Unfortunately due to the combinatorial nature of the multi-label reduction it can be very slow in practice. Here's an example application where I asked Mechanical Turkers to multi-label phrases into high level buckets like ``Politics'' and ``Entertainment''.
pmineiro@ubuntu-152% for r in 4; do rm model.${r}; time ~/src/multionlineextract/src/multionlineextract --model model.${r} --data <(./multicat 10 =(sort -R octoplevel.max3.moe.in)) --n_items $(cat octoplevel.max3.moe.in | wc -l) --n_raw_labels $(./statsfrompm n_raw_labels) --max_raw_labels 3 --rank ${r} --priorz $(./statsfrompm priorz) --predict flass.${r} --eta 0.5; done
seed = 45
initial_t = 1000
eta = 0.500000
rho = 0.500000
n_items = 3803
n_raw_labels = 10
max_raw_labels = 3
n_labels (induced) = 176
n_workers = 65536
rank = 4
test_only = false
prediction file = flass.4
priorz = 0.049156,0.087412,0.317253,0.012600,0.135758,0.079440,0.109094,0.016949
,0.157750,0.034519
cumul since example current current current
avg q last counter label predict ratings
-3.515874 -3.515874 2 -1 0 4
-3.759951 -3.922669 5 -1 0 4
-3.263854 -2.767756 10 -1 0 4
-2.999247 -2.696840 19 -1 0 3
-2.531113 -2.014788 36 -1 9 4
-2.503801 -2.474213 69 -1 3 4
-2.452015 -2.396817 134 -1 3 4
-2.214508 -1.968222 263 -1 6 3
-2.030175 -1.842252 520 -1 3 4
-1.907382 -1.783031 1033 -1 1 4
-1.728004 -1.547266 2058 -1 2 4
-1.582127 -1.435591 4107 -1 2 4
-1.460967 -1.339532 8204 -1 9 4
-1.364336 -1.267581 16397 -1 5 4
-1.281301 -1.198209 32782 -1 3 4
-1.267093 -1.178344 38030 -1 3 -1
applying deferred prior updates ... finished
gamma: 0.0010 0.0008 0.0007 0.0006
~/src/multionlineextract/src/multionlineextract --model model.${r} --data 2
717.98s user 3.46s system 99% cpu 45:26.28 total
Sadly, yes, that's 45 minutes on one core of my laptop. The good news is that while working on speeding this up, I improved the speed of ordinalonlineextract and nominallabelextract by a factor of 4. However inference is still $O (|L|^2)$ so the problem with 176 effective labels above is about 7700 times slower than a binary problem. A more restrictive assumption, such as ``all errors are equally likely'' (in the nominal case) or ``error likelihood depends only upon the edit distance from the true label'' (in the multi-label case) would admit cheaper exact inference. multionlineextract is available from the nincompoop repository on Google code.
No comments:
Post a Comment