July 27, 2009
So I haven't blogged for the last couple days. Oops, oh well life moves on.
I spent the weekend working in the lab. I ran the self-organizing map on all the data, then ran the distance measure on all of it. Today I ran some preliminary evaluations, and.....ahhhhhhhhhhh crap. I got object recognition rates so bad, worse than chance in some cases. We may be able to fiddle with some constants to improve it some, but there's not a whole lot we can do. We have to have our results finished by the end of today so we can get our poster printed, so we don't have time to rerun any of the data. Taylor went through all my code this morning and double checked to make sure the poor results weren't some coding error somewhere, and it turns out that it wasn't. I think they are a result of our distance metric. Jivko used string matching in his paper, but to save time (since our data set is much larger and string matching runtime scales quadratically) we used an n-gram trie, which was put together by Matt. We knew that the results would be worse, but they shouldn't be this much worse. Jivko ran some data through the trie as well, and he said he also got poor results, although how bad I'm not sure. He should be in in an hour or so, and so we can ask him to see what he thinks we should do. At the moment, since we don't have time to do any major changes, I think we're just going to have to say that our results show that an n-gram trie is ineffective at object recognition. Since Taylor and I will still be here, next week we may try string matching (we have to use all the REU computers for about 2 days straight, which is why we can't do that now). We'll have to keep in touch with Ugonna through email and Google Docs. Hopefully doing that will give us publishable results.
Another option we might consider is instead of doing object recognition, we may do object classification. A lot of the objects are fairly similar, so classification should give us much better results. We'll have to just hand label the classes for the objects, but that shouldn't be too hard. Whichever we choose, I don't think we're going to get the results that we were looking for, at least not using an n-gram trie. But oh well, hopefully we get some results.