Hey Dave - Your visualization is cool, no doubt, but I'm not seeing as much structure in the 3D cloud as I see in 2D visualizations. In 2D, the words form clusters of similarity which are pretty obvious, but I don't really see that here. What training data did you use for word2vec, and for how many iterations did you run t-sne?
I ran some visualizations of my own a while back, and they're a lot less aesthetically pleasing, but you can see more structure and clustering.
I trained word2vec on wikipedia and trained a bunch of different models w/ TSNE for between 800 and 1500 iterations. In 2d, the results actually appear fine (similar to your graphics + others that have been demonstrated). Unfortunately, at least for me, this doesn't seem to be translating to 3d.
It's either a mix of the quality of the vectors (maybe I need to try some other word vectors) and/or T-SNE needs to be optimized a bit better (and/or there's a bug). It's not quite where I want it to be yet, so consider this a "cool tech demo we're still working on" kind of thing.
Did you roll your own version of t-sne to use in Erstaz? Now that I think about it, I've never seen a 3D visualization with t-sne before (and searching just now didn't find any), it could be that t-sne doesn't work as well in 3D (or has bugs which only appear in >2 dims).
We're rolling our own for release, but right now we're borrowing liberally from the Barnes-Hut-SNE implementation here: http://homepage.tudelft.nl/19j49/t-SNE.html Barnes-Hut is a modification that makes T-SNE a lot more efficient, so I could train on hundreds of thousands of vectors instead of <10,000.
However, I'm wondering if there's something I'm missing that makes it not suitable for 3d--IE, maybe the assumptions being made to speed things up break down after the 2nd dimension. Also, there's interesting discussion in the literature about whether or not T-SNE is a good dimensionality reduction technique in general (as opposed to only a very powerful visualization technique), so my next step is probably going to be running the vectors through an autoencoder to generate 3d coordinates and then plotting those and comparing the visualizations.
Re: another example of TSNE w/ text--yeah, I've only seen this http://homepage.tudelft.nl/19j49/t-SNE_files/semantic_tsne.j... which seems to work but isn't interactive. Frankly, I'm surprised we got it to work with three.js--we're able to render as many as 250,000 unique words and it runs smooth (it just takes longer to download--this demo has 25,000).
Barnes-Hut uses a quadtree, doesn't it? I don't know whether the code was adapted to use an octree in 3D instead; maybe it was? FWIW there's a really interesting Google techtalk on how it works: http://www.youtube.com/watch?v=RJVL80Gg3lA
Thanks, but I've tried that and it gives really messy results. For example, every accented character turns into a space, dividing the word into two meaningless pieces.
Edit: it also seems to put the entire article on a single line with no punctuation. But Google's word2vec examples run with each sentence on one line. Wouldn't this make a difference in training?
Yeah, I've tried from .5 to 50--3 seems to be the best value currently. I may try it with the non Barnes-Hut implementation too, maybe on a smaller number of words. If that works, then I'll bet it is something related to how the quadtree is constructed...
Not at all... "documents" were the user's play history and "words" were artists. So if you played one artist twice and another artist once it'd be like a document that says "artist1 artist1 artist2". The assumption is that document topics are analogous to music genres, and each artist creates music within a small set of genres each user prefers music within a small set of genres.
These are a bunch of "word vectors" (generated w/ word2vec https://code.google.com/p/word2vec/) put through a dimensionality reduction/visualization technique called T-SNE and then plotted in 3d. As far as what the clustering represents, check out T-SNE (http://homepage.tudelft.nl/19j49/t-SNE.html), but the short answer is--it's hard to say...
Here's a longer explanation from a webinar I did yesterday where I demoed this: http://www.youtube.com/watch?v=wmlj5uTUTFY (skip to 12:11 to get to the applicable section)
I ran some visualizations of my own a while back, and they're a lot less aesthetically pleasing, but you can see more structure and clustering.
https://raw.github.com/dhammack/Word2VecExample/master/visua... https://raw.github.com/dhammack/Word2VecExample/master/visua...
(It could also just be a WebGL thing, because the third dimension seems to have a lot less variance than the other two)