What does HackerNews think of char-rnn?
Multi-layer Recurrent Neural Networks (LSTM, GRU, RNN) for character-level language models in Torch
[1] http://prize.hutter1.net [2] https://github.com/karpathy/char-rnn
http://i.imgur.com/rb0GJvQ.gifv
Basically, you create a video, dump the video's frames with ffmpeg, run each frame through the RNN, and stitch them back together. It took me several hours to produce just ten seconds of video. Unfortunately, unless you have a Titan X GPU the max size of each image is quite small (certainly less than 1080p), which may be why the frames in this video are split into four quadrants.
The original Word2Vec[1] is missing too. While Gensim and Glove are nice, Word2Vec still outperforms them both in some circumstances.
Surely there is a good LTSM language modelling project somewhere too? I can't think of one off the top of my head though. There's some code in Keras[2], but maybe Karpathy's char-RNN would be better[3] because of the documentation.
[1] https://code.google.com/p/word2vec/
[2] https://github.com/fchollet/keras/blob/master/examples/lstm_...
The output seems to me around the level a Markov chain might produce. Karpathy's RNN code produces much, much better results[1].
I wonder if manually extracting features and training the RNN on that is a mistake? RNN's tend to work well on text because they encode understanding of the parse tree themselves.