What does HackerNews think of char-rnn?

Multi-layer Recurrent Neural Networks (LSTM, GRU, RNN) for character-level language models in Torch

Language: Lua

Procedural generation for midi has been around for a long time - ever with fair more “primitive” archictures that current GPT-types. Even chat-run [0] can be used to generate midi-based classical scores… I can’t find the exact trained model just now. But as you say… those midi generations sound like $hit so getting them arranged properly would be v v useful.

[0]https://github.com/karpathy/char-rnn

I thought this would be about something like giving the Hutter Prize [1] a go using character RNNs [2]. Instead, it's a somewhat confused "gentle introduction" to neural nets (which there are plenty of already, of higher quality) and compression is sort of handwavely discussed, not properly with bits and entropy like us information theorists would have it :)

[1] http://prize.hutter1.net [2] https://github.com/karpathy/char-rnn

I produced something similar to this shortly after the release of https://github.com/karpathy/char-rnn:

http://i.imgur.com/rb0GJvQ.gifv

Basically, you create a video, dump the video's frames with ffmpeg, run each frame through the RNN, and stitch them back together. It took me several hours to produce just ten seconds of video. Unfortunately, unless you have a Titan X GPU the max size of each image is quite small (certainly less than 1080p), which may be why the frames in this video are split into four quadrants.

https://github.com/facebook/MemNN should be in the language modelling (or Deep Learning) part. I'll give them a pass because it was only released a couple of days ago.

The original Word2Vec[1] is missing too. While Gensim and Glove are nice, Word2Vec still outperforms them both in some circumstances.

Surely there is a good LTSM language modelling project somewhere too? I can't think of one off the top of my head though. There's some code in Keras[2], but maybe Karpathy's char-RNN would be better[3] because of the documentation.

[1] https://code.google.com/p/word2vec/

[2] https://github.com/fchollet/keras/blob/master/examples/lstm_...

[3] https://github.com/karpathy/char-rnn

I haven't looked at the code, but glancing at the results leaves me thinking it might need more work.

The output seems to me around the level a Markov chain might produce. Karpathy's RNN code produces much, much better results[1].

I wonder if manually extracting features and training the RNN on that is a mistake? RNN's tend to work well on text because they encode understanding of the parse tree themselves.

[1] https://github.com/karpathy/char-rnn