What does HackerNews think of self-instruct?
Aligning pretrained language models with instruction data generated by themselves.
Language:
Python
#11
in
R
When they say Augment your dataset with synthetic data on https://developers.googleblog.com/2023/03/announcing-palm-ap... do they mean something like this https://github.com/yizhongw/self-instruct ?
It says
> We train the Alpaca model on 52K instruction-following demonstrations generated in the style of self-instruct using text-davinci-003
Which leads to self-instruct https://github.com/yizhongw/self-instruct
From a glimpse they used a LM to classify instructions & train the model which IMHO is very similar to RLHF