The Decoding
Posts
Can AI Models Teach Themselves To Follow Instructions?

Can AI Models Teach Themselves To Follow Instructions?

Leveraging synthetic data to align AI models

Gregor von Dulong
April 23, 2024

Sponsored by

Can AI models teach themselves to follow instructions?

Raw LLMs are not very useful.

Interacting with them feels a bit like interacting with the child in this video:

It’s simply not on board with our plans.

Over the recent years, a lot of research has focussed on aligning these models with humans so they become more useful. As a result, methods such as RLHF and instruction tuning helped create banger applications like ChatGPT. Though the two methods are fundamentally different, they both have the same goal:

Create a model that does what you want.

In RLHF (Reinforcement Learning With Human Feedback) a separate reward model is trained on human feedback. This reward model is then used to fine-tune the foundation model to essentially adhere to the human feedback.

Instruction tuning on the other hand describes the process of fine-tuning a model on - you guessed it - on instructions.

An example of such an instruction could be:

Find a slant rhyme with the word orange. Chose a word from the following list: Florence, door hinge, and tree.

The model would be expected to output something like: “Florence”.

However, there is one way in which RLHF and instruction tuning are alike. Human annotators need to hand-label gigantic datasets. That costs a ton of money.

But it doesn’t stop there.

It’s non-trivial for humans to hand-label these datasets. The data are limited in quality, diversity, and creativity.

So, what could a way forward look like?

The authors of the Self-Instruct paper proposed an interesting approach to auto-generating these instruction datasets and then fine-tuning models on them.

The results are impressive!

They were able to show fine-tuning GPT-3 on their auto-generated data and achieved a 33% improvement on benchmarks.

Let’s dive in and see how this cool method works in detail.

But before a quick word from this week’s sponsor.

Learn 20+ AI Tools, ChatGPT & Prompting techniques for FREE

This 3-hour ChatGPT & AI Workshop will help you automate tasks & simplify your life using AI at no cost. (+ you get a bonus worth $500 on registering) 🎁

Click to Register ($0 for the First 100 people)

With AI & Chatgpt, you will be able to:

✅ Make smarter decisions based on data in seconds using AI

✅ Automate daily tasks and increase productivity & creativity

✅ Solve complex business problem to using the power of AI

✅ Build stunning presentations & create content in seconds

👉 Hurry! Click here to register (Limited seats: FREE for First 100 people only)🎁

What Does Self-Instruct Do?

Here goes their framework in one sentence:

They use a model to generate an instruction dataset and then fine-tune that same model on the datasets, which essentially bootstraps it off its own generations.

Simple no?

Yes, but it’s actually a bit more involved than that. The authors had to do a lot more stuff to get it to work.

Some Of The Nitty Gritty

To instruction-tune their model, the authors needed to generate with the following shape:

(Instruction, Input) → Output

To illustrate this a bit more, let’s fill in my rhyming example from above.

(”Create a rhyme for orange from a list”, “Florence, door hinge, tree”) → “Florence”

They started by generating just the instructions. To do that, the authors created 175 hand-crafted tasks (input-output pairs). These were then used to generate instructions that had the input-output pairs as a consequence. Essentially, the question to the model was: Given X input and Y output, what instruction could prompt this output from the input.

To use our rhyming example:

“What question could have led to the output Florence, give the following list: [Florence, door hinge, tree]”

Answer: ”Create a slant rhyme for Orange”

Alright.

That step left them with hundreds of new instructions for 175 input-output pairs.

Here comes the tricky part. To build the complete training set, they needed turn the problem around and use these new instructions to generate fresh input-output pairs. This was particularly challenging because the model did not just need to understand what outputs to create from the instructions.

It also needed to come up with a potential input that would be necessary to solve this task and then solve it. In the paper, the authors dive deeper into all the hand-holding they had to do to get the model to produce good results.

Last but not least the dataset underwent extensive filtering.

For example, they determined the ROUGE-L score (sub-string overlap) between the instructions and removed those with too much overlap. They also filtered instructions that included keywords such as image, picture, or graph as these indicate the use of data that the language model could not process.

Then they went through the three steps of then generating instructions, then generating input-output pairs, and finally filtering over and over again.

This way, they iteratively increased the share of synthetic samples and built a large, diverse, and most importantly entirely model-generated dataset.

Then they proceeded to fine-tune GPT-3

Interesting so far? You might also like the Language Model Digest.
Consider subscribing for free daily updates on NLP research.

	Sponsored LLMs ResearchDaily newsletter categorizing & easily explaining LLMs research papers as they published.

Training and Results

The authors applied various tricks to augment the data.

For example, they used different line breaks in the prompts and appended keywords (e.g. “Task:”) to the beginning and end of the prompt. This was intended to make the model more robust to different inputs.

Their results were very impressive.

As mentioned above, the fine-tuned version of GPT-3 outperformed the raw version by 33% on benchmarks. They also managed to be on par with state-of-the-art models such as InstructGPT which was trained on a private human-labelled dataset.

And they did that with almost entirely machine-generated data.

Big Kudos also to the authors for open-sourcing a synthetic dataset of 52K instructions and a set of manually written novel tasks for building and evaluating instruction-following models.

Before we wrap up, here are some closing thoughts

Closing Thoughts

In my opinion, this can be viewed as a sort of alignment by distillation.

Let me explain what I mean.

Obviously, the model was not able to expand its abilities. That would mean that it generated novel information outside of the data manifold that was captured during the initial training.

Hence, it would have created information out of thin air.

What happened here was something different. The original GPT-3 model was already able to follow instructions.

However, it was trained in a very general way on very general data.

To anthropomorphize the hell out of it, we could say that the model did not know that following instructions was what we wanted from it.

The fine-tuning on self-generated data, now, made the model essentially collapse internally onto following instructions.

I hope this gave you some food for thought or will spark an idea for your work! Maybe you even learned something like I did when reading this cool paper.

If you have feedback or questions, hit reply on this email or connect with me on Twitter or LinkedIn.

Lots of love and see you next time!

P.S. If you found this useful, please, share it with a friend or subscribe here ⭕️ for weekly deep dives on AI research and the data economy.