Submitted by Vegetable-Skill-9700 t3_121a8p4 in MachineLearning
LahmacunBear t1_jdo7k0w wrote
Here’s a thought — 175B in GPT3 original, the best stuff thrown at it, performs as it did. ChatGPT training tricks, suddenly same size performs magnitudes better. I doubt that currently the LLMs are fully efficient, i.e. just as with GPT3 to 3.5, with the same size we can continue to get much better results, and therefore current results with much smaller models.
Viewing a single comment thread. View all comments