blarg7459 t1_jbetts9 wrote on March 8, 2023 at 3:23 PM

Reply to comment by CKtalon in [D] Can someone explain the discrepancy between the findings of LLaMA and Chinchilla? by __Maximum__

Doesn't that mean that if you include inference costs, and the model will be used extensively, you may actually get much better bang for your bucks by training much more than chinchilla-optimal?

farmingvillein t1_jbk3esu wrote on March 9, 2023 at 4:49 PM

Yes, which was arguably the key claim of the LLaMa paper.