Submitted by __Maximum__ t3_11l3as6 in MachineLearning
blarg7459 t1_jbetts9 wrote
Reply to comment by CKtalon in [D] Can someone explain the discrepancy between the findings of LLaMA and Chinchilla? by __Maximum__
Doesn't that mean that if you include inference costs, and the model will be used extensively, you may actually get much better bang for your bucks by training much more than chinchilla-optimal?
farmingvillein t1_jbk3esu wrote
Yes, which was arguably the key claim of the LLaMa paper.
Viewing a single comment thread. View all comments