Co0k1eGal3xy t1_jbzi8wc wrote
Reply to comment by TemperatureAmazing67 in [D] Is anyone trying to just brute force intelligence with enormous model sizes and existing SOTA architectures? Are there technical limitations stopping us? by hebekec256
- Double Decent, more parameters are MORE data efficient.
- Most of these LLMs barely complete 1 epoch, so there is no concern about overfitting currently.
Viewing a single comment thread. View all comments