Submitted by sinavski t3_10uh62c in MachineLearning
Hello! I'm trying to understand what available LLMs one can "relatively easily" play with. My goal is to understand the landscape since I haven't worked in this field before. I'm trying to run them "from the largest to the smallest".
By "relatively easy", I mean doesn't require to setup a GPU cluster or costs more than $20:)
Here are some examples I have found so far:
- ChatGPT (obviously) - 175B params
- OpenAI api to access GPT-3s (from ada (0.5B) to davinci (175B)). Also CodeX
- Bloom (176B) - text window on that page seems to work reliably, you just need to keep pressing "generate"
- OPT-175B (Facebook LLM), the hosting works surprisingly fast, but slower than ChatGPT
- Several models on HuggingFace that I made to run with Colab Pro subscription: GPT-NeoX 20B, Flan-t5-xxl 11B, Xlm-roberta-xxl 10.7B, GPT-j 6B. I spent about $20 total on running the models below. None of the Hugging face API interfaces/spaces didn't work for me :(. Here is an example notebook I made for NeoX.
Does anyone know more models that are easily accessible?
P.S. Some large models I couldn't figure out (yet) how to run easily: Galactica-120b 120B Opt-30b 30B
gopher9 t1_j7cbdlg wrote
RWKV 14B, trained on The Pile.