Z1BattleBoy21 t1_jcjgjiw wrote on March 17, 2023 at 7:44 AM

Akimbo333 t1_jcjhgoh wrote on March 17, 2023 at 7:57 AM

Cool thanks!!! Do you think that this could be used for a humanoid robot?

Z1BattleBoy21 t1_jcjhw2v wrote on March 17, 2023 at 8:04 AM

In theory, for sure. Only company I know that's working towards a humanoid robot is https://www.figure.ai/. I don't think they've released much to the public so idk if they even use an LLM.

Akimbo333 t1_jcjmf7u wrote on March 17, 2023 at 9:10 AM

Oh ok cool! But I don't have high hopes for figure

Akimbo333 t1_jcjxnvw wrote on March 17, 2023 at 11:33 AM

And I have to figure out how to make the model multi modal

Hands0L0 t1_jck1yvf wrote on March 17, 2023 at 12:16 PM

I got 30b running on a 3090 machine, but the token return is very limited

Akimbo333 t1_jck2koh wrote on March 17, 2023 at 12:21 PM

Oh ok. How many tokens are returned

Hands0L0 t1_jck3lfv wrote on March 17, 2023 at 12:31 PM

Depends on prompt size which is going to dictate that quality of the return. 300 tokens?

Akimbo333 t1_jck53wv wrote on March 17, 2023 at 12:45 PM

Well, actually, that's not bad! That's about 50-70 words. Which in the English lesson is essentially 3-5 sentences. Essentially, it's a paragraph. It's a good amount for a chatbot! Let me know what you think?

Hands0L0 t1_jck5cyd wrote on March 17, 2023 at 12:47 PM

Considering you can explore context with ChatGPT and bing through multiple returns, not exactly. You need to hit it on your first attempt

Akimbo333 t1_jck73ph wrote on March 17, 2023 at 1:02 PM

Well you could always ask it to continue the sentence

Hands0L0 t1_jck7ifi wrote on March 17, 2023 at 1:06 PM

Not if there is a token limit.

I'm sorry, I don't think I was being clear. The token limit is tied to VRAM. You can load the 30b on a 3090 but it shallows up 20/24 gb of VRAM for the model and prompt alone. That gives you 4gb for returns

Akimbo333 t1_jcka9ef wrote on March 17, 2023 at 1:28 PM

Oh ok. So you can't make it keep talking?

Hands0L0 t1_jckbm7h wrote on March 17, 2023 at 1:39 PM

No, because the predictive text needs the entire conversation history context to predict what to say next, and the only way to store the conversation history is in RAM. If you run out of RAM you run out of room for returns.

Akimbo333 t1_jckc9iu wrote on March 17, 2023 at 1:44 PM

Damn! There's gotta be a better way to store conversations!!! Maybe one day

Hands0L0 t1_jcknz03 wrote on March 17, 2023 at 3:06 PM

Study CS and come up with a solution and you can be very rich

Akimbo333 t1_jckt4js wrote on March 17, 2023 at 3:40 PM

Oh yeah, I bet, lol!!!

[deleted] t1_jck4apz wrote on March 17, 2023 at 12:37 PM

[deleted]

bryceschroeder t1_jcygn0x wrote on March 20, 2023 at 3:04 PM

>strongest

I am running LLaMA 30B at home at full fp16. Takes 87 GB of VRAM on six AMD Insight MI25s and speed is reasonable but not fast (It can spit out a sentence in 10-30 seconds or so in a dialog / chatbot context depending on the length of the response.) While the hardware is not "consumer hardware" per se, it's old datacenter hardware, the cost was in line with the kind of money you would spend on a middling gaming setup. The computer cost about $1500 to build up and the GPUs to put in it set me back about $500.

bryceschroeder t1_jcyhyss wrote on March 20, 2023 at 3:14 PM

To clarify with some additional details, I probably could have spent less on the computer; I sprang for 384 GB of DDR4 and 1 TB NVMe to make loading models faster.

Akimbo333 t1_jcz1iff wrote on March 20, 2023 at 5:22 PM

Wow! Now that's interesting!

Those who know...

Akimbo333 t1_jcjf5j4 wrote on March 17, 2023 at 7:24 AM