artsybashev t1_jefp0o2 wrote on March 31, 2023 at 5:53 PM

Reply to comment by Sopel97 in [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679

Yes the only thing they can do is ban you from their service

artsybashev t1_je65qs7 wrote on March 29, 2023 at 6:12 PM

Reply to AI Startup Cerebras releases open source ChatGPT-like alternative models by Time_Key8052

13B model is quite small. Given that the company is focusing in AI hardware, the dataset and other parts of the model might be lagging a bit. Lack of comparison to other models also suggests that the performance is not that good.

artsybashev t1_jdu2hjs wrote on March 27, 2023 at 4:55 AM

Reply to comment by super_deap in [D] GPT4 and coding problems by enryu42

The physical world that we know is very different from the virtual twin that we see. The human mind lives in a virtual existence created by the material human brain. This virtual world creates nonexisting things like pain, colors, feelings and also the feeling of existence.

The virtual world that each of our brain creates is the wonderful world where a soul can emerge. Virtual worlds can also be created by computers. There is no third magical place besides these two in my view.

artsybashev t1_jds4ekt wrote on March 26, 2023 at 7:31 PM

Reply to comment by addition in [D] GPT4 and coding problems by enryu42

And soon people understand that this feedbackloop is what creates the thing we call consciousness.

artsybashev t1_jdmpwwd wrote on March 25, 2023 at 3:29 PM

Reply to comment by danielbln in [R] Reflexion: an autonomous agent with dynamic memory and self-reflection - Noah Shinn et al 2023 Northeastern University Boston - Outperforms GPT-4 on HumanEval accuracy (0.67 --> 0.88)! by Singularian2501

The fluffy overly complex writing around your main message has worked as a barrier or prefilter to filter out bad job candidates or unqualified contributions to scientific discussion. LLMs are destroying this part. Interesting to see what this leads to.

artsybashev t1_jdlml1f wrote on March 25, 2023 at 8:44 AM

Reply to comment by nekize in [R] Reflexion: an autonomous agent with dynamic memory and self-reflection - Noah Shinn et al 2023 Northeastern University Boston - Outperforms GPT-4 on HumanEval accuracy (0.67 --> 0.88)! by Singularian2501

Sounds like we need a LLM to generate padding for the academia and LLM to write the tldr for the readers. World is dumb.

artsybashev t1_jd6l85h wrote on March 22, 2023 at 5:07 AM

Reply to comment by maizeq in [R] SPDF - Sparse Pre-training and Dense Fine-tuning for Large Language Models by CS-fan-101

nvidia has structured sparsity

artsybashev t1_j9puq9o wrote on February 23, 2023 at 6:45 PM

Reply to comment by levand in Why bigger transformer models are better learners? by begooboi

It is in a way the same phenomena. If you think about information in images, overfitting would start to learn even the noise patterns in the images. If your training data does not have enough real information to fill the model capacity, the model will start to learn noise and overfit to your data.

artsybashev t1_j8i33cp wrote on February 14, 2023 at 1:58 PM

Reply to comment by Appropriate_Ant_4629 in GPU comparisons: RTX 6000 ADA vs Hopper h100 by N3urAlgorithm

Yeah might be. I've only seen companies do machine learning in two ways. On is to rent a cluster of gpus and train something big for a week or two to explore something interesting. The other use pattern is to retrain a model every week with fresh data. Maybe this is the case for OP. Retraining a model each week and serving that model with some cloud platform. It makes sense to build a dedicated instance for a reoccuring tasks if you know that there is a need for it for more than a year. I guess it is also cheaper than using the upfront payment option in aws.

artsybashev t1_j8e2dmj wrote on February 13, 2023 at 5:00 PM

Reply to comment by N3urAlgorithm in GPU comparisons: RTX 6000 ADA vs Hopper h100 by N3urAlgorithm

I understand that you have given up hope for Cloud. Just so you understand the options, $50k gives you about 1000 days of 4x A100 from vast.ai with todays pricing. Since in 3 years there is going to be at least one new generation, you will probably get more like 6 years of 4x A100 or one year of 4x A100 + 1 year of 4x H100. Keeping your rig at 100% utilization for 3 years might be hard if you plan to have holidays.

artsybashev t1_j7k04qr wrote on February 7, 2023 at 10:11 AM

Reply to comment by ginger_beer_m in [N] Google: An Important Next Step On Our AI Journey by EducationalCicada

If Xi Jing Ping, Putin and Trump have taught you anything, being correct is absolutely useless. Just having some sort of a plan, coming up with a good story and some fact sounding arguments is a lot more valuable that what the average person thinks. Nothing more is required to be one of the the most influential person alive.

artsybashev t1_j5jj7fi wrote on January 23, 2023 at 1:35 PM

Reply to comment by hiptobecubic in [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

no

artsybashev t1_j5fhyex wrote on January 22, 2023 at 5:21 PM

Reply to comment by EmmyNoetherRing in [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

Infowars gets a new meaning in 10 years

artsybashev t1_j5fhioy wrote on January 22, 2023 at 5:18 PM

Reply to comment by conchoso in [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

Yeah it is easier to modify the color of a pixel that characters in a text in a way than humans do not detect. Something can be done through typos, weird choise of word or calculating the checksum of word choises, but those methods can easily sound unnatural to human readers.

artsybashev t1_j5fh1m8 wrote on January 22, 2023 at 5:15 PM

Reply to comment by EmmyNoetherRing in [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

I wonder if in 50 years, the LLM models are able to produce "viruses" that cause problems in competing models. Like AI hacking the other AI through injecting disruptive training data to the enemy training procedure.

artsybashev t1_j5fgnjm wrote on January 22, 2023 at 5:12 PM

Reply to comment by EmmyNoetherRing in [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

So they are effectively currently poluting the public space with their AI which only they have the tool to detect. Smells like anti-competition to me. This potentially makes the competing teams models worse since they will be eating the shit that GPT3 pushes out.

artsybashev t1_j2v9lx2 wrote on January 4, 2023 at 4:28 AM

Reply to comment by currentscurrents in [R] Massive Language Models Can Be Accurately Pruned in One-Shot by starstruckmon

If you believe in singularity, at some point we reach an infinite loop where "AI" creates better methods to run calculations that it uses to build better "AI". In a way that is already happening but once that loop gets faster and more autonomous it can find a balance where the development is "optimally" fast.

artsybashev t1_j2suada wrote on January 3, 2023 at 6:43 PM

Reply to comment by yahma in [R] Massive Language Models Can Be Accurately Pruned in One-Shot by starstruckmon

A100 can run about 75B parameters in 8bit. With pruning that is doable, but it wont be quite the same perplexity.

artsybashev t1_j1ph7f3 wrote on December 26, 2022 at 9:40 AM

Reply to comment by AltruisticNight8314 in [D] Running large language models on a home PC? by Zondartul

one A100 80GB will get you started with models 500M-15B. You can rent that for a $50 per day. See where that takes you in a week.

artsybashev t1_j1c7pzh wrote on December 23, 2022 at 5:37 AM

Reply to comment by step21 in [D] When chatGPT stops being free: Run SOTA LLM in cloud by _underlines_

a lot of stuff can be run locally with git clone ... and docer compose up

artsybashev t1_j154fhy wrote on December 21, 2022 at 7:04 PM

Reply to comment by caedin8 in [D] Running large language models on a home PC? by Zondartul

it is just the inference. Training requires more like 100 x A100 and a cluster to train on. Just a million to get started.

artsybashev t1_izoujfv wrote on December 10, 2022 at 6:55 PM

Reply to comment by new_name_who_dis_ in [P] I made a command-line tool that explains your errors using ChatGPT (link in comments) by jsonathan

Yeah. A lot of times I get a better answer from chatgpt but you really need to take its responses witha grain of salt

artsybashev t1_izorw1t wrote on December 10, 2022 at 6:37 PM

Reply to comment by _poisonedrationality in [P] I made a command-line tool that explains your errors using ChatGPT (link in comments) by jsonathan

yeah it is annoyingly confidently wrong. Even when you point out its mistake, it might try to explain like no mistakes where made. Sometimes it admits that there was a mistake. From a coworker this would be really annoying behaviour.

artsybashev t1_iw29zh1 wrote on November 12, 2022 at 11:53 AM

Reply to how do people come up with a training procedure like this? by Still-Barracuda5245

A lot of deep learning has been modern equivalent of witchcraft. Just some ideas that might make sense being squashed together.

Hyperparameter tuning is one of the most obscure and hard to learn part of neural network training since it is hard to do multiple runs with it for models that take more than a few weeks/thousands of dollars to train. Most of the researchers just have learned some good initial guess and might run the model with some set of hyperparameters from which the best result is chosen.

Some of the hyperparameter tunings can also be done with a smaller model and the amount of hyperparameter tuning can be reduced while growing the model to the target size.

artsybashev t1_iv5vgsp wrote on November 5, 2022 at 3:28 PM

Reply to comment by PicnicBasketPirate in New member! Been lurking for a while! Just bought my daughter her first small gumbo pot! It’s a vintage Magnalite 5qt! These puppies last forever! by cindy_lou_who_1982

The discussion section is definitely worth reading. Apparently aluminum is ok as long as the levels stay below the limits of your kidneys ability get rid of it.

This is around 0.1mg per day of aluminum getting into your blood stream. The bioavailability from different sources is a bit complicated to figure out. Main sources are probably deodorants, vaccines, cookware, processed food, drink water, drugs, sun lotions and exposure in some occupations.