Cheap_Meeting t1_j973fm1 wrote on February 19, 2023 at 7:30 PM

Reply to [D] Is Google a language transformer like ChatGPT except without the G (Generative) part? by Lets_Gooo_123

I went ahead and asked ChatGPT your question for you:

No, Google is not a language transformer like ChatGPT. While Google has developed language models like BERT and GPT-3, it is primarily a search engine that uses various algorithms to deliver search results to users. Google's language models are used to improve search results and to power services like Google Assistant, but they are not the primary focus of the company.

As for ChatGPT, it is true that its main purpose is to generate human-like text based on prompts provided by users. However, it is not just a search engine that can talk back. It is a complex machine learning model that has been trained on vast amounts of text data and uses advanced natural language processing techniques to generate responses.

While it is true that the development of more algorithms and computing power is necessary for further advancements in AI, ChatGPT and other similar models have already made significant strides in the field of natural language processing. They have the potential to be used in a wide range of applications, including language translation, content creation, and customer service. However, it is unlikely that they will completely replace human jobs, as they are still limited by their inability to understand and reason about the world in the way that humans do.

Cheap_Meeting t1_j972cc6 wrote on February 19, 2023 at 7:22 PM

Reply to [R] [N] In this paper, we show how a conversational model, 3.5x smaller than SOTA, can be optimized to outperform the baselines through Auxiliary Learning. Published in the ACL Anthology: "Efficient Task-Oriented Dialogue Systems with Response Selection as an Auxiliary Task." by radi-cho

First author is a high-school student. Impressive.

Cheap_Meeting t1_j7j70tj wrote on February 7, 2023 at 4:17 AM

Reply to comment by MysteryInc152 in [D] List of Large Language Models to play with. by sinavski

That's not my takeway. GLM-130B is even behind OPT according to the mean win rate, and the instruction tuned version of OPT in turn is worse than FLAN-T5 which is a 10x smaller model (https://arxiv.org/pdf/2212.12017.pdf Table 14)

Cheap_Meeting t1_j7exknk wrote on February 6, 2023 at 8:06 AM

Reply to [D] Is English the optimal language to train NLP models on? by MrOfficialCandy

English due to data availability.

Cheap_Meeting t1_j7cit9i wrote on February 5, 2023 at 7:58 PM

Reply to comment by MysteryInc152 in [D] List of Large Language Models to play with. by sinavski

Are any benchmark scores such as MMLU or BigBench available for Aleph Alpha's models?

Cheap_Meeting t1_j7chivx wrote on February 5, 2023 at 7:49 PM

Reply to [D] List of Large Language Models to play with. by sinavski

In terms of Consumer Apps, the Poe app from Quora has access to two models from Open AI and one from Anthropic.

Perplexity.ai, YouChat and Neeva are search engines that integrated LLMs.

Google has an AI + Search Event on Wednesday where they are likely to announce something as well.

In terms of APIs and getting a feeling for these models, I would use OpenAI's APIs. Their models are the best publically available models. Open Source models are still far behind.

Cheap_Meeting t1_j7bsnoo wrote on February 5, 2023 at 5:03 PM

Reply to comment by kaiser_xc in [N] GitHub CEO on why open source developers should be exempt from the EU’s AI Act by EmbarrassedHelp

No, what u/mulokisch is saying is that the cookie disclaimers are unrelated to GDPR.

Cheap_Meeting t1_j3uo0s9 wrote on January 11, 2023 at 4:23 AM

Reply to comment by [deleted] in [D] Found very similar paper to my submitted paper on Arxiv by [deleted]

Also, you might want to tell them that you thought their paper was really well executed and if they would be willing to chat and if there are internship opportunities on their team (if you are interested in that).

Cheap_Meeting t1_j3unrox wrote on January 11, 2023 at 4:21 AM

Reply to comment by ktpr in [D] Found very similar paper to my submitted paper on Arxiv by [deleted]

It's a preprint server, meant to publish preprints.

Cheap_Meeting t1_j3unjt1 wrote on January 11, 2023 at 4:19 AM

Reply to comment by ASuarezMascareno in [D] Found very similar paper to my submitted paper on Arxiv by [deleted]

We can't afford that in Machine Learning by the time your paper has passed the review process it's going to be outdated.

Cheap_Meeting t1_j3ulqr7 wrote on January 11, 2023 at 4:05 AM

Reply to [D] Options for data scientist who wants more social interaction by pseudoanonyme

Data scientist is not a very well-defined job title. It can involve more or less social interaction depending on the company.

But you could get more social interaction by doing pair programming with your coworkers or asking to be assigned more work involving social interaction, e.g. talking to clients or mentoring new coworkers.

You could eventually transition into another career path such as people management, education, project management, program management, sales, etc.

Cheap_Meeting t1_j21v6mi wrote on December 29, 2022 at 1:24 AM

Reply to comment by All-DayErrDay in [D] DeepMind has at least half a dozen prototypes for abstract/symbolic reasoning. What are their approaches? by valdanylchuk

I think the main limitations of LLMs are:

Hallucinations: They will make up facts.
Alignment/Safety: They will sometimes give undesirable outputs.
"Honesty": They cannot make reliable statements about their own knowledge and capabilities.
Reliability: They can perform a lot of tasks, but often not reliably.
Long-context (& lack of memory): They cannot (trivially) be used if the input size exceeds the context length.
Generalization: They often require task-specific finetuning or prompting.
Single modality: They cannot easily perform tasks on audio, image, video.
Input/Output paradigm: It is unclear on how to use them for tasks which don't have a specific inputs and outputs (e.g. tasks which require taking many steps).
Agency: LLMs don't act as agents which have their own goals.
Cost: Both training and inference incur significant cost.

Cheap_Meeting t1_j21q96n wrote on December 29, 2022 at 12:48 AM

Reply to comment by ReasonablyBadass in [D] DeepMind has at least half a dozen prototypes for abstract/symbolic reasoning. What are their approaches? by valdanylchuk

Yes, they are trained on a much larger amount of language data than a human sees in their lifetime.

However, I would argue that it's a worthwhile trade-off. Computers can more easily ingest a large amount of data. Humans get feedback from the environment (like their parents), can cross-reference different modalities, and have inductive biases.

Cheap_Meeting t1_j0s0eiq wrote on December 18, 2022 at 11:47 PM

Reply to comment by iamgianluca in [D] Will there be a replacement for Machine Learning Twitter? by MrAcurite

I tried using it but the user interface is so much worse than Twitter. Any tips?

Cheap_Meeting t1_izuygjv wrote on December 12, 2022 at 1:06 AM

Reply to [P] LORA Dreambooth - fine-tune Stable diffusion models twice as faster than Dreambooth method, smaller model sizes 3-4 MBs by Illustrious_Row_9971

Did you do an evaluation of the quality impact?

Cheap_Meeting t1_iyhla19 wrote on December 1, 2022 at 1:39 PM

Reply to [D] Cloud providers for hobby use by gyurisc

Colab

Cheap_Meeting t1_iyhl471 wrote on December 1, 2022 at 1:38 PM

Reply to comment by C0hentheBarbarian in OpenAI ChatGPT [R] by Sea-Photo5230

I actually think prompt engineering is becoming less and less important with things such as ChatGPT.

Cheap_Meeting t1_iyhkxds wrote on December 1, 2022 at 1:36 PM

Reply to comment by Sea-Photo5230 in OpenAI ChatGPT [R] by Sea-Photo5230

That now they can collect data to make it even better.

Cheap_Meeting t1_iybzc8v wrote on November 30, 2022 at 6:54 AM

Reply to comment by ThisIsMyStonerAcount in [D] I'm at NeurIPS, AMA by ThisIsMyStonerAcount

Did you hear about any parties on Thursday. Apparently Google, OpenAI and DeepMind are all tomorrow.

Cheap_Meeting t1_ixplzje wrote on November 25, 2022 at 8:19 AM

Reply to [D] Informal meetup at NeurIPS next week by tlyleung

I'm down.

Cheap_Meeting t1_ixnt905 wrote on November 24, 2022 at 9:44 PM

Reply to [P] Stable Diffusion 2.0 Announcement by hardmaru

This reads like an announcement for the release of a traditional piece of software. It would be nice if you could instead publish some metrics such as FID or ideally side-by-side human evaluation against SD 1.5 / DALLE-2.

One of the best things about the machine learning community is that we have been taking a rational metrics-driving approach. I hope that as ML gets more and more real-world use cases, and both open-source and commercial applications that are not tied to academic research become more prevalent, we don't lose that.

Cheap_Meeting t1_ixn5wlc wrote on November 24, 2022 at 6:47 PM

Reply to comment by OnlyInspector4654 in [P] Stable Diffusion 2.0 Announcement by hardmaru

I'm not sure if that question was directed at you specifically.

Cheap_Meeting t1_ix56uu6 wrote on November 20, 2022 at 8:58 PM

Reply to comment by erannare in [R] Tips on training Transformers by parabellum630

>but the transformer is stuck and loss doesn't decrease after abt 20k steps.

Presumably they meant training loss, which would indicate that this is an optimization problem.

Cheap_Meeting t1_ix56atv wrote on November 20, 2022 at 8:54 PM

Reply to [R] Tips on training Transformers by parabellum630

I disagree with some of the other advice here. I would suggest starting with something that you know works. That means you could either use a training setup from another modality such as vision or text and apply it to your data, or you could try to reproduce a result from the literature first.

Cheap_Meeting t1_ix520oo wrote on November 20, 2022 at 8:26 PM

Reply to comment by AlexGrinch in [D] Why do we train language models with next word prediction instead of some kind of reinforcement learning-like setup? by blazejd

Rereading my own comment, it could have been phrased better. Let me try again:

I think you are taking OP's question too literally. At least as I understand it, the intent of OP's question was: "Why are self-supervised autoregressive models the predominant form of generative models for language? Intuitively it would seem that the training process should be closer to how humans learn language."