suflaj t1_j3ek7d0 wrote on January 8, 2023 at 12:28 AM

Reply to comment by currentscurrents in [D] Will NLP Researchers Lose Our Jobs after ChatGPT? by singularpanda

> This is a complex and poorly-defined task

Not at all. First of all, ChatGPT does not understand complexity. It would do you well not to think of it like there is some hierarchy. Secondly, there is no requirement of it needing to be well defined. From what I could gather, ChatGPT requires you to convince it it is not giving out an opinion, and then it can hallucinate pretty much anything.

Specifically the task you gave it is likely implicitly present in the dataset, in the sense that the dataset allowed the model to learn the connections between the words you gave it. I hate to break your bubble, but the task is also achievable even with GPT2, a much less expressive model, since it can be represented as a prompt.

It will be easier to see the shortcomings there, but to put it simply, ChatGPT also has them, ex. it does not by default in the genral case differentiate between uppercase and lowercase letters even if it might be relevant for the task. Such things are too subtle for it. Once you realize the biases it has in this regard you being to see through the cracks. Or generally once you give it a counting task, it says it can count but it is not always successful in it.

What is fascinating is the amount of memory ChatGPT has. It is compared to other models very big. But it is limited and it is not preserved outside of the session.

I would say that the people hyping it up probably just do not understand it that well. LLMs are fascinating, yes, but not ChatGPT specifically, it's how malleable the knowledge is. I would advise you to not understand it, because then the magic stays alive. I had a lot of fun for the first week when I was using it, but I never even use it nowadays.

I would also advise you to approach it more critically. I would advise you to first look into how blatantly racist and sexist it is. With that, you can see the reflection of its creators in it. And most of all, I would advise you to focus on its shortcomings. They are easy to find once you start talking to it more like you'd talk with a friend. They will help you use it more effectively.

currentscurrents t1_j3emas4 wrote on January 8, 2023 at 12:43 AM

>I hate to break your bubble, but the task is also achievable even with GPT2

Is it? I would love to know how. I can run GPT2 locally, and that would be fantastic level of zero-shot learning to be able to play around with.

I have no doubt you can fine-tune GPT2 or T5 to achieve this, but in my experience they aren't nearly as promptable as GPT3/ChatGPT.

>Specifically the task you gave it is likely implicitly present in the dataset, in the sense that the dataset allowed the model to learn the connections between the words you gave it

I'm not sure what you're getting at here. It has learned the connections and meanings between words of course, that's what a language model does.

But it still followed my instructions, and it can follow a wide variety of other detailed instructions you give it. These tasks are too specific to have been in the training data; it is successfully generalizing zero-shot to new NLP tasks.

suflaj t1_j3emtbh wrote on January 8, 2023 at 12:47 AM

> I would love to know how to do this! I can run GPT2 locally, and that would be fantastic level of zero-shot learning to be able to play around with.

It depends on how much you can compress the prompts. GPT2 is severely limited by memory. This means that you would need to train it on already condensed prompts. But in reality, it has the same (albeit not as refined) capabilities as ChatGPT.

> But it still followed my instructions

Well, it turns out that following instructions can be reduced to a symbol manipulation task. Again, you're giving it too much credit. I do agree that it is wide, but it is not as wide as Google or Wikipedia, which would represent humanity I guess.

> it is successfully generalizing zero-shot to new NLP tasks.

As are lesser models. Transformer based models are fairly successful at it and we have hypothesized this since GPT2, and confirmed it with GPT3. But one thing: technically it generalized few-shot to a new NLP task. It hallucinates on zero shot problems generally or states that it doesn't know. Ask it, for an example, what a "gebutzeripanim" is. I made that up just now.

As for the task you gave it, you cannot claim it is zero shot, as you cannot prove its components were not in the database. Unless you want to say that you're pretty sure the prompt you gave it was not in the database, but hey, that can apply to all generative models, that's what generalization is. But there are tasks it fails on because it just cannot do some things. Ask it to integrate or derive certain functions and you'll quickly see what I mean.

It can tell you all you want to know about integration, it can tell you all the rules perfectly, but it simply cannot apply them as well.

blose1 t1_j3j5vpp wrote on January 8, 2023 at 10:59 PM

>Ask it, for an example, what a "gebutzeripanim" is. I made that up just now.

Q: what gebutzeripanim is?

A: I'm sorry, but I am unable to find any information about "gebutzeripanim." Could you please provide more context or clarify what you are asking about?

suflaj t1_j3ld7gy wrote on January 9, 2023 at 10:38 AM

Exactly. It's not exactly meant to be a zero shot model, but rather a conversational agent. If it were a zero shot model, here is what it could say:

gebutzeripanim is a new phrase for me
based on its start it sounds german
based on its middle it sounds vaguely korean
based on its end it sounds vaguely turkish
it might be made up slang

But in conversation you wouldn't just start decomposing stuff you never heard about. Here are what it says if you try to fish out these answers:

I'm sorry, but I'm not familiar with the term "gebutzeripanim." Could you provide more context or clarify what you are referring to?
Without more context or information about the word, it is difficult for me to determine whether it has a connection to the German language. "Gebutzeripanim" is not a word that I am familiar with, and a search of online resources did not yield any results. It is possible that this is a made-up or invented word, or that it is used in a specific context that is not familiar to me.
Without more context or information about the word, it is difficult for me to determine whether it has a connection to the Korean language. "Gebutzeripanim" is not a word that I am familiar with, and a search of online resources did not yield any results. It is possible that this is a made-up or invented word, or that it is used in a specific context that is not familiar to me. It is also possible that the word includes elements from multiple languages or that it is a hybrid of several languages. Without more information, it is difficult to say for sure.
Without more context or information about the word, it is difficult for me to determine whether it has a connection to the Turkish language. "Gebutzeripanim" is not a word that I am familiar with, and a search of online resources did not yield any results. It is possible that this is a made-up or invented word, or that it is used in a specific context that is not familiar to me. It is also possible that the word includes elements from multiple languages or that it is a hybrid of several languages. Without more information, it is difficult to say for sure.
It is not possible for me to determine whether "gebutzeripanim" is a made-up or invented word, as I do not have sufficient information about the word. It is possible that "gebutzeripanim" is a slang term that is not widely known or used, or that it is a word that has been created recently and has not yet gained widespread recognition. Without more context or information, it is difficult to say for sure what the word means or how it is used.

It mostly bases it answers on the fact that it doesn't know. And given that it uses past answers as part of the prompts you can see that the second time I ask it for a language it assumes it can be made out of more languages, but not because it understands it, only because I mentioned it.

If you ask it in a new session whether it is made out of words or phrases from several languages, it answers with

> I'm sorry, but I am unable to find any information about a word spelled "gebutzeripanim." It is possible that this is a made-up word or a word from a language that I am not familiar with. Can you provide any context or additional information about the word that might help me to better understand it?

Since it basically needs to explicitly see things in training, it's not really a zero-shot, but rather a few-shot model. There are instances where it seems like it can connect the dots but you can't really say it happens in the general case...