Submitted by Business-Ad6451 t3_zqhhdc in MachineLearning
[removed]
Submitted by Business-Ad6451 t3_zqhhdc in MachineLearning
[removed]
I detect sarcasm.
Oh, really?
I think he's just a tiny bit skeptical considering how that's like the biggest challenge of NLP. Probably thousands of people tried it already, but even GPT3 doesn't seem to ace sarcasm yet
Oh I am sure it’s a trivial task to solve…
Considering the average reddit user routinely fails to detect sarcasm I think the bots are quite a ways away
Eh, disagree. It's called "artificial intelligence", not "artificial dunning kruger effect"
To be fair, ChatGPT very confidently bullshits about everything, even about 2+2 being equal to 3. But I agree, AI being able to detect sarcasm shouldn't be far away, however, it definitely won't be solved by BERT
That's amusing because his comment is written in a way that is often sarcastic, but the comment is literally true: it absolutely would be useful to have such a bot.
So, was he really being sarcastic? Good luck to you and your bot :P
Well aren't you a smart little bot.
200% precision!
I mean. It could be if it always has context on every domain area.
Better semantic search can help solve this problem as it allows us to augment that project with an external knowledge base. At Marqo (the startup I work for), we created a demo where GPT provides up-to-date news summarisation through the use of Marqo as a knowledge base:
https://medium.com/creator-fund/building-search-engines-that-think-like-humans-e019e6fb6389
This could be applied to op's project. You can visit Marqo: https://github.com/marqo-ai/marqo
Hello.
Sarcasm is algorithmically challenging. It is an antithetic form of human expression. You have to take into account the phenomenon of linguistic ellipsis, which means that words, phrases and clauses are understood via world knowledge and pragmatics. As you have probably researched, typical ML implementations produce average results. Before going into the specs of the embeddings, I believe you have to check your dataset. There is a difference between a headlines dataset produced from publishers and other forms of short text like tweets that are user-generated content. You have to think how intented sarcasm, perceived sarcasm, irony, hashtags, emoticons and other written linguistic expressions present in the domain of sentiment analysis, shape the problem. It is very interesting to see how a LLM performs on this task. I hope you make progress.
If you ever need a data set, I’ll happily donate my mother.
If you’re worried about the dimensionality of the embeddings, why not do some dimensional reduction on them?
Because if I use SVD, the matrix I would get would have a rank ≤ min{m, n}, assuming I had a m×n embedding matrix. But I want to reduce the 26000×768 matrix to 10000×768, which can't be done using SVD.
New to this: Are there some labelled datasets for sarcasm?
Yes. Available on Kaggle.
I'm not sure about the quality of these datasets. As in: outside of the challenge/shared task, they are worth nothing.
The reason is that to assemble data for the negative (non-sarcasm) set, they usually recur to data that are clearly distinguishable either for style (news vs. non-news) or topic (eg politics related vs. non-politics related).
Some forms of sarcasm can be detected (eg hyperbole), but others are completely indiscernible without knowing the context of the author (if I said "I love Sundays" you need some context of my Sunday to understand if I'm sarcastic or not).
Can you share the link please
google.com/
Maybe 85% of detecting sarcasm involves knowing the speaker's actual opinion on the topic so the listener can assess the low probability of earnesty.
The other 15% is present in the overdirect phrasing of the counterintuitive opinion.
So, where a low probability of a speaker's earnesty is present: if they employ very clear verbiage, and especially emphatic punctuation or enthusiastic modifiers, it is more likely sarcasm than some other alternative (e.g., changing opinion or acknowledging nuance).
Clarity provides a maybe 60% chance where counterintuitive enthusiasm increases to near certainty.
A vegan about a steak:
I. "Doesn't that look delicious!"
V.
Ii. "That actually kind of looks delicious...."
Both are counterintuitive statements of the speaker, but the emphasis and certainty of statement i. versus the uncertainty present in option ii. makes it clear which is more likely sarcastic.
Great question, by the way!
This seems like an extremely difficult problem. Humans generally fail to recognize sarcastic journalism all the time; I expect only the original authors could tell for some of it.
(For instance, famous alleged-fraudster SBF has a lot of articles in places like the NYT which most readers think are "good press" for him, but I'm fairly sure are actually the journalists lowkey making fun of him.)
Yeah, sure you made one.
It’s not gonna be simple text classifier.
The issue isn’t the sarcasm- the issue is in the definition of sarcasm the parameters - an actual person is putting perimeters on the sarcasm- Reddit , Facebook…. Any of these platforms intent on machine learning need to understand- are they pursuing mimicking machine growth - or Human growth - and if it’s human then which side am I or anyone else pushing it- if machine - than what category? Be honest- is what I tell myself
A good way to begin is to ask algorithmically ask a series of specific questions- for example - maybe end user or bot user - intent - query- … command : after picking up key word grouping pairs like “well that” auto bot answers end user auto send clarify “clarify meaning “
Are you Sheldon Cooper?
Nah. Physics was never my strong suit.
Professor Frink professor Frink, he'll make you laugh he'll make you think...
bUt I dOn'T tHiNk He CaN dEtEcT sArCaSiM
UnLeSs ThEy UsE /s
https://www.kaggle.com/datasets/danofer/sarcasm
We presented a paper about this, and predicting winning jokes in games of Cards Against Humanity at EMNLP :)
"Cards Against AI: Predicting Humor in a Fill-in-the-blank Party Game"
vzq t1_j0y5mye wrote
A sarcasm detector. Boy, that’s useful.