jobeta t1_j0yw8sh wrote on December 20, 2022 at 1:12 PM

New to this: Are there some labelled datasets for sarcasm?

Business-Ad6451 OP t1_j0ywigc wrote on December 20, 2022 at 1:14 PM

Yes. Available on Kaggle.

CMDRJohnCasey t1_j0z0e6h wrote on December 20, 2022 at 1:49 PM

I'm not sure about the quality of these datasets. As in: outside of the challenge/shared task, they are worth nothing.

The reason is that to assemble data for the negative (non-sarcasm) set, they usually recur to data that are clearly distinguishable either for style (news vs. non-news) or topic (eg politics related vs. non-politics related).

Some forms of sarcasm can be detected (eg hyperbole), but others are completely indiscernible without knowing the context of the author (if I said "I love Sundays" you need some context of my Sunday to understand if I'm sarcastic or not).

AnyString3053 t1_j0yx2s6 wrote on December 20, 2022 at 1:20 PM

Can you share the link please

Business-Ad6451 OP t1_j0z03e7 wrote on December 20, 2022 at 1:46 PM

https://www.kaggle.com/datasets/rmisra/news-headlines-dataset-for-sarcasm-detection

RageOnGoneDo t1_j0z51t9 wrote on December 20, 2022 at 2:26 PM

google.com/