The project aims to automatically detect sarcastic tweets using machine learning techniques. To do this an annoteted dataset is neaded to learn which tweets are sarcastic and which are literal.
Label a sentence as sarcastic if the sentence is interpreted as sarcastic, or as not sarcastic if the sentence does not contain sarcasm or the potential sarcasm is unclear without the tweet context.
Sarcasm is by the cambridge dictionary defined as: "The use of remarks that clearly mean the opposite of what they say, made in order to hurt someone's feelings or to criticize something in a humorous way"
Some of the tweets contain the hashtag #sarcasm. It would be interesting to see how many of the tweets containing a sarcam hashtag are actually sarcastic.