读书笔记：YT 神功源自 TWSS

YT 是黑话，以前论过，不赘。无需深究，乃借题发挥，引入最近的读书笔记一则。

Quote

For those who are too polite to know this type of humor, let me explain. When speaking in a non-sexual context, we sometimes say things that are not funny, but which would be funny if the same words were uttered in a sexual context. A listener may detect the double meaning, and respond to your words with, “that’s what she said”, thus putting the remark into a sexual context, and creating a joke. Here’s an example:

Man 1: (looking at deli sandwiches) Wow, they’re much bigger than I expected.
Man 2: That’s what she said!

摘自 http://us.textanalyticsnews.com/fc_fcbi1lz/lz.aspx?p1=05555212S3562&CC=&p=1&cID=0&cValue=1

just finished reading the academic paper on this research, done by some professors at Washington Univ.

It is very, very research oriented and academic and should not even bother practitioners in industry at all.

It is eye-catching and certainly has academic value due to no one having done anything on this so-called TWSS (That is What She Said) problem before.

It is intended to identify/classify via machine learning a subset of puns which might (and might not) contain sarcasm on a brand. But mainly it is only a very small subset of data associated with some adult jokes.

First of all, puns are the last thing which should be brought to the table as an object for automatic processing in a real life system not only because they are statistically rare but also because they are so complex and often involve cultural context. There are endless jobs which are much more widespread and much more tractable for automatic processing. Spending resources on such a problem in industry is not wise, nor effective.

It is one of those again, technology news reporters like to cover stories like that as it draws people's attention and imagination.

Some research is twisted/exaggerated out of context to sound like the next big thing in real life technology.

If they are real for apps they should show benchmarks from real life large corpus. Not the benchmark reported in the paper on some select corpus of a particular source, but the one from the social media at large. First question to answer is how much TWSS is in social media, how relevant it is when it does occur to brands and lastly how the classification will be used in apps. None of these are answered by the research publication, so it is not worth the time in looking into this.

It is eye catching. That's all.

RE: Subject: What can jokes teach us about NLP?
Can your text analytics algorithm tell the difference between a joke and a serious statement?

Reference:
http://www.aclweb.org/anthology-new/P/P11/P11-2016.pdf

本文引用地址：http://blog.sciencenet.cn/blog-362400-617371.html

作者liwei999

作者 liwei999

相关文章

Qwen3来了，全尺寸开源，性能拉满！附最新一手实测！

DeepSeek-V3解析及技术报告英中报告对照版

如何构建和优化推理型大型语言模型？DeepSeek R1的启示

发表回复

You missed

Qwen3-VL技术报告英中对照版.pdf

DeepSeek-V3.2-Exp：用稀疏注意力实现更高效的长上下文推理

LongCat-Flash：美团发布的高效MoE大模型，支持智能体任务，推理速度达100 token/秒

GLM-4.5：三体合一的开源智能体大模型，重新定义AI推理边界

作者liwei999

相关文章：

作者 liwei999

相关文章

发表回复

You missed