AI Megathread

Ashkuri

People have targeted it so much

In this community? People love em-dashes in this community.

Pavel

@Faraday said in AI Megathread:

People have targeted it so much

In this community? People love em-dashes in this community.

People love nitpicking others and witch hunts in this community.

Ashkuri

We have at this point several many posts about “I’ll never give up my dashes” and that’s good, people shouldn’t give them up. I don’t think anyone here has ever advocated giving them up. Em Dash Use is not an AI-flagging behavior for this specific group of internet citizens.

It just seems like unnecessary upset that we are constantly rehashing “what if they come for my em dashes” when the use of them is probably the only thing this community has ever unanimously agreed on.

Use the em dashes. Love them. It will be ok.

Aria

@Faraday said in AI Megathread:

@Ashkuri said in AI Megathread:

No one is coming for the em dashes

That is just literally untrue. People have targeted it so much that it’s been widely dubbed the “ChatGPT Hyphen”, and there are a bazillion articles written about how it’s a “tell-tale sign of AI use” by people who don’t know better.

I am not saying that the Wikipedia article was made in bad faith. As you note, it has many sensible disclaimers. But there are just too many people looking for “shortcuts” to identifying AI writing, and they’re liable to summarize and/or quote out of context without the necessary nuance.

I probably shouldn’t be sitting here laughing, but like…

I rattled off five different things I do in my writing that would probably get it flagged as AI, three of which are things I’ve been professionally trained to as part of published and in-house style guides. And now we’re arguing about “OMG em-dashes!” again, which is… summarizing and/or quoting out of context without the rest of the nuance.

Exceptio probat regulam in casibus non exceptis.

Pavel

@Ashkuri This topic isn’t exclusively about this community. That’s why it’s in the real life category, not the MUing related ones. I don’t give a damn if people here think my writing is AI, these folks don’t impact my life.

However, various institutions are using flawed heuristics – be they AI-driven or meatbrain – to judge whether something is written by an LLM which do include em dashes and other common signs of professional/academic writing, and using those flawed judgements to punish students, workers, etc in ways that can dramatically impact their professional lives.

Scribo ergo LLM sum.

ETA: We’re (or at least I am) using em dashes as a shorthand for “professional writing stuff.”

Aria

@Pavel said in AI Megathread:

However, various institutions are using flawed heuristics – be they AI-driven or meatbrain – to judge whether something is written by an LLM which do include em dashes and other common signs of professional/academic writing, and using those flawed judgements to punish students, workers, etc in ways that can dramatically impact their professional lives.

Funnily enough, em-dashes (along with emojis) are something that I use in my professional life to gauge whether or not something was written by an employee or written by AI. The giveaway isn’t that they’re included in whatever document, though. It’s that they’re formatted incorrectly.

Our brand standard font doesn’t like em-dashes and will format them like this-- which is both grammatically wrong and stylistically inconsistent (to the point that the forum software isn’t even auto-formatting it for me). You should be–if you know how to use them properly–leaving no spaces between the dash and the word. The professional writers will almost always catch this because it’s a clear grammar mistake. People who don’t know how em-dashes are supposed to work but had them inserted in by Copilot 365 or ChatGPT usually don’t notice the error.

It’s the same with emojis. The ones that you can make in Microsoft programs using keyboard shortcuts or from the available list in Teams have a totally different visual style than the ones that ChatGPT, Writer, and other LLMs spit out. If they show up in a piece, it’s almost always a dead giveaway that someone copied that content from an LLM into Word, Outlook, or Teams, and literally every time I’ve asked someone if they used AI to write that after seeing an out of place emoji, they excitedly confirmed they did.

The thing is, I’m not a manager and I’m not HR. The only repercussions they’re going to face from me judging something as “AI wrote that for them” is me making a bitchy little face behind my computer. If anyone else notices, they’ll probably be praised for being more efficient–right up until someone in leadership wants to know why something isn’t right.

Needless to say, I have a lot of Big Feelings about AI because my company uses it, I’m expected to write about what we use it to do for the public, I’m expected to write about how to use it better for our employees, and people think it can replace parts of my job. Which it can! And does! Often poorly. Especially when people don’t understand how it works, what it actually does, or that artificial intelligence is a terrible misnomer and it should likely be called ‘automation’ instead.

Pavel

@Aria said in AI Megathread:

You should be–if you know how to use them properly–leaving no spaces between the dash and the word

That is a style guide difference.

ETA: At least it used to be, I haven’t checked recently. But when I was first coming up in the Professional Writing Arena we used some bastardised variant of AP style that required a space between. It also did weird shit with ellipses that I didn’t approve of.

Trashcan

@Pavel said in AI Megathread:

That is a style guide difference.

Yup.

Pavel

@Trashcan That footnote is giving me flashbacks to a stand-up argument I had with one of my educators when I was a wee lad (of around 19?) where I held the view that the Oxford comma is never optional. Academic arguments are weird, man. I was/am right, though.

Roz

@Aria said in AI Megathread:

You should be–if you know how to use them properly–leaving no spaces between the dash and the word.

i came over here to complain about how those are clearly en-dashes, and then i discovered in the quoted text that you did the usual double hyphen dealio we all do, and the forum software just renders it as en-dashes?!?!?!

nodebb why you do this. clearly if you’re going to render that out it should be — not –. absolute clown behavior, nodebb

Pavel

It’s markdown: dashdash is en, dashdasdash is em.

Test–test
Test—test

Test – test
Test — test

Roz

@Pavel that’s dumb

Pavel

@Roz Having two kinds of dashes is dumb.

And this is the part where I leave so Roz can’t em dash my brains against the rocks.

Trashcan

@Pavel said in AI Megathread:

The only way you can truly tell if writing is LLM generated and not simply a style you’ve come to associate with LLM is to be comparative.

This is not true; people who are very familiar with AI-generated text can identify it accurately 90% of the time without any access to ‘comparative’ sources.

@Aria said in AI Megathread:

Anything I write professionally would almost certainly be pegged as written by AI,

@Pavel said in AI Megathread:

various institutions are using flawed heuristics – be they AI-driven or meatbrain – to judge whether something is written by an LLM

The fear that human-generated content is going to be flagged as written by AI is mostly overblown. People who are not familiar with AI are not good at detecting it, but when you see stats about how AI detection tools are “highly inaccurate”, that statistic is almost always referring to AI not being flagged (evasion), not false positives. Various studies have found commercial AI detector tools to have very low levels of “false positives”: GPTZero identified human content correctly 99.7% of the time, and Pangram also identified human content correctly over 99% of the time, while Originality.ai did slightly less well at only 98+% of the time.

If we take these numbers at face value, the odds of someone familiar with AI output identifying a piece of writing as suspect and putting it through two different commercial AI detectors and both of them flagging it as AI when it was, in fact, human-written, is in the neighborhood of 0.002%. You’re more likely to die in a given year than to have this happen to you. I’m personally comfortable with that level of risk.

The odds of someone unfamiliar with AI output accusing you off the cuff of AI use and being wrong about it are about 50%. So. You know. Watch out for that one.

Faraday

@Trashcan said in AI Megathread:

This is not true; people who are very familiar with AI-generated text can identify it accurately 90% of the time without any access to ‘comparative’ sources.

You cited a study with a microscopic sample size and flawed methodology, which (as far as I can see) wasn’t peer reviewed or published in a reputable journal with editorial review. It’s interesting, sure, and maybe can lead to future research, but it by no means proves that “people who are very familiar with AI-generated text can identify it accurately 90% of the time”.

Pavel

@Trashcan said in AI Megathread:

This is not true; people who are very familiar with AI-generated text can identify it accurately 90% of the time without any access to ‘comparative’ sources.

“Person very familiar with Vermeer easily spots forgery” is not a surprise. I was speaking about the general population, who are not very familiar with AI-generated text.

Those studies you quoted, while potentially promising, are very small in scale. Another study has indicated that if English isn’t your first language, there’s a higher chance of your work being pulled up as having been written by AI.

@Trashcan said in AI Megathread:

the odds of someone familiar with AI output identifying a piece of writing as suspect and putting it through two different commercial AI detectors

This part, though, is the most bemusing though. The odds of someone familiar with AI output putting it through two different commercial AI detectors in the real world are almost laughably small, in my experience. Academic institutions and non-tech companies aren’t going to fork out for two bits of software that do roughly the same thing, they’re going to go with whomever has the shiniest advertising budget.

MisterBoring

The most funny part of this to me is people keep using the en dash / em dash thing as an example, and I myself don’t know when to use either, so I just do whatever and sometimes it gets autocorrected.

Pavel

@MisterBoring Either @Roz or @Aria explained… somewhere up in the higher reaches of this thread. I got a cramp trying to scroll that far.

Aria

@Pavel said in AI Megathread:

@Aria said in AI Megathread:

You should be–if you know how to use them properly–leaving no spaces between the dash and the word

That is a style guide difference.

ETA: At least it used to be, I haven’t checked recently. But when I was first coming up in the Professional Writing Arena we used some bastardised variant of AP style that required a space between. It also did weird shit with ellipses that I didn’t approve of.

You can use spaces between (I prefer spaces between because ohgodmyeyes), but having a space on one side and not the other like our dumb brand font does is what I was talking about re: stylistic inconsistency. We use a bastardized version of Chicago style where I work that does the no spaces.

Trashcan

@Pavel said in AI Megathread:

are very small in scale

First study:

By using the method described above, we create 6 datasets with around 20K samples each

Second study:

The researchers built a dataset of about 2,000 human-written passages spanning six mediums: blogs, consumer reviews, news articles, novels, restaurant reviews, and résumés. They then used four popular large language models to generate AI versions of the content by using prompts designed to elicit similar text to the originals.

What would you consider an acceptable scale?

@Faraday said in AI Megathread:

You cited a study with a microscopic sample size and flawed methodology,

Fair enough, I can’t find any similar studies with a larger sample size. Most other studies find odds statistically significantly better than a coin flip, somewhere between just barely and the upper 60%s.

@Pavel said in AI Megathread:

The odds of someone familiar with AI output putting it through two different commercial AI detectors in the real world are almost laughably small

Even if we grant that only 2 events must occur, the suspicion (we’ll go with a 50/50) and a single check (with an average from the commercial offerings of 1% false positives), if you’re approaching your 50’s, you’re still more likely to die in a given year than for this to happen to you (0.5%). These tools are aware of the negative ramifications of a false positive and are biased towards not returning them.