Page & Bell

AI Text Cleaner (Markdown & Em Dash Remover)

Paste raw ChatGPT, Claude, or Gemini output and get publish-ready plain text in one step. The cleaner strips markdown emphasis and headers, converts em dashes and smart quotes to your preferred style, deletes invisible zero-width characters, and reports exactly what it changed — “14 markdown marks, 6 em dashes, 3 hidden characters removed” — so you know the paste into your CMS, email client, or LinkedIn post will look the way you wrote it, not the way the model formatted it.

Paste AI output above. The cleaned text appears here along with a summary of every artifact removed.

How to use the ai text cleaner (markdown & em dash remover)

  1. Paste AI-generated text into the box. All cleanup options are on by default.
  2. Pick your em dash policy: replace with a comma, a spaced hyphen, or keep them.
  3. Choose how bullet markers should look — normalized dashes, • dots, or removed entirely.
  4. Read the change summary to see what was cleaned, then copy the result.

What gets cleaned, exactly

ArtifactExample inResult out
Bold / italic**key point** and *aside*key point and aside
Headers## ConclusionConclusion
Links[our guide](https://…)our guide
Em dashfast — and cheapfast, and cheap
Smart quotes“done” & it’s"done" & it's
Zero-width charsinvisible (U+200B…U+FEFF)deleted, with a count

The CMS and email paste workflow

The most common failure mode: you paste model output into WordPress, Mailchimp, or Outlook and the asterisks come along as literal characters, because those editors do not render markdown. The second failure mode is subtler — the paste looks fine, but smart quotes break code snippets, zero-width spaces split words for screen readers, and triple blank lines create awkward gaps in the published page. Running text through this cleaner first gives you a known-plain baseline: straight quotes, single blank lines between paragraphs, no invisible characters, and bullets in one consistent style. Then apply formatting deliberately in your editor, rather than inheriting whatever the model chose.

A practical tip for writers who keep some structure: set the bullet option to “Normalize to -” and the dash policy to “keep”. You get clean, consistent lists and retain dashes where you deliberately want them — the summary still tells you how many dashes exist so you can judge whether the density feels like your voice.

Frequently asked questions

Why does AI output have all this formatting in the first place?

LLMs are trained heavily on markdown — documentation, GitHub, forums — and chat interfaces render that markdown into pretty bold text and headers. When you copy from the chat window, some apps give you the rendered text, others give you the raw asterisks and pound signs. Smart quotes and em dashes appear because the training data is full of professionally typeset prose. None of it is a watermark; it is just the model writing in its native dialect.

Are em dashes really a sign of AI writing?

No — the em dash is a legitimate punctuation mark used by professional writers for centuries. AI models do use it noticeably more often than the average person types it (since — is hard to type on most keyboards), which is why frequent em dashes became a folk heuristic for AI text in 2024-25. The goal of this tool is not to hide anything; it is consistency with your own style. If you never type em dashes, your published text should not suddenly be full of them.

What are the hidden characters it removes?

Zero-width spaces (U+200B), zero-width non-joiners and joiners (U+200C, U+200D), word joiners (U+2060), and byte-order marks (U+FEFF). These are invisible but real characters that break search-and-replace, inflate character counts, trip plagiarism checkers, and cause SEO tools to misread your content. They sneak in via copy-paste chains through web apps. The summary tells you how many were found — often zero, sometimes dozens.

Will it mangle code blocks in the output?

Not with the default settings. The preserve-code toggle splits the text on fenced ``` blocks and passes them through untouched, so indentation, asterisks, and underscores inside code survive. Turn the toggle off if you want the fences removed and the code treated as ordinary text.

How does it handle ***bold italic*** or nested emphasis?

The emphasis stripper runs repeatedly until the text stops changing. ***word*** unwraps to *word* on the first pass and to plain word on the second. This iterate-until-stable approach is more reliable than trying to write one regex that anticipates every nesting combination.

Is my text sent to a server?

No. The entire cleaner runs in your browser with JavaScript string operations. Nothing is uploaded, logged, or stored — you can verify by loading the page once and then switching off your internet connection; the tool keeps working.

Related tools

Learn more