Text Cleaner / Formatting Remover
One paste, clean text out. This cleaner reads the rich-text flavour of your clipboard — so it genuinely sees the formatting Word, Google Docs, and web pages smuggle along — then strips tags, straightens smart quotes, fixes spaces and line breaks, and hands you plain text ready for any CMS, email tool, or database. Markup is parsed with the browser's DOM parser, never executed, and nothing leaves your device.
Cleaned text appears here the moment you paste.
How to use the text cleaner / formatting remover
- Paste your text. If the clipboard carries Word/Google Docs/web formatting, the tool detects it, extracts clean text, and keeps table cells separated by tabs.
- Toggle the operations you want: strip HTML, straighten smart quotes, dash policy, space collapsing, line-break handling, URL and emoji removal, case transform.
- Or just press “Clean everything” for the sensible aggressive preset.
- Check the applied-operations summary (it reports exactly what changed and how much), then copy the clean text.
Where hidden formatting actually lives
Copy one sentence from Word and inspect the clipboard: you will find 4–8 KB of HTML — class names like MsoNormal, inline font declarations on every run of text, and locale-specific non-breaking spaces. Google Docs wraps your copy in a docs-internal-guid span with inline styles. Text copied from news sites often carries tracking spans and zero-width characters. Paste any of it into a CMS rich-text field and the junk styles come along, overriding your site's typography — the classic “why is this one paragraph in Calibri?” bug.
The reliable workflow for editors and content teams: copy from the source → paste here → press “Clean everything” → copy the plain text into your CMS, then re-apply headings and bold using the CMS's own controls. Ten seconds, and your styles stay yours.
Quick clean here vs the specialised tools
| Job | Use this page | Use the specialist |
|---|---|---|
| Messy Word/web paste | Yes — one click | — |
| Whitespace forensics, code indentation | Basic collapse | Whitespace Cleaner |
| PDF/email line-wrap repair | Smart paragraph mode | Line Break Remover |
| Strip specific symbol classes | URL/emoji toggles only | Special Character Remover |
| AI-output Markdown & em dashes | Dash policy only | AI Text Cleaner |
Frequently asked questions
How does it detect Word or Google Docs formatting?
When you copy from a rich-text editor, your clipboard holds at least two flavours of the same content: text/plain and text/html. Most paste targets quietly take the HTML — which for Word means kilobytes of mso- styles, conditional comments, and o:p tags wrapped around every paragraph. This tool reads the text/html flavour directly on paste, parses it with the browser’s DOMParser (the markup is never rendered or executed, so embedded scripts are inert), and extracts only the visible text.
What happens to tables and images when I paste?
Table cells are preserved with tab characters between columns and a line break per row — so a pasted Word or Google Sheets table drops straight into Excel or Sheets with the structure intact. Images have no text content and are dropped. Lists keep one item per line.
Why convert smart quotes and em dashes at all?
Curly quotes (U+201C/U+201D) and em dashes (U+2014) are fine in prose but break things in technical contexts: code snippets fail to compile, CSV parsers mis-split fields, and search/matching treats “word” and "word" as different strings. Word’s AutoCorrect inserts them as you type, so they end up everywhere. The dash policy is separate because some style guides want em dashes kept; choose keep, convert to hyphen, or remove.
What does “smart paragraph merge” do to line breaks?
It treats a blank line as a paragraph boundary (kept) and any single line break inside a paragraph as a wrap artefact (replaced with a space). This is exactly what you want for text from PDFs and emails, where every visual line ends in a hard return. “Strip all” flattens everything to one line; “keep” touches nothing. The dedicated Line Break Remover offers the same modes with more surgical control.
Is there a size limit?
2 MB of text (about 300,000 words), which covers any realistic document. Beyond that the input is rejected with a clear message rather than freezing your tab. Within the limit, cleaning is instant — every operation is a single linear pass.
When should I use the single-purpose tools instead?
Use this page for the common case: messy paste, one click, done. Reach for the specialised siblings when you need fine control — the Whitespace Cleaner reports per-character-type counts and has a preserve-indentation mode for code; the Line Break Remover has paragraph heuristics tuned for PDFs; the Special Character Remover lets you whitelist exactly which characters survive; and the AI Text Cleaner targets Markdown artefacts and em-dash patterns specific to AI-generated text.
Related tools
- AI Text Cleaner (Markdown & Em Dash Remover)Clean AI-generated text in one paste: strip markdown asterisks and headers, replace em dashes and smart quotes, and remove hidden characters.
- Whitespace CleanerRemove extra spaces from text: collapse double spaces, trim line ends, delete tabs and non-breaking spaces — clean whitespace in one click.
- Line Break RemoverRemove line breaks from text instantly — join PDF copy-paste fragments into clean paragraphs or one line, keeping paragraph breaks if you want.
- Special Character RemoverRemove special characters, symbols, numbers, punctuation, or emojis from text with simple checkboxes — keep only the characters you choose.
- Word Frequency CounterCount word frequency in any text — a sortable table of the most common words, two and three-word phrases, stopword filtering, and CSV export.
- Comma Separator (Column ⇄ List)Convert a column of values into a comma-separated list and back — custom delimiters, quote wrapping, and dedupe for SQL IN clauses and CSVs.
Learn more
- How to Clean Up Text Pasted from AI, Word, and PDFsCurly quotes, em dashes, non-breaking spaces, and hidden characters break code and search. Learn to spot and strip the junk that AI and Word leave behind.
- Zero-Width and Invisible Characters: Why Your Text BreaksInvisible characters like zero-width spaces and the BOM silently break search, code, and matching. Learn where they come from and how to strip them.