Question 1

Do the matching options change my actual lines?

Accepted Answer

No. Case-insensitive, trim, and ignore-punctuation affect only the comparison key. If your list contains “John@Example.com ” and “john@example.com”, case-insensitive + trim treats them as one duplicate, but the surviving line is output exactly as you pasted it (the first or last occurrence, your choice). This matters for data like emails where you want dedupe logic to be loose but the stored value to stay verbatim.

Question 2

When should I keep the last occurrence instead of the first?

Accepted Answer

Keep-last is the right choice when later entries are more up to date — for example a change log of customer records where each re-entry supersedes the previous one, or a re-exported keyword list where the newest line carries corrected spelling. Keep-first preserves your original ordering priority; keep-last preserves recency. The kept line is emitted at the position of that occurrence.

Question 3

How is this better than Excel’s Remove Duplicates?

Accepted Answer

Three ways: you don’t need a spreadsheet open for a 10-second job; Excel compares cells exactly as stored, so “Apple ” with a trailing space and “apple” survive as three “different” values unless you build helper columns with TRIM and LOWER; and Excel has no ignore-punctuation mode at all. Here those normalizations are one checkbox each, and the stats bar tells you exactly how many rows were dropped — Excel only tells you after the fact.

Question 4

Will it catch near-duplicates like plurals or reordered words?

Accepted Answer

No — and be suspicious of any tool that silently claims to. “running shoe” vs “running shoes” vs “shoe running” are distinct lines to an exact-match deduper, even with all normalizations on. Catching those requires stemming and token sorting, which changes meaning in ways you should review by hand. For keyword lists, sort the output A–Z first: near-duplicates cluster together visually, making manual review fast.

Question 5

How are blank lines handled?

Accepted Answer

Blank lines never participate in duplicate matching — an empty line will not “dedupe” against another empty line unless you ask. The blank-line policy gives you three behaviours: keep them all (default, preserves paragraph grouping), collapse runs of blanks to a single one, or remove every blank line for a tight list ready to upload.

Question 6

What is the size limit and how fast is it?

Accepted Answer

The deduper does one pass with a hash map: normalize each line once, look it up, move on — O(n) overall. A 100,000-line list (roughly 2–3 MB) processes in well under a second in a modern browser. A trailing newline at the end of your paste is ignored, so it never shows up as a phantom blank line in the counts, and Windows CRLF endings are normalized automatically.

Duplicate Line Remover

How to use the duplicate line remover

Where deduping saves you real trouble

Worked example

Frequently asked questions

Related tools