Remove duplicate lines

Keeps the first occurrence of each line; optional trim before compare.

Overview

Duplicates in text might seem like a minor annoyance, but anyone who has ever cleaned up an email list, consolidated TypeScript imports, or processed server logs knows how much they slow things down. The logic behind line deduplication is elegant: a Set tracks which lines have already been seen; when a new line appears, it goes to the output; when a repeat is found, it is silently discarded.

Keeping the first occurrence rather than the last is a deliberate choice. In most use cases, the original document order carries meaning: the first import of a dependency usually defines the scope, the first occurrence of an address in a consolidated list is typically the most reliable one. Reversing the logic to keep the last occurrence is trivial, but it requires conscious intent.

The detail that changes everything is the comparison strategy: are two lines with the same content but different casing duplicates? Are admin and Admin the same thing in a config file? What about the version with surrounding spaces? That is why case-insensitive comparison and trim-before-compare options exist. The line written to the output is always the original version of the first occurrence — normalization happens only during comparison and does not alter the content.

In databases, the equivalent is SELECT DISTINCT or GROUP BY. In functional languages, it is Haskell's nub. In Python, it is the classic dict.fromkeys(). All solve the same problem with the same core idea: a hash of the seen value as a key in a visited dictionary.

Technical deep dive

Common use cases

  • Dependency lists: requirements.txt, package.json, or pom.xml files that grew over time often accumulate duplicate entries from merges and manual copy-pastes.
  • Code imports and directives: consolidating TypeScript, Python, or C# files from multiple sources typically produces repeated imports that compile fine but pollute the codebase.
  • Address and contact lists: email, phone, or ID lists exported from different systems arrive full of duplicates with subtle casing variations.
  • Server logs: concatenating logs from multiple time periods for analysis causes identical warning or error lines to repeat hundreds of times and obscure unique events.
  • Config files: duplicate configuration options can cause unexpected behavior depending on which occurrence the parser uses.

Comparison modes

  • Exact match (default): two lines are duplicates only if they are byte-for-byte identical. The output line preserves the original casing and spaces.
  • Case-insensitive: converts both lines to lowercase before comparing. Useful for lists of names, addresses, and identifiers that may arrive with inconsistent casing.
  • Trim before compare: strips leading and trailing spaces and tabs before checking if a line was already seen. The output still shows the original content with spaces, but the duplicate criterion ignores the padding.
  • Combined case-insensitive and trim: the most permissive mode, which treats variants with extra spaces and different casing as the same line.

Tool guide

  • What you are working with Multi-line text (lists, logs, exports) where each line is one unit.

  • What the tool does Keeps only the first occurrence of each line (optional trim before comparing).

  • Why use it Clean email/ID/URL lists pasted from many sources and prepare unique rows for import.

Code Snippets

Set-based deduplication in JavaScript
function dedupeLines(text, { caseSensitive = true, trim = false } = {}) {
  const seen = new Set();
  return text
    .split('\n')
    .filter(line => {
      let key = trim ? line.trim() : line;
      if (!caseSensitive) key = key.toLowerCase();
      if (seen.has(key)) return false;
      seen.add(key);
      return true;
    })
    .join('\n');
}
Python equivalent
def dedupe_lines(text, case_sensitive=True, trim=False):
    seen = set()
    result = []
    for line in text.splitlines():
        key = line.strip() if trim else line
        if not case_sensitive:
            key = key.lower()
        if key not in seen:
            seen.add(key)
            result.append(line)
    return '\n'.join(result)

Before

a
b
a
b
c → a
b
c

FAQ

What is this tool for?

It runs fully in your browser: useful to validate, format, or convert data in everyday development.

Are my inputs sent to a server?

Processing happens locally with JavaScript. We do not store what you paste into the text areas.

Can I use this for real production data?

Use at your own risk. For secrets (passwords, tokens), prefer controlled environments and your company policies. And always review the generated contents. Never trust blindly things you see on the internet.