Skip to main content

How to Split a String by Multiple Special Characters in JavaScript

A common text parsing task is to split a string into an array using a variety of different delimiters or special characters. For example, you might need to break a string apart by any punctuation mark, space, or symbol. The most efficient and powerful way to do this is with the String.prototype.split() method, using a single regular expression that contains a character set.

This guide will teach you how to use a regular expression with a character set ([...]) to split a string by multiple, different special characters at once.

The Core Method: split() with a Character Set

The String.prototype.split() method can accept a regular expression as its argument. To split by multiple different characters, you can define all of those characters inside a single character set ([...]).

Problem: you have a string where words are separated by a mix of different delimiters, and you want to get an array of the words.

// Problem: How to split this string by '.', '_', and '-' all at once?
let messyString = 'word1.word2_word3-word4';

Solution:

let messyString = 'word1.word2_word3-word4';

// The regex /[._-]/ matches any single character that is a dot, underscore, or hyphen.
let words = messyString.split(/[._-]/);

console.log(words);

Output:

['word1', 'word2', 'word3', 'word4']
note

This is the most direct and efficient way to solve the problem, as it processes the entire string in a single pass.

How the Character Set Works

Let's break down the pattern /[._-]/:

  • / ... /: These forward slashes mark the beginning and end of the regular expression.
  • [._-]: This is the character set. It tells the split() method to break the string wherever it finds any single character that is inside the brackets. In this case, it will split on a literal dot (.), an underscore (_), or a hyphen (-).
note

An important note on escaping: Inside a character set [...], most special characters (like ., *, +) do not need to be escaped with a backslash. However, some characters, like the backslash itself (\\) or a hyphen (-) that isn't at the beginning or end of the set, must be escaped.

Example with more characters

let str = 'a.b,c-d_e=f\\g/h';

// This set includes many common delimiters.
let result = str.split(/[.,-_=\\/\s]/g);

console.log(result);

Output:

['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']

A Practical Alternative: Splitting by Anything That Isn't a Letter or Number

A more common real-world scenario is not to define what the delimiters are, but what they are not. For example, you might want to split a string by any character that is not a letter or a number.

Problem: you want to extract all the "words" from a string, and a "word" is defined as any sequence of letters or numbers.

let messyString = 'User ID: user-123, Role: admin!';

Solution: use a negated character set ([^...]) with a + quantifier.

let messyString = 'User ID: user-123, Role: admin!';

// This regex matches one or more characters that are NOT letters or numbers.
let words = messyString.split(/[^a-zA-Z0-9]+/);

console.log(words);

Output:

['User', 'ID', 'user', '123', 'Role', 'admin', '']

How It Works

  • [^a-zA-Z0-9]: The ^ at the beginning of the character set negates it, so it matches any character that is not a letter or a digit.
  • +: The + is a quantifier that means "one or more." This is important because it treats a sequence of special characters (like : ) as a single delimiter.

Conclusion

Splitting a string by multiple special characters is a task perfectly suited for a regular expression.

  • The recommended best practice is to use string.split(/[...]/) with a character set ([...]) that contains all the delimiters you want to split by.
  • For the common task of extracting words, it's often easier to use a negated character set to split by anything that is not a letter or number: string.split(/[^a-zA-Z0-9]+/).
  • This approach is far more efficient and readable than chaining multiple split() calls or performing multiple replace() operations.