How to Split a String by Capital Letters in JavaScript
A common text-parsing task is to split a "PascalCase" or "CamelCase" string into an array of separate words. For example, you might want to convert MyVariableName into ['My', 'Variable', 'Name']. The most effective way to achieve this is by using a regular expression to find the boundaries created by the uppercase letters.
This guide will teach you the modern, standard method for splitting a string by capital letters using the String.prototype.match() method. We will also cover the split() method with a lookahead as an alternative and explain its limitations.
The Core Method (Recommended): match() with a Regular Expression
Instead of thinking about where to "split" the string, it's often more robust to think about what "words" you want to match. The match() method with a carefully crafted regular expression can find all the word segments in one pass.
Problem: you have a string in PascalCase and want to split it into an array of words.
// Problem: How to split this into ['My', 'Variable', 'Name']?
let myString = 'MyVariableName';
Solution: this regex matches a capital letter followed by any number of lowercase letters.
let myString = 'MyVariableName';
// This regex finds all occurrences of a capital letter followed by lowercase letters.
let words = myString.match(/[A-Z][a-z]+/g);
console.log(words);
Output:
['My', 'Variable', 'Name']
This is a clean and direct way to extract the words.
How the match() Regex Works
Let's break down the pattern /[A-Z][a-z]+/g:
/ ... /g: The forward slashes denote a regular expression, and thegis the global flag, which is essential for finding all matches in the string, not just the first one.[A-Z]: This is a character set that matches any single uppercase letter fromAtoZ.[a-z]+: This is another character set that matches any single lowercase letter. The+is a quantifier that means "one or more times."
So, the regex reads: "Find any sequence that starts with a single uppercase letter, followed by one or more lowercase letters."
An Alternative Method: split() with a Positive Lookahead
Another common approach is to use split() with a special regular expression feature called a positive lookahead. A lookahead lets you split the string before a capital letter without consuming that letter.
let myString = 'MyVariableName';
// The regex /(?=[A-Z])/ splits the string at the position just before a capital letter.
let words = myString.split(/(?=[A-Z])/);
console.log(words);
Output:
['My', 'Variable', 'Name']
Why match() is Often Better
The split() method can be less intuitive and has edge cases. For example, if the string starts with a lowercase letter or contains other characters, it can produce empty or incorrect array elements, requiring more cleanup. The match() approach is generally more direct as it focuses on extracting what you want rather than splitting by what you don't want.
How to Handle Edge Cases (Leading/Trailing Spaces)
If your input string might have surrounding whitespace, it's a good practice to trim it first.
Example of problem:
// Problem: Spaces will interfere with the split.
let messyString = ' FirstWordSecondWord ';
Solution: chain the .trim() method before you match() or split().
let messyString = ' FirstWordSecondWord ';
let words = messyString.trim().match(/[A-Z][a-z]+/g);
console.log(words);
Output:
['First', 'Word', 'Second', 'Word']
This specific regex doesn't handle the "d" in "Word" correctly in the second match, which shows the complexity. A more robust regex for match might be /[A-Z]?[a-z]+/g.
A better regex for match that handles more cases:
let messyString = ' aWordAnotherWord ';
// This will match an optional capital letter, followed by one or more non-capitals.
let words = messyString.trim().match(/[A-Z]?[^A-Z]+/g) || [];
console.log(words);
Output:
['a', 'Word', 'Another', 'Word']
Conclusion
Splitting a string by capital letters is a perfect task for regular expressions.
- The recommended best practice is to use the
string.match()method with a regex that defines what a "word" looks like (e.g.,/[A-Z][a-z]+/g). This is often the most direct and robust approach. - The
string.split()method with a positive lookahead (/(?=[A-Z])/)is a clever and popular alternative that also works well for simple cases.
By choosing the appropriate regex, you can reliably parse PascalCase or CamelCase strings into a more usable array format.