Regex Cheat Sheet: The Complete Regular Expressions Guide

Q: What is a regular expression (regex)?

A regular expression (regex) is a sequence of characters that defines a search pattern. It is used for string matching, validation, search-and-replace operations, and text extraction in virtually every programming language including JavaScript, Python, PHP, Java, and Go. Regex patterns can match simple literal text or complex patterns using metacharacters, quantifiers, and groups.

Q: What is the difference between .* and .*? in regex?

.* is a greedy quantifier that matches as many characters as possible, while .*? is a lazy (non-greedy) quantifier that matches as few characters as possible. For example, given the string " hello world ", the pattern .* matches the entire string from the first to the last , while .*? matches only " hello " — stopping at the first closing tag.

Q: What do the regex flags g, i, m, s, u, and y mean?

In JavaScript regex: g (global) finds all matches instead of stopping at the first; i (case-insensitive) ignores uppercase vs lowercase; m (multiline) makes ^ and $ match the start and end of each line instead of the entire string; s (dotAll) makes the dot (.) match newline characters; u (unicode) enables full Unicode matching; y (sticky) matches only at the exact position indicated by the lastIndex property. Combine flags like /pattern/gi for a global, case-insensitive search.

Why Every Developer Needs Regex

Regular expressions (regex) are one of the most powerful tools in a developer's toolkit. Whether you are validating form input, parsing log files, extracting data from HTML, or performing complex search-and-replace operations, regex lets you describe text patterns in a concise, expressive language that works across virtually every programming language.

Despite their reputation for being cryptic, regular expressions follow a logical set of rules. Once you understand the building blocks — metacharacters, character classes, quantifiers, anchors, groups, and lookarounds — you can read and write even complex patterns with confidence. This regex cheat sheet covers every essential concept with practical examples you can copy and use immediately.

Regex Tester Write patterns, paste text, and see matches highlighted instantly with all JS flags

Try It Free

Regex Basics: Metacharacters and Literals

At its simplest, a regex pattern is a string of literal characters. The pattern hello matches the exact text "hello". But the real power of regex comes from metacharacters — special characters that have a meaning beyond their literal value.

Metacharacter	Meaning	Example	Matches
`.`	Any character except newline	`h.t`	hat, hot, hit, h9t
`^`	Start of string (or line with `m` flag)	`^Hello`	"Hello world" but not "Say Hello"
`$`	End of string (or line with `m` flag)	`world$`	"Hello world" but not "world peace"
`*`	Zero or more of preceding element	`ab*c`	ac, abc, abbc, abbbc
`+`	One or more of preceding element	`ab+c`	abc, abbc, abbbc (not ac)
`?`	Zero or one of preceding element	`colou?r`	color, colour
`{n,m}`	Between n and m of preceding element	`a{2,4}`	aa, aaa, aaaa
`[ ]`	Character class — match any one character inside	`[aeiou]`	Any single vowel
`\`	Escape a metacharacter to match it literally	`\.`	A literal dot
`\|`	Alternation (OR)	`cat\|dog`	cat or dog
`( )`	Grouping and capturing	`(ab)+`	ab, abab, ababab

To match a metacharacter literally, escape it with a backslash. For example, \. matches a literal period, \* matches a literal asterisk, and \\ matches a literal backslash.

Character Classes and Shorthand

Character classes let you match any one character from a defined set. You create a custom class by placing characters inside square brackets, or you can use built-in shorthand classes that represent common character groups.

Shorthand	Equivalent	Meaning	Example Match
`\d`	`[0-9]`	Any digit	0, 5, 9
`\D`	`[^0-9]`	Any non-digit	a, #, space
`\w`	`[a-zA-Z0-9_]`	Any word character	a, Z, 3, _
`\W`	`[^a-zA-Z0-9_]`	Any non-word character	!, @, space
`\s`	`[ \t\n\r\f\v]`	Any whitespace	space, tab, newline
`\S`	`[^ \t\n\r\f\v]`	Any non-whitespace	a, 1, !
`[a-z]`	Custom range	Any lowercase letter	a, m, z
`[A-Z]`	Custom range	Any uppercase letter	A, M, Z
`[0-9]`	Same as `\d`	Any digit	0, 5, 9
`[^abc]`	Negated class	Any character except a, b, or c	d, 1, !

You can combine ranges and individual characters in a single class. For example, [a-zA-Z0-9._%+-] matches any letter, digit, period, underscore, percent, plus, or hyphen — the common characters allowed in the local part of an email address.

// Match a hex color code: # followed by 3 or 6 hex digits
/#([0-9a-fA-F]{3}){1,2}\b/

// Match a US ZIP code: 5 digits, optionally followed by -4 digits
/^\d{5}(-\d{4})?$/

Quantifiers: How Many to Match

Quantifiers control how many times the preceding element must occur for a match to succeed. By default, quantifiers are greedy — they match as many characters as possible. Add a ? after the quantifier to make it lazy (match as few as possible).

Quantifier	Meaning	Greedy Example	Lazy Version
`*`	0 or more	`a.*b` matches "aXYZb" in "aXYZbXb"	`a.*?b` matches "aXYZb"
`+`	1 or more	`\d+` matches "123" in "abc123def"	`\d+?` matches "1"
`?`	0 or 1	`https?` matches "http" or "https"	`??` (rarely used)
`{n}`	Exactly n	`\d{4}` matches "2025"	N/A (exact)
`{n,}`	n or more	`\w{3,}` matches words with 3+ chars	`\w{3,}?` matches exactly 3
`{n,m}`	Between n and m	`[a-z]{2,5}` matches 2 to 5 lowercase letters	`[a-z]{2,5}?` matches exactly 2

Understanding greedy vs. lazy matching is essential for writing correct patterns. When scraping HTML, for example, using <div>.*</div> (greedy) would match from the first <div> all the way to the last </div> on the page. Using <div>.*?</div> (lazy) matches each individual div block.

Anchors and Boundaries

Anchors do not match characters — they match positions in the string. They are zero-width assertions that constrain where a match can occur.

^ — Matches the start of the string. With the m flag, matches the start of each line.
$ — Matches the end of the string. With the m flag, matches the end of each line.
\b — Word boundary. Matches the position between a word character (\w) and a non-word character.
\B — Non-word boundary. Matches any position that is not a word boundary.

// \b prevents partial matches
/\bcat\b/   matches "cat" but NOT "concatenate" or "category"

// ^ and $ together ensure the ENTIRE string matches
/^\d{3}-\d{3}-\d{4}$/   validates "555-123-4567" as a complete phone format

// \B matches inside a word
/\Bcat\B/   matches "cat" in "concatenate" but NOT in "cat" or "category"

Regex Tester Test anchors, capture groups, and backreferences with live match highlighting

Try It Free

Groups and Capturing

Groups bundle part of a pattern together. Parentheses create a capturing group that saves the matched text for later use — in backreferences, replacements, or programmatic extraction.

(abc) — Capturing group. Matches "abc" and stores it as group 1.
(?:abc) — Non-capturing group. Groups the pattern but does not store the match.
(?<name>abc) — Named capturing group. Stores the match under the label "name".
\1, \2 — Backreferences. Match the same text that was captured by group 1, group 2, etc.

// Capturing group: extract area code from phone number
/\((\d{3})\)\s\d{3}-\d{4}/
// Input: "(555) 123-4567" -> Group 1 captures "555"

// Named group: extract date parts
/(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/
// Input: "2025-08-15" -> year="2025", month="08", day="15"

// Backreference: find repeated words
/\b(\w+)\s+\1\b/
// Matches "the the", "is is", "hello hello"

// Non-capturing group: match file extensions without capturing
/\.(?:jpg|png|gif|webp)$/i
// Matches ".jpg", ".PNG", ".gif" — but does not capture the extension

Groups are especially useful in find-and-replace operations. You can reference captured groups in the replacement string using $1, $2, or $<name>. Our Regex Tester supports a replace mode where you can test patterns with backreferences like $1 and $2 in the replacement string.

Lookahead and Lookbehind

Lookahead and lookbehind are zero-width assertions that check whether a pattern exists ahead of or behind the current position without including those characters in the match. They are sometimes called "lookarounds."

(?=...) — Positive lookahead. Asserts that what follows matches the pattern.
(?!...) — Negative lookahead. Asserts that what follows does not match the pattern.
(?<=...) — Positive lookbehind. Asserts that what precedes matches the pattern.
(?<!...) — Negative lookbehind. Asserts that what precedes does not match the pattern.

// Positive lookahead: match a number followed by "px"
/\d+(?=px)/          matches "16" in "16px" but not "16em"

// Negative lookahead: match a number NOT followed by "px"
/\d+(?!px)/          matches "16" in "16em" but not the "16" in "16px"

// Positive lookbehind: match a number preceded by "$"
/(?<=\$)\d+/         matches "50" in "$50" but not "50" in "50 items"

// Negative lookbehind: match a number NOT preceded by "$"
/(?<!\$)\d+/         matches "50" in "50 items" but not in "$50"

Lookarounds are commonly used for password validation. For example, to require at least one uppercase letter, one lowercase letter, one digit, and one special character:

// Password: 8+ chars, uppercase, lowercase, digit, special char
/^(?=.*[A-Z])(?=.*[a-z])(?=.*\d)(?=.*[!@#$%^&*]).{8,}$/

Each (?=...) checks that the required character type exists somewhere in the string, without consuming any characters. The final .{8,} then matches the entire password of 8 or more characters. If you need to generate passwords that pass this kind of validation, our Password Generator creates cryptographically secure passwords with customizable character sets and length.

Password Generator Generate cryptographically secure passwords that pass any regex validation rule

Try It Free

Common Regex Patterns

Here are battle-tested regex patterns for the most common validation tasks. You can paste any of these directly into our Regex Tester to see them in action.

Pattern Name	Regex	Matches
Email address	`^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$`	user@example.com
URL (HTTP/HTTPS)	`^https?:\/\/[^\s/$.?#].[^\s]*$`	https://example.com/path
US phone number	`^$?(\d{3})$?[-.\s]?(\d{3})[-.\s]?(\d{4})$`	(555) 123-4567, 555.123.4567
IPv4 address	`^(\d{1,3}\.){3}\d{1,3}$`	192.168.1.1
Date (YYYY-MM-DD)	`^\d{4}-(0[1-9]\|1[0-2])-(0[1-9]\|[12]\d\|3[01])$`	2025-08-15
Hex color code	`^#([0-9a-fA-F]{3}){1,2}$`	#FF8C00, #f0f
Strong password	`^(?=.[A-Z])(?=.[a-z])(?=.\d)(?=.[\W_]).{8,}$`	P@ssw0rd!, Str0ng#Key

When working with URLs that contain special characters, remember that you may need to URL-encode values before applying regex validation. Similarly, when parsing structured data formats like JSON, our JSON Formatter can validate and pretty-print the data before you apply regex extraction.

Regex Flags

Flags (also called modifiers) change how the regex engine processes the pattern. In JavaScript, flags appear after the closing delimiter: /pattern/flags.

g (global) — Find all matches instead of stopping at the first one.
i (case-insensitive) — Treat uppercase and lowercase letters as equivalent. /hello/i matches "Hello", "HELLO", "hElLo".
m (multiline) — Make ^ and $ match the start and end of each line, not just the entire string.
s (dotAll) — Make . match newline characters (\n) as well. Without this flag, . matches everything except newlines.
u (unicode) — Enable full Unicode matching. Required for correctly handling emoji and multi-byte characters.
y (sticky) — Match only from the position indicated by lastIndex. Useful for building tokenizers and parsers.

// Global + case-insensitive: find all "the" regardless of case
"The cat and the dog".match(/the/gi)
// Result: ["The", "the"]

// Multiline: match the start of each line
"line 1\nline 2\nline 3".match(/^line/gm)
// Result: ["line", "line", "line"]

// dotAll: match across newlines
"hello\nworld".match(/hello.world/s)
// Result: ["hello\nworld"]

When working with regex in different programming languages, be aware that flag syntax varies. Python uses re.IGNORECASE, PHP uses /pattern/i like JavaScript, and Java uses Pattern.CASE_INSENSITIVE. The concepts are the same, but the API differs. If you need to compare output from different regex implementations, our Text Diff Checker can highlight differences between results.

Frequently Asked Questions

What is a regular expression (regex)?

A regular expression (regex) is a sequence of characters that defines a search pattern. It is used for string matching, validation, search-and-replace operations, and text extraction in virtually every programming language including JavaScript, Python, PHP, Java, and Go. Regex patterns can match simple literal text or complex patterns using metacharacters, quantifiers, and groups.

What is the difference between .* and .*? in regex?

.* is a greedy quantifier that matches as many characters as possible, while .*? is a lazy (non-greedy) quantifier that matches as few characters as possible. For example, given the string "<b>hello</b><b>world</b>", the pattern <b>.*</b> matches the entire string from the first <b> to the last </b>, while <b>.*?</b> matches only "<b>hello</b>" — stopping at the first closing tag.

How do I validate an email address with regex?

A practical regex for email validation is ^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$. This pattern checks for one or more valid characters before the @ symbol, a domain name with dots, and a top-level domain of at least two letters. Note that the full RFC 5322 email specification is extremely complex, so most applications use a simplified pattern for basic validation and then confirm via a verification email.

What are lookahead and lookbehind in regex?

Lookahead and lookbehind are zero-width assertions that check whether a pattern exists ahead of or behind the current position without consuming characters. Positive lookahead (?=...) asserts that what follows matches the pattern. Negative lookahead (?!...) asserts that what follows does not match. Positive lookbehind (?<=...) and negative lookbehind (?<!...) work the same way but look backward. They are commonly used for password validation, conditional matching, and extracting text between delimiters.

What do the regex flags g, i, m, s, u, and y mean?

In JavaScript regex: g (global) finds all matches instead of stopping at the first; i (case-insensitive) ignores uppercase vs lowercase; m (multiline) makes ^ and $ match the start and end of each line instead of the entire string; s (dotAll) makes the dot (.) match newline characters; u (unicode) enables full Unicode matching; y (sticky) matches only at the exact position indicated by the lastIndex property. Combine flags like /pattern/gi for a global, case-insensitive search.

Conclusion

Regular expressions are a universal tool that every developer encounters sooner or later. The fundamentals are straightforward: metacharacters give special meaning to characters, character classes define sets of characters to match, quantifiers control repetition, anchors pin matches to specific positions, groups capture and organize sub-patterns, and lookarounds enable conditional matching without consuming text.

The patterns in this cheat sheet cover the vast majority of real-world use cases — from validating email addresses and phone numbers to extracting data from structured text. Bookmark this page as a reference, and when you need to build and test a regex pattern interactively, use our free Regex Tester. It highlights matches in real time, shows capture group details, supports all JavaScript flags, and includes a library of preset patterns to get you started.

Tools Mentioned in This Article

Regex Tester Password Generator URL Encoder/Decoder Text Diff Checker JSON Formatter

Regex Cheat Sheet: The Complete Regular Expressions Guide

Why Every Developer Needs Regex

Regex Basics: Metacharacters and Literals

Character Classes and Shorthand

Quantifiers: How Many to Match

Anchors and Boundaries

Groups and Capturing

Lookahead and Lookbehind

Common Regex Patterns

Regex Flags

Frequently Asked Questions

Conclusion

Tools Mentioned in This Article

Related Articles

.htaccess Redirect Guide: 301, 302 & URL Rewrite Rules

Base64 Encoding Explained: How It Works & When to Use It

Unix Timestamp & Epoch Time: The Developer's Complete Guide