Regex Across Languages and Tools
Understand how regex behavior differs between JavaScript, Python, Java, Go, and command-line tools like grep and sed.· 10 min
Concept
Regular expression syntax looks similar across languages, but important differences exist. A pattern that works in one language may behave differently — or fail entirely — in another.
**Flavor families:**
- **PCRE** (Perl Compatible): PHP and many command-line tools (grep -P). The most feature-rich flavor. Ruby uses the Oniguruma engine, which is PCRE-like but distinct.
- **JavaScript/ECMAScript**: Supported in all browsers and Node.js. Has added features steadily (named groups in ES2018, lookbehind in ES2018, \p{...} with u flag).
- **Python re**: Similar to PCRE but with distinct syntax for named groups ((?P<name>...)) and no support for \p{...} property escapes (use the third-party regex module).
- **Java java.util.regex**: PCRE-like but with some quirks — backslashes must be double-escaped in string literals (\\d instead of \d).
- **Go regexp**: Uses RE2 engine, which guarantees linear-time matching. Does NOT support backreferences or lookaheads/lookbehinds. This is a deliberate trade-off for performance safety.
- **.NET**: The most feature-rich built-in engine — supports balancing groups, variable-length lookbehinds, and more.
**Key differences to watch for:**
| Feature | JS | Python | Java | Go | PCRE |
|---|---|---|---|---|---|
| Named groups | (?<n>...) | (?P<n>...) | (?<n>...) | (?P<n>...) | Both |
| Lookbehind | Fixed + variable | Fixed-length only | Variable (caveats in Java 9+) | Not supported | Fixed + variable |
| \p{L} Unicode | With u flag | regex module only | Yes | Yes | Yes |
| Backreferences | \1 or \k<n> | \1 or (?P=n) | \1 | Not supported | \1 or \k<n> |
| Possessive x++ | Not supported | Yes (3.11+) | Yes | Not supported | Yes |
| Atomic groups | Not supported | Yes (3.11+) | Yes | Not supported | Yes |
**Command-line tools:**
- grep — basic regex by default; use grep -E for extended regex (ERE) or grep -P for PCRE (on GNU grep)
- sed — basic regex by default; sed -E for ERE. No lookaheads or named groups.
- **VS Code Find/Replace** — uses JavaScript regex syntax with the g and m flags implicitly enabled
**Practical tip:** If your pattern must work across multiple languages, stick to these universally supported features: character classes ([...], \d, \w, \s), quantifiers (+, *, ?, {n,m}), anchors (^, $), alternation (|), and basic groups (...). Avoid backreferences and lookarounds if Go support is required.
/(?<year>\d{4})-(?<month>\d{2})/gNamed groups work in JS, Java, and .NET with this syntax. Python uses (?P<year>...) instead. Go uses (?P<year>...) syntax.
/(?<=\$)\d+\.\d{2}/gLookbehind works in JS, Java, PCRE, and .NET. Python requires fixed-length lookbehind. Go does NOT support this at all.
Exercise
Write a pattern that matches a date in YYYY-MM-DD format using named capture groups. Use the JavaScript syntax (?<name>...) since the Regexflux tester uses the JavaScript engine.
Your pattern:
Must match
Must not match
Try These Patterns
See these concepts in action with real-world patterns from the library: