HTTP/HTTPS URL
intermediateMatches URLs starting with http:// or https://.
Pattern
/https?://[\w.-]+(?:\.[a-zA-Z]{2,})(?:[/\w.-]*)*(?:\?[\w=&.-]*)?(?:#[\w-]*)?/giTry It
Explanation
Matches URLs with http or https protocol, domain with TLD, optional path segments, query string parameters, and fragment identifiers.
When to Use
Use this pattern to extract or validate HTTP/HTTPS URLs from plain text such as user comments, log files, or documents. For strict URL parsing, use the built-in URL constructor (JavaScript) or urllib.parse (Python) instead. This pattern covers common URL formats but does not handle internationalized domain names (IDN) or URLs with encoded special characters.
Step-by-Step Breakdown
| Token | Explanation |
|---|---|
https? | Match 'http' followed by an optional 's' — covers both HTTP and HTTPS protocols |
:// | Match the literal protocol separator '://' |
[\w.-]+ | Match one or more word characters, dots, or hyphens — the domain and any subdomains (e.g., 'api.example') |
(?:\.[a-zA-Z]{2,}) | Match a dot followed by two or more letters — the top-level domain (.com, .org, .co.uk) |
(?:[/\w.-]*)* | Match zero or more path segments containing slashes, word characters, dots, or hyphens |
(?:\?[\w=&.-]*)? | Optionally match a query string starting with ? containing key=value pairs separated by & |
(?:#[\w-]*)? | Optionally match a fragment identifier starting with # (e.g., #section-1) |
Common Mistakes
Forgetting to escape the dot before the TLD, e.g., [\w.-]+.[a-zA-Z]{2,}
Fix: [\w.-]+\.[a-zA-Z]{2,}An unescaped dot matches any character, causing the pattern to match strings like https://example_com as valid URLs.
Using greedy .* for the path instead of a restricted character class
Fix: (?:[/\w.-]*)*A greedy .* can match far beyond the URL boundary and may cause catastrophic backtracking on long inputs. Use a character class that only allows valid URL path characters.
Not including the query string and fragment portions
Fix: (?:\?[\w=&.-]*)?(?:#[\w-]*)?URLs frequently include query parameters (?key=value) and fragment identifiers (#section). Without these groups, valid URLs like https://example.com/search?q=test are only partially matched.
Test Strings
Matching
- https://example.com
- http://sub.domain.co.uk/path?q=1#top
- https://api.example.com/v2/users
Non-matching
- ftp://files.com
- example.com
- not-a-url
Language Compatibility
| Language | Support |
|---|---|
| JS | Full support |
| PYTHON | Full support |
| JAVA | Full support |
| PHP | Full support |
| GO | Full support |
| RUBY | Full support |
| CSHARP | Full support |
Code Snippets
const regex = /https?:\/\/[\w.-]+(?:\.[a-zA-Z]{2,})(?:[\/\w.-]*)*(?:\?[\w=&.-]*)?(?:#[\w-]*)?/gi;
const text = "your text here";
const matches = text.match(regex);
console.log(matches);Common Variations
HTTPS only
https://[\w.-]+(?:\.[a-zA-Z]{2,})(?:[/\w.-]*)*Matches only HTTPS URLs
Related Patterns
Learn the Concepts
This pattern uses concepts covered in these lessons: