Regular Expressions in Python
Regular expressions in Python for pattern matching and text processing
Regular Expressions (Regex) in Python are a powerful tool for pattern matching, text searching, and string manipulation. They allow developers to search, extract, and replace text efficiently using specific patterns. Python provides the built-in re
module for working with regular expressions.
In this article, we’ll explore what regular expressions are, their syntax, and practical examples in Python.
A regular expression (regex) is a sequence of characters that defines a search pattern. It is widely used for:
Validating input (emails, phone numbers, passwords).
Searching for patterns in text.
Extracting useful information (like dates, URLs, or numbers).
Replacing or splitting strings based on patterns.
re
ModuleTo use regular expressions in Python, first import the built-in re
module:
import re
Python’s re
module provides several useful functions:
Function | Description |
---|---|
re.match() | Checks if the pattern matches at the beginning of a string. |
re.search() | Searches for the first occurrence of a pattern. |
re.findall() | Returns all matches in a list. |
re.sub() | Replaces one or more matches with a given string. |
re.split() | Splits a string by the occurrences of a pattern. |
Pattern | Meaning |
---|---|
\d | Matches any digit (0–9). |
\D | Matches any non-digit. |
\w | Matches word characters (letters, digits, underscore). |
\W | Matches non-word characters. |
\s | Matches whitespace (spaces, tabs, newlines). |
. | Matches any character except newline. |
^ | Matches the start of a string. |
$ | Matches the end of a string. |
* | Matches 0 or more repetitions. |
+ | Matches 1 or more repetitions. |
{n,m} | Matches between n and m repetitions. |
import re
email = "[email protected]"
pattern = r'^[\w\.-]+@[\w\.-]+\.\w+$'
if re.match(pattern, email):
print("Valid Email")
else:
print("Invalid Email")
import re
text = "Order numbers: 1234, 5678, and 91011"
numbers = re.findall(r'\d+', text)
print(numbers) # ['1234', '5678', '91011']
import re
text = "I like Java"
new_text = re.sub(r'Java', 'Python', text)
print(new_text) # I like Python
import re
text = "apple,banana;grape orange"
fruits = re.split(r'[;,\s]\s*', text)
print(fruits) # ['apple', 'banana', 'grape', 'orange']
Use raw strings (r"pattern"
) to avoid escape sequence conflicts.
Keep regex patterns simple and readable.
Test your regex using tools like regex101.
Use compiled regex (re.compile()
) for repeated use to improve performance.
Regular expressions in Python are a must-know tool for text processing, validation, and data cleaning. By mastering regex patterns and Python’s re
module, you can handle complex text manipulation tasks efficiently.
Whether you’re validating user input, cleaning datasets, or searching logs, regex provides the flexibility and power needed for robust Python applications.