Try it yourself with our free Regex Tester tool — runs entirely in your browser, no signup needed.

How to Use regex to replace in Python

How to use regex to replace in Python

The re module in Python provides a powerful way to search and replace text using regular expressions. This is a crucial skill for any Python developer, as it allows for efficient and flexible text processing. In this guide, we will cover the basics of using regex to replace in Python, including a quick example, a step-by-step breakdown, handling edge cases, common mistakes, performance tips, and frequently asked questions.

Quick Example

import re

text = "Hello, my phone number is 123-456-7890."
pattern = r"\d{3}-\d{3}-\d{4}"
replacement = "[REDACTED]"
new_text = re.sub(pattern, replacement, text)
print(new_text)  # Output: "Hello, my phone number is [REDACTED]."

This example replaces a phone number pattern with a placeholder string.

Step-by-Step Breakdown

Importing the re module

import re

The re module is part of the Python Standard Library, so you don't need to install any additional dependencies.

Defining the text and pattern

text = "Hello, my phone number is 123-456-7890."
pattern = r"\d{3}-\d{3}-\d{4}"

The text variable holds the string we want to modify. The pattern variable is a regular expression that matches the phone number format. The \d special sequence matches any digit, and the {3} and {4} specify the exact number of repetitions.

Defining the replacement string

replacement = "[REDACTED]"

This is the string that will replace the matched phone number.

Using re.sub() to replace the pattern

new_text = re.sub(pattern, replacement, text)

The re.sub() function takes three arguments: the pattern to match, the replacement string, and the text to modify. It returns a new string with all occurrences of the pattern replaced.

Printing the result

print(new_text)  # Output: "Hello, my phone number is [REDACTED]."

The modified string is printed to the console.

Handling Edge Cases

Empty/null input

text = ""
new_text = re.sub(pattern, replacement, text)
print(new_text)  # Output: ""

If the input string is empty, re.sub() will return an empty string.

Invalid input

text = 123
try:
    new_text = re.sub(pattern, replacement, text)
except TypeError:
    print("Error: Input must be a string.")

If the input is not a string, re.sub() will raise a TypeError. You can catch this exception and handle it accordingly.

Large input

large_text = "Hello, my phone number is 123-456-7890." * 1000
new_text = re.sub(pattern, replacement, large_text)
print(len(new_text))  # Output: 14000

re.sub() can handle large input strings efficiently.

Unicode/special characters

text = "Hello, my phone number is +1-123-456-7890."
pattern = r"\+\d{1,2}-\d{3}-\d{3}-\d{4}"
replacement = "[REDACTED]"
new_text = re.sub(pattern, replacement, text)
print(new_text)  # Output: "Hello, my phone number is [REDACTED]."

re.sub() can handle Unicode characters and special sequences like + and -.

Common Mistakes

Wrong pattern

pattern = r"\d{3}\d{3}\d{4}"  # incorrect pattern

Corrected code:

pattern = r"\d{3}-\d{3}-\d{4}"  # correct pattern

Make sure to use the correct pattern to match your target text.

Missing r prefix

pattern = "\d{3}-\d{3}-\d{4}"  # missing r prefix

Corrected code:

pattern = r"\d{3}-\d{3}-\d{4}"  # correct r prefix

The r prefix is necessary to denote a raw string in Python.

Not handling exceptions

try:
    new_text = re.sub(pattern, replacement, text)
except Exception as e:
    print("Error:", e)

Make sure to handle potential exceptions that may occur during the replacement process.

Performance Tips

Use compiled patterns

compiled_pattern = re.compile(pattern)
new_text = compiled_pattern.sub(replacement, text)

Compiling the pattern beforehand can improve performance when performing multiple replacements.

Use re.sub() with a lambda function

new_text = re.sub(pattern, lambda match: replacement, text)

Using a lambda function can improve performance when the replacement string depends on the matched text.

Avoid unnecessary replacements

if pattern in text:
    new_text = re.sub(pattern, replacement, text)
else:
    new_text = text

Avoid performing replacements when the pattern is not present in the text.

FAQ

Q: What is the difference between re.sub() and str.replace()?

Answer: re.sub() uses regular expressions to match and replace patterns, while str.replace() performs a simple string replacement.

Q: Can I use re.sub() with non-string inputs?

Answer: No, re.sub() requires a string input. Use str() or repr() to convert non-string inputs to strings.

Q: How do I handle Unicode characters in my pattern?

Answer: Use Unicode escape sequences (e.g., \u) or Unicode code points (e.g., \U) to match Unicode characters in your pattern.

Q: Can I use re.sub() with large input strings?

Answer: Yes, re.sub() can handle large input strings efficiently.

Q: What is the difference between re.sub() and re.subn()?

Answer: re.sub() returns the modified string, while re.subn() returns a tuple containing the modified string and the number of replacements made.

AI agent tools available. The CodeTidy MCP Server gives Claude, Cursor, and other AI agents access to 60+ developer tools. One command: npx @codetidy/mcp