Try it yourself with our free Regex Tester tool — runs entirely in your browser, no signup needed.

How to Use regex to match in C#

How to use regex to match in C#

Regular expressions (regex) are a powerful tool for matching patterns in strings. In C#, the System.Text.RegularExpressions namespace provides a robust implementation of regex, allowing developers to efficiently search, validate, and extract data from strings. In this article, we'll explore how to use regex to match in C#, covering the basics, edge cases, and performance tips.

Quick Example

using System.Text.RegularExpressions;

public class RegexExample
{
    public static bool IsValidEmail(string input)
    {
        var regex = new Regex(@"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$");
        return regex.IsMatch(input);
    }

    public static void Main()
    {
        Console.WriteLine(IsValidEmail("test@example.com")); // True
        Console.WriteLine(IsValidEmail("invalid_email")); // False
    }
}

This example demonstrates a simple email validation using regex.

Step-by-Step Breakdown

Let's dissect the code:

  1. using System.Text.RegularExpressions;: We import the System.Text.RegularExpressions namespace, which provides the Regex class.
  2. var regex = new Regex(@"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$");: We create a new Regex object with a pattern that matches most common email address formats. The pattern consists of:
    • ^ asserts the start of the string
    • [a-zA-Z0-9._%+-]+ matches one or more alphanumeric characters, dots, underscores, percent signs, plus signs, or hyphens
    • @ matches the @ symbol
    • [a-zA-Z0-9.-]+ matches one or more alphanumeric characters, dots, or hyphens
    • \. matches a period ( escaped with a backslash because . has a special meaning in regex)
    • [a-zA-Z]{2,} matches the domain extension (it must be at least 2 characters long)
    • $ asserts the end of the string
  3. return regex.IsMatch(input);: We use the IsMatch method to test whether the input string matches the regex pattern.
  4. public static void Main(): We define a Main method to test the IsValidEmail method.

Handling Edge Cases

Here are some common edge cases to consider:

Empty/Null Input

public static bool IsValidEmail(string input)
{
    if (string.IsNullOrEmpty(input)) return false;
    // ...
}

We add a simple null check to return false for empty or null inputs.

Invalid Input

public static bool IsValidEmail(string input)
{
    try
    {
        var regex = new Regex(@"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$");
        return regex.IsMatch(input);
    }
    catch (ArgumentException ex)
    {
        Console.WriteLine($"Invalid regex pattern: {ex.Message}");
        return false;
    }
}

We wrap the regex creation and matching in a try-catch block to handle invalid regex patterns.

Large Input

public static bool IsValidEmail(string input)
{
    if (input.Length > 1000) return false; // arbitrary limit
    // ...
}

We add a simple length check to return false for excessively long inputs.

Unicode/Special Characters

public static bool IsValidEmail(string input)
{
    var regex = new Regex(@"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$", RegexOptions.IgnoreCase | RegexOptions.CultureInvariant);
    return regex.IsMatch(input);
}

We use the RegexOptions.IgnoreCase and RegexOptions.CultureInvariant flags to make the regex pattern case-insensitive and culture-invariant.

Common Mistakes

Mistake 1: Incorrect Pattern

// WRONG
var regex = new Regex(@"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+$");

// CORRECT
var regex = new Regex(@"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$");

The incorrect pattern misses the domain extension.

Mistake 2: Missing Escapes

// WRONG
var regex = new Regex(@"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$");

// CORRECT
var regex = new Regex(@"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$", RegexOptions.IgnoreCase | RegexOptions.CultureInvariant);

The incorrect pattern lacks the RegexOptions.IgnoreCase and RegexOptions.CultureInvariant flags.

Mistake 3: Incorrect Input

// WRONG
Console.WriteLine(IsValidEmail("test@example")); // True

// CORRECT
Console.WriteLine(IsValidEmail("test@example.com")); // True

The incorrect input lacks the domain extension.

Performance Tips

  1. Compile regex patterns: Use the Regex.CompileToAssembly method to compile regex patterns into an assembly, which can improve performance.
  2. Use RegexOptions: Use the RegexOptions flags to optimize regex matching, such as RegexOptions.IgnoreCase and RegexOptions.CultureInvariant.
  3. Avoid excessive backtracking: Use possessive quantifiers (e.g., ++ instead of +) to avoid excessive backtracking.

FAQ

Q: What is the difference between Regex.IsMatch and Regex.Match?

A: Regex.IsMatch returns a boolean indicating whether the input string matches the regex pattern, while Regex.Match returns a Match object containing information about the match.

Q: Can I use regex to validate passwords?

A: Yes, but be cautious of common pitfalls, such as using overly complex patterns or neglecting to handle edge cases.

Q: How do I handle Unicode characters in regex?

A: Use the RegexOptions.CultureInvariant flag to make the regex pattern culture-invariant.

Q: Can I use regex to parse HTML?

A: Generally, no. Use a dedicated HTML parsing library instead.

Q: How do I optimize regex performance?

A: Use the performance tips outlined above, such as compiling regex patterns and using RegexOptions flags.

AI agent tools available. The CodeTidy MCP Server gives Claude, Cursor, and other AI agents access to 60+ developer tools. One command: npx @codetidy/mcp