C# Regex Gotchas

Here I am going to show you how to understand a few confusing usages of regex in c#, like double quotes in pattern, find and replace, wrapping up with a real world example of version number replacing in automated build process.

1. Two ways of using regex for match,replace,etc, there is no difference using them either way as far as I am concerned:

1.1. Using Static Regex Class

Match m = Regex.Match("abracadabra", "(a|b|r)+");
string newString=Regex.Replace("  abra  ", @"^\s*(.*?)\s*$", "$1");

1.2. Instantiated Regex Object

string text = "abracadabra1abracadabra2abracadabra3";
string pat = @"
    (		# start the first group
      abra	# match the literal 'abra'
      (		# start the second (inner) group
      cad	# match the literal 'cad'
      )?	# end the second (optional) group
    )		# end the first group
    +		# match one or more occurences
    ";
// use 'x' modifier to ignore comments
Regex r = new Regex(pat, "x");

Match m = r.Match(text);

Regex.Replace needs to be assigned to get to the string that is replaced.

2. Double quote “ in pattern

Two ways of using double quote in pattern, the thrid way is bit of unusual but it does not do any harm.

// string literal
Regex expression1 = new Regex(@"""([\d\.]+)""");
// escaped string
Regex expression11 = new Regex("\"[\\d\\.]\"");
// basically the same as first one
Regex expression111 = new Regex(@"\""([\d\.]+)\""");

Double quote in string literal is double double quote “” or escape it with “\”.

3. Capture and Replace

There are two types of capture , one is non-capturing capture using (?:pattern) , the other one is using round brackets(pattern), to construct new string using “$1 “ to reference the first pattern match. Using “$1″+another string is very tricky, as when another string starts with a number then $referencenumber will point to none existing capture therefore “$1” will give you literal. Here is an example, you want to concatenate capture with string “123.456”, its not going to happen, all you get is “$1123.456”. to get around it you need “${1}” + “123.456”.



// this gives you capture+ new version
str = expression1.Replace(str, "$1"+"new version" );


4. A useful example

When doing automated release, you will have the release tool like NAnt to do version replacement for you, the task is to extract the version number out and replace it with new one using regex


   string setupFileContents = @"""ARPCOMMENTS"" = ""8:xxx Build 3.5.2000""";
            
    Regex expression1 = new Regex(@"(""ARPCOMMENTS"" = ""8:.*Build )(?:[\d\.]+)");
            
    setupFileContents = expression1.Replace(setupFileContents, "$1 "+"new version no" );

Have fun and hope this clears some confusion for you and makes you more confident when using regex.

Tags:

This entry was posted on Tuesday, January 10th, 2012 at 1:24 am and is filed under General. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

Leave a Reply

*