This is a quick article where I would like to show how to use regular expression to get all matches of a particular pattern from the string.
Example of a string:
string strHTML = "<span>key1<span>value1
<span>key2</span>value2
<span>key3</span>value3";
To get the list of key-value pairs we need to parse the above string and return a list of matches, where each match will contain key in Group[1].Value and value in Group[2].Value. The first part of the pattern is clear:
(.+?)(.+?) but the tricky bit is to know how to define an end match condition. Here I suggest to use an operator (?=exp) that will match any position preceding a suffix exp, in our case we want to stop the match if we find another occurrence of
tag or reach the end of the string ($)
MatchCollection mc = Regex.Matches(strHTML, @"<span>(.+?)</span>(.+?)(?=(<span>|$))");
foreach(Match m in mc)
Response.Write(m.Groups[1].Value + " " + m.Groups[2].Value + "</br>");
Result:
key1 value1
key2 value2
key3 value3
What will happen if you don’t include (?=exp) operator? You will lose the middle pair and the result will be:
key1 value1
key3 value3