[c#] c# regex matches example

I am trying to get values from the following text. How can this be done with Regex?

Input

Lorem ipsum dolor sit %download%#456 amet, consectetur adipiscing %download%#3434 elit. Duis non nunc nec mauris feugiat porttitor. Sed tincidunt blandit dui a viverra%download%#298. Aenean dapibus nisl %download%#893434 id nibh auctor vel tempor velit blandit.

Output

456  
3434  
298   
893434 

This question is related to c# regex

The answer is


All the other responses I see are fine, but C# has support for named groups!

I'd use the following code:

const string input = "Lorem ipsum dolor sit %download%#456 amet, consectetur adipiscing %download%#3434 elit. Duis non nunc nec mauris feugiat porttitor. Sed tincidunt blandit dui a viverra%download%#298. Aenean dapibus nisl %download%#893434 id nibh auctor vel tempor velit blandit.";

static void Main(string[] args)
{
    Regex expression = new Regex(@"%download%#(?<Identifier>[0-9]*)");
    var results = expression.Matches(input);
    foreach (Match match in results)
    {
        Console.WriteLine(match.Groups["Identifier"].Value);
    }
}

The code that reads: (?<Identifier>[0-9]*) specifies that [0-9]*'s results will be part of a named group that we index as above: match.Groups["Identifier"].Value


Regex regex = new Regex("%download#(\\d+?)%", RegexOptions.SingleLine);
Matches m = regex.Matches(input);

I think will do the trick (not tested).


    public void match2()
    {
        string input = "%download%#893434";
        Regex word = new Regex(@"\d+");
        Match m = word.Match(input);
        Console.WriteLine(m.Value);
    }

This pattern should work:

#\d

foreach(var match in System.Text.RegularExpressions.RegEx.Matches(input, "#\d"))
{
    Console.WriteLine(match.Value);
}

(I'm not in front of Visual Studio, but even if that doesn't compile as-is, it should be close enough to tweak into something that works).


It looks like most of post here described what you need here. However - something you might need more complex behavior - depending on what you're parsing. In your case it might be so that you won't need more complex parsing - but it depends what information you're extracting.

You can use regex groups as field name in class, after which could be written for example like this:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Reflection;
using System.Text.RegularExpressions;

public class Info
{
    public String Identifier;
    public char nextChar;
};

class testRegex {

    const string input = "Lorem ipsum dolor sit %download%#456 amet, consectetur adipiscing %download%#3434 elit. " +
    "Duis non nunc nec mauris feugiat porttitor. Sed tincidunt blandit dui a viverra%download%#298. Aenean dapibus nisl %download%#893434 id nibh auctor vel tempor velit blandit.";

    static void Main(string[] args)
    {
        Regex regex = new Regex(@"%download%#(?<Identifier>[0-9]*)(?<nextChar>.)(?<thisCharIsNotNeeded>.)");
        List<Info> infos = new List<Info>();

        foreach (Match match in regex.Matches(input))
        {
            Info info = new Info();
            for( int i = 1; i < regex.GetGroupNames().Length; i++ )
            {
                String groupName = regex.GetGroupNames()[i];

                FieldInfo fi = info.GetType().GetField(regex.GetGroupNames()[i]);

                if( fi != null ) // Field is non-public or does not exists.
                    fi.SetValue( info, Convert.ChangeType( match.Groups[groupName].Value, fi.FieldType));
            }
            infos.Add(info);
        }

        foreach ( var info in infos )
        {
            Console.WriteLine(info.Identifier + " followed by '" + info.nextChar.ToString() + "'");
        }
    }

};

This mechanism uses C# reflection to set value to class. group name is matched against field name in class instance. Please note that Convert.ChangeType won't accept any kind of garbage.

If you want to add tracking of line / column - you can add extra Regex split for lines, but in order to keep for loop intact - all match patterns must have named groups. (Otherwise column index will be calculated incorrectly)

This will results in following output:

456 followed by ' '
3434 followed by ' '
298 followed by '.'
893434 followed by ' '