[regex] Regex for Comma delimited list

What is the regular expression to validate a comma delimited list like this one:

12365, 45236, 458, 1, 99996332, ......

This question is related to regex csv

The answer is


You might want to specify language just to be safe, but

(\d+, ?)+(\d+)?

ought to work


This one will reject extraneous commas at the start or end of the line, if that's important to you.

((, )?(^)?(possible|value|patterns))*

Replace possible|value|patterns with a regex that matches your allowed values.


This regex extracts an element from a comma separated list, regardless of contents:

(.+?)(?:,|$)

If you just replace the comma with something else, it should work for any delimiter.


In JavaScript, use split to help out, and catch any negative digits as well:

'-1,2,-3'.match(/(-?\d+)(,\s*-?\d+)*/)[0].split(',');
// ["-1", "2", "-3"]
// may need trimming if digits are space-separated

The following will match any comma delimited word/digit/space combination

(((.)*,)*)(.)*

Match duplicate comma-delimited items:

(?<=,|^)([^,]*)(,\1)+(?=,|$)

Reference.

This regex can be used to split the values of a comma delimitted list. List elements may be quoted, unquoted or empty. Commas inside a pair of quotation marks are not matched.

,(?!(?<=(?:^|,)\s*"(?:[^"]|""|\\")*,)(?:[^"]|""|\\")*"\s*(?:,|$))

Reference.


/^\d+(?:, ?\d+)*$/

i used this for a list of items that had to be alphanumeric without underscores at the front of each item.

^(([0-9a-zA-Z][0-9a-zA-Z_]*)([,][0-9a-zA-Z][0-9a-zA-Z_]*)*)$

It depends a bit on your exact requirements. I'm assuming: all numbers, any length, numbers cannot have leading zeros nor contain commas or decimal points. individual numbers always separated by a comma then a space, and the last number does NOT have a comma and space after it. Any of these being wrong would simplify the solution.

([1-9][0-9]*,[ ])*[1-9][0-9]*

Here's how I built that mentally:

[0-9]  any digit.
[1-9][0-9]*  leading non-zero digit followed by any number of digits
[1-9][0-9]*, as above, followed by a comma
[1-9][0-9]*[ ]  as above, followed by a space
([1-9][0-9]*[ ])*  as above, repeated 0 or more times
([1-9][0-9]*[ ])*[1-9][0-9]*  as above, with a final number that doesn't have a comma.

I had a slightly different requirement, to parse an encoded dictionary/hashtable with escaped commas, like this:

"1=This is something, 2=This is something,,with an escaped comma, 3=This is something else"

I think this is an elegant solution, with a trick that avoids a lot of regex complexity:

if (string.IsNullOrEmpty(encodedValues))
{
    return null;
}
else
{
    var retVal = new Dictionary<int, string>();
    var reFields = new Regex(@"([0-9]+)\=(([A-Za-z0-9\s]|(,,))+),");
    foreach (Match match in reFields.Matches(encodedValues + ","))
    {
        var id = match.Groups[1].Value;
        var value = match.Groups[2].Value;
        retVal[int.Parse(id)] = value.Replace(",,", ",");
    }
    return retVal;
}

I think it can be adapted to the original question with an expression like @"([0-9]+),\s?" and parse on Groups[0].

I hope it's helpful to somebody and thanks for the tips on getting it close to there, especially Asaph!