Using LINQ on collections, what is the difference between the following lines of code?
if(!coll.Any(i => i.Value))
and
if(!coll.Exists(i => i.Value))
Update 1
When I disassemble .Exists
it looks like there is no code.
Update 2
Anyone know why there is no code there for this one?
This question is related to
c#
linq
collections
The difference is that Any is an extension method for any IEnumerable<T>
defined on System.Linq.Enumerable. It can be used on any IEnumerable<T>
instance.
Exists does not appear to be an extension method. My guess is that coll is of type List<T>
. If so Exists is an instance method which functions very similar to Any.
In short, the methods are essentially the same. One is more general than the other.
When you correct the measurements - as mentioned above: Any and Exists, and adding average - we'll get following output:
Executing search Exists() 1000 times ...
Average Exists(): 35566,023
Fastest Exists() execution: 32226
Executing search Any() 1000 times ...
Average Any(): 58852,435
Fastest Any() execution: 52269 ticks
Benchmark finished. Press any key.
As a continuation on Matas' answer on benchmarking.
TL/DR: Exists() and Any() are equally fast.
First off: Benchmarking using Stopwatch is not precise (see series0ne's answer on a different, but similiar, topic), but it is far more precise than DateTime.
The way to get really precise readings is by using Performance Profiling. But one way to get a sense of how the two methods' performance measure up to each other is by executing both methods loads of times and then comparing the fastest execution time of each. That way, it really doesn't matter that JITing and other noise gives us bad readings (and it does), because both executions are "equally misguiding" in a sense.
static void Main(string[] args)
{
Console.WriteLine("Generating list...");
List<string> list = GenerateTestList(1000000);
var s = string.Empty;
Stopwatch sw;
Stopwatch sw2;
List<long> existsTimes = new List<long>();
List<long> anyTimes = new List<long>();
Console.WriteLine("Executing...");
for (int j = 0; j < 1000; j++)
{
sw = Stopwatch.StartNew();
if (!list.Exists(o => o == "0123456789012"))
{
sw.Stop();
existsTimes.Add(sw.ElapsedTicks);
}
}
for (int j = 0; j < 1000; j++)
{
sw2 = Stopwatch.StartNew();
if (!list.Exists(o => o == "0123456789012"))
{
sw2.Stop();
anyTimes.Add(sw2.ElapsedTicks);
}
}
long existsFastest = existsTimes.Min();
long anyFastest = anyTimes.Min();
Console.WriteLine(string.Format("Fastest Exists() execution: {0} ticks\nFastest Any() execution: {1} ticks", existsFastest.ToString(), anyFastest.ToString()));
Console.WriteLine("Benchmark finished. Press any key.");
Console.ReadKey();
}
public static List<string> GenerateTestList(int count)
{
var list = new List<string>();
for (int i = 0; i < count; i++)
{
Random r = new Random();
int it = r.Next(0, 100);
list.Add(new string('s', it));
}
return list;
}
After executing the above code 4 times (which in turn do 1 000 Exists()
and Any()
on a list with 1 000 000 elements), it's not hard to see that the methods are pretty much equally fast.
Fastest Exists() execution: 57881 ticks
Fastest Any() execution: 58272 ticks
Fastest Exists() execution: 58133 ticks
Fastest Any() execution: 58063 ticks
Fastest Exists() execution: 58482 ticks
Fastest Any() execution: 58982 ticks
Fastest Exists() execution: 57121 ticks
Fastest Any() execution: 57317 ticks
There is a slight difference, but it's too small a difference to not be explained by background noise. My guess would be that if one would do 10 000 or 100 000 Exists()
and Any()
instead, that slight difference would disappear more or less.
Additionally, this will only work if Value is of type bool. Normally this is used with predicates. Any predicate would be generally used find whether there is any element satisfying a given condition. Here you're just doing a map from your element i to a bool property. It will search for an "i" whose Value property is true. Once done, the method will return true.
TLDR; Performance-wise Any
seems to be slower (if I have set this up properly to evaluate both values at almost same time)
var list1 = Generate(1000000);
var forceListEval = list1.SingleOrDefault(o => o == "0123456789012");
if (forceListEval != "sdsdf")
{
var s = string.Empty;
var start2 = DateTime.Now;
if (!list1.Exists(o => o == "0123456789012"))
{
var end2 = DateTime.Now;
s += " Exists: " + end2.Subtract(start2);
}
var start1 = DateTime.Now;
if (!list1.Any(o => o == "0123456789012"))
{
var end1 = DateTime.Now;
s +=" Any: " +end1.Subtract(start1);
}
if (!s.Contains("sdfsd"))
{
}
testing list generator:
private List<string> Generate(int count)
{
var list = new List<string>();
for (int i = 0; i < count; i++)
{
list.Add( new string(
Enumerable.Repeat("ABCDEFGHIJKLMNOPQRSTUVWXYZ", 13)
.Select(s =>
{
var cryptoResult = new byte[4];
new RNGCryptoServiceProvider().GetBytes(cryptoResult);
return s[new Random(BitConverter.ToInt32(cryptoResult, 0)).Next(s.Length)];
})
.ToArray()));
}
return list;
}
With 10M records
" Any: 00:00:00.3770377 Exists: 00:00:00.2490249"
With 5M records
" Any: 00:00:00.0940094 Exists: 00:00:00.1420142"
With 1M records
" Any: 00:00:00.0180018 Exists: 00:00:00.0090009"
With 500k, (I also flipped around order in which they get evaluated to see if there is no additional operation associated with whichever runs first.)
" Exists: 00:00:00.0050005 Any: 00:00:00.0100010"
With 100k records
" Exists: 00:00:00.0010001 Any: 00:00:00.0020002"
It would seem Any
to be slower by magnitude of 2.
Edit: For 5 and 10M records I changed the way it generates the list and Exists
suddenly became slower than Any
which implies there's something wrong in the way I am testing.
New testing mechanism:
private static IEnumerable<string> Generate(int count)
{
var cripto = new RNGCryptoServiceProvider();
Func<string> getString = () => new string(
Enumerable.Repeat("ABCDEFGHIJKLMNOPQRSTUVWXYZ", 13)
.Select(s =>
{
var cryptoResult = new byte[4];
cripto.GetBytes(cryptoResult);
return s[new Random(BitConverter.ToInt32(cryptoResult, 0)).Next(s.Length)];
})
.ToArray());
var list = new ConcurrentBag<string>();
var x = Parallel.For(0, count, o => list.Add(getString()));
return list;
}
private static void Test()
{
var list = Generate(10000000);
var list1 = list.ToList();
var forceListEval = list1.SingleOrDefault(o => o == "0123456789012");
if (forceListEval != "sdsdf")
{
var s = string.Empty;
var start1 = DateTime.Now;
if (!list1.Any(o => o == "0123456789012"))
{
var end1 = DateTime.Now;
s += " Any: " + end1.Subtract(start1);
}
var start2 = DateTime.Now;
if (!list1.Exists(o => o == "0123456789012"))
{
var end2 = DateTime.Now;
s += " Exists: " + end2.Subtract(start2);
}
if (!s.Contains("sdfsd"))
{
}
}
Edit2: Ok so to eliminate any influence from generating test data I wrote it all to file and now read it from there.
private static void Test()
{
var list1 = File.ReadAllLines("test.txt").Take(500000).ToList();
var forceListEval = list1.SingleOrDefault(o => o == "0123456789012");
if (forceListEval != "sdsdf")
{
var s = string.Empty;
var start1 = DateTime.Now;
if (!list1.Any(o => o == "0123456789012"))
{
var end1 = DateTime.Now;
s += " Any: " + end1.Subtract(start1);
}
var start2 = DateTime.Now;
if (!list1.Exists(o => o == "0123456789012"))
{
var end2 = DateTime.Now;
s += " Exists: " + end2.Subtract(start2);
}
if (!s.Contains("sdfsd"))
{
}
}
}
10M
" Any: 00:00:00.1640164 Exists: 00:00:00.0750075"
5M
" Any: 00:00:00.0810081 Exists: 00:00:00.0360036"
1M
" Any: 00:00:00.0190019 Exists: 00:00:00.0070007"
500k
" Any: 00:00:00.0120012 Exists: 00:00:00.0040004"
Source: Stackoverflow.com