[c#] .NET unique object identifier

The information I give here is not new, I just added this for completeness.

The idea of this code is quite simple:

  • Objects need a unique ID, which isn't there by default. Instead, we have to rely on the next best thing, which is RuntimeHelpers.GetHashCode to get us a sort-of unique ID
  • To check uniqueness, this implies we need to use object.ReferenceEquals
  • However, we would still like to have a unique ID, so I added a GUID, which is by definition unique.
  • Because I don't like locking everything if I don't have to, I don't use ConditionalWeakTable.

Combined, that will give you the following code:

public class UniqueIdMapper
{
    private class ObjectEqualityComparer : IEqualityComparer<object>
    {
        public bool Equals(object x, object y)
        {
            return object.ReferenceEquals(x, y);
        }

        public int GetHashCode(object obj)
        {
            return RuntimeHelpers.GetHashCode(obj);
        }
    }

    private Dictionary<object, Guid> dict = new Dictionary<object, Guid>(new ObjectEqualityComparer());
    public Guid GetUniqueId(object o)
    {
        Guid id;
        if (!dict.TryGetValue(o, out id))
        {
            id = Guid.NewGuid();
            dict.Add(o, id);
        }
        return id;
    }
}

To use it, create an instance of the UniqueIdMapper and use the GUID's it returns for the objects.


Addendum

So, there's a bit more going on here; let me write a bit down about ConditionalWeakTable.

ConditionalWeakTable does a couple of things. The most important thing is that it doens't care about the garbage collector, that is: the objects that you reference in this table will be collected regardless. If you lookup an object, it basically works the same as the dictionary above.

Curious no? After all, when an object is being collected by the GC, it checks if there are references to the object, and if there are, it collects them. So if there's an object from the ConditionalWeakTable, why will the referenced object be collected then?

ConditionalWeakTable uses a small trick, which some other .NET structures also use: instead of storing a reference to the object, it actually stores an IntPtr. Because that's not a real reference, the object can be collected.

So, at this point there are 2 problems to address. First, objects can be moved on the heap, so what will we use as IntPtr? And second, how do we know that objects have an active reference?

  • The object can be pinned on the heap, and its real pointer can be stored. When the GC hits the object for removal, it unpins it and collects it. However, that would mean we get a pinned resource, which isn't a good idea if you have a lot of objects (due to memory fragmentation issues). This is probably not how it works.
  • When the GC moves an object, it calls back, which can then update the references. This might be how it's implemented judging by the external calls in DependentHandle - but I believe it's slightly more sophisticated.
  • Not the pointer to the object itself, but a pointer in the list of all objects from the GC is stored. The IntPtr is either an index or a pointer in this list. The list only changes when an object changes generations, at which point a simple callback can update the pointers. If you remember how Mark & Sweep works, this makes more sense. There's no pinning, and removal is as it was before. I believe this is how it works in DependentHandle.

This last solution does require that the runtime doesn't re-use the list buckets until they are explicitly freed, and it also requires that all objects are retrieved by a call to the runtime.

If we assume they use this solution, we can also address the second problem. The Mark & Sweep algorithm keeps track of which objects have been collected; as soon as it has been collected, we know at this point. Once the object checks if the object is there, it calls 'Free', which removes the pointer and the list entry. The object is really gone.

One important thing to note at this point is that things go horribly wrong if ConditionalWeakTable is updated in multiple threads and if it isn't thread safe. The result would be a memory leak. This is why all calls in ConditionalWeakTable do a simple 'lock' which ensures this doesn't happen.

Another thing to note is that cleaning up entries has to happen once in a while. While the actual objects will be cleaned up by the GC, the entries are not. This is why ConditionalWeakTable only grows in size. Once it hits a certain limit (determined by collision chance in the hash), it triggers a Resize, which checks if objects have to be cleaned up -- if they do, free is called in the GC process, removing the IntPtr handle.

I believe this is also why DependentHandle is not exposed directly - you don't want to mess with things and get a memory leak as a result. The next best thing for that is a WeakReference (which also stores an IntPtr instead of an object) - but unfortunately doesn't include the 'dependency' aspect.

What remains is for you to toy around with the mechanics, so that you can see the dependency in action. Be sure to start it multiple times and watch the results:

class DependentObject
{
    public class MyKey : IDisposable
    {
        public MyKey(bool iskey)
        {
            this.iskey = iskey;
        }

        private bool disposed = false;
        private bool iskey;

        public void Dispose()
        {
            if (!disposed)
            {
                disposed = true;
                Console.WriteLine("Cleanup {0}", iskey);
            }
        }

        ~MyKey()
        {
            Dispose();
        }
    }

    static void Main(string[] args)
    {
        var dep = new MyKey(true); // also try passing this to cwt.Add

        ConditionalWeakTable<MyKey, MyKey> cwt = new ConditionalWeakTable<MyKey, MyKey>();
        cwt.Add(new MyKey(true), dep); // try doing this 5 times f.ex.

        GC.Collect(GC.MaxGeneration);
        GC.WaitForFullGCComplete();

        Console.WriteLine("Wait");
        Console.ReadLine(); // Put a breakpoint here and inspect cwt to see that the IntPtr is still there
    }

Examples related to c#

How can I convert this one line of ActionScript to C#? Microsoft Advertising SDK doesn't deliverer ads How to use a global array in C#? How to correctly write async method? C# - insert values from file into two arrays Uploading into folder in FTP? Are these methods thread safe? dotnet ef not found in .NET Core 3 HTTP Error 500.30 - ANCM In-Process Start Failure Best way to "push" into C# array

Examples related to unique

Count unique values with pandas per groups Find the unique values in a column and then sort them How can I check if the array of objects have duplicate property values? Firebase: how to generate a unique numeric ID for key? pandas unique values multiple columns Select unique values with 'select' function in 'dplyr' library Generate 'n' unique random numbers within a range SQL - select distinct only on one column Can I use VARCHAR as the PRIMARY KEY? Count unique values in a column in Excel

Examples related to hashcode

Hashing with SHA1 Algorithm in C# HashMaps and Null values? How to create a HashMap with two keys (Key-Pair, Value)? What is hashCode used for? Is it unique? How does a Java HashMap handle different objects with the same hash code? Hashcode and Equals for Hashset What is the use of hashCode in Java? Good Hash Function for Strings Why do I need to override the equals and hashCode methods in Java? Memory address of variables in Java

Examples related to gethashcode

Correct way to override Equals() and GetHashCode() .NET unique object identifier What is the best algorithm for overriding GetHashCode?