[c#] .NET unique object identifier

Is there a way of getting a unique identifier of an instance?

GetHashCode() is the same for the two references pointing to the same instance. However, two different instances can (quite easily) get the same hash code:

Hashtable hashCodesSeen = new Hashtable();
LinkedList<object> l = new LinkedList<object>();
int n = 0;
while (true)
{
    object o = new object();
    // Remember objects so that they don't get collected.
    // This does not make any difference though :(
    l.AddFirst(o);
    int hashCode = o.GetHashCode();
    n++;
    if (hashCodesSeen.ContainsKey(hashCode))
    {
        // Same hashCode seen twice for DIFFERENT objects (n is as low as 5322).
        Console.WriteLine("Hashcode seen twice: " + n + " (" + hashCode + ")");
        break;
    }
    hashCodesSeen.Add(hashCode, null);
}

I'm writing a debugging addin, and I need to get some kind of ID for a reference which is unique during the run of the program.

I already managed to get internal ADDRESS of the instance, which is unique until the garbage collector (GC) compacts the heap (= moves the objects = changes the addresses).

Stack Overflow question Default implementation for Object.GetHashCode() might be related.

The objects are not under my control as I am accessing objects in a program being debugged using the debugger API. If I was in control of the objects, adding my own unique identifiers would be trivial.

I wanted the unique ID for building a hashtable ID -> object, to be able to lookup already seen objects. For now I solved it like this:

Build a hashtable: 'hashCode' -> (list of objects with hash code == 'hashCode')
Find if object seen(o) {
    candidates = hashtable[o.GetHashCode()] // Objects with the same hashCode.
    If no candidates, the object is new
    If some candidates, compare their addresses to o.Address
        If no address is equal (the hash code was just a coincidence) -> o is new
        If some address equal, o already seen
}

This question is related to c# unique hashcode gethashcode

The answer is


The reference is the unique identifier for the object. I don't know of any way of converting this into anything like a string etc. The value of the reference will change during compaction (as you've seen), but every previous value A will be changed to value B, so as far as safe code is concerned it's still a unique ID.

If the objects involved are under your control, you could create a mapping using weak references (to avoid preventing garbage collection) from a reference to an ID of your choosing (GUID, integer, whatever). That would add a certain amount of overhead and complexity, however.


Checked out the ObjectIDGenerator class? This does what you're attempting to do, and what Marc Gravell describes.

The ObjectIDGenerator keeps track of previously identified objects. When you ask for the ID of an object, the ObjectIDGenerator knows whether to return the existing ID, or generate and remember a new ID.

The IDs are unique for the life of the ObjectIDGenerator instance. Generally, a ObjectIDGenerator life lasts as long as the Formatter that created it. Object IDs have meaning only within a given serialized stream, and are used for tracking which objects have references to others within the serialized object graph.

Using a hash table, the ObjectIDGenerator retains which ID is assigned to which object. The object references, which uniquely identify each object, are addresses in the runtime garbage-collected heap. Object reference values can change during serialization, but the table is updated automatically so the information is correct.

Object IDs are 64-bit numbers. Allocation starts from one, so zero is never a valid object ID. A formatter can choose a zero value to represent an object reference whose value is a null reference (Nothing in Visual Basic).


The information I give here is not new, I just added this for completeness.

The idea of this code is quite simple:

  • Objects need a unique ID, which isn't there by default. Instead, we have to rely on the next best thing, which is RuntimeHelpers.GetHashCode to get us a sort-of unique ID
  • To check uniqueness, this implies we need to use object.ReferenceEquals
  • However, we would still like to have a unique ID, so I added a GUID, which is by definition unique.
  • Because I don't like locking everything if I don't have to, I don't use ConditionalWeakTable.

Combined, that will give you the following code:

public class UniqueIdMapper
{
    private class ObjectEqualityComparer : IEqualityComparer<object>
    {
        public bool Equals(object x, object y)
        {
            return object.ReferenceEquals(x, y);
        }

        public int GetHashCode(object obj)
        {
            return RuntimeHelpers.GetHashCode(obj);
        }
    }

    private Dictionary<object, Guid> dict = new Dictionary<object, Guid>(new ObjectEqualityComparer());
    public Guid GetUniqueId(object o)
    {
        Guid id;
        if (!dict.TryGetValue(o, out id))
        {
            id = Guid.NewGuid();
            dict.Add(o, id);
        }
        return id;
    }
}

To use it, create an instance of the UniqueIdMapper and use the GUID's it returns for the objects.


Addendum

So, there's a bit more going on here; let me write a bit down about ConditionalWeakTable.

ConditionalWeakTable does a couple of things. The most important thing is that it doens't care about the garbage collector, that is: the objects that you reference in this table will be collected regardless. If you lookup an object, it basically works the same as the dictionary above.

Curious no? After all, when an object is being collected by the GC, it checks if there are references to the object, and if there are, it collects them. So if there's an object from the ConditionalWeakTable, why will the referenced object be collected then?

ConditionalWeakTable uses a small trick, which some other .NET structures also use: instead of storing a reference to the object, it actually stores an IntPtr. Because that's not a real reference, the object can be collected.

So, at this point there are 2 problems to address. First, objects can be moved on the heap, so what will we use as IntPtr? And second, how do we know that objects have an active reference?

  • The object can be pinned on the heap, and its real pointer can be stored. When the GC hits the object for removal, it unpins it and collects it. However, that would mean we get a pinned resource, which isn't a good idea if you have a lot of objects (due to memory fragmentation issues). This is probably not how it works.
  • When the GC moves an object, it calls back, which can then update the references. This might be how it's implemented judging by the external calls in DependentHandle - but I believe it's slightly more sophisticated.
  • Not the pointer to the object itself, but a pointer in the list of all objects from the GC is stored. The IntPtr is either an index or a pointer in this list. The list only changes when an object changes generations, at which point a simple callback can update the pointers. If you remember how Mark & Sweep works, this makes more sense. There's no pinning, and removal is as it was before. I believe this is how it works in DependentHandle.

This last solution does require that the runtime doesn't re-use the list buckets until they are explicitly freed, and it also requires that all objects are retrieved by a call to the runtime.

If we assume they use this solution, we can also address the second problem. The Mark & Sweep algorithm keeps track of which objects have been collected; as soon as it has been collected, we know at this point. Once the object checks if the object is there, it calls 'Free', which removes the pointer and the list entry. The object is really gone.

One important thing to note at this point is that things go horribly wrong if ConditionalWeakTable is updated in multiple threads and if it isn't thread safe. The result would be a memory leak. This is why all calls in ConditionalWeakTable do a simple 'lock' which ensures this doesn't happen.

Another thing to note is that cleaning up entries has to happen once in a while. While the actual objects will be cleaned up by the GC, the entries are not. This is why ConditionalWeakTable only grows in size. Once it hits a certain limit (determined by collision chance in the hash), it triggers a Resize, which checks if objects have to be cleaned up -- if they do, free is called in the GC process, removing the IntPtr handle.

I believe this is also why DependentHandle is not exposed directly - you don't want to mess with things and get a memory leak as a result. The next best thing for that is a WeakReference (which also stores an IntPtr instead of an object) - but unfortunately doesn't include the 'dependency' aspect.

What remains is for you to toy around with the mechanics, so that you can see the dependency in action. Be sure to start it multiple times and watch the results:

class DependentObject
{
    public class MyKey : IDisposable
    {
        public MyKey(bool iskey)
        {
            this.iskey = iskey;
        }

        private bool disposed = false;
        private bool iskey;

        public void Dispose()
        {
            if (!disposed)
            {
                disposed = true;
                Console.WriteLine("Cleanup {0}", iskey);
            }
        }

        ~MyKey()
        {
            Dispose();
        }
    }

    static void Main(string[] args)
    {
        var dep = new MyKey(true); // also try passing this to cwt.Add

        ConditionalWeakTable<MyKey, MyKey> cwt = new ConditionalWeakTable<MyKey, MyKey>();
        cwt.Add(new MyKey(true), dep); // try doing this 5 times f.ex.

        GC.Collect(GC.MaxGeneration);
        GC.WaitForFullGCComplete();

        Console.WriteLine("Wait");
        Console.ReadLine(); // Put a breakpoint here and inspect cwt to see that the IntPtr is still there
    }

If you are writing a module in your own code for a specific usage, majkinetor's method MIGHT have worked. But there are some problems.

First, the official document does NOT guarantee that the GetHashCode() returns an unique identifier (see Object.GetHashCode Method ()):

You should not assume that equal hash codes imply object equality.

Second, assume you have a very small amount of objects so that GetHashCode() will work in most cases, this method can be overridden by some types.
For example, you are using some class C and it overrides GetHashCode() to always return 0. Then every object of C will get the same hash code. Unfortunately, Dictionary, HashTable and some other associative containers will make use this method:

A hash code is a numeric value that is used to insert and identify an object in a hash-based collection such as the Dictionary<TKey, TValue> class, the Hashtable class, or a type derived from the DictionaryBase class. The GetHashCode method provides this hash code for algorithms that need quick checks of object equality.

So, this approach has great limitations.

And even more, what if you want to build a general purpose library? Not only are you not able to modify the source code of the used classes, but their behavior is also unpredictable.

I appreciate that Jon and Simon have posted their answers, and I will post a code example and a suggestion on performance below.

using System;
using System.Diagnostics;
using System.Runtime.CompilerServices;
using System.Runtime.Serialization;
using System.Collections.Generic;


namespace ObjectSet
{
    public interface IObjectSet
    {
        /// <summary> check the existence of an object. </summary>
        /// <returns> true if object is exist, false otherwise. </returns>
        bool IsExist(object obj);

        /// <summary> if the object is not in the set, add it in. else do nothing. </summary>
        /// <returns> true if successfully added, false otherwise. </returns>
        bool Add(object obj);
    }

    public sealed class ObjectSetUsingConditionalWeakTable : IObjectSet
    {
        /// <summary> unit test on object set. </summary>
        internal static void Main() {
            Stopwatch sw = new Stopwatch();
            sw.Start();
            ObjectSetUsingConditionalWeakTable objSet = new ObjectSetUsingConditionalWeakTable();
            for (int i = 0; i < 10000000; ++i) {
                object obj = new object();
                if (objSet.IsExist(obj)) { Console.WriteLine("bug!!!"); }
                if (!objSet.Add(obj)) { Console.WriteLine("bug!!!"); }
                if (!objSet.IsExist(obj)) { Console.WriteLine("bug!!!"); }
            }
            sw.Stop();
            Console.WriteLine(sw.ElapsedMilliseconds);
        }


        public bool IsExist(object obj) {
            return objectSet.TryGetValue(obj, out tryGetValue_out0);
        }

        public bool Add(object obj) {
            if (IsExist(obj)) {
                return false;
            } else {
                objectSet.Add(obj, null);
                return true;
            }
        }

        /// <summary> internal representation of the set. (only use the key) </summary>
        private ConditionalWeakTable<object, object> objectSet = new ConditionalWeakTable<object, object>();

        /// <summary> used to fill the out parameter of ConditionalWeakTable.TryGetValue(). </summary>
        private static object tryGetValue_out0 = null;
    }

    [Obsolete("It will crash if there are too many objects and ObjectSetUsingConditionalWeakTable get a better performance.")]
    public sealed class ObjectSetUsingObjectIDGenerator : IObjectSet
    {
        /// <summary> unit test on object set. </summary>
        internal static void Main() {
            Stopwatch sw = new Stopwatch();
            sw.Start();
            ObjectSetUsingObjectIDGenerator objSet = new ObjectSetUsingObjectIDGenerator();
            for (int i = 0; i < 10000000; ++i) {
                object obj = new object();
                if (objSet.IsExist(obj)) { Console.WriteLine("bug!!!"); }
                if (!objSet.Add(obj)) { Console.WriteLine("bug!!!"); }
                if (!objSet.IsExist(obj)) { Console.WriteLine("bug!!!"); }
            }
            sw.Stop();
            Console.WriteLine(sw.ElapsedMilliseconds);
        }


        public bool IsExist(object obj) {
            bool firstTime;
            idGenerator.HasId(obj, out firstTime);
            return !firstTime;
        }

        public bool Add(object obj) {
            bool firstTime;
            idGenerator.GetId(obj, out firstTime);
            return firstTime;
        }


        /// <summary> internal representation of the set. </summary>
        private ObjectIDGenerator idGenerator = new ObjectIDGenerator();
    }
}

In my test, the ObjectIDGenerator will throw an exception to complain that there are too many objects when creating 10,000,000 objects (10x than in the code above) in the for loop.

Also, the benchmark result is that the ConditionalWeakTable implementation is 1.8x faster than the ObjectIDGenerator implementation.


I know that this has been answered, but it's at least useful to note that you can use:

http://msdn.microsoft.com/en-us/library/system.object.referenceequals.aspx

Which will not give you a "unique id" directly, but combined with WeakReferences (and a hashset?) could give you a pretty easy way of tracking various instances.


You would have to assign such an identifier yourself, manually - either inside the instance, or externally.

For records related to a database, the primary key may be useful (but you can still get duplicates). Alternatively, either use a Guid, or keep your own counter, allocating using Interlocked.Increment (and make it large enough that it isn't likely to overflow).


RuntimeHelpers.GetHashCode() may help (MSDN).


How about this method:

Set a field in the first object to a new value. If the same field in the second object has the same value, it's probably the same instance. Otherwise, exit as different.

Now set the field in the first object to a different new value. If the same field in the second object has changed to the different value, it's definitely the same instance.

Don't forget to set field in the first object back to it's original value on exit.

Problems?


You can develop your own thing in a second. For instance:

   class Program
    {
        static void Main(string[] args)
        {
            var a = new object();
            var b = new object();
            Console.WriteLine("", a.GetId(), b.GetId());
        }
    }

    public static class MyExtensions
    {
        //this dictionary should use weak key references
        static Dictionary<object, int> d = new Dictionary<object,int>();
        static int gid = 0;

        public static int GetId(this object o)
        {
            if (d.ContainsKey(o)) return d[o];
            return d[o] = gid++;
        }
    }   

You can choose what you will like to have as unique ID on your own, for instance, System.Guid.NewGuid() or simply integer for fastest access.


It is possible to make a unique object identifier in Visual Studio: In the watch window, right-click the object variable and choose Make Object ID from the context menu.

Unfortunately, this is a manual step, and I don't believe the identifier can be accessed via code.


.NET 4 and later only

Good news, everyone!

The perfect tool for this job is built in .NET 4 and it's called ConditionalWeakTable<TKey, TValue>. This class:

  • can be used to associate arbitrary data with managed object instances much like a dictionary (although it is not a dictionary)
  • does not depend on memory addresses, so is immune to the GC compacting the heap
  • does not keep objects alive just because they have been entered as keys into the table, so it can be used without making every object in your process live forever
  • uses reference equality to determine object identity; moveover, class authors cannot modify this behavior so it can be used consistently on objects of any type
  • can be populated on the fly, so does not require that you inject code inside object constructors

Examples related to c#

How can I convert this one line of ActionScript to C#? Microsoft Advertising SDK doesn't deliverer ads How to use a global array in C#? How to correctly write async method? C# - insert values from file into two arrays Uploading into folder in FTP? Are these methods thread safe? dotnet ef not found in .NET Core 3 HTTP Error 500.30 - ANCM In-Process Start Failure Best way to "push" into C# array

Examples related to unique

Count unique values with pandas per groups Find the unique values in a column and then sort them How can I check if the array of objects have duplicate property values? Firebase: how to generate a unique numeric ID for key? pandas unique values multiple columns Select unique values with 'select' function in 'dplyr' library Generate 'n' unique random numbers within a range SQL - select distinct only on one column Can I use VARCHAR as the PRIMARY KEY? Count unique values in a column in Excel

Examples related to hashcode

Hashing with SHA1 Algorithm in C# HashMaps and Null values? How to create a HashMap with two keys (Key-Pair, Value)? What is hashCode used for? Is it unique? How does a Java HashMap handle different objects with the same hash code? Hashcode and Equals for Hashset What is the use of hashCode in Java? Good Hash Function for Strings Why do I need to override the equals and hashCode methods in Java? Memory address of variables in Java

Examples related to gethashcode

Correct way to override Equals() and GetHashCode() .NET unique object identifier What is the best algorithm for overriding GetHashCode?