[algorithm] How can I pair socks from a pile efficiently?

We can use hashing to our leverage if you can abstract a single pair of socks as the key itself and its other pair as value.

  1. Make two imaginary sections behind you on the floor, one for you and another for your spouse.

  2. Take one from the pile of socks.

  3. Now place the socks on the floor, one by one, following the below rule.

    • Identify the socks as yours or hers and look at the relevant section on the floor.

    • If you can spot the pair on the floor pick it up and knot them up or clip them up or do whatever you would do after you find a pair and place it in a basket (remove it from the floor).

    • Place it in the relevant section.

  4. Repeat 3 until all socks are over from the pile.


Explanation:

Hashing and Abstraction

Abstraction is a very powerful concept that has been used to improve user experience (UX). Examples of abstraction in real-life interactions with computers include:

  • Folder icons used for navigation in a GUI (graphical user interface) to access an address instead of typing the actual address to navigate to a location.
  • GUI sliders used to control various levels of volume, document scroll position, etc..

Hashing or other not-in-place solutions are not an option because I am not able to duplicate my socks (though it could be nice if I could).

I believe the asker was thinking of applying hashing such that the slot to which either pair of sock goes should be known before placing them.

That's why I suggested abstracting a single sock that is placed on the floor as the hash key itself (hence there is no need to duplicate the socks).

How to define our hash key?

The below definition for our key would also work if there are more than one pair of similar socks. That is, let's say there are two pairs of black men socks PairA and PairB, and each sock is named PairA-L, PairA-R, PairB-L, PairB-R. So PairA-L can be paired with PairB-R, but PairA-L and PairB-L cannot be paired.

Let say any sock can be uniqly identified by,

Attribute[Gender] + Attribute[Colour] + Attribute[Material] + Attribute[Type1] + Attribute[Type2] + Attribute[Left_or_Right]

This is our first hash function. Let's use a short notation for this h1(G_C_M_T1_T2_LR). h1(x) is not our location key.

Another hash function eliminating the Left_or_Right attribute would be h2(G_C_M_T1_T2). This second function h2(x) is our location key! (for the space on the floor behind you).

  • To locate the slot, use h2(G_C_M_T1_T2).
  • Once the slot is located then use h1(x) to check their hashes. If they don't match, you have a pair. Else throw the sock into the same slot.

NOTE: Since we remove a pair once we find one, it's safe to assume that there would only be at a maximum one slot with a unique h2(x) or h1(x) value.

In case we have each sock with exactly one matching pair then use h2(x) for finding the location and if no pair, a check is required, since it's safe to assume they are a pair.

Why is it important to lay the socks on the floor

Let's consider a scenario where the socks are stacked over one another in a pile (worst case). This means we would have no other choice but to do a linear search to find a pair.

Spreading them on the floor gives more visibility which improves the chance of spotting the matching sock (matching a hash key). When a sock was placed on the floor in step 3, our mind had subconsciously registered the location. - So in case, this location is available in our memory we can directly find the matching pair. - In case the location is not remembered, don't worry, then we can always revert back to linear search.

Why is it important to remove the pair from the floor?

  • Short-term human memory works best when it has fewer items to remember. Thus increasing the probability of us resorting to hashing to spot the pair.
  • It will also reduce the number of items to be searched through when linear searching for the pair is being used.

Analysis

  1. Case 1: Worst case when Derpina cannot remember or spot the socks on the floor directly using the hashing technique. Derp does a linear search through the items on the floor. This is not worse than the iterating through the pile to find the pair.
    • Upper bound for comparison: O(n^2).
    • Lower bound for comparison: (n/2). (when every other sock Derpina picks up is the pair of previous one).
  2. Case 2: Derp remembers each the location of every sock that he placed on the floor and each sock has exactly one pair.
    • Upper bound for comparison: O(n/2).
    • Lower bound for comparison: O(n/2).

I am talking about comparison operations, picking the socks from the pile would necessarily be n number of operations. So a practical lower bound would be n iterations with n/2 comparisons.

Speeding up things

To achieve a perfect score so Derp gets O(n/2) comparisons, I would recommend Derpina to,

  • spend more time with the socks to get familiarize with it. Yes, that means spending more time with Derp's socks too.
  • Playing memory games like spot the pairs in a grid can improve short-term memory performance, which can be highly beneficial.

Is this equivalent to the element distinctness problem?

The method I suggested is one of the methods used to solve element distinctness problem where you place them in the hash table and do the comparison.

Given your special case where there exists only one exact pair, it has become very much equivalent to the element distinct problem. Since we can even sort the socks and check adjacent socks for pairs (another solution for EDP).

However, if there is a possibility of more than one pair that can exist for given sock then it deviates from EDP.

Examples related to algorithm

How can I tell if an algorithm is efficient? Find the smallest positive integer that does not occur in a given sequence Efficiently getting all divisors of a given number Peak signal detection in realtime timeseries data What is the optimal algorithm for the game 2048? How can I sort a std::map first by value, then by key? Finding square root without using sqrt function? Fastest way to flatten / un-flatten nested JSON objects Mergesort with Python Find common substring between two strings

Examples related to sorting

Sort Array of object by object field in Angular 6 Sorting a list with stream.sorted() in Java How to sort dates from Oldest to Newest in Excel? how to sort pandas dataframe from one column Reverse a comparator in Java 8 Find the unique values in a column and then sort them pandas groupby sort within groups pandas groupby sort descending order Efficiently sorting a numpy array in descending order? Swift: Sort array of objects alphabetically

Examples related to language-agnostic

IOException: The process cannot access the file 'file path' because it is being used by another process Peak signal detection in realtime timeseries data Match linebreaks - \n or \r\n? Simple way to understand Encapsulation and Abstraction How can I pair socks from a pile efficiently? How do I determine whether my calculation of pi is accurate? What is ADT? (Abstract Data Type) How to explain callbacks in plain english? How are they different from calling one function from another function? Ukkonen's suffix tree algorithm in plain English Private vs Protected - Visibility Good-Practice Concern

Examples related to matching

How can I pair socks from a pile efficiently? Find which rows have different values for a given column in Teradata SQL Check if value exists in column in VBA String Pattern Matching In Java