[multithreading] What is a race condition?

When writing multithreaded applications, one of the most common problems experienced is race conditions.

My questions to the community are:

What is the race condition?
How do you detect them?
How do you handle them?
Finally, how do you prevent them from occurring?

The answer is


A race condition is a situation on concurrent programming where two concurrent threads or processes compete for a resource and the resulting final state depends on who gets the resource first.


Race conditions occur in multi-threaded applications or multi-process systems. A race condition, at its most basic, is anything that makes the assumption that two things not in the same thread or process will happen in a particular order, without taking steps to ensure that they do. This happens commonly when two threads are passing messages by setting and checking member variables of a class both can access. There's almost always a race condition when one thread calls sleep to give another thread time to finish a task (unless that sleep is in a loop, with some checking mechanism).

Tools for preventing race conditions are dependent on the language and OS, but some comon ones are mutexes, critical sections, and signals. Mutexes are good when you want to make sure you're the only one doing something. Signals are good when you want to make sure someone else has finished doing something. Minimizing shared resources can also help prevent unexpected behaviors

Detecting race conditions can be difficult, but there are a couple signs. Code which relies heavily on sleeps is prone to race conditions, so first check for calls to sleep in the affected code. Adding particularly long sleeps can also be used for debugging to try and force a particular order of events. This can be useful for reproducing the behavior, seeing if you can make it disappear by changing the timing of things, and for testing solutions put in place. The sleeps should be removed after debugging.

The signature sign that one has a race condition though, is if there's an issue that only occurs intermittently on some machines. Common bugs would be crashes and deadlocks. With logging, you should be able to find the affected area and work back from there.


Consider an operation which has to display the count as soon as the count gets incremented. ie., as soon as CounterThread increments the value DisplayThread needs to display the recently updated value.

int i = 0;

Output

CounterThread -> i = 1  
DisplayThread -> i = 1  
CounterThread -> i = 2  
CounterThread -> i = 3  
CounterThread -> i = 4  
DisplayThread -> i = 4

Here CounterThread gets the lock frequently and updates the value before DisplayThread displays it. Here exists a Race condition. Race Condition can be solved by using Synchronzation


You can prevent race condition, if you use "Atomic" classes. The reason is just the thread don't separate operation get and set, example is below:

AtomicInteger ai = new AtomicInteger(2);
ai.getAndAdd(5);

As a result, you will have 7 in link "ai". Although you did two actions, but the both operation confirm the same thread and no one other thread will interfere to this, that means no race conditions!


Here is the classical Bank Account Balance example which will help newbies to understand Threads in Java easily w.r.t. race conditions:

public class BankAccount {

/**
 * @param args
 */
int accountNumber;
double accountBalance;

public synchronized boolean Deposit(double amount){
    double newAccountBalance=0;
    if(amount<=0){
        return false;
    }
    else {
        newAccountBalance = accountBalance+amount;
        accountBalance=newAccountBalance;
        return true;
    }

}
public synchronized boolean Withdraw(double amount){
    double newAccountBalance=0;
    if(amount>accountBalance){
        return false;
    }
    else{
        newAccountBalance = accountBalance-amount;
        accountBalance=newAccountBalance;
        return true;
    }
}

public static void main(String[] args) {
    // TODO Auto-generated method stub
    BankAccount b = new BankAccount();
    b.accountBalance=2000;
    System.out.println(b.Withdraw(3000));

}

A sort-of-canonical definition is "when two threads access the same location in memory at the same time, and at least one of the accesses is a write." In the situation the "reader" thread may get the old value or the new value, depending on which thread "wins the race." This is not always a bug—in fact, some really hairy low-level algorithms do this on purpose—but it should generally be avoided. @Steve Gury give's a good example of when it might be a problem.


Race condition is not only related with software but also related with hardware too. Actually the term was initially coined by the hardware industry.

According to wikipedia:

The term originates with the idea of two signals racing each other to influence the output first.

Race condition in a logic circuit:

enter image description here

Software industry took this term without modification, which makes it a little bit difficult to understand.

You need to do some replacement to map it to the software world:

  • "two signals" => "two threads"/"two processes"
  • "influence the output" => "influence some shared state"

So race condition in software industry means "two threads"/"two processes" racing each other to "influence some shared state", and the final result of the shared state will depend on some subtle timing difference, which could be caused by some specific thread/process launching order, thread/process scheduling, etc.


A "race condition" exists when multithreaded (or otherwise parallel) code that would access a shared resource could do so in such a way as to cause unexpected results.

Take this example:

for ( int i = 0; i < 10000000; i++ )
{
   x = x + 1; 
}

If you had 5 threads executing this code at once, the value of x WOULD NOT end up being 50,000,000. It would in fact vary with each run.

This is because, in order for each thread to increment the value of x, they have to do the following: (simplified, obviously)

Retrieve the value of x
Add 1 to this value
Store this value to x

Any thread can be at any step in this process at any time, and they can step on each other when a shared resource is involved. The state of x can be changed by another thread during the time between x is being read and when it is written back.

Let's say a thread retrieves the value of x, but hasn't stored it yet. Another thread can also retrieve the same value of x (because no thread has changed it yet) and then they would both be storing the same value (x+1) back in x!

Example:

Thread 1: reads x, value is 7
Thread 1: add 1 to x, value is now 8
Thread 2: reads x, value is 7
Thread 1: stores 8 in x
Thread 2: adds 1 to x, value is now 8
Thread 2: stores 8 in x

Race conditions can be avoided by employing some sort of locking mechanism before the code that accesses the shared resource:

for ( int i = 0; i < 10000000; i++ )
{
   //lock x
   x = x + 1; 
   //unlock x
}

Here, the answer comes out as 50,000,000 every time.

For more on locking, search for: mutex, semaphore, critical section, shared resource.


What is a race condition?

The situation when the process is critically dependent on the sequence or timing of other events.

For example, Processor A and processor B both needs identical resource for their execution.

How do you detect them?

There are tools to detect race condition automatically:

How do you handle them?

Race condition can be handled by Mutex or Semaphores. They act as a lock allows a process to acquire a resource based on certain requirements to prevent race condition.

How do you prevent them from occurring?

There are various ways to prevent race condition, such as Critical Section Avoidance.

  1. No two processes simultaneously inside their critical regions. (Mutual Exclusion)
  2. No assumptions are made about speeds or the number of CPUs.
  3. No process running outside its critical region which blocks other processes.
  4. No process has to wait forever to enter its critical region. (A waits for B resources, B waits for C resources, C waits for A resources)

There is an important technical difference between race conditions and data races. Most answers seem to make the assumption that these terms are equivalent, but they are not.

A data race occurs when 2 instructions access the same memory location, at least one of these accesses is a write and there is no happens before ordering among these accesses. Now what constitutes a happens before ordering is subject to a lot of debate, but in general ulock-lock pairs on the same lock variable and wait-signal pairs on the same condition variable induce a happens-before order.

A race condition is a semantic error. It is a flaw that occurs in the timing or the ordering of events that leads to erroneous program behavior.

Many race conditions can be (and in fact are) caused by data races, but this is not necessary. As a matter of fact, data races and race conditions are neither the necessary, nor the sufficient condition for one another. This blog post also explains the difference very well, with a simple bank transaction example. Here is another simple example that explains the difference.

Now that we nailed down the terminology, let us try to answer the original question.

Given that race conditions are semantic bugs, there is no general way of detecting them. This is because there is no way of having an automated oracle that can distinguish correct vs. incorrect program behavior in the general case. Race detection is an undecidable problem.

On the other hand, data races have a precise definition that does not necessarily relate to correctness, and therefore one can detect them. There are many flavors of data race detectors (static/dynamic data race detection, lockset-based data race detection, happens-before based data race detection, hybrid data race detection). A state of the art dynamic data race detector is ThreadSanitizer which works very well in practice.

Handling data races in general requires some programming discipline to induce happens-before edges between accesses to shared data (either during development, or once they are detected using the above mentioned tools). this can be done through locks, condition variables, semaphores, etc. However, one can also employ different programming paradigms like message passing (instead of shared memory) that avoid data races by construction.


What is a Race Condition?

You are planning to go to a movie at 5 pm. You inquire about the availability of the tickets at 4 pm. The representative says that they are available. You relax and reach the ticket window 5 minutes before the show. I'm sure you can guess what happens: it's a full house. The problem here was in the duration between the check and the action. You inquired at 4 and acted at 5. In the meantime, someone else grabbed the tickets. That's a race condition - specifically a "check-then-act" scenario of race conditions.

How do you detect them?

Religious code review, multi-threaded unit tests. There is no shortcut. There are few Eclipse plugin emerging on this, but nothing stable yet.

How do you handle and prevent them?

The best thing would be to create side-effect free and stateless functions, use immutables as much as possible. But that is not always possible. So using java.util.concurrent.atomic, concurrent data structures, proper synchronization, and actor based concurrency will help.

The best resource for concurrency is JCIP. You can also get some more details on above explanation here.


A race condition is a kind of bug, that happens only with certain temporal conditions.

Example: Imagine you have two threads, A and B.

In Thread A:

if( object.a != 0 )
    object.avg = total / object.a

In Thread B:

object.a = 0

If thread A is preempted just after having check that object.a is not null, B will do a = 0, and when thread A will gain the processor, it will do a "divide by zero".

This bug only happen when thread A is preempted just after the if statement, it's very rare, but it can happen.


Microsoft actually have published a really detailed article on this matter of race conditions and deadlocks. The most summarized abstract from it would be the title paragraph:

A race condition occurs when two threads access a shared variable at the same time. The first thread reads the variable, and the second thread reads the same value from the variable. Then the first thread and second thread perform their operations on the value, and they race to see which thread can write the value last to the shared variable. The value of the thread that writes its value last is preserved, because the thread is writing over the value that the previous thread wrote.


A race condition is an undesirable situation that occurs when two or more process can access and change the shared data at the same time.It occurred because there were conflicting accesses to a resource . Critical section problem may cause race condition. To solve critical condition among the process we have take out only one process at a time which execute the critical section.


Try this basic example for better understanding of race condition:

    public class ThreadRaceCondition {

    /**
     * @param args
     * @throws InterruptedException
     */
    public static void main(String[] args) throws InterruptedException {
        Account myAccount = new Account(22222222);

        // Expected deposit: 250
        for (int i = 0; i < 50; i++) {
            Transaction t = new Transaction(myAccount,
                    Transaction.TransactionType.DEPOSIT, 5.00);
            t.start();
        }

        // Expected withdrawal: 50
        for (int i = 0; i < 50; i++) {
            Transaction t = new Transaction(myAccount,
                    Transaction.TransactionType.WITHDRAW, 1.00);
            t.start();

        }

        // Temporary sleep to ensure all threads are completed. Don't use in
        // realworld :-)
        Thread.sleep(1000);
        // Expected account balance is 200
        System.out.println("Final Account Balance: "
                + myAccount.getAccountBalance());

    }

}

class Transaction extends Thread {

    public static enum TransactionType {
        DEPOSIT(1), WITHDRAW(2);

        private int value;

        private TransactionType(int value) {
            this.value = value;
        }

        public int getValue() {
            return value;
        }
    };

    private TransactionType transactionType;
    private Account account;
    private double amount;

    /*
     * If transactionType == 1, deposit else if transactionType == 2 withdraw
     */
    public Transaction(Account account, TransactionType transactionType,
            double amount) {
        this.transactionType = transactionType;
        this.account = account;
        this.amount = amount;
    }

    public void run() {
        switch (this.transactionType) {
        case DEPOSIT:
            deposit();
            printBalance();
            break;
        case WITHDRAW:
            withdraw();
            printBalance();
            break;
        default:
            System.out.println("NOT A VALID TRANSACTION");
        }
        ;
    }

    public void deposit() {
        this.account.deposit(this.amount);
    }

    public void withdraw() {
        this.account.withdraw(amount);
    }

    public void printBalance() {
        System.out.println(Thread.currentThread().getName()
                + " : TransactionType: " + this.transactionType + ", Amount: "
                + this.amount);
        System.out.println("Account Balance: "
                + this.account.getAccountBalance());
    }
}

class Account {
    private int accountNumber;
    private double accountBalance;

    public int getAccountNumber() {
        return accountNumber;
    }

    public double getAccountBalance() {
        return accountBalance;
    }

    public Account(int accountNumber) {
        this.accountNumber = accountNumber;
    }

    // If this method is not synchronized, you will see race condition on
    // Remove syncronized keyword to see race condition
    public synchronized boolean deposit(double amount) {
        if (amount < 0) {
            return false;
        } else {
            accountBalance = accountBalance + amount;
            return true;
        }
    }

    // If this method is not synchronized, you will see race condition on
    // Remove syncronized keyword to see race condition
    public synchronized boolean withdraw(double amount) {
        if (amount > accountBalance) {
            return false;
        } else {
            accountBalance = accountBalance - amount;
            return true;
        }
    }
}

A race condition is an undesirable situation that occurs when a device or system attempts to perform two or more operations at the same time, but because of the nature of the device or system, the operations must be done in the proper sequence in order to be done correctly.

In computer memory or storage, a race condition may occur if commands to read and write a large amount of data are received at almost the same instant, and the machine attempts to overwrite some or all of the old data while that old data is still being read. The result may be one or more of the following: a computer crash, an "illegal operation," notification and shutdown of the program, errors reading the old data, or errors writing the new data.


You don't always want to discard a race condition. If you have a flag which can be read and written by multiple threads, and this flag is set to 'done' by one thread so that other thread stop processing when flag is set to 'done', you don't want that "race condition" to be eliminated. In fact, this one can be referred to as a benign race condition.

However, using a tool for detection of race condition, it will be spotted as a harmful race condition.

More details on race condition here, http://msdn.microsoft.com/en-us/magazine/cc546569.aspx.


Examples related to multithreading

How can compare-and-swap be used for a wait-free mutual exclusion for any shared data structure? Waiting until the task finishes What is the difference between Task.Run() and Task.Factory.StartNew() Why is setState in reactjs Async instead of Sync? What exactly is std::atomic? Calling async method on button click WAITING at sun.misc.Unsafe.park(Native Method) How to use background thread in swift? What is the use of static synchronized method in java? Locking pattern for proper use of .NET MemoryCache

Examples related to concurrency

WAITING at sun.misc.Unsafe.park(Native Method) What is the Swift equivalent to Objective-C's "@synchronized"? Custom thread pool in Java 8 parallel stream How to check if another instance of my shell script is running How to use the CancellationToken property? What's the difference between a Future and a Promise? Why use a ReentrantLock if one can use synchronized(this)? NSOperation vs Grand Central Dispatch What's the difference between Thread start() and Runnable run() multiprocessing.Pool: When to use apply, apply_async or map?

Examples related to terminology

The differences between initialize, define, declare a variable What is the difference between a web API and a web service? What does "opt" mean (as in the "opt" directory)? Is it an abbreviation? What's the name for hyphen-separated case? What is Bit Masking? What is ADT? (Abstract Data Type) What exactly are iterator, iterable, and iteration? What is a web service endpoint? What is the difference between Cloud, Grid and Cluster? How to explain callbacks in plain english? How are they different from calling one function from another function?

Examples related to race-condition

How to get last inserted row ID from WordPress database? What is a race condition?