[operating-system] Difference between binary semaphore and mutex

Is there any difference between a binary semaphore and mutex or are they essentially the same?

This question is related to operating-system mutex semaphore binary-semaphore

The answer is


http://www.geeksforgeeks.org/archives/9102 discusses in details.

Mutex is locking mechanism used to synchronize access to a resource. Semaphore is signaling mechanism.

Its up to to programmer if he/she wants to use binary semaphore in place of mutex.


As many folks here have mentioned, a mutex is used to protect a critical piece of code (AKA critical section.) You will acquire the mutex (lock), enter critical section, and release mutex (unlock) all in the same thread.

While using a semaphore, you can make a thread wait on a semaphore (say thread A), until another thread (say thread B)completes whatever task, and then sets the Semaphore for thread A to stop the wait, and continue its task.


They are NOT the same thing. They are used for different purposes!
While both types of semaphores have a full/empty state and use the same API, their usage is very different.

Mutual Exclusion Semaphores
Mutual Exclusion semaphores are used to protect shared resources (data structure, file, etc..).

A Mutex semaphore is "owned" by the task that takes it. If Task B attempts to semGive a mutex currently held by Task A, Task B's call will return an error and fail.

Mutexes always use the following sequence:

  - SemTake
  - Critical Section
  - SemGive

Here is a simple example:

  Thread A                     Thread B
   Take Mutex
     access data
     ...                        Take Mutex  <== Will block
     ...
   Give Mutex                     access data  <== Unblocks
                                  ...
                                Give Mutex

Binary Semaphore
Binary Semaphore address a totally different question:

  • Task B is pended waiting for something to happen (a sensor being tripped for example).
  • Sensor Trips and an Interrupt Service Routine runs. It needs to notify a task of the trip.
  • Task B should run and take appropriate actions for the sensor trip. Then go back to waiting.

   Task A                      Task B
   ...                         Take BinSemaphore   <== wait for something
   Do Something Noteworthy
   Give BinSemaphore           do something    <== unblocks

Note that with a binary semaphore, it is OK for B to take the semaphore and A to give it.
Again, a binary semaphore is NOT protecting a resource from access. The act of Giving and Taking a semaphore are fundamentally decoupled.
It typically makes little sense for the same task to so a give and a take on the same binary semaphore.


Diff between Binary Semaphore and Mutex: OWNERSHIP: Semaphores can be signalled (posted) even from a non current owner. It means you can simply post from any other thread, though you are not the owner.

Semaphore is a public property in process, It can be simply posted by a non owner thread. Please Mark this difference in BOLD letters, it mean a lot.


You obviously use mutex to lock a data in one thread getting accessed by another thread at the same time. Assume that you have just called lock() and in the process of accessing data. This means that you don’t expect any other thread (or another instance of the same thread-code) to access the same data locked by the same mutex. That is, if it is the same thread-code getting executed on a different thread instance, hits the lock, then the lock() should block the control flow there. This applies to a thread that uses a different thread-code, which is also accessing the same data and which is also locked by the same mutex. In this case, you are still in the process of accessing the data and you may take, say, another 15 secs to reach the mutex unlock (so that the other thread that is getting blocked in mutex lock would unblock and would allow the control to access the data). Do you at any cost allow yet another thread to just unlock the same mutex, and in turn, allow the thread that is already waiting (blocking) in the mutex lock to unblock and access the data? Hope you got what I am saying here? As per, agreed upon universal definition!,

  • with “mutex” this can’t happen. No other thread can unlock the lock in your thread
  • with “binary-semaphore” this can happen. Any other thread can unlock the lock in your thread

So, if you are very particular about using binary-semaphore instead of mutex, then you should be very careful in “scoping” the locks and unlocks. I mean that every control-flow that hits every lock should hit an unlock call, also there shouldn’t be any “first unlock”, rather it should be always “first lock”.


Mutexes have ownership, unlike semaphores. Although any thread, within the scope of a mutex, can get an unlocked mutex and lock access to the same critical section of code,only the thread that locked a mutex should unlock it.


Mutex: Suppose we have critical section thread T1 wants to access it then it follows below steps. T1:

  1. Lock
  2. Use Critical Section
  3. Unlock

Binary semaphore: It works based on signaling wait and signal. wait(s) decrease "s" value by one usually "s" value is initialize with value "1", signal(s) increases "s" value by one. if "s" value is 1 means no one is using critical section, when value is 0 means critical section is in use. suppose thread T2 is using critical section then it follows below steps. T2 :

  1. wait(s)//initially s value is one after calling wait it's value decreased by one i.e 0
  2. Use critical section
  3. signal(s) // now s value is increased and it become 1

Main difference between Mutex and Binary semaphore is in Mutext if thread lock the critical section then it has to unlock critical section no other thread can unlock it, but in case of Binary semaphore if one thread locks critical section using wait(s) function then value of s become "0" and no one can access it until value of "s" become 1 but suppose some other thread calls signal(s) then value of "s" become 1 and it allows other function to use critical section. hence in Binary semaphore thread doesn't have ownership.


MUTEX

Until recently, the only sleeping lock in the kernel was the semaphore. Most users of semaphores instantiated a semaphore with a count of one and treated them as a mutual exclusion lock—a sleeping version of the spin-lock. Unfortunately, semaphores are rather generic and do not impose any usage constraints. This makes them useful for managing exclusive access in obscure situations, such as complicated dances between the kernel and userspace. But it also means that simpler locking is harder to do, and the lack of enforced rules makes any sort of automated debugging or constraint enforcement impossible. Seeking a simpler sleeping lock, the kernel developers introduced the mutex.Yes, as you are now accustomed to, that is a confusing name. Let’s clarify.The term “mutex” is a generic name to refer to any sleeping lock that enforces mutual exclusion, such as a semaphore with a usage count of one. In recent Linux kernels, the proper noun “mutex” is now also a specific type of sleeping lock that implements mutual exclusion.That is, a mutex is a mutex.

The simplicity and efficiency of the mutex come from the additional constraints it imposes on its users over and above what the semaphore requires. Unlike a semaphore, which implements the most basic of behaviour in accordance with Dijkstra’s original design, the mutex has a stricter, narrower use case: n Only one task can hold the mutex at a time. That is, the usage count on a mutex is always one.

  1. Whoever locked a mutex must unlock it. That is, you cannot lock a mutex in one context and then unlock it in another. This means that the mutex isn’t suitable for more complicated synchronizations between kernel and user-space. Most use cases, however, cleanly lock and unlock from the same context.
  2. Recursive locks and unlocks are not allowed. That is, you cannot recursively acquire the same mutex, and you cannot unlock an unlocked mutex.
  3. A process cannot exit while holding a mutex.
  4. A mutex cannot be acquired by an interrupt handler or bottom half, even with mutex_trylock().
  5. A mutex can be managed only via the official API: It must be initialized via the methods described in this section and cannot be copied, hand initialized, or reinitialized.

[1] Linux Kernel Development, Third Edition Robert Love


The basic issue is concurrency. There is more than one flow of control. Think about two processes using a shared memory. Now only one process can access the shared memory at a time. If more than one process accesses the shared memory at a time, the contents of shared memory would get corrupted. It is like a railroad track. Only one train can run on it, else there would be an accident.So there is a signalling mechanism, which a driver checks. If the signal is green, the train can go and if it is red it has to wait to use the track. Similarly in case of shared memory, there is a binary semaphore. If the semaphore is 1, a process acquires it (makes it 0) and goes ahead and accesses it. If the semaphore is 0, the process waits. The functionality the binary semaphore has to provide is mutual exclusion (or mutex, in short) so that only one of the many concurrent entities (process or thread) mutually excludes others. It is a plus that we have counting semaphores, which help in synchronizing multiple instances of a resource.

Mutual exclusion is the basic functionality provided by semaphores. Now in the context of threads, we might have a different name and syntax for it. But the underlying concept is the same: how to keep integrity of code and data in concurrent programming. In my opinion, things like ownership, and associated checks are refinements provided by implementations.


Mutex and binary semaphore are both of the same usage, but in reality, they are different.

In case of mutex, only the thread which have locked it can unlock it. If any other thread comes to lock it, it will wait.

In case of semaphone, that's not the case. Semaphore is not tied up with a particular thread ID.


The Toilet example is an enjoyable analogy:

Mutex:

Is a key to a toilet. One person can have the key - occupy the toilet - at the time. When finished, the person gives (frees) the key to the next person in the queue.

Officially: "Mutexes are typically used to serialise access to a section of re-entrant code that cannot be executed concurrently by more than one thread. A mutex object only allows one thread into a controlled section, forcing other threads which attempt to gain access to that section to wait until the first thread has exited from that section." Ref: Symbian Developer Library

(A mutex is really a semaphore with value 1.)

Semaphore:

Is the number of free identical toilet keys. Example, say we have four toilets with identical locks and keys. The semaphore count - the count of keys - is set to 4 at beginning (all four toilets are free), then the count value is decremented as people are coming in. If all toilets are full, ie. there are no free keys left, the semaphore count is 0. Now, when eq. one person leaves the toilet, semaphore is increased to 1 (one free key), and given to the next person in the queue.

Officially: "A semaphore restricts the number of simultaneous users of a shared resource up to a maximum number. Threads can request access to the resource (decrementing the semaphore), and can signal that they have finished using the resource (incrementing the semaphore)." Ref: Symbian Developer Library


The answer may depend on the target OS. For example, at least one RTOS implementation I'm familiar with will allow multiple sequential "get" operations against a single OS mutex, so long as they're all from within the same thread context. The multiple gets must be replaced by an equal number of puts before another thread will be allowed to get the mutex. This differs from binary semaphores, for which only a single get is allowed at a time, regardless of thread contexts.

The idea behind this type of mutex is that you protect an object by only allowing a single context to modify the data at a time. Even if the thread gets the mutex and then calls a function that further modifies the object (and gets/puts the protector mutex around its own operations), the operations should still be safe because they're all happening under a single thread.

{
    mutexGet();  // Other threads can no longer get the mutex.

    // Make changes to the protected object.
    // ...

    objectModify();  // Also gets/puts the mutex.  Only allowed from this thread context.

    // Make more changes to the protected object.
    // ...

    mutexPut();  // Finally allows other threads to get the mutex.
}

Of course, when using this feature, you must be certain that all accesses within a single thread really are safe!

I'm not sure how common this approach is, or whether it applies outside of the systems with which I'm familiar. For an example of this kind of mutex, see the ThreadX RTOS.


In windows the difference is as below. MUTEX: process which successfully executes wait has to execute a signal and vice versa. BINARY SEMAPHORES: Different processes can execute wait or signal operation on a semaphore.


Best Solution

The only difference is

1.Mutex -> lock and unlock are under the ownership of a thread that locks the mutex.

2.Semaphore -> No ownership i.e; if one thread calls semwait(s) any other thread can call sempost(s) to remove the lock.


I think most of the answers here were confusing especially those saying that mutex can be released only by the process that holds it but semaphore can be signaled by ay process. The above line is kind of vague in terms of semaphore. To understand we should know that there are two kinds of semaphore one is called counting semaphore and the other is called a binary semaphore. In counting semaphore handles access to n number of resources where n can be defined before the use. Each semaphore has a count variable, which keeps the count of the number of resources in use, initially, it is set to n. Each process that wishes to uses a resource performs a wait() operation on the semaphore (thereby decrementing the count). When a process releases a resource, it performs a release() operation (incrementing the count). When the count becomes 0, all the resources are being used. After that, the process waits until the count becomes more than 0. Now here is the catch only the process that holds the resource can increase the count no other process can increase the count only the processes holding a resource can increase the count and the process waiting for the semaphore again checks and when it sees the resource available it decreases the count again. So in terms of binary semaphore, only the process holding the semaphore can increase the count, and count remains zero until it stops using the semaphore and increases the count and other process gets the chance to access the semaphore.

The main difference between binary semaphore and mutex is that semaphore is a signaling mechanism and mutex is a locking mechanism, but binary semaphore seems to function like mutex that creates confusion, but both are different concepts suitable for a different kinds of work.


You can clearly remember difference by this:

  1. Mutex lock : is for protecting critical region, Mutex can't be used across processes, only used in single process

  2. Semaphore: is for signalling availability of a resource. Semaphore can be used both across processes and across processes.


Mutex work on blocking critical region, But Semaphore work on count.


Myth:

Couple of article says that "binary semaphore and mutex are same" or "Semaphore with value 1 is mutex" but the basic difference is Mutex can be released only by thread that had acquired it, while you can signal semaphore from any other thread

Key Points:

•A thread can acquire more than one lock (Mutex).

•A mutex can be locked more than once only if its a recursive mutex, here lock and unlock for mutex should be same

•If a thread which had already locked a mutex, tries to lock the mutex again, it will enter into the waiting list of that mutex, which results in deadlock.

•Binary semaphore and mutex are similar but not same.

•Mutex is costly operation due to protection protocols associated with it.

•Main aim of mutex is achieve atomic access or lock on resource


A Mutex controls access to a single shared resource. It provides operations to acquire() access to that resource and release() it when done.

A Semaphore controls access to a shared pool of resources. It provides operations to Wait() until one of the resources in the pool becomes available, and Signal() when it is given back to the pool.

When number of resources a Semaphore protects is greater than 1, it is called a Counting Semaphore. When it controls one resource, it is called a Boolean Semaphore. A boolean semaphore is equivalent to a mutex.

Thus a Semaphore is a higher level abstraction than Mutex. A Mutex can be implemented using a Semaphore but not the other way around.


Mutex: Suppose we have critical section thread T1 wants to access it then it follows below steps. T1:

  1. Lock
  2. Use Critical Section
  3. Unlock

Binary semaphore: It works based on signaling wait and signal. wait(s) decrease "s" value by one usually "s" value is initialize with value "1", signal(s) increases "s" value by one. if "s" value is 1 means no one is using critical section, when value is 0 means critical section is in use. suppose thread T2 is using critical section then it follows below steps. T2 :

  1. wait(s)//initially s value is one after calling wait it's value decreased by one i.e 0
  2. Use critical section
  3. signal(s) // now s value is increased and it become 1

Main difference between Mutex and Binary semaphore is in Mutext if thread lock the critical section then it has to unlock critical section no other thread can unlock it, but in case of Binary semaphore if one thread locks critical section using wait(s) function then value of s become "0" and no one can access it until value of "s" become 1 but suppose some other thread calls signal(s) then value of "s" become 1 and it allows other function to use critical section. hence in Binary semaphore thread doesn't have ownership.


Mutex work on blocking critical region, But Semaphore work on count.


Mutex are used for " Locking Mechanisms ". one process at a time can use a shared resource

whereas

Semaphores are used for " Signaling Mechanisms " like "I am done , now can continue"


Nice articles on the topic:

From part 2:

The mutex is similar to the principles of the binary semaphore with one significant difference: the principle of ownership. Ownership is the simple concept that when a task locks (acquires) a mutex only it can unlock (release) it. If a task tries to unlock a mutex it hasn’t locked (thus doesn’t own) then an error condition is encountered and, most importantly, the mutex is not unlocked. If the mutual exclusion object doesn't have ownership then, irrelevant of what it is called, it is not a mutex.


"binary semaphore" is a programming language circumvent to use a «semaphore» like «mutex». Apparently there are two very big differences:

  1. The way you call each one of them.

  2. The maximum length of the "identifier".


The answer may depend on the target OS. For example, at least one RTOS implementation I'm familiar with will allow multiple sequential "get" operations against a single OS mutex, so long as they're all from within the same thread context. The multiple gets must be replaced by an equal number of puts before another thread will be allowed to get the mutex. This differs from binary semaphores, for which only a single get is allowed at a time, regardless of thread contexts.

The idea behind this type of mutex is that you protect an object by only allowing a single context to modify the data at a time. Even if the thread gets the mutex and then calls a function that further modifies the object (and gets/puts the protector mutex around its own operations), the operations should still be safe because they're all happening under a single thread.

{
    mutexGet();  // Other threads can no longer get the mutex.

    // Make changes to the protected object.
    // ...

    objectModify();  // Also gets/puts the mutex.  Only allowed from this thread context.

    // Make more changes to the protected object.
    // ...

    mutexPut();  // Finally allows other threads to get the mutex.
}

Of course, when using this feature, you must be certain that all accesses within a single thread really are safe!

I'm not sure how common this approach is, or whether it applies outside of the systems with which I'm familiar. For an example of this kind of mutex, see the ThreadX RTOS.


Since none of the above answer clears the confusion, here is one which cleared my confusion.

Strictly speaking, a mutex is a locking mechanism used to synchronize access to a resource. Only one task (can be a thread or process based on OS abstraction) can acquire the mutex. It means there will be ownership associated with mutex, and only the owner can release the lock (mutex).

Semaphore is signaling mechanism (“I am done, you can carry on” kind of signal). For example, if you are listening songs (assume it as one task) on your mobile and at the same time your friend called you, an interrupt will be triggered upon which an interrupt service routine (ISR) will signal the call processing task to wakeup.

Source: http://www.geeksforgeeks.org/mutex-vs-semaphore/


Mutexes have ownership, unlike semaphores. Although any thread, within the scope of a mutex, can get an unlocked mutex and lock access to the same critical section of code,only the thread that locked a mutex should unlock it.


A Mutex controls access to a single shared resource. It provides operations to acquire() access to that resource and release() it when done.

A Semaphore controls access to a shared pool of resources. It provides operations to Wait() until one of the resources in the pool becomes available, and Signal() when it is given back to the pool.

When number of resources a Semaphore protects is greater than 1, it is called a Counting Semaphore. When it controls one resource, it is called a Boolean Semaphore. A boolean semaphore is equivalent to a mutex.

Thus a Semaphore is a higher level abstraction than Mutex. A Mutex can be implemented using a Semaphore but not the other way around.


Apart from the fact that mutexes have an owner, the two objects may be optimized for different usage. Mutexes are designed to be held only for a short time; violating this can cause poor performance and unfair scheduling. For example, a running thread may be permitted to acquire a mutex, even though another thread is already blocked on it. Semaphores may provide more fairness, or fairness can be forced using several condition variables.


At a theoretical level, they are no different semantically. You can implement a mutex using semaphores or vice versa (see here for an example). In practice, the implementations are different and they offer slightly different services.

The practical difference (in terms of the system services surrounding them) is that the implementation of a mutex is aimed at being a more lightweight synchronisation mechanism. In oracle-speak, mutexes are known as latches and semaphores are known as waits.

At the lowest level, they use some sort of atomic test and set mechanism. This reads the current value of a memory location, computes some sort of conditional and writes out a value at that location in a single instruction that cannot be interrupted. This means that you can acquire a mutex and test to see if anyone else had it before you.

A typical mutex implementation has a process or thread executing the test-and-set instruction and evaluating whether anything else had set the mutex. A key point here is that there is no interaction with the scheduler, so we have no idea (and don't care) who has set the lock. Then we either give up our time slice and attempt it again when the task is re-scheduled or execute a spin-lock. A spin lock is an algorithm like:

Count down from 5000:
     i. Execute the test-and-set instruction
    ii. If the mutex is clear, we have acquired it in the previous instruction 
        so we can exit the loop
   iii. When we get to zero, give up our time slice.

When we have finished executing our protected code (known as a critical section) we just set the mutex value to zero or whatever means 'clear.' If multiple tasks are attempting to acquire the mutex then the next task that happens to be scheduled after the mutex is released will get access to the resource. Typically you would use mutexes to control a synchronised resource where exclusive access is only needed for very short periods of time, normally to make an update to a shared data structure.

A semaphore is a synchronised data structure (typically using a mutex) that has a count and some system call wrappers that interact with the scheduler in a bit more depth than the mutex libraries would. Semaphores are incremented and decremented and used to block tasks until something else is ready. See Producer/Consumer Problem for a simple example of this. Semaphores are initialised to some value - a binary semaphore is just a special case where the semaphore is initialised to 1. Posting to a semaphore has the effect of waking up a waiting process.

A basic semaphore algorithm looks like:

(somewhere in the program startup)
Initialise the semaphore to its start-up value.

Acquiring a semaphore
   i. (synchronised) Attempt to decrement the semaphore value
  ii. If the value would be less than zero, put the task on the tail of the list of tasks waiting on the semaphore and give up the time slice.

Posting a semaphore
   i. (synchronised) Increment the semaphore value
  ii. If the value is greater or equal to the amount requested in the post at the front of the queue, take that task off the queue and make it runnable.  
 iii. Repeat (ii) for all tasks until the posted value is exhausted or there are no more tasks waiting.

In the case of a binary semaphore the main practical difference between the two is the nature of the system services surrounding the actual data structure.

EDIT: As evan has rightly pointed out, spinlocks will slow down a single processor machine. You would only use a spinlock on a multi-processor box because on a single processor the process holding the mutex will never reset it while another task is running. Spinlocks are only useful on multi-processor architectures.


While a binary semaphore may be used as a mutex, a mutex is a more specific use-case, in that only the process that locked the mutex is supposed to unlock it. This ownership constraint makes it possible to provide protection against:

  • Accidental release
  • Recursive Deadlock
  • Task Death Deadlock

These constraints are not always present because they degrade the speed. During the development of your code, you can enable these checks temporarily.

e.g. you can enable Error check attribute in your mutex. Error checking mutexes return EDEADLK if you try to lock the same one twice and EPERM if you unlock a mutex that isn't yours.

pthread_mutex_t mutex;
pthread_mutexattr_t attr;
pthread_mutexattr_init (&attr);
pthread_mutexattr_settype (&attr, PTHREAD_MUTEX_ERRORCHECK_NP);
pthread_mutex_init (&mutex, &attr);

Once initialised we can place these checks in our code like this:

if(pthread_mutex_unlock(&mutex)==EPERM)
 printf("Unlock failed:Mutex not owned by this thread\n");

They are NOT the same thing. They are used for different purposes!
While both types of semaphores have a full/empty state and use the same API, their usage is very different.

Mutual Exclusion Semaphores
Mutual Exclusion semaphores are used to protect shared resources (data structure, file, etc..).

A Mutex semaphore is "owned" by the task that takes it. If Task B attempts to semGive a mutex currently held by Task A, Task B's call will return an error and fail.

Mutexes always use the following sequence:

  - SemTake
  - Critical Section
  - SemGive

Here is a simple example:

  Thread A                     Thread B
   Take Mutex
     access data
     ...                        Take Mutex  <== Will block
     ...
   Give Mutex                     access data  <== Unblocks
                                  ...
                                Give Mutex

Binary Semaphore
Binary Semaphore address a totally different question:

  • Task B is pended waiting for something to happen (a sensor being tripped for example).
  • Sensor Trips and an Interrupt Service Routine runs. It needs to notify a task of the trip.
  • Task B should run and take appropriate actions for the sensor trip. Then go back to waiting.

   Task A                      Task B
   ...                         Take BinSemaphore   <== wait for something
   Do Something Noteworthy
   Give BinSemaphore           do something    <== unblocks

Note that with a binary semaphore, it is OK for B to take the semaphore and A to give it.
Again, a binary semaphore is NOT protecting a resource from access. The act of Giving and Taking a semaphore are fundamentally decoupled.
It typically makes little sense for the same task to so a give and a take on the same binary semaphore.


Almost all of the above said it right. Let me also try my bit to clarify if somebody still has a doubt.

  • Mutex -> used for serialization
  • Semaphore-> synchronization.

Purpose of both are different however, same functionality could be achieved through both of them with careful programming.

Standard Example-> producer consumer problem.

initial value of SemaVar=0

Producer                           Consumer
---                                SemaWait()->decrement SemaVar   
produce data
---
SemaSignal SemaVar or SemaVar++  --->consumer unblocks as SemVar is 1 now.

Hope I could clarify.


  • A mutex can be released only by the thread that had acquired it.
  • A binary semaphore can be signaled by any thread (or process).

so semaphores are more suitable for some synchronization problems like producer-consumer.

On Windows, binary semaphores are more like event objects than mutexes.


The basic issue is concurrency. There is more than one flow of control. Think about two processes using a shared memory. Now only one process can access the shared memory at a time. If more than one process accesses the shared memory at a time, the contents of shared memory would get corrupted. It is like a railroad track. Only one train can run on it, else there would be an accident.So there is a signalling mechanism, which a driver checks. If the signal is green, the train can go and if it is red it has to wait to use the track. Similarly in case of shared memory, there is a binary semaphore. If the semaphore is 1, a process acquires it (makes it 0) and goes ahead and accesses it. If the semaphore is 0, the process waits. The functionality the binary semaphore has to provide is mutual exclusion (or mutex, in short) so that only one of the many concurrent entities (process or thread) mutually excludes others. It is a plus that we have counting semaphores, which help in synchronizing multiple instances of a resource.

Mutual exclusion is the basic functionality provided by semaphores. Now in the context of threads, we might have a different name and syntax for it. But the underlying concept is the same: how to keep integrity of code and data in concurrent programming. In my opinion, things like ownership, and associated checks are refinements provided by implementations.


Since none of the above answer clears the confusion, here is one which cleared my confusion.

Strictly speaking, a mutex is a locking mechanism used to synchronize access to a resource. Only one task (can be a thread or process based on OS abstraction) can acquire the mutex. It means there will be ownership associated with mutex, and only the owner can release the lock (mutex).

Semaphore is signaling mechanism (“I am done, you can carry on” kind of signal). For example, if you are listening songs (assume it as one task) on your mobile and at the same time your friend called you, an interrupt will be triggered upon which an interrupt service routine (ISR) will signal the call processing task to wakeup.

Source: http://www.geeksforgeeks.org/mutex-vs-semaphore/


Best Solution

The only difference is

1.Mutex -> lock and unlock are under the ownership of a thread that locks the mutex.

2.Semaphore -> No ownership i.e; if one thread calls semwait(s) any other thread can call sempost(s) to remove the lock.


  • A mutex can be released only by the thread that had acquired it.
  • A binary semaphore can be signaled by any thread (or process).

so semaphores are more suitable for some synchronization problems like producer-consumer.

On Windows, binary semaphores are more like event objects than mutexes.


While a binary semaphore may be used as a mutex, a mutex is a more specific use-case, in that only the process that locked the mutex is supposed to unlock it. This ownership constraint makes it possible to provide protection against:

  • Accidental release
  • Recursive Deadlock
  • Task Death Deadlock

These constraints are not always present because they degrade the speed. During the development of your code, you can enable these checks temporarily.

e.g. you can enable Error check attribute in your mutex. Error checking mutexes return EDEADLK if you try to lock the same one twice and EPERM if you unlock a mutex that isn't yours.

pthread_mutex_t mutex;
pthread_mutexattr_t attr;
pthread_mutexattr_init (&attr);
pthread_mutexattr_settype (&attr, PTHREAD_MUTEX_ERRORCHECK_NP);
pthread_mutex_init (&mutex, &attr);

Once initialised we can place these checks in our code like this:

if(pthread_mutex_unlock(&mutex)==EPERM)
 printf("Unlock failed:Mutex not owned by this thread\n");

On Windows, there are two differences between mutexes and binary semaphores:

  1. A mutex can only be released by the thread which has ownership, i.e. the thread which previously called the Wait function, (or which took ownership when creating it). A semaphore can be released by any thread.

  2. A thread can call a wait function repeatedly on a mutex without blocking. However, if you call a wait function twice on a binary semaphore without releasing the semaphore in between, the thread will block.


Myth:

Couple of article says that "binary semaphore and mutex are same" or "Semaphore with value 1 is mutex" but the basic difference is Mutex can be released only by thread that had acquired it, while you can signal semaphore from any other thread

Key Points:

•A thread can acquire more than one lock (Mutex).

•A mutex can be locked more than once only if its a recursive mutex, here lock and unlock for mutex should be same

•If a thread which had already locked a mutex, tries to lock the mutex again, it will enter into the waiting list of that mutex, which results in deadlock.

•Binary semaphore and mutex are similar but not same.

•Mutex is costly operation due to protection protocols associated with it.

•Main aim of mutex is achieve atomic access or lock on resource


Diff between Binary Semaphore and Mutex: OWNERSHIP: Semaphores can be signalled (posted) even from a non current owner. It means you can simply post from any other thread, though you are not the owner.

Semaphore is a public property in process, It can be simply posted by a non owner thread. Please Mark this difference in BOLD letters, it mean a lot.


You can clearly remember difference by this:

  1. Mutex lock : is for protecting critical region, Mutex can't be used across processes, only used in single process

  2. Semaphore: is for signalling availability of a resource. Semaphore can be used both across processes and across processes.


On Windows, there are two differences between mutexes and binary semaphores:

  1. A mutex can only be released by the thread which has ownership, i.e. the thread which previously called the Wait function, (or which took ownership when creating it). A semaphore can be released by any thread.

  2. A thread can call a wait function repeatedly on a mutex without blocking. However, if you call a wait function twice on a binary semaphore without releasing the semaphore in between, the thread will block.


As many folks here have mentioned, a mutex is used to protect a critical piece of code (AKA critical section.) You will acquire the mutex (lock), enter critical section, and release mutex (unlock) all in the same thread.

While using a semaphore, you can make a thread wait on a semaphore (say thread A), until another thread (say thread B)completes whatever task, and then sets the Semaphore for thread A to stop the wait, and continue its task.


Nice articles on the topic:

From part 2:

The mutex is similar to the principles of the binary semaphore with one significant difference: the principle of ownership. Ownership is the simple concept that when a task locks (acquires) a mutex only it can unlock (release) it. If a task tries to unlock a mutex it hasn’t locked (thus doesn’t own) then an error condition is encountered and, most importantly, the mutex is not unlocked. If the mutual exclusion object doesn't have ownership then, irrelevant of what it is called, it is not a mutex.


  • A mutex can be released only by the thread that had acquired it.
  • A binary semaphore can be signaled by any thread (or process).

so semaphores are more suitable for some synchronization problems like producer-consumer.

On Windows, binary semaphores are more like event objects than mutexes.


Mutex and binary semaphore are both of the same usage, but in reality, they are different.

In case of mutex, only the thread which have locked it can unlock it. If any other thread comes to lock it, it will wait.

In case of semaphone, that's not the case. Semaphore is not tied up with a particular thread ID.


Mutex is used to protect the sensitive code and data, semaphore is used to synchronization.You also can have practical use with protect the sensitive code, but there might be a risk that release the protection by the other thread by operation V.So The main difference between bi-semaphore and mutex is the ownership.For instance by toilet , Mutex is like that one can enter the toilet and lock the door, no one else can enter until the man get out, bi-semaphore is like that one can enter the toilet and lock the door, but someone else could enter by asking the administrator to open the door, it's ridiculous.


They are NOT the same thing. They are used for different purposes!
While both types of semaphores have a full/empty state and use the same API, their usage is very different.

Mutual Exclusion Semaphores
Mutual Exclusion semaphores are used to protect shared resources (data structure, file, etc..).

A Mutex semaphore is "owned" by the task that takes it. If Task B attempts to semGive a mutex currently held by Task A, Task B's call will return an error and fail.

Mutexes always use the following sequence:

  - SemTake
  - Critical Section
  - SemGive

Here is a simple example:

  Thread A                     Thread B
   Take Mutex
     access data
     ...                        Take Mutex  <== Will block
     ...
   Give Mutex                     access data  <== Unblocks
                                  ...
                                Give Mutex

Binary Semaphore
Binary Semaphore address a totally different question:

  • Task B is pended waiting for something to happen (a sensor being tripped for example).
  • Sensor Trips and an Interrupt Service Routine runs. It needs to notify a task of the trip.
  • Task B should run and take appropriate actions for the sensor trip. Then go back to waiting.

   Task A                      Task B
   ...                         Take BinSemaphore   <== wait for something
   Do Something Noteworthy
   Give BinSemaphore           do something    <== unblocks

Note that with a binary semaphore, it is OK for B to take the semaphore and A to give it.
Again, a binary semaphore is NOT protecting a resource from access. The act of Giving and Taking a semaphore are fundamentally decoupled.
It typically makes little sense for the same task to so a give and a take on the same binary semaphore.


At a theoretical level, they are no different semantically. You can implement a mutex using semaphores or vice versa (see here for an example). In practice, the implementations are different and they offer slightly different services.

The practical difference (in terms of the system services surrounding them) is that the implementation of a mutex is aimed at being a more lightweight synchronisation mechanism. In oracle-speak, mutexes are known as latches and semaphores are known as waits.

At the lowest level, they use some sort of atomic test and set mechanism. This reads the current value of a memory location, computes some sort of conditional and writes out a value at that location in a single instruction that cannot be interrupted. This means that you can acquire a mutex and test to see if anyone else had it before you.

A typical mutex implementation has a process or thread executing the test-and-set instruction and evaluating whether anything else had set the mutex. A key point here is that there is no interaction with the scheduler, so we have no idea (and don't care) who has set the lock. Then we either give up our time slice and attempt it again when the task is re-scheduled or execute a spin-lock. A spin lock is an algorithm like:

Count down from 5000:
     i. Execute the test-and-set instruction
    ii. If the mutex is clear, we have acquired it in the previous instruction 
        so we can exit the loop
   iii. When we get to zero, give up our time slice.

When we have finished executing our protected code (known as a critical section) we just set the mutex value to zero or whatever means 'clear.' If multiple tasks are attempting to acquire the mutex then the next task that happens to be scheduled after the mutex is released will get access to the resource. Typically you would use mutexes to control a synchronised resource where exclusive access is only needed for very short periods of time, normally to make an update to a shared data structure.

A semaphore is a synchronised data structure (typically using a mutex) that has a count and some system call wrappers that interact with the scheduler in a bit more depth than the mutex libraries would. Semaphores are incremented and decremented and used to block tasks until something else is ready. See Producer/Consumer Problem for a simple example of this. Semaphores are initialised to some value - a binary semaphore is just a special case where the semaphore is initialised to 1. Posting to a semaphore has the effect of waking up a waiting process.

A basic semaphore algorithm looks like:

(somewhere in the program startup)
Initialise the semaphore to its start-up value.

Acquiring a semaphore
   i. (synchronised) Attempt to decrement the semaphore value
  ii. If the value would be less than zero, put the task on the tail of the list of tasks waiting on the semaphore and give up the time slice.

Posting a semaphore
   i. (synchronised) Increment the semaphore value
  ii. If the value is greater or equal to the amount requested in the post at the front of the queue, take that task off the queue and make it runnable.  
 iii. Repeat (ii) for all tasks until the posted value is exhausted or there are no more tasks waiting.

In the case of a binary semaphore the main practical difference between the two is the nature of the system services surrounding the actual data structure.

EDIT: As evan has rightly pointed out, spinlocks will slow down a single processor machine. You would only use a spinlock on a multi-processor box because on a single processor the process holding the mutex will never reset it while another task is running. Spinlocks are only useful on multi-processor architectures.


http://www.geeksforgeeks.org/archives/9102 discusses in details.

Mutex is locking mechanism used to synchronize access to a resource. Semaphore is signaling mechanism.

Its up to to programmer if he/she wants to use binary semaphore in place of mutex.


  • A mutex can be released only by the thread that had acquired it.
  • A binary semaphore can be signaled by any thread (or process).

so semaphores are more suitable for some synchronization problems like producer-consumer.

On Windows, binary semaphores are more like event objects than mutexes.


Almost all of the above said it right. Let me also try my bit to clarify if somebody still has a doubt.

  • Mutex -> used for serialization
  • Semaphore-> synchronization.

Purpose of both are different however, same functionality could be achieved through both of them with careful programming.

Standard Example-> producer consumer problem.

initial value of SemaVar=0

Producer                           Consumer
---                                SemaWait()->decrement SemaVar   
produce data
---
SemaSignal SemaVar or SemaVar++  --->consumer unblocks as SemVar is 1 now.

Hope I could clarify.


The Toilet example is an enjoyable analogy:

Mutex:

Is a key to a toilet. One person can have the key - occupy the toilet - at the time. When finished, the person gives (frees) the key to the next person in the queue.

Officially: "Mutexes are typically used to serialise access to a section of re-entrant code that cannot be executed concurrently by more than one thread. A mutex object only allows one thread into a controlled section, forcing other threads which attempt to gain access to that section to wait until the first thread has exited from that section." Ref: Symbian Developer Library

(A mutex is really a semaphore with value 1.)

Semaphore:

Is the number of free identical toilet keys. Example, say we have four toilets with identical locks and keys. The semaphore count - the count of keys - is set to 4 at beginning (all four toilets are free), then the count value is decremented as people are coming in. If all toilets are full, ie. there are no free keys left, the semaphore count is 0. Now, when eq. one person leaves the toilet, semaphore is increased to 1 (one free key), and given to the next person in the queue.

Officially: "A semaphore restricts the number of simultaneous users of a shared resource up to a maximum number. Threads can request access to the resource (decrementing the semaphore), and can signal that they have finished using the resource (incrementing the semaphore)." Ref: Symbian Developer Library


"binary semaphore" is a programming language circumvent to use a «semaphore» like «mutex». Apparently there are two very big differences:

  1. The way you call each one of them.

  2. The maximum length of the "identifier".


You obviously use mutex to lock a data in one thread getting accessed by another thread at the same time. Assume that you have just called lock() and in the process of accessing data. This means that you don’t expect any other thread (or another instance of the same thread-code) to access the same data locked by the same mutex. That is, if it is the same thread-code getting executed on a different thread instance, hits the lock, then the lock() should block the control flow there. This applies to a thread that uses a different thread-code, which is also accessing the same data and which is also locked by the same mutex. In this case, you are still in the process of accessing the data and you may take, say, another 15 secs to reach the mutex unlock (so that the other thread that is getting blocked in mutex lock would unblock and would allow the control to access the data). Do you at any cost allow yet another thread to just unlock the same mutex, and in turn, allow the thread that is already waiting (blocking) in the mutex lock to unblock and access the data? Hope you got what I am saying here? As per, agreed upon universal definition!,

  • with “mutex” this can’t happen. No other thread can unlock the lock in your thread
  • with “binary-semaphore” this can happen. Any other thread can unlock the lock in your thread

So, if you are very particular about using binary-semaphore instead of mutex, then you should be very careful in “scoping” the locks and unlocks. I mean that every control-flow that hits every lock should hit an unlock call, also there shouldn’t be any “first unlock”, rather it should be always “first lock”.


Though mutex & semaphores are used as synchronization primitives ,there is a big difference between them. In the case of mutex, only the thread that locked or acquired the mutex can unlock it. In the case of a semaphore, a thread waiting on a semaphore can be signaled by a different thread. Some operating system supports using mutex & semaphores between process. Typically usage is creating in shared memory.


In windows the difference is as below. MUTEX: process which successfully executes wait has to execute a signal and vice versa. BINARY SEMAPHORES: Different processes can execute wait or signal operation on a semaphore.


Modified question is - What's the difference between A mutex and a "binary" semaphore in "Linux"?

Ans: Following are the differences – i) Scope – The scope of mutex is within a process address space which has created it and is used for synchronization of threads. Whereas semaphore can be used across process space and hence it can be used for interprocess synchronization.

ii) Mutex is lightweight and faster than semaphore. Futex is even faster.

iii) Mutex can be acquired by same thread successfully multiple times with condition that it should release it same number of times. Other thread trying to acquire will block. Whereas in case of semaphore if same process tries to acquire it again it blocks as it can be acquired only once.


The concept was clear to me after going over above posts. But there were some lingering questions. So, I wrote this small piece of code.

When we try to give a semaphore without taking it, it goes through. But, when you try to give a mutex without taking it, it fails. I tested this on a Windows platform. Enable USE_MUTEX to run the same code using a MUTEX.

#include <stdio.h>
#include <windows.h>
#define xUSE_MUTEX 1
#define MAX_SEM_COUNT 1

DWORD WINAPI Thread_no_1( LPVOID lpParam );
DWORD WINAPI Thread_no_2( LPVOID lpParam );

HANDLE Handle_Of_Thread_1 = 0;
HANDLE Handle_Of_Thread_2 = 0;
int Data_Of_Thread_1 = 1;
int Data_Of_Thread_2 = 2;
HANDLE ghMutex = NULL;
HANDLE ghSemaphore = NULL;


int main(void)
{

#ifdef USE_MUTEX
    ghMutex = CreateMutex( NULL, FALSE, NULL);
    if (ghMutex  == NULL) 
    {
        printf("CreateMutex error: %d\n", GetLastError());
        return 1;
    }
#else
    // Create a semaphore with initial and max counts of MAX_SEM_COUNT
    ghSemaphore = CreateSemaphore(NULL,MAX_SEM_COUNT,MAX_SEM_COUNT,NULL);
    if (ghSemaphore == NULL) 
    {
        printf("CreateSemaphore error: %d\n", GetLastError());
        return 1;
    }
#endif
    // Create thread 1.
    Handle_Of_Thread_1 = CreateThread( NULL, 0,Thread_no_1, &Data_Of_Thread_1, 0, NULL);  
    if ( Handle_Of_Thread_1 == NULL)
    {
        printf("Create first thread problem \n");
        return 1;
    }

    /* sleep for 5 seconds **/
    Sleep(5 * 1000);

    /*Create thread 2 */
    Handle_Of_Thread_2 = CreateThread( NULL, 0,Thread_no_2, &Data_Of_Thread_2, 0, NULL);  
    if ( Handle_Of_Thread_2 == NULL)
    {
        printf("Create second thread problem \n");
        return 1;
    }

    // Sleep for 20 seconds
    Sleep(20 * 1000);

    printf("Out of the program \n");
    return 0;
}


int my_critical_section_code(HANDLE thread_handle)
{

#ifdef USE_MUTEX
    if(thread_handle == Handle_Of_Thread_1)
    {
        /* get the lock */
        WaitForSingleObject(ghMutex, INFINITE);
        printf("Thread 1 holding the mutex \n");
    }
#else
    /* get the semaphore */
    if(thread_handle == Handle_Of_Thread_1)
    {
        WaitForSingleObject(ghSemaphore, INFINITE);
        printf("Thread 1 holding semaphore \n");
    }
#endif

    if(thread_handle == Handle_Of_Thread_1)
    {
        /* sleep for 10 seconds */
        Sleep(10 * 1000);
#ifdef USE_MUTEX
        printf("Thread 1 about to release mutex \n");
#else
        printf("Thread 1 about to release semaphore \n");
#endif
    }
    else
    {
        /* sleep for 3 secconds */
        Sleep(3 * 1000);
    }

#ifdef USE_MUTEX
    /* release the lock*/
    if(!ReleaseMutex(ghMutex))
    {
        printf("Release Mutex error in thread %d: error # %d\n", (thread_handle == Handle_Of_Thread_1 ? 1:2),GetLastError());
    }
#else
    if (!ReleaseSemaphore(ghSemaphore,1,NULL) )      
    {
        printf("ReleaseSemaphore error in thread %d: error # %d\n",(thread_handle == Handle_Of_Thread_1 ? 1:2), GetLastError());
    }
#endif

    return 0;
}

DWORD WINAPI Thread_no_1( LPVOID lpParam ) 
{ 
    my_critical_section_code(Handle_Of_Thread_1);
    return 0;
}


DWORD WINAPI Thread_no_2( LPVOID lpParam ) 
{
    my_critical_section_code(Handle_Of_Thread_2);
    return 0;
}

The very fact that semaphore lets you signal "it is done using a resource", even though it never owned the resource, makes me think there is a very loose coupling between owning and signaling in the case of semaphores.


MUTEX

Until recently, the only sleeping lock in the kernel was the semaphore. Most users of semaphores instantiated a semaphore with a count of one and treated them as a mutual exclusion lock—a sleeping version of the spin-lock. Unfortunately, semaphores are rather generic and do not impose any usage constraints. This makes them useful for managing exclusive access in obscure situations, such as complicated dances between the kernel and userspace. But it also means that simpler locking is harder to do, and the lack of enforced rules makes any sort of automated debugging or constraint enforcement impossible. Seeking a simpler sleeping lock, the kernel developers introduced the mutex.Yes, as you are now accustomed to, that is a confusing name. Let’s clarify.The term “mutex” is a generic name to refer to any sleeping lock that enforces mutual exclusion, such as a semaphore with a usage count of one. In recent Linux kernels, the proper noun “mutex” is now also a specific type of sleeping lock that implements mutual exclusion.That is, a mutex is a mutex.

The simplicity and efficiency of the mutex come from the additional constraints it imposes on its users over and above what the semaphore requires. Unlike a semaphore, which implements the most basic of behaviour in accordance with Dijkstra’s original design, the mutex has a stricter, narrower use case: n Only one task can hold the mutex at a time. That is, the usage count on a mutex is always one.

  1. Whoever locked a mutex must unlock it. That is, you cannot lock a mutex in one context and then unlock it in another. This means that the mutex isn’t suitable for more complicated synchronizations between kernel and user-space. Most use cases, however, cleanly lock and unlock from the same context.
  2. Recursive locks and unlocks are not allowed. That is, you cannot recursively acquire the same mutex, and you cannot unlock an unlocked mutex.
  3. A process cannot exit while holding a mutex.
  4. A mutex cannot be acquired by an interrupt handler or bottom half, even with mutex_trylock().
  5. A mutex can be managed only via the official API: It must be initialized via the methods described in this section and cannot be copied, hand initialized, or reinitialized.

[1] Linux Kernel Development, Third Edition Robert Love


I think most of the answers here were confusing especially those saying that mutex can be released only by the process that holds it but semaphore can be signaled by ay process. The above line is kind of vague in terms of semaphore. To understand we should know that there are two kinds of semaphore one is called counting semaphore and the other is called a binary semaphore. In counting semaphore handles access to n number of resources where n can be defined before the use. Each semaphore has a count variable, which keeps the count of the number of resources in use, initially, it is set to n. Each process that wishes to uses a resource performs a wait() operation on the semaphore (thereby decrementing the count). When a process releases a resource, it performs a release() operation (incrementing the count). When the count becomes 0, all the resources are being used. After that, the process waits until the count becomes more than 0. Now here is the catch only the process that holds the resource can increase the count no other process can increase the count only the processes holding a resource can increase the count and the process waiting for the semaphore again checks and when it sees the resource available it decreases the count again. So in terms of binary semaphore, only the process holding the semaphore can increase the count, and count remains zero until it stops using the semaphore and increases the count and other process gets the chance to access the semaphore.

The main difference between binary semaphore and mutex is that semaphore is a signaling mechanism and mutex is a locking mechanism, but binary semaphore seems to function like mutex that creates confusion, but both are different concepts suitable for a different kinds of work.


Mutex is used to protect the sensitive code and data, semaphore is used to synchronization.You also can have practical use with protect the sensitive code, but there might be a risk that release the protection by the other thread by operation V.So The main difference between bi-semaphore and mutex is the ownership.For instance by toilet , Mutex is like that one can enter the toilet and lock the door, no one else can enter until the man get out, bi-semaphore is like that one can enter the toilet and lock the door, but someone else could enter by asking the administrator to open the door, it's ridiculous.


Mutex work on blocking critical region, But Semaphore work on count.


Modified question is - What's the difference between A mutex and a "binary" semaphore in "Linux"?

Ans: Following are the differences – i) Scope – The scope of mutex is within a process address space which has created it and is used for synchronization of threads. Whereas semaphore can be used across process space and hence it can be used for interprocess synchronization.

ii) Mutex is lightweight and faster than semaphore. Futex is even faster.

iii) Mutex can be acquired by same thread successfully multiple times with condition that it should release it same number of times. Other thread trying to acquire will block. Whereas in case of semaphore if same process tries to acquire it again it blocks as it can be acquired only once.


Mutex work on blocking critical region, But Semaphore work on count.


Mutex are used for " Locking Mechanisms ". one process at a time can use a shared resource

whereas

Semaphores are used for " Signaling Mechanisms " like "I am done , now can continue"


The Toilet example is an enjoyable analogy:

Mutex:

Is a key to a toilet. One person can have the key - occupy the toilet - at the time. When finished, the person gives (frees) the key to the next person in the queue.

Officially: "Mutexes are typically used to serialise access to a section of re-entrant code that cannot be executed concurrently by more than one thread. A mutex object only allows one thread into a controlled section, forcing other threads which attempt to gain access to that section to wait until the first thread has exited from that section." Ref: Symbian Developer Library

(A mutex is really a semaphore with value 1.)

Semaphore:

Is the number of free identical toilet keys. Example, say we have four toilets with identical locks and keys. The semaphore count - the count of keys - is set to 4 at beginning (all four toilets are free), then the count value is decremented as people are coming in. If all toilets are full, ie. there are no free keys left, the semaphore count is 0. Now, when eq. one person leaves the toilet, semaphore is increased to 1 (one free key), and given to the next person in the queue.

Officially: "A semaphore restricts the number of simultaneous users of a shared resource up to a maximum number. Threads can request access to the resource (decrementing the semaphore), and can signal that they have finished using the resource (incrementing the semaphore)." Ref: Symbian Developer Library


Mutex

Mutexes are typically used to serialise access to a section of re-entrant code that cannot be executed concurrently by more than one thread. A mutex object only allows one thread into a controlled section, forcing other threads which attempt to gain access to that section to wait until the first thread has exited from that section.Proper use of a mutex is to protect a shared resource can have a dangerous unintended side effect. Any two RTOS tasks that operate at different priorities and coordinate via a mutex, create the opportunity for priority inversion. Mutex works in user space.

Semaphore

Semaphore is a signalling mechanism. Semaphore restricts the number of simultaneous users of a shared resource up to a maximum number. Threads can request access to the resource (decrementing the semaphore) and can signal that they have finished using the resource (incrementing the semaphore). It allows number of thread to access shared resources.The correct use of a semaphore is for signaling from one task to another.semaphores can also be used to signal from an interrupt service routine (ISR) to a task. Signaling a semaphore is a non-blocking RTOS behavior and thus ISR safe. Because this technique eliminates the error-prone need to disable interrupts at the task level.This works in kernel space.


On Windows, there are two differences between mutexes and binary semaphores:

  1. A mutex can only be released by the thread which has ownership, i.e. the thread which previously called the Wait function, (or which took ownership when creating it). A semaphore can be released by any thread.

  2. A thread can call a wait function repeatedly on a mutex without blocking. However, if you call a wait function twice on a binary semaphore without releasing the semaphore in between, the thread will block.


Mutex

Mutexes are typically used to serialise access to a section of re-entrant code that cannot be executed concurrently by more than one thread. A mutex object only allows one thread into a controlled section, forcing other threads which attempt to gain access to that section to wait until the first thread has exited from that section.Proper use of a mutex is to protect a shared resource can have a dangerous unintended side effect. Any two RTOS tasks that operate at different priorities and coordinate via a mutex, create the opportunity for priority inversion. Mutex works in user space.

Semaphore

Semaphore is a signalling mechanism. Semaphore restricts the number of simultaneous users of a shared resource up to a maximum number. Threads can request access to the resource (decrementing the semaphore) and can signal that they have finished using the resource (incrementing the semaphore). It allows number of thread to access shared resources.The correct use of a semaphore is for signaling from one task to another.semaphores can also be used to signal from an interrupt service routine (ISR) to a task. Signaling a semaphore is a non-blocking RTOS behavior and thus ISR safe. Because this technique eliminates the error-prone need to disable interrupts at the task level.This works in kernel space.


The Toilet example is an enjoyable analogy:

Mutex:

Is a key to a toilet. One person can have the key - occupy the toilet - at the time. When finished, the person gives (frees) the key to the next person in the queue.

Officially: "Mutexes are typically used to serialise access to a section of re-entrant code that cannot be executed concurrently by more than one thread. A mutex object only allows one thread into a controlled section, forcing other threads which attempt to gain access to that section to wait until the first thread has exited from that section." Ref: Symbian Developer Library

(A mutex is really a semaphore with value 1.)

Semaphore:

Is the number of free identical toilet keys. Example, say we have four toilets with identical locks and keys. The semaphore count - the count of keys - is set to 4 at beginning (all four toilets are free), then the count value is decremented as people are coming in. If all toilets are full, ie. there are no free keys left, the semaphore count is 0. Now, when eq. one person leaves the toilet, semaphore is increased to 1 (one free key), and given to the next person in the queue.

Officially: "A semaphore restricts the number of simultaneous users of a shared resource up to a maximum number. Threads can request access to the resource (decrementing the semaphore), and can signal that they have finished using the resource (incrementing the semaphore)." Ref: Symbian Developer Library


At a theoretical level, they are no different semantically. You can implement a mutex using semaphores or vice versa (see here for an example). In practice, the implementations are different and they offer slightly different services.

The practical difference (in terms of the system services surrounding them) is that the implementation of a mutex is aimed at being a more lightweight synchronisation mechanism. In oracle-speak, mutexes are known as latches and semaphores are known as waits.

At the lowest level, they use some sort of atomic test and set mechanism. This reads the current value of a memory location, computes some sort of conditional and writes out a value at that location in a single instruction that cannot be interrupted. This means that you can acquire a mutex and test to see if anyone else had it before you.

A typical mutex implementation has a process or thread executing the test-and-set instruction and evaluating whether anything else had set the mutex. A key point here is that there is no interaction with the scheduler, so we have no idea (and don't care) who has set the lock. Then we either give up our time slice and attempt it again when the task is re-scheduled or execute a spin-lock. A spin lock is an algorithm like:

Count down from 5000:
     i. Execute the test-and-set instruction
    ii. If the mutex is clear, we have acquired it in the previous instruction 
        so we can exit the loop
   iii. When we get to zero, give up our time slice.

When we have finished executing our protected code (known as a critical section) we just set the mutex value to zero or whatever means 'clear.' If multiple tasks are attempting to acquire the mutex then the next task that happens to be scheduled after the mutex is released will get access to the resource. Typically you would use mutexes to control a synchronised resource where exclusive access is only needed for very short periods of time, normally to make an update to a shared data structure.

A semaphore is a synchronised data structure (typically using a mutex) that has a count and some system call wrappers that interact with the scheduler in a bit more depth than the mutex libraries would. Semaphores are incremented and decremented and used to block tasks until something else is ready. See Producer/Consumer Problem for a simple example of this. Semaphores are initialised to some value - a binary semaphore is just a special case where the semaphore is initialised to 1. Posting to a semaphore has the effect of waking up a waiting process.

A basic semaphore algorithm looks like:

(somewhere in the program startup)
Initialise the semaphore to its start-up value.

Acquiring a semaphore
   i. (synchronised) Attempt to decrement the semaphore value
  ii. If the value would be less than zero, put the task on the tail of the list of tasks waiting on the semaphore and give up the time slice.

Posting a semaphore
   i. (synchronised) Increment the semaphore value
  ii. If the value is greater or equal to the amount requested in the post at the front of the queue, take that task off the queue and make it runnable.  
 iii. Repeat (ii) for all tasks until the posted value is exhausted or there are no more tasks waiting.

In the case of a binary semaphore the main practical difference between the two is the nature of the system services surrounding the actual data structure.

EDIT: As evan has rightly pointed out, spinlocks will slow down a single processor machine. You would only use a spinlock on a multi-processor box because on a single processor the process holding the mutex will never reset it while another task is running. Spinlocks are only useful on multi-processor architectures.


Their synchronization semantics are very different:

  • mutexes allow serialization of access to a given resource i.e. multiple threads wait for a lock, one at a time and as previously said, the thread owns the lock until it is done: only this particular thread can unlock it.
  • a binary semaphore is a counter with value 0 and 1: a task blocking on it until any task does a sem_post. The semaphore advertises that a resource is available, and it provides the mechanism to wait until it is signaled as being available.

As such one can see a mutex as a token passed from task to tasks and a semaphore as traffic red-light (it signals someone that it can proceed).


Though mutex & semaphores are used as synchronization primitives ,there is a big difference between them. In the case of mutex, only the thread that locked or acquired the mutex can unlock it. In the case of a semaphore, a thread waiting on a semaphore can be signaled by a different thread. Some operating system supports using mutex & semaphores between process. Typically usage is creating in shared memory.


Their synchronization semantics are very different:

  • mutexes allow serialization of access to a given resource i.e. multiple threads wait for a lock, one at a time and as previously said, the thread owns the lock until it is done: only this particular thread can unlock it.
  • a binary semaphore is a counter with value 0 and 1: a task blocking on it until any task does a sem_post. The semaphore advertises that a resource is available, and it provides the mechanism to wait until it is signaled as being available.

As such one can see a mutex as a token passed from task to tasks and a semaphore as traffic red-light (it signals someone that it can proceed).


The concept was clear to me after going over above posts. But there were some lingering questions. So, I wrote this small piece of code.

When we try to give a semaphore without taking it, it goes through. But, when you try to give a mutex without taking it, it fails. I tested this on a Windows platform. Enable USE_MUTEX to run the same code using a MUTEX.

#include <stdio.h>
#include <windows.h>
#define xUSE_MUTEX 1
#define MAX_SEM_COUNT 1

DWORD WINAPI Thread_no_1( LPVOID lpParam );
DWORD WINAPI Thread_no_2( LPVOID lpParam );

HANDLE Handle_Of_Thread_1 = 0;
HANDLE Handle_Of_Thread_2 = 0;
int Data_Of_Thread_1 = 1;
int Data_Of_Thread_2 = 2;
HANDLE ghMutex = NULL;
HANDLE ghSemaphore = NULL;


int main(void)
{

#ifdef USE_MUTEX
    ghMutex = CreateMutex( NULL, FALSE, NULL);
    if (ghMutex  == NULL) 
    {
        printf("CreateMutex error: %d\n", GetLastError());
        return 1;
    }
#else
    // Create a semaphore with initial and max counts of MAX_SEM_COUNT
    ghSemaphore = CreateSemaphore(NULL,MAX_SEM_COUNT,MAX_SEM_COUNT,NULL);
    if (ghSemaphore == NULL) 
    {
        printf("CreateSemaphore error: %d\n", GetLastError());
        return 1;
    }
#endif
    // Create thread 1.
    Handle_Of_Thread_1 = CreateThread( NULL, 0,Thread_no_1, &Data_Of_Thread_1, 0, NULL);  
    if ( Handle_Of_Thread_1 == NULL)
    {
        printf("Create first thread problem \n");
        return 1;
    }

    /* sleep for 5 seconds **/
    Sleep(5 * 1000);

    /*Create thread 2 */
    Handle_Of_Thread_2 = CreateThread( NULL, 0,Thread_no_2, &Data_Of_Thread_2, 0, NULL);  
    if ( Handle_Of_Thread_2 == NULL)
    {
        printf("Create second thread problem \n");
        return 1;
    }

    // Sleep for 20 seconds
    Sleep(20 * 1000);

    printf("Out of the program \n");
    return 0;
}


int my_critical_section_code(HANDLE thread_handle)
{

#ifdef USE_MUTEX
    if(thread_handle == Handle_Of_Thread_1)
    {
        /* get the lock */
        WaitForSingleObject(ghMutex, INFINITE);
        printf("Thread 1 holding the mutex \n");
    }
#else
    /* get the semaphore */
    if(thread_handle == Handle_Of_Thread_1)
    {
        WaitForSingleObject(ghSemaphore, INFINITE);
        printf("Thread 1 holding semaphore \n");
    }
#endif

    if(thread_handle == Handle_Of_Thread_1)
    {
        /* sleep for 10 seconds */
        Sleep(10 * 1000);
#ifdef USE_MUTEX
        printf("Thread 1 about to release mutex \n");
#else
        printf("Thread 1 about to release semaphore \n");
#endif
    }
    else
    {
        /* sleep for 3 secconds */
        Sleep(3 * 1000);
    }

#ifdef USE_MUTEX
    /* release the lock*/
    if(!ReleaseMutex(ghMutex))
    {
        printf("Release Mutex error in thread %d: error # %d\n", (thread_handle == Handle_Of_Thread_1 ? 1:2),GetLastError());
    }
#else
    if (!ReleaseSemaphore(ghSemaphore,1,NULL) )      
    {
        printf("ReleaseSemaphore error in thread %d: error # %d\n",(thread_handle == Handle_Of_Thread_1 ? 1:2), GetLastError());
    }
#endif

    return 0;
}

DWORD WINAPI Thread_no_1( LPVOID lpParam ) 
{ 
    my_critical_section_code(Handle_Of_Thread_1);
    return 0;
}


DWORD WINAPI Thread_no_2( LPVOID lpParam ) 
{
    my_critical_section_code(Handle_Of_Thread_2);
    return 0;
}

The very fact that semaphore lets you signal "it is done using a resource", even though it never owned the resource, makes me think there is a very loose coupling between owning and signaling in the case of semaphores.


On Windows, there are two differences between mutexes and binary semaphores:

  1. A mutex can only be released by the thread which has ownership, i.e. the thread which previously called the Wait function, (or which took ownership when creating it). A semaphore can be released by any thread.

  2. A thread can call a wait function repeatedly on a mutex without blocking. However, if you call a wait function twice on a binary semaphore without releasing the semaphore in between, the thread will block.


The answer may depend on the target OS. For example, at least one RTOS implementation I'm familiar with will allow multiple sequential "get" operations against a single OS mutex, so long as they're all from within the same thread context. The multiple gets must be replaced by an equal number of puts before another thread will be allowed to get the mutex. This differs from binary semaphores, for which only a single get is allowed at a time, regardless of thread contexts.

The idea behind this type of mutex is that you protect an object by only allowing a single context to modify the data at a time. Even if the thread gets the mutex and then calls a function that further modifies the object (and gets/puts the protector mutex around its own operations), the operations should still be safe because they're all happening under a single thread.

{
    mutexGet();  // Other threads can no longer get the mutex.

    // Make changes to the protected object.
    // ...

    objectModify();  // Also gets/puts the mutex.  Only allowed from this thread context.

    // Make more changes to the protected object.
    // ...

    mutexPut();  // Finally allows other threads to get the mutex.
}

Of course, when using this feature, you must be certain that all accesses within a single thread really are safe!

I'm not sure how common this approach is, or whether it applies outside of the systems with which I'm familiar. For an example of this kind of mutex, see the ThreadX RTOS.


At a theoretical level, they are no different semantically. You can implement a mutex using semaphores or vice versa (see here for an example). In practice, the implementations are different and they offer slightly different services.

The practical difference (in terms of the system services surrounding them) is that the implementation of a mutex is aimed at being a more lightweight synchronisation mechanism. In oracle-speak, mutexes are known as latches and semaphores are known as waits.

At the lowest level, they use some sort of atomic test and set mechanism. This reads the current value of a memory location, computes some sort of conditional and writes out a value at that location in a single instruction that cannot be interrupted. This means that you can acquire a mutex and test to see if anyone else had it before you.

A typical mutex implementation has a process or thread executing the test-and-set instruction and evaluating whether anything else had set the mutex. A key point here is that there is no interaction with the scheduler, so we have no idea (and don't care) who has set the lock. Then we either give up our time slice and attempt it again when the task is re-scheduled or execute a spin-lock. A spin lock is an algorithm like:

Count down from 5000:
     i. Execute the test-and-set instruction
    ii. If the mutex is clear, we have acquired it in the previous instruction 
        so we can exit the loop
   iii. When we get to zero, give up our time slice.

When we have finished executing our protected code (known as a critical section) we just set the mutex value to zero or whatever means 'clear.' If multiple tasks are attempting to acquire the mutex then the next task that happens to be scheduled after the mutex is released will get access to the resource. Typically you would use mutexes to control a synchronised resource where exclusive access is only needed for very short periods of time, normally to make an update to a shared data structure.

A semaphore is a synchronised data structure (typically using a mutex) that has a count and some system call wrappers that interact with the scheduler in a bit more depth than the mutex libraries would. Semaphores are incremented and decremented and used to block tasks until something else is ready. See Producer/Consumer Problem for a simple example of this. Semaphores are initialised to some value - a binary semaphore is just a special case where the semaphore is initialised to 1. Posting to a semaphore has the effect of waking up a waiting process.

A basic semaphore algorithm looks like:

(somewhere in the program startup)
Initialise the semaphore to its start-up value.

Acquiring a semaphore
   i. (synchronised) Attempt to decrement the semaphore value
  ii. If the value would be less than zero, put the task on the tail of the list of tasks waiting on the semaphore and give up the time slice.

Posting a semaphore
   i. (synchronised) Increment the semaphore value
  ii. If the value is greater or equal to the amount requested in the post at the front of the queue, take that task off the queue and make it runnable.  
 iii. Repeat (ii) for all tasks until the posted value is exhausted or there are no more tasks waiting.

In the case of a binary semaphore the main practical difference between the two is the nature of the system services surrounding the actual data structure.

EDIT: As evan has rightly pointed out, spinlocks will slow down a single processor machine. You would only use a spinlock on a multi-processor box because on a single processor the process holding the mutex will never reset it while another task is running. Spinlocks are only useful on multi-processor architectures.


Apart from the fact that mutexes have an owner, the two objects may be optimized for different usage. Mutexes are designed to be held only for a short time; violating this can cause poor performance and unfair scheduling. For example, a running thread may be permitted to acquire a mutex, even though another thread is already blocked on it. Semaphores may provide more fairness, or fairness can be forced using several condition variables.


Examples related to operating-system

Context.startForegroundService() did not then call Service.startForeground() Fork() function in C python: get directory two levels up Find Process Name by its Process ID Best way to find os name and version in Unix/Linux platform How to run a program without an operating system? How to make parent wait for all child processes to finish? Get operating system info Running windows shell commands with python What are the differences between virtual memory and physical memory?

Examples related to mutex

What is the Swift equivalent to Objective-C's "@synchronized"? Mutex lock threads When should one use a spinlock instead of mutex? Is there a Mutex in Java? Mutex example / tutorial? When should we use mutex and when should we use semaphore Proper use of mutexes in Python Lock, mutex, semaphore... what's the difference? Example for boost shared_mutex (multiple reads/one write)? What is mutex and semaphore in Java ? What is the main difference?

Examples related to semaphore

semaphore implementation "The semaphore timeout period has expired" error for USB connection Semaphore vs. Monitors - what's the difference? Is there a Mutex in Java? When should we use mutex and when should we use semaphore Lock, mutex, semaphore... what's the difference? Delete all SYSTEM V shared memory and semaphores on UNIX-like systems What is mutex and semaphore in Java ? What is the main difference? Difference between binary semaphore and mutex What is a semaphore?

Examples related to binary-semaphore

Difference between binary semaphore and mutex