There are 3 main issues with multithreading:
1) Race Conditions
2) Caching / stale memory
3) Complier and CPU optimisations
volatile
can solve 2 & 3, but can't solve 1. synchronized
/explicit locks can solve 1, 2 & 3.
1) Consider this thread unsafe code:
x++;
While it may look like one operation, it's actually 3: reading the current value of x from memory, adding 1 to it, and saving it back to memory. If few threads try to do it at the same time, the result of the operation is undefined. If x
originally was 1, after 2 threads operating the code it may be 2 and it may be 3, depending on which thread completed which part of the operation before control was transferred to the other thread. This is a form of race condition.
Using synchronized
on a block of code makes it atomic - meaning it make it as if the 3 operations happen at once, and there's no way for another thread to come in the middle and interfere. So if x
was 1, and 2 threads try to preform x++
we know in the end it will be equal to 3. So it solves the race condition problem.
synchronized (this) {
x++; // no problem now
}
Marking x
as volatile
does not make x++;
atomic, so it doesn't solve this problem.
2) In addition, threads have their own context - i.e. they can cache values from main memory. That means that a few threads can have copies of a variable, but they operate on their working copy without sharing the new state of the variable among other threads.
Consider that on one thread, x = 10;
. And somewhat later, in another thread, x = 20;
. The change in value of x
might not appear in the first thread, because the other thread has saved the new value to its working memory, but hasn't copied it to the main memory. Or that it did copy it to the main memory, but the first thread hasn't updated its working copy. So if now the first thread checks if (x == 20)
the answer will be false
.
Marking a variable as volatile
basically tells all threads to do read and write operations on main memory only. synchronized
tells every thread to go update their value from main memory when they enter the block, and flush the result back to main memory when they exit the block.
Note that unlike data races, stale memory is not so easy to (re)produce, as flushes to main memory occur anyway.
3) The complier and CPU can (without any form of synchronization between threads) treat all code as single threaded. Meaning it can look at some code, that is very meaningful in a multithreading aspect, and treat it as if it’s single threaded, where it’s not so meaningful. So it can look at a code and decide, in sake of optimisation, to reorder it, or even remove parts of it completely, if it doesn’t know that this code is designed to work on multiple threads.
Consider the following code:
boolean b = false;
int x = 10;
void threadA() {
x = 20;
b = true;
}
void threadB() {
if (b) {
System.out.println(x);
}
}
You would think that threadB could only print 20 (or not print anything at all if threadB if-check is executed before setting b
to true), as b
is set to true only after x
is set to 20, but the compiler/CPU might decide to reorder threadA, in that case threadB could also print 10. Marking b
as volatile
ensures that it won’t be reordered (or discarded in certain cases). Which mean threadB could only print 20 (or nothing at all). Marking the methods as syncrhonized will achieve the same result. Also marking a variable as volatile
only ensures that it won’t get reordered, but everything before/after it can still be reordered, so synchronization can be more suited in some scenarios.
Note that before Java 5 New Memory Model, volatile didn’t solve this issue.