2016年7月22日 星期五

How important and how to implement tread-safety in Java concurrency / multithreading environment?

How important and how to implement tread-safe in Java concurrency / multithreading environment?
For a multi-threaded / concurrency environment, thread-safety is a very important consideration. If thread-safety is failed to be implemented, it would definitely leads to data inconsistency.
Let’s take a look of the following example:
public class Parent {
     private static int counter = 0;
    
     public int getCount(long millis){
        
         try {
              // Pretend heavy-loading job here
              Thread.sleep(millis);
         } catch (InterruptedException e) {
         }
         return ++counter;
     }
    
     class Child extends Thread{
         private final String name;
         private final Parent counter;
         private final long millis;
        
         public Child(final String name, final Parent counter, final long millis){
              this.name = name;
              this.counter = counter;
              this.millis = millis;
         }
        
         public void run(){
              for (int i = 0; i < 50; i++){
                  int oCounter = counter.getCount(millis);
                  System.out.println(name + ": " + oCounter);
              }
         }
     }
    
     public static void main(String[] args){
         Parent counterA = new Parent();
         Child a = counterA.new Child("A", counterA, 5);
         Child b = counterA.new Child("B", counterA, 10);
        
         a.start();
         b.start();
        
     }
}
The above example, there is a Child class. The run() method of Child class would call the getCount() of Parent class. Each time getCount () is called, the counter will be incremented by 1 and then the latest value will be returned to the caller. The run() method of Child class will then print out the value after each increment.
There are 3 implementation features to be aware for this example:
(1)   From the main() method, we can see 2 threads of Child are created and both are referring to the same instance of Parent class.
public static void main(String[] args){
         Parent counterA = new Parent();
         Child a = counterA.new Child("A", counterA, 5);
         Child b = counterA.new Child("B", counterA, 10);
(2)   The run() method of Child class will call the getCount() of Parent class for 50 times.
     public void run(){
         for (int i = 0; i < 50; i++){
              int oCounter = counter.getCount(millis);
              System.out.println(name + ": " + oCounter);
         }
     }
(3)   The getCount() method of Parent class sleep for a while base on the provided millis value to pretend heavy loading process.
     public int getCount(long millis){
         try {
              // Pretend heavy-loading job here
              Thread.sleep(millis);
         } catch (InterruptedException e) {   }
         return ++counter;
     }
We would assume that the last result value of the counter should be 100.
However, after several executions, you would probably found that the results vary between 98, 99 and 100. Sometimes, it would even be 96 or 97:

The result is caused by the reason that when Thread 1 (e.g. Child a) accesses the method at t1, Thread 2 (e.g. Child b) may not be done with the method. So the value returned to Thread 1 (i.e. Child a) is the value that has not been increased:
 
To avoid this race condition, we can implement either one of object-level synchronizations as below:
Approach 1 - method-synchronization
Adding “synchronized” to the method for making it thread-safe. When synchronized is added to the method (a static method or a non-static method which involves static variable), it will guarantee there is only one thread could this method of this Parent class instance.
public synchronized int getCount(long millis){
    
     try {
         // Pretend heavy-loading job here
         Thread.sleep(millis);
     } catch (InterruptedException e) {
     }
     return ++counter;
}
Approach 2 – block-synchronization
Adding “synchronized(this) { .. }” for wrapping the block which involves operations on static variable.
Block-synchronization can also be located on 2 places: (1) callee class (i.e. Parent class here); (2) caller class (i.e. Child class)
(1)   Callee class (i.e. Parent Class here)
On callee class, we can implement the block-synchronization inside the getCount() method which involves operation on the static variable:
public int getCount(long millis){
    
     try {
         // Pretend heavy-loading job here
         Thread.sleep(millis);
     } catch (InterruptedException e) {   }
    
     synchronized(this){
         // This variable is thread-safe now
         return ++counter;
     }
}
(2)   Caller class (i.e. Child class here)
On the caller class, we can implement the block-synchronization inside the run() method by wrapping the area which will call the method that involves the static variable of the callee class (i.e. getCount() of Parent class):
public void run(){
     for (int i = 0; i < 50; i++){
         // synchronized (this){ // confusing usage (and incorrect in this example)
         synchronized (Parent.this){ // Precise usage
             
              // operation on getCount() is thread-safe now
              int oCounter = counter.getCount(millis);

              System.out.println(name + ": " + oCounter);
         }
     }
}
One point we need to bear in mind regarding (2) Caller class approach. As the block synchronization is located at the caller’s class instead of the callee, the synchronization needs to be stated as synchronized(Parent.class) explicitly (i.e. synchronizaed(<Callee>.class) to avoid incorrectly using the instance-level lock of each instantiated thread of Child class, instead of using the only Parent class instance as the “lock”.
This is one of the common mistake on implementing thread-safe application along with nested-class design.
After modification, the final version would be as below:
public class Parent {
     private static int counter = 0;
    
     public int getCount(long millis){
        
         try {
              // Pretend heavy-loading job here
              Thread.sleep(millis);
         } catch (InterruptedException e) {
         }
        
         synchronized(this){
              // This variable is thread-safe now
              return ++counter;
         }
     }
    
     class Child extends Thread{
         private final String name;
         private final Parent counter;
         private final long millis;
        
         public Child(final String name, final Parent counter, final long millis){
              this.name = name;
              this.counter = counter;
              this.millis = millis;
         }
        
         public void run(){
              for (int i = 0; i < 50; i++){
                  synchronized (Parent.this){
                       // operation on getCount() is thread-safe now
                       int oCounter = counter.getCount(millis);
                       System.out.println(name + ": " + oCounter);
                  }
              }
         }
     }
    
     public static void main(String[] args){
        
         Parent counterA = new Parent();
         Child a = counterA.new Child("A", counterA, 5);
         Child b = counterA.new Child("B", counterA, 10);
        
         a.start();
         b.start();
     }
}
Conclusion
Personally, block-synchronization on the callee class is more preferred on the aspect of thread-safety and this should have advantage over the other 2 approaches. Comparing with other 2 approaches, this approach can:
(1)   Guarantee there is only one caller’s thread could access the static variable involved operations at each time;
(2)   Minimize the area of codes being locked as locking has its cost of time; minimizing the area of codes being locked, is maximizing the performance of the program theoretically.

The above approaches should be enough to guarantee thread-safety for most of the cases. However, they are still not enough for some circumstances. One of the obvious circumstance, is the callee is instantiated for more than one instance.
Regarding this circumstance, you may also be interested in What the difference are between synchronized this vs class? for more details regarding the difference of synchronized(this) VS synchronized(class) in a concurrency environment.

沒有留言:

張貼留言