Java Concurrency Tidbits



Threads



Threads allow multiple streams of program control flow to coexist within a process. They share process-wide resources such as memory and file handles, but each thread has its own program counter, stack, and local variables. 

Threads also provide a natural decomposition for exploiting hardware parallelism on multiprocessor systems; multiple threads within the same program can be scheduled simultaneously on multiple CPUs.

Threads are sometimes called lightweight processes, and most modern operating systems treat
threads, not processes, as the basic units of scheduling. In the absence of explicit coordination,
threads execute simultaneously and asynchronously with respect to one another. Since threads
share the memory address space of their owning process, all threads within a process have access
to the same variables and allocate objects from the same heap, which allows finer-grained data
sharing than inter-process mechanisms. But without explicit synchronization to coordinate access to
shared data, a thread may modify variables that another thread is in the middle of using, with
unpredictable results.





  • Threads are heavy weight objects
  • Number of threads in a program is proportional to number of processor you have.
  • Threads are useful 
  • server applications for improving resource utilization and throughput. 
  • 34:35 - disadvantages of blocking
  • lock is not available - suspend your thread  , waken up when lock is available.This results in 2 context switches.
  • Synchronizers : Is any class that manages control flow of other threads. 
  • Synchronize : 38:55  Concurrency Utilties in JDK 5.0 - 
  •  when u start waiting for lock u should be ready to wait forever - no way to interrupt
  • calls need to acquired and release same 
  • Semaphore is used to control the number of concurrent threads that are using a resource.




Concurrency


Java 5 - Coarse Grained Concurrency

Java  7 - Finer grained parallelism  -Influence by hardware design evolution. Core continues to increase.
  •  Data intensive task - sorting , searching  - to get answer faster 



Thread Safe Collection




  • ConcurrentHashMap - multiple reads can overlap. Cannot be locked.
  • Queue - Kind of list
  • Blocking Queues -  (30:00 min on Concurrency utilities Java Polis)





---------------------------------------------------------------------------------------------------------------------------------------------------------------------


  • The only way an object can be shared across threads is if a reference to it is published to the heap


ReadWriteLocks


In a way, I am surprised that ReadWriteLocks are as bad as they are. I really thought that write locks would get priority over read locks, since not doing that would result in obvious starvation scenarios. Instead, with Java 5, as soon as you have 3 or more read threads, you can expect complete starvation of the writer threads. With Java 6, the situation has improved somewhat.


Instead of using ReadWriteLock, we rather recommend that you use higher level concurrency classes such as the atomic classes, the ConcurrentHashMap and ConcurrentLinkedQueue


 The ReentrantLock class, which implements Lock, has the same concurrency and memory semantics as synchronized, but also adds features like lock polling, timed lock waits, and interruptible lock waits. Additionally, it offers far better performance under heavy contention. (In other words, when many threads are attempting to access a shared resource, the JVM will spend less time scheduling threads and more time executing them.)


ConcurrentHashMap 
  • Uses multiple write locks for different hash buckets.32 locks to guard different subset of hash buckets.
  •  Locks are primarily used by mutative (put() and remove()) operations.
---------------------------------------------------------------------------------------------------------------------------

I think it's more useful to think about IO-bound work and CPU-bound work; threads can help with both.
For IO-bound work you are presumably waiting for external resources (in your case, feeds to be read). If you must wait on multiple external resources then it only makes sense to wait on them in parallel rather than wait on them one after the other. This is best done by spinning up threads which block on the IO.
For CPU-bound work you want to use all of your cores to maximize the throughput of completing that work. To do that, you should create a pool of worker threads roughly the same size as your number of cores and break up and distribute the work across them. [How you break up and distribute the work is itself an interesting problem.]
In practice, I find that most applications have both of these problems and it makes sense to use threads to solve both kinds of problems.