跳至主要内容

Jail AtomicLong?

Jail AtomicLong?

The atomic class in Java Kingdom provide very handy and powerful functionalities in cases where our program’s state is simple enough to delegate to a atomic variable. If we look at the packages – java.util.concurrent.atomic, there exists five general kinds of atomic class:

Atomic
| | | | \__ Primitives: Long, Integer, Boolean, Reference
| | | \---- Arrasy element: LongArray, IntegerArray, ReferenceArray
| | \______ Refelction related: LongFieldUpdater, IntegerFieldUpdater, ReferenceFieldUpdater
| \-------- Reference with state: MarkableReference, StampedReference
\__________ Primitives: Striped64, LongAdder, DoubleAdder, LongAccumulator, DoubleAccumulator[*]

[*]: Other four kinds of classes all start with `Atomic`, except this one.

Each kinds of class has different usages domains as above list concludes, except the first and last type. They seems have some duplication: that is this three similar class

  • AtomicLong
  • AtomicAdder
  • AtomicAccumulator

Obviously enough, the AtomicAccumulator is the abstraction of AtomicAdder, i.e. which support more general operations but not so efficient compared with a simple “Adder”. So, the newcomer LongAdder (who comes in JDK since 1.8) report to the Java Kingdom that “we should jail the AtomicLong and deprecate it in the future.”

Today, the Java Judge Committee open a meeting to let AtomicLong and LongAdder to do “face-to-face confrontation”.


Memory for Speed

LongAdder start with documentation that Doug Lea write:

Under low update contention, the two classes have similar characteristics. But under high contention, expected throughput of this class is significantly higher

“As all of you seen, LongAdder will be significantly efficient under high contention.” LongAdder said.

“But we should notice why it is more efficient? The AtomicLong is already very fast which use optimistic lock CAS to avoid heavy lock path. So how LongAdder make it more efficient? He must hide some fact.” AtomicLong challenged.

“Em, right. LongAdder make it at the expense of higher space consumption.” LongAdder added.

"Yes. He didn’t tell all truth. The original document of LongAdder is

But under high contention, expected throughput of this class is significantly higher, at the expense of higher space consumption.

"In details, It maintains one or more variables which sum to the real value. So explains it!

One or more variables that together maintain an initially zero long sum

LongAdder think the drawback is exposed, he has to show his advantages "Ok, Let’s start with documentation

When updates (method add) are contended across threads, the set of variables may grow dynamically to reduce contention. Method sum (or, equivalently, longValue) returns the current total combined across the variables maintaining the sum.

"So we can grasp a general idea that under high contention, different thread will update different variables, which avoid contention as AtomicLong did. We can compare the implementations:

// AtomicLong
public final long addAndGet(long delta) {  
  return unsafe.getAndAddLong(this, valueOffset, delta) + delta;  
}
// Unsafe
public final long getAndAddLong(Object obj, long offset, long delta) {  
  long old;  
  do {  
    old = this.getLongVolatile(obj, offset);  
  } while(!this.compareAndSwapLong(obj, offset, old, old + delta));
  return old;  
}

"As you can see, the implementation of AtomicLong will continue to try if CAS failed under high thread contention. But our LongAdder is different:

public void add(long x) {
    Cell[] as; long b, v; int m; Cell a;
    if ((as = cells) != null || !casBase(b = base, b + x)) {
        boolean uncontended = true;
        if (as == null || (m = as.length - 1) < 0 ||
            (a = as[getProbe() & m]) == null ||
            !(uncontended = a.cas(v = a.value, v + x)))
            longAccumulate(x, null, uncontended);
    }
}

"Our LongAdder has a long base and Cell[] cells to store sum value. Under low contention, thread will just update the base use CAS which is the same with LongAdder. But under high contention, most thread will CAS failed and enter the if condition.

"those threads will check whether the cells array is already init

as == null || (m = as.length - 1) < 0

"whether the a random (getProbe is a thread local random number generation) cell is init

a = as[getProbe() & m]) == null

“and finally, whether that cell contented. If all those trial failed, we will init cells or try to attach new Cell, or in more special cases, expand size of cells to reduce the collision.” LongAdder finished.

if (…
!(uncontended = a.cas(v = a.value, v + x)))
longAccumulate(x, null, uncontended);

People are almost overwhelmed by the complex but dedicated implementation of LongAdder, but except AtomicLong. He found some other drawbacks.

“But” AtomicLong said loudly, “you pay for it more than space”, he wait a while and continue “You are not accurate. The following is the citation from your sum() method”

Returns the current sum. The returned value is NOT an atomic snapshot; invocation in the absence of concurrent updates returns an accurate result, but concurrent updates that occur while the sum is being calculated might not be incorporated.

“The sum is accumulated dynamically by iterating all cells (if exists), which, without lock, can be updated during iteration, so is not accurate. By the way, your sum() is of course not efficient than mine.” AtomicLong finished.

All people nod and LongAdder is blushed.

“So you are very suitable for situations like collecting statistics, which has high contention and endure the accuracy loss, just like the ConcurrentHashMap's size().” AtomicLong close the debate and return to work.


PS

One more questions let reader to ponder:

if ((as = cells) != null || !casBase(b = base, b + x)) {

During add, what about check base first then cells?

Written with StackEdit.

评论

此博客中的热门博文

Spring Boot: Customize Environment

Spring Boot: Customize Environment Environment variable is a very commonly used feature in daily programming: used in init script used in startup configuration used by logging etc In Spring Boot, all environment variables are a part of properties in Spring context and managed by Environment abstraction. Because Spring Boot can handle the parse of configuration files, when we want to implement a project which uses yml file as a separate config file, we choose the Spring Boot. The following is the problems we met when we implementing the parse of yml file and it is recorded for future reader. Bind to Class Property values can be injected directly into your beans using the @Value annotation, accessed via Spring’s Environment abstraction or bound to structured objects via @ConfigurationProperties. As the document says, there exists three ways to access properties in *.properties or *.yml : @Value : access single value Environment : can access multi

Elasticsearch: Join and SubQuery

Elasticsearch: Join and SubQuery Tony was bothered by the recent change of search engine requirement: they want the functionality of SQL-like join in Elasticsearch! “They are crazy! How can they think like that. Didn’t they understand that Elasticsearch is kind-of NoSQL 1 in which every index should be independent and self-contained? In this way, every index can work independently and scale as they like without considering other indexes, so the performance can boost. Following this design principle, Elasticsearch has little related supports.” Tony thought, after listening their requirements. Leader notice tony’s unwillingness and said, “Maybe it is hard to do, but the requirement is reasonable. We need to search person by his friends, didn’t we? What’s more, the harder to implement, the more you can learn from it, right?” Tony thought leader’s word does make sense so he set out to do the related implementations Application-Side Join “The first implementation

Implement isdigit

It is seems very easy to implement c library function isdigit , but for a library code, performance is very important. So we will try to implement it and make it faster. Function So, first we make it right. int isdigit ( char c) { return c >= '0' && c <= '9' ; } Improvements One – Macro When it comes to performance for c code, macro can always be tried. #define isdigit (c) c >= '0' && c <= '9' Two – Table Upper version use two comparison and one logical operation, but we can do better with more space: # define isdigit(c) table[c] This works and faster, but somewhat wasteful. We need only one bit to represent true or false, but we use a int. So what to do? There are many similar functions like isalpha(), isupper ... in c header file, so we can combine them into one int and get result by table[c]&SOME_BIT , which is what source do. Source code of ctype.h : # define _ISbit(bit) (1 << (