跳至主要内容

GC Roots

GC Roots

This blog is about the a question I received:

Are the GC roots of Minor GC and Major GC the same?

GC Algorithm

If we are asked to come up with a GC collection algorithm, we may think about a solution : in which we maintain the count of reference to some objects, if count drop to 0, we claim the memory. This is called Reference Count, which can’t solve the cyclic reference, is a wrong algorithm.

The GC collection algorithm adopted by JVM is marking algorithm from GC roots. The GC roots sounds very mysterious, but its meaning is very simple: GC roots is the object that live outside Java heap and referring to object in Java heap.

From the Hotspot JVM source code, we can see the following GC roots types:

enum RootType {  
  universe = 1,  
  jni_handles           = 2,  
  threads               = 3,  
  object_synchronizer   = 4,  
  system_dictionary     = 5,  
  class_loader_data     = 6,  
  management            = 7,  
  jvmti                 = 8,  
  code_cache            = 9  
};

And from comment of source code, we can conclude the meaning of those roots:

  • Universe: is a name space holding known system classes and objects in the VM. The object heap is allocated and accessed through Universe, and various allocation support is provided.
  • JNIHandles: JNI code
  • Threads: live threads
  • ObjectSynchronizer: monitor object
  • Management: JVM used management class
  • JvmtiExport: JVM tool interface, for debug & profiling;
  • SystemDictionary: Loaded classes are accessible through the SystemDictionary.
  • ClassLoaderDataGraph: A class loader represents a linkset. Conceptually, a linkset identifies the complete transitive closure of resolved links that a dynamic linker can produce. A ClassLoaderData also encapsulates the allocation space, called a metaspace, used by the dynamic linker to allocate the runtime representation of all the types it defines.

PSScavenge for Minor GC

In order to answer the question, we need to dive into the marking algorithm process. So, we choose the PSScavenge as our target. Scavenging is a garbage collector for young generation in Hotspot JVM, PS represents parallel process.

A simple search we found the following method, which is the GC process:

PSScavenge::invoke()

Through some code reading, we can draw following invocation sequence (we take Universe as an example, other roots are the same):

PSScavengeScavengeRootsTaskUniversePSRootsClosurePSPromotirun task in threadsUniverse::oops_dodo_oop, do_oop_nv, do_oop_workcopy_and_push_safe_barrierPSScavengeScavengeRootsTaskUniversePSRootsClosurePSPromoti

And in do_oop_work, live object is copied to survivor space and card_table (the utility used to avoid scan old generation when handle young generation) is updated:

// p is GC roots
oop o = RawAccess<OOP_NOT_NULL>::oop_load(p);  
oop new_obj = o->is_forwarded()  
    ? o->forwardee()  
    : copy_to_survivor_space<promote_immediately>(o);

RawAccess<OOP_NOT_NULL>::oop_store(p, new_obj);

if ((!PSScavenge::is_obj_in_young((HeapWord*)p)) &&  
ParallelScavengeHeap::heap()->is_in_reserved(p)) {  
  if (PSScavenge::is_obj_in_young(new_obj)) {  
    PSScavenge::card_table()->inline_write_ref_field_gc(p, new_obj);  
  }  
}

PSParallelCompact for Major GC

In order to compare, we then see the GC code for old generation. Using similar way, we can have the illustration of PSParallelCompact's , which is garbage collector for old generation:

invoke_no_policymarking_phaseMarkFromRootsTaskMarkAndPushClosurParCompactdo_oop, do_oop_nvmark_and_pushinvoke_no_policymarking_phaseMarkFromRootsTaskMarkAndPushClosurParCompact

Until now, we still not find any differences between GC roots of young generation and old generation. When doing young generation garbage collection, card_table should be used, but we didn’t see any clue. So we decided to review the PSScavenge::invoke().

This time, we find we miss a different task from ScavengeRootsTaskOldToYoungRootsTask

OldToYoungRootsTask

In PSScavenge::invoke(), Hotspot will open a thread pool to run a serial tasks. Except the ScavengeRootsTask which handle all the roots as above listed, OldToYoungRootsTask handle reference in card_table:

PSScavengeOldToYoungRootsTasPSCardTablecard_table->scavenge_contents_parallelPSScavengeOldToYoungRootsTasPSCardTable

So, we can understand where card_table is used and there exists no special roots other than constant listed above. Now, can see the answer of our question: under the strict definition of GC roots, Minor GC and Major GC use same roots.

Let to be Solved

Although the problem can be answered, but more questions arises:

  • The meaning and usage of is_forwarded & forwardee is still unclear for me;
  • How young generation to old generation reference resolved in old generation mark and compaction? Didn’t find any clue in PSScavenge::invoke_no_policy();

Abstraction for Ref

OldToYoungRootsTask: This task is used to scan old to young roots in parallel

The markOop describes the header of an object.

OopsInGenClosure: Closure for iterating roots from a particular generation
Note: all classes deriving from this MUST call this do_barrier method at the end of their own do_oop method! Note: no do_oop defined, this is an abstract class.

Code Ref

Written with StackEdit.

评论

此博客中的热门博文

Spring Boot: Customize Environment

Spring Boot: Customize Environment Environment variable is a very commonly used feature in daily programming: used in init script used in startup configuration used by logging etc In Spring Boot, all environment variables are a part of properties in Spring context and managed by Environment abstraction. Because Spring Boot can handle the parse of configuration files, when we want to implement a project which uses yml file as a separate config file, we choose the Spring Boot. The following is the problems we met when we implementing the parse of yml file and it is recorded for future reader. Bind to Class Property values can be injected directly into your beans using the @Value annotation, accessed via Spring’s Environment abstraction or bound to structured objects via @ConfigurationProperties. As the document says, there exists three ways to access properties in *.properties or *.yml : @Value : access single value Environment : can access multi

Elasticsearch: Join and SubQuery

Elasticsearch: Join and SubQuery Tony was bothered by the recent change of search engine requirement: they want the functionality of SQL-like join in Elasticsearch! “They are crazy! How can they think like that. Didn’t they understand that Elasticsearch is kind-of NoSQL 1 in which every index should be independent and self-contained? In this way, every index can work independently and scale as they like without considering other indexes, so the performance can boost. Following this design principle, Elasticsearch has little related supports.” Tony thought, after listening their requirements. Leader notice tony’s unwillingness and said, “Maybe it is hard to do, but the requirement is reasonable. We need to search person by his friends, didn’t we? What’s more, the harder to implement, the more you can learn from it, right?” Tony thought leader’s word does make sense so he set out to do the related implementations Application-Side Join “The first implementation

Implement isdigit

It is seems very easy to implement c library function isdigit , but for a library code, performance is very important. So we will try to implement it and make it faster. Function So, first we make it right. int isdigit ( char c) { return c >= '0' && c <= '9' ; } Improvements One – Macro When it comes to performance for c code, macro can always be tried. #define isdigit (c) c >= '0' && c <= '9' Two – Table Upper version use two comparison and one logical operation, but we can do better with more space: # define isdigit(c) table[c] This works and faster, but somewhat wasteful. We need only one bit to represent true or false, but we use a int. So what to do? There are many similar functions like isalpha(), isupper ... in c header file, so we can combine them into one int and get result by table[c]&SOME_BIT , which is what source do. Source code of ctype.h : # define _ISbit(bit) (1 << (