两个GC问题记录

《两个GC问题记录》

被挑战的两个问题

『逃逸』的Thread会不会被GC

  • 如下图,新创建的Thread没有任何引用,在方法结束后thread对象会被GC吗?
  • 《两个GC问题记录》

  • 很显然live thread作为GC Root,肯定不会被GC,那么问题是没有引用,JVM是怎么管理这个thread对象的?
    • 首先Thread是被组织到ThreadGroup中的,ThreadGroup又可以有parent,因此所有的线程被组织成为一个树结构,thread和其所属的threadGroup是双向引用关系,树的根节点是system ThreadGroup,如下图
    • 《两个GC问题记录》

    • 其次一个thread一定属于一个ThreadGroup,如果没有指定则为创建此thread(此例中为main)的线程的ThreadGroup
    • 然后system ThreadGroup中包含两个非常特殊的thread,分别是ReferenceHandler和FinalizerThread,这两个thread的反向指针是『stack root』,GC路径如下图:
    • 《两个GC问题记录》

    • 最后『stack root in thread “Finalizer”』的地址是0x000070000595f968,到FinalizerThread的栈帧中可以看到是在栈的最底部,内容是分配在YoungGen上的Thread对象
    • 《两个GC问题记录》

各种GC算法的搭配原则是什么样的

无它,总结如下:

原则

  • If you have a single processor, single thread machine then you should use the serial collector (default for some configurations, can be enabled explicitly for with -XX:+UseSerialGC).
  • For multiprocessor machines where your workload is basically CPU bound, use the parallel collector. This is enabled by default if you use the -server flag, or you can enable it explicitly with -XX:+UseParallelGC.
  • If you’d rather keep the GC pauses shorter at the expense of using more total CPU time for GC, and you have more than one CPU, you can use the concurrent collector (-XX:+UseConcMarkSweepGC). Note that the concurrent collector tends to require more RAM allocated to the JVM than the serial or parallel collectors for a given workload because some memory fragmentation can occur.

算法总结

《两个GC问题记录》

  • “Serial” is a stop-the-world, copying collector which uses a single GC thread.
  • “ParNew” is a stop-the-world, copying collector which uses multiple GC threads. It differs from “Parallel Scavenge” in that it has enhancements that make it usable with CMS. For example, “ParNew” does the synchronization needed so that it can run during the concurrent phases of CMS.
  • “Parallel Scavenge” is a stop-the-world, copying collector which uses multiple GC threads. This is like the previous parallel copying collector, but the algorithm is tuned for gigabyte heaps (over 10GB) on multi-CPU machines. This collection algorithm is designed to maximize throughput while minimizing pauses. It has an optional adaptive tuning policy which will automatically resize heap spaces. If you use this collector, you can only use the the original mark-sweep collector in the old generation (i.e. the newer old generation concurrent collector cannot work with this young generation collector).
  • “Serial Old” is a stop-the-world,mark-sweep-compact collector that uses a single GC thread.
  • “CMS” is a mostly concurrent, low-pause collector.
  • “Parallel Old” is a compacting collector that uses multiple GC threads. Using the -XX flags for our collectors for jdk6,
  • UseSerialGC is “Serial” + “Serial Old”
  • UseParNewGC is “ParNew” + “Serial Old”
  • UseConcMarkSweepGC is “ParNew” + “CMS” + “Serial Old”. “CMS” is used most of the time to collect the tenured generation. “Serial Old” is used when a concurrent mode failure occurs.

  • UseParallelGC is “Parallel Scavenge” + “Serial Old”
  • UseParallelOldGC is “Parallel Scavenge” + “Parallel Old”

FAQ

  • UseParNew and UseParallelGC both collect the young generation using multiple GC threads. Which is faster?
    There’s no one correct answer for this questions. Mostly they perform equally well, but I’ve seen one do better than the other in different situations. If you want to use GC ergonomics, it is only supported by UseParallelGC (and UseParallelOldGC). so that’s what you’ll have to use.
  • Why doesn’t “ParNew” and “Parallel Old” work together?
    • “ParNew” is written in a style where each generation being collected offers certain interfaces for its collection. For example, “ParNew” (and “Serial”) implements space_iterate() which will apply an operation to every object in the young generation.
    • When collecting the tenured generation with either “CMS” or “Serial Old”, the GC can use space_iterate() to do some work on the objects in the young generation.
    • This makes the mix-and-match of collectors work but adds some burden to the maintenance of the collectors and to the addition of new collectors. And the burden seems to be quadratic in the number of collectors.
    • Alternatively, “Parallel Scavenge” (at least with its initial implementation before “Parallel Old”) always knew how the tenured generation was being collected and could call directly into the code in the “Serial Old” collector.”Parallel Old” is not written in the “ParNew” style so matching it with “ParNew” doesn’t just happen without significant work. By the way, we would like to match “Parallel Scavenge” only with “Parallel Old” eventually and clean up any of the ad hoc code needed for “Parallel Scavenge” to work with both.
    • Please don’t think too much about the examples I used above. They are admittedly contrived and not worth your time.
  • How do I use “CMS” with “Serial”?
    • XX:+UseConcMarkSweepGC -XX:-UseParNewGC. Don’t use -XX:+UseConcMarkSweepGC and -XX:+UseSerialGC. Although that’s seems like a logical combination, it will result in a message saying something about conflicting collector combinations and the JVM won’t start. Sorry about that. Our bad.
  • Is the blue box with the “?” a typo?
    • That box represents the new garbage collector that we’re currently developing called Garbage First or G1 for short. G1 will provide More predictable GC pauses Better GC ergonomics Low pauses without fragmentation Parallelism and concurrency in collections Better heap utilization G1 straddles the young generation – tenured generation boundary because it is a generational collector only in the logical sense. G1 divides the heap into regions and during a GC can collect a subset of the regions. It is logically generational because it dynamically selects a set of regions to act as a young generation which will then be collected at the next GC (as the young generation would be).
    • The user can specify a goal for the pauses and G1 will do an estimate (based on past collections) of how many regions can be collected in that time (the pause goal). That set of regions is called a collection set and G1 will collect it during the next GC.
    • G1 can choose the regions with the most garbage to collect first (Garbage First, get it?) so gets the biggest bang for the collection buck.
    • G1 compacts so fragmentation is much less a problem. Why is it a problem at all? There can be internal fragmentation due to partially filled regions.
    • The heap is not statically divided into a young generation and a tenured generation so the problem of an imbalance in their sizes is not there.
    • Along with a pause time goal the user can specify a goal on the fraction of time that can be spent on GC during some period (e.g., during the next 100 seconds don’t spend more than 10 seconds collecting). For such goals (10 seconds of GC in a 100 second period) G1 can choose a collection set that it expects it can collect in 10 seconds and schedules the collection 90 seconds (or more) from the previous collection. You can see how an evil user could specify 0 collection
    • time in the next century so again, this is just a goal, not a promise.
    • If G1 works out as we expect, it will become our low-pause collector in place of “ParNew” + “CMS”. And if you’re about to ask when will it be ready, please don’t be offended by my dead silence. It’s the highest priority project for our team, but it is software development so there are the usual unknowns. It will be out by JDK7. The sooner the better as far as we’re concerned.
  • summary:
    • Apply -XX:+UseParallelGC when you require parallel collection method over YOUNG generation ONLY, (but still) use serial-mark-sweep method as OLD generation collection
    • Apply -XX:+UseParallelOldGC when you require parallel collection method over YOUNG generation (automatically sets -XX:+UseParallelGC) AND OLD generation collection
    • Apply -XX:+UseParNewGC & -XX:+UseConcMarkSweepGC when you require parallel collection method over YOUNG generation AND require CMS method as your collection over OLD generation memory
    • You can’t apply -XX:+UseParallelGC or -XX:+UseParallelOldGC with -XX:+UseConcMarkSweepGC simultaneously, that’s why your require -XX:+UseParNewGC to be paired with CMS otherwise use -XX:+UseSerialGC explicitly OR -XX:-UseParNewGC if you wish to use serial method against young generation
  • Using -XX:+UseParNewGC along with -XX:+UseConcMarkSweepGC, will cause higher pause time for Minor GCs, when compared to -XX:+UseParallelGC.
    • This is because, promotion of objects from Young to Old Generation will require running a Best-Fit algorithm (due to old generation fragmentation) to find an address for this object.
    • Running such an algorithm is not required when using -XX:+UseParallelGC, as +UseParallelGC can be configured only with MarkandCompact Collector, in which case there is no fragmentation. ### HotSpot JVM garbage collection options cheat sheet
  • HotSpot GC collectors
    • HotSpot JVM may use one of 6 combinations of garbage collection algorithms listed below.
    Young collectorOld collectorJVM option
    Serial (DefNew)Serial Mark-Sweep-Compact-XX:+UseSerialGC
    Parallel scavenge (PSYoungGen)Serial Mark-Sweep-Compact (PSOldGen)-XX:+UseParallelGC
    Parallel scavenge (PSYoungGen)Parallel Mark-Sweep-Compact(ParOldGen)-XX:+UseParallelOldGC
    Serial (DefNew)Concurrent Mark Sweep-XX:+UseConcMarkSweepGC -XX:-UseParNewGC
    Parallel (ParNew)Concurrent Mark Sweep-XX:+UseConcMarkSweepGC -XX:+UseParNewGC
    G1-XX:+UseG1GC
  • GC logging options
    • Please note that many of logging options could be modified on running JVM using JMX (e.g. via JConsole).
  • JVM optionDescription
    General options
    -verbose:gc or -XX:+PrintGCPrint basic GC info
    -XX:+PrintGCDetailsPrint more elaborated GC info
    -XX:+PrintGCTimeStampsPrint timestamps for each GC event (seconds count from start of JVM)
    -XX:+PrintGCDateStampsPrint date stamps at garbage collection events (e.g. 2011-09-08T14:20:29.557+0400: [GC… )
    -XX:+PrintGCTaskTimeStamps Print timestamps for individual GC worker thread tasks (very verbose)
    -Xloggc:Redirects GC output to a file instead of console
    -XX:+PrintTenuringDistributionPrint detailed demography of young space after each collection
    -XX:+PrintTLABPrint TLAB allocation statistics
    -XX:+PrintReferenceGC Print times for weak/soft/JNI/etc reference processing during STW pause
    -XX:+PrintJNIGCStallsReports if GC is waiting for native code to unpin object in memory
    -XX:+PrintGCApplicationStoppedTimePrint pause summary after each stop-the-world pause
    -XX:+PrintGCApplicationConcurrentTime Print time for each concurrent phase of GC
    -XX:+PrintClassHistogramAfterFullGCPrints class histogram after full GC
    -XX:+PrintClassHistogramBeforeFullGCPrints class histogram before full GC
    -XX:+HeapDumpAfterFullGCCreates heap dump file after full GC
    -XX:+HeapDumpBeforeFullGCCreates heap dump file before full GC
    -XX:+HeapDumpOnOutOfMemoryErrorCreates heap dump in out-of-memory condition
    -XX:HeapDumpPath=< path >Specifies path to save heap dumps
    CMS specific options
    -XX:PrintCMSStatistics=2Print additional CMS statistics if n >= 1
    -XX:+PrintCMSInitiationStatisticsPrint CMS initiation details
    -XX:PrintFLSStatistics=2Print additional info concerning free lists
    -XX:PrintFLSCensus=2Print additional info concerning free lists
    -XX:+PrintPromotionFailurePrint additional diagnostic information following promotion failure
    -XX:+CMSDumpAtPromotionFailureDump useful information about the state of the CMS old generation upon a promotion failure.
    -XX:+CMSPrintChunksInDumpIn a CMS dump enabled by option above, include more detailed information about the free chunks.
    -XX:+CMSPrintObjectsInDumpIn a CMS dump enabled by option above, include more detailed information about the allocated objects.

    《两个GC问题记录》

打赏
欢迎关注人生设计师的微信公众账号

公众号ID:longjiazuoA
《两个GC问题记录》

    原文作者:JVM
    原文地址: https://juejin.im/entry/5abdad505188257cc20d6434
    本文转自网络文章,转载此文章仅为分享知识,如有侵权,请联系博主进行删除。
点赞