The Microservice kingdom has grown up a lot in recent years. It has many very powerful weapons (frameworks or tools) like Spring Cloud, Dubbo, ServiceComb, Service Mesh1. With the help of those weapons, the Microservice kingdom take many territory of original Monolithic kingdom.
One day, when the king of Microservice inspect his country, a sentinel called free run to his face and reported in hurry, “My Lord. The memory in our server 1 region is in danger: there are only 5% memory is in control of our linux officer.”
“Don’t be so hurry. Tell me the details. When did this happen?” The King is very experienced and want more details.
“Since the last time when server 1 introduced the docker, the memory usage of every microservices, like user-service, gateway-service etc, have been in consistent increase.” free reported.
“How dare you. The docker is the new military officer2 our great King introduced.” The DevOps officer who introduced docker rebuke the free angrily.
“Calm down. We are know it has nothing to do the docker. There must be some mis-understanding between docker and other microservices. Anyway, for now, does any officer has some suggestions about how to deal with the memory crisis in server 1?” King says.
"docker support the resource limit. We can limit the memory upper bounds docker can use if we have to." The DevOps officer says reluctantly.
“Yes, we have to. The linux officer has a very very stubborn but loyal subordinate called OOM-killer. If he detect there is no more memory left for his boss – linux, he will choose some process and kill them to release the memory. Even our microservices may be killed in that case3.”
“Can’t we fire that guy?” The DevOps officer don’t want to limit his docker's development.
“No, we can’t. If linux let the memory usage increase and not respond to the OutOfMemory, itself will crash, which will affect even more people4.” The King refused.
“Ok, I will do the limitation.” The DevOps officer decide to do it by himself. He then send the docker some command5 and tell him to restart:
docker run --memory=xxm --memory-swap=xxm ...
A few days later, the docker officer send sentinel docker ps coming in emergency. After listening the report, the Microservice King call together all the officer to have meeting.
"Just now, the docker reported that our user-service is killed. docker ps, come in, show us the failure.
$ docker ps | grep 'user-service'
$ docker ps -a | grep 'user-service'
0f742445f839 user-service java ... 16 hours ago
And the dmesg sentinel is also reported that the a docker container is killed . Any officer has some suggestions?" King speak seriously.
All the officer are too afraid to speak because they understand this is a very critical failure.
After some time, the core officer Java stand out and says, “Our department has some competent staff like jps, jinfo, jmap, we can send them to investigate this failure.”
“Fine, so show me why this happens and how to deal with those situation in 2 days.” King ordered.
The Java officer arrived the accident scene and only find some wreckage (stack trace):
[ 583.447974] Pid: 1954, comm: java ...
[ 583.447980] Call Trace:
[ 583.447998] [<ffffffff816df13a>] dump_header+0x83/0xbb
[ 583.448108] [<ffffffff816df1c7>] oom_kill_process.part.6+0x55/0x2cf
[ 583.448124] [<ffffffff81067265>] ? has_ns_capability_noaudit+0x15/0x20
[ 583.448137] [<ffffffff81191cc1>] ? mem_cgroup_iter+0x1b1/0x200
[ 583.448150] [<ffffffff8113893d>] oom_kill_process+0x4d/0x50
...
[ 583.448275] [<ffffffff8115b4d3>] do_anonymous_page.isra.35+0xa3/0x2f0
[ 583.448288] [<ffffffff8115f759>] handle_pte_fault+0x209/0x230
[ 583.448301] [<ffffffff81160bb0>] handle_mm_fault+0x2a0/0x3e0
[ 583.448320] [<ffffffff816f844f>] __do_page_fault+0x1af/0x560
[ 583.448341] [<ffffffffa02b0a80>] ? vfsub_read_u+0x30/0x40 [aufs]
[ 583.448358] [<ffffffffa02ba3a7>] ? aufs_read+0x107/0x140 [aufs]
[ 583.448371] [<ffffffff8119bb50>] ? vfs_read+0xb0/0x180
[ 583.448384] [<ffffffff816f880e>] do_page_fault+0xe/0x10
[ 583.448396] [<ffffffff816f4bd8>] page_fault+0x28/0x30
[ 583.448405] Task in /lxc/0f742445f8397ee7928c56bcd5c05ac29dcc6747c6d1c3bdda80d8e688fae949 killed as a result of limit of /lxc/0f742445f8397ee7928c56bcd5c05ac29dcc6747c6d1c3bdda80d8e688fae949
[ 583.448412] memory: usage xxxMB, limit xxxMB, failcnt 342
No heap dump, no core dump, the investigation mired in a stalemate. Just at this moment, the docker says we can restart the user-service and you guys can jump into container to see what happens. Java think it is a good idea to try to reproduce the error so they started to do so.
$ docker run --memory=xxm --memory-swap=xxm -d ... -name user-service
$ docker exec -it user-service bash
The Java officer come into the container, followed by jps, jmap, jinfo and officer’s son java. The jmap is eager to show his ability, so he says, “I can show the memory usage of a process, organized by classloader, or heap …”
“So show us” the Java interrupt him.
The jmap called the jps and head to help him, then start the working:
$ jps
1 Jar
xxx jps
$ jmap -histo <vmid> | head
num #instances #bytes class name
----------------------------------------------
1: 2083 18549536 [B
2: 1654 2146632 [I
3: 15388 1471480 [C
4: 3671 409312 java.lang.Class
5: 15031 360744 java.lang.String
6: 2909 314808 [Ljava.lang.Object;
7: 7206 230592 java.util.concurrent.ConcurrentHashMap$Node
They analyzed some time the memory usage and whether their exists memory leakage. Just as they analyzing, the memory usage keeps raising. Analyzing heap gives no results, they become anxious. At that time, the Java officer notice that the jps and jinfo is playing:
$ jps -lvm
1 Jar
...
$ jinfo -flags <vmid>
...
Non-default VM flags: -XX:CICompilerCount=3
Command line: -Djava.awt.headless=true ...
$ jinfo -sysprops <vmid>
When he is about to get angry, he notice that the command line lack -Xmx which is the max memory limit for JVM. Suddenly, a insight dawn on him: those Microservice not have JVM memory settings because they are Spring Boot project which is started with simple java -jar xxx.jar.
Java officer call the java to come here and say, “show us the default memory limit of JVM in this environment”. The java start to do it right now:
$ java -XX:+PrintFlagsFinal -version | grep -iE 'HeapSize|PermSize'
uintx ErgoHeapSizeLimit = 0 {product}
uintx HeapSizePerGCThread = 87241520 {product}
uintx InitialHeapSize := 192937984 {product}
uintx LargePageHeapSizeThreshold = 134217728 {product}
uintx MaxHeapSize := 3072327680 {product}
Seeing the MaxHeapSize number is much larger than the memory limit docker has set, they all understand: JVM not know the docker’s memory limit and not going to do GC, then he just require more memory and finally being killed. And the solution is also very simple: adding the -Xmx700m to startup option.
Errors When Use Java Util
- Error attaching to process: Run with same user
- Metadata does not appear to be polymorphic
Ref
- Docker document: limit a container resource
- Default heap size of
- Get parameter of running JVM
- Docker container is killed after exceed memory limit
Written with StackEdit.
This is four mainstream Microservice frameworks for Java Web world. ↩︎
Docker is a virtualization technique which can be used to isolate resource & environment and provide elastic service expansion. We call it ‘military officer’ because we draw the analogy between Microservice framework and ‘weapon’. Considering the docker is used to manage those Microservice application, we name it ‘military officer’. ↩︎
An example can be seen from this blog post: OOM killer kill the tomcat ↩︎
Actually, we can turn off the linux’s memory overcommit to avoid
linuxto crash, but all memory allocation callmallocwill fail, which makes all process fail to proceed and make system not working. So, we might better not to turn off this feature. ↩︎The swap in docker is also cause much performance penalty, so we might better disable it. In docker, the
memory-swapinclude thememoryandswap, so, in order to disable swap, we set same value formemoryandmemory-swap. For more details, you can refer to this page ↩︎
评论
发表评论