跳至主要内容

Microservice Log Practice(2): Aggregate Log

Microservice Log Practice(2): Aggregate Log

Tony gets up early today, he decided to finish the following requirement of log microservice. He just make the configuration of log centralized by using Spring Cloud Config center. Now, he want to collect the log.

  • The configuration is centralized;
  • The logging result is centralized – logging result is in somewhere of server;
  • The log is naturally grouped by request;

Log Collection

“When is comes to the log related jobs, we can think of ELK stack immediately. Except the Logstash, they also provides ‘Beats’ family to handle log. Considering we append log in file for future usage, we can use FileBeat to collect the log, then send to Logstash to parse, then store them in Elasticsearch and visualize in Kibana.” Tony think.

"Oh, I forget that our service is now running in docker in order to provide elastic scalable service, so the requirement is not simple now. The first problem is where is log file stored? If we store it in docker container, how to use Filebeat to collect them? Putting Filebeat also in docker?

"Considering location of Filebeat and log file, we seem having four choices:

log in docker log outside docker
Filebeat in docker (1) (2)
Filebeat outside docker (3) (4)

“The choice (3) is not reasonable, ignore it. Considering choice (1), If we want to put the Filebeat into the docker, we need to put every microservice with a Filebeat into docker container even that the docker is officially suggested to run only one service. But this idea will increase much performance cost, because we have much microservice in all. One more drawback is persistence problem. If the container is removed, the log file will be never found again, which may causes data lose.” Tony strike through the option one and three.

“So the left options are two and four. But, how to make log file outside of docker? Aha, I remember that docker support volumes/bind mounts to share files between container and host machine. The microservice will log normally, but the file is actually outside the container. With regards to options about Filebeat location, it seems that inside or outside of docker make no differences.”

Log View

“The last requirement is so hard: the log is naturally grouped by request.” Tony is lost in thought. "This requirement is very essential features: because the usage of microservice architecture, the log of a single request can be distributed in multiple microservice, different files, even hosts. If the we need to log in different machines and grep what the error message is, we will have no time to do other tasks. By group the log of a single requests, we can clearly see the call stacks between multiple hosts and see where the problems may lay.

“But how? Even the log is already stored in Elasticsearch, how can I search them out? What about an extra tag in log message then search the tag in Kibana?”

RequestId & Kibana

“Yes, we can add an requestId in header when the request reach the gateway. And then, when append logs, fetch the requestId.”

"We can use Filter to intercept the request and add the identifier.

  @Override
public void doFilter(ServletRequest servletRequest, ServletResponse servletResponse,
    FilterChain filterChain) throws IOException, ServletException {
  filterChain
      .doFilter(new AddRequestId((HttpServletRequest) servletRequest), servletResponse);
}

“However, how to add requestId in log? Add one more parameter in every log method call? Impossible! There are so many log method calls. Actually, requestId is like some kind of pattern of log message, which much like date or thread etc. So the logback’s MDC is the perfect match, which is bind to specific thread like request does.”

<pattern>%d %5p %X{requestId} [${application}@${env}@${HOST_ADDRESS}] --- [%t] %-40.40c{39} : %m%n%ex</pattern>

"One more time, Filter is the best match to do the MDC variable fetch in order to avoid the intrusion of original code:

@Override
public void doFilter(ServletRequest servletRequest, ServletResponse servletResponse,
    FilterChain filterChain) throws IOException, ServletException {
  MDC.put(REQUEST_ID, ((HttpServletRequest) servletRequest).getHeader(REQUEST_ID));
  try {
    filterChain.doFilter(servletRequest, servletResponse);
  } finally {
    MDC.remove(REQUEST_ID);
  }
}

“Then, we only need to filter the log result in Kibana via the requestId to the call stack of microservice.” Tony finished the design and start to write some demos for test.

Ref

Written with StackEdit.

评论

此博客中的热门博文

Spring Boot: Customize Environment

Spring Boot: Customize Environment Environment variable is a very commonly used feature in daily programming: used in init script used in startup configuration used by logging etc In Spring Boot, all environment variables are a part of properties in Spring context and managed by Environment abstraction. Because Spring Boot can handle the parse of configuration files, when we want to implement a project which uses yml file as a separate config file, we choose the Spring Boot. The following is the problems we met when we implementing the parse of yml file and it is recorded for future reader. Bind to Class Property values can be injected directly into your beans using the @Value annotation, accessed via Spring’s Environment abstraction or bound to structured objects via @ConfigurationProperties. As the document says, there exists three ways to access properties in *.properties or *.yml : @Value : access single value Environment : can access multi

Elasticsearch: Join and SubQuery

Elasticsearch: Join and SubQuery Tony was bothered by the recent change of search engine requirement: they want the functionality of SQL-like join in Elasticsearch! “They are crazy! How can they think like that. Didn’t they understand that Elasticsearch is kind-of NoSQL 1 in which every index should be independent and self-contained? In this way, every index can work independently and scale as they like without considering other indexes, so the performance can boost. Following this design principle, Elasticsearch has little related supports.” Tony thought, after listening their requirements. Leader notice tony’s unwillingness and said, “Maybe it is hard to do, but the requirement is reasonable. We need to search person by his friends, didn’t we? What’s more, the harder to implement, the more you can learn from it, right?” Tony thought leader’s word does make sense so he set out to do the related implementations Application-Side Join “The first implementation

Implement isdigit

It is seems very easy to implement c library function isdigit , but for a library code, performance is very important. So we will try to implement it and make it faster. Function So, first we make it right. int isdigit ( char c) { return c >= '0' && c <= '9' ; } Improvements One – Macro When it comes to performance for c code, macro can always be tried. #define isdigit (c) c >= '0' && c <= '9' Two – Table Upper version use two comparison and one logical operation, but we can do better with more space: # define isdigit(c) table[c] This works and faster, but somewhat wasteful. We need only one bit to represent true or false, but we use a int. So what to do? There are many similar functions like isalpha(), isupper ... in c header file, so we can combine them into one int and get result by table[c]&SOME_BIT , which is what source do. Source code of ctype.h : # define _ISbit(bit) (1 << (