Be the first user to complete this post
|
Add to List |
383. Print Processes and its messages from a given log file
Problem Statement:
There is an application that has several processes running in parallel and each process prints the logs in a single log file. Log message contains the process name, timestamp, message type (WARN, ERROR, etc) and actual message. You have given start timestamp and end timestamp. Write an algorithm to process the log file and combine messages as per the message types within the start and end timestamp and also find Top 2 Services with the most number of messages
Example:
Sample Logfile: Timestamp|Process|MessageType|Message 1540403503|S1|WARN|warning message 1540403503|S1|WARN|warning message 1540403504|S2|WARN|warning message 1540403504|S1|ERROR|error message 1540403604|S4|ERROR|error message 1540403614|S1|DEBUG|debug message 1540403614|S5|DEBUG|debug message 1540403615|S6|INFO|info message 1540403615|S6|DEBUG|debug message 1540403615|S6|DEBUG|debug message 1540403715|S7|INFO|info message 1540403715|S7|INFO|info message Output: S1 - 1 ERROR / 1 DEBUG / 2 WARN / S2 - 1 WARN / S4 - 1 ERROR / S5 - 1 DEBUG / S6 - 1 INFO / 2 DEBUG / S7 - 2 INFO / Top 2 processes are S1 : 4 and S6 : 3
Approach:
- Use LinkedHashSet with process name as key HashMap (this Map, message as key and its count as value).
- Process the given log file, one line at a time.
- Get the timestamp, if it is in the given range then proceed else skip this line.
- check if process name exist in LinkedHashMap,
- If no then create HashMap with the message as key, value as 1. Now insert this HashMap into LinkedHashMap as value with the process as key.
- If yes then get HashMap from LinkedHashMap using the process as key. Check for the log message line in retrieved HashMap, if exist then increase its count else insert log message with count 1.
- At the end iterate the LinkedHashMap to print the result.
- During this iteration, keep track of the top two processes (which has most log messages), also read find two largest elements in the array.
Output:
S1 - 1 ERROR / 1 DEBUG / 2 WARN / S2 - 1 WARN / S4 - 1 ERROR / S5 - 1 DEBUG / S6 - 1 INFO / 2 DEBUG / S7 - 2 INFO / Top 2 processes are S1 : 4 and S6 : 3