Tuesday, March 15, 2022

[SOLVED] About logstash. Hot to output the fillered log by collect order?

Issue

I'm working on masking part of my logs by using logstash filter.

This is the example of my logs.

[2022/02/22 12:19:56.092] [INFO ] Controller : co.test.api.controller.LoginApiController#loginUsersPost 
[2022/02/22 12:19:56.092] [INFO ] API : Object[][{F120001,class LoginRequestDto {
    msn: 08022222222
    userLoginDto: class UserLoginAuthQrDto {
        class UserLoginDto {
            type: 01
        }
        number: 20290520021255J
    }
}}]

However, logstash outputs the json like below.

{"message":"[2022/02/22 12:19:56.092] [INFO ] API : Object[][{F120001,class LoginRequestDto {","@timestamp":
{"message":"    userLoginDto: class UserLoginAuthQrDto {","@timestamp":"2022-02-22T03:19:56.931Z"}
{"message":"            type: 01","@timestamp":"2022-02-22T03:19:56.931Z"}
{"message":"        number: 20290520021255J","@timestamp":"2022-02-22T03:19:56.931Z"}
{"message":"[2022/02/22 12:19:56.092] [INFO ] Controller : co.test.api.controller.LoginApiController#logi
{"message":"    msn:XXXXX","@timestamp":"2022-02-22T03:19:56.930Z"}
{"message":"        class UserLoginDto {","@timestamp":"2022-02-22T03:19:56.931Z"}
{"message":"        }","@timestamp":"2022-02-22T03:19:56.931Z"}
{"message":"    }","@timestamp":"2022-02-22T03:19:56.931Z"}
{"message":"}}] ","@timestamp":"2022-02-22T03:19:56.931Z"}

Field of "msn" was successfully filtered through my config but as you can see, the order of rows are changed.

Then here is the configuration of logstash.

input {
    file {
        mode => "tail"
        path => ["/test/app_info.log"]
        sincedb_path => "/home/logstash/output/sincedb/app_info.log"
        start_position => "beginning"
        codec => plain {
            charset => "UTF-8"
        }
    }
}
filter {
    mutate {
        remove_field => ["path", "@version", "host"]
        gsub => [
            "message", "msn:.*", "msn:XXXXX"
        ]
    }  
}
output {
    file {
        path => "/test/logstash/output/test_%{+YYYYMMdd}.log"
    }
}

If I omitted the gsub, the order of rows are same as original log. Therefore I can tell it's caused by gsub.

Does anyone know hot to output the fillered log by collect order ?


Solution

If you want to preserve the order of events in logstash then you have to set pipeline.workers to 1, so that there is only a single worker thread, and also set pipeline.ordered to true. Both of these can be set in logstash.yml. That is documented here.

The details of the filters can affect whether order is preserved, but logstash does not guarantee that order is preserved unless pipeline.ordered is in effect. They are free to update the codebase so that order is still modified when you remove the gsub.



Answered By - Badger
Answer Checked By - Mary Flores (WPSolving Volunteer)