Code Monkey home page Code Monkey logo

Comments (6)

cosmo0920 avatar cosmo0920 commented on June 16, 2024

However, I found an error "dump an error event: error_class=ArgumentError error="invalid byte sequence in UTF-8" location="/usr/lib/ruby/gems/2.5.0/gems/fluent-plugin-concat-2.4.0/lib/fluent/plugin/filter_concat.rb:291:in `match'" recently.
I have added the replace_invalid_sequence but no luck. Please advise. Thank you!!

This parameter should be added in filter parser plugin configuration not filter concat plugin.

https://docs.fluentd.org/filter/parser#replace_invalid_sequence
replace_invalid_sequence as true should handle invalid byte sequence in UTF8 or other encodings.

from fluent-plugin-concat.

chikinchoi avatar chikinchoi commented on June 16, 2024

Hi @cosmo0920 ,

I understand that replace_invalid_sequence should be added in filter parser plugin. I saw that there are some parser plugin, e.g "json", "csv", "multiline". However, I don't need to parse the data into other format in the concat filter, may I know how to add the replace_invalid_sequence with concat filter?
Thank you.

<filter **firelens**>
  @type concat
  key log
  multiline_start_regexp '^\{\\"@timestamp'
  multiline_end_regexp '/\}/'
  separator ""
  flush_interval 1
  timeout_label @NORMAL
</filter>

from fluent-plugin-concat.

chikinchoi avatar chikinchoi commented on June 16, 2024

Hi @cosmo0920 ,

I think that there is a mutual exclusion in this case. I have considered the below solution to fix the "docker has split over multiple lines due to its 16KB line limit" issue and also the "invalid byte sequence in UTF-8" issue.

According to [1], I found that the event proceeds through the filter configuration in descending order. Therefore, if I place the concat filter first, it will trigger the "invalid byte sequence in UTF-8' issue as the "replace_invalid_sequence" is in the parser filter. If I place the parser filter first, it will trigger the "docker has split over multiple lines due to its 16KB line limit" issue as the "key" field in some logs is not a complete log due to split to multiple lines.
Could you please add a new feature which is to add a new parameter replace_invalid_sequence into the concat plugin or suggest another solution to fix this mutual exclusion? Thank you very much!

<filter **firelens**>
  @type concat
  key log
  multiline_start_regexp '^\{\\"@timestamp'
  multiline_end_regexp '/\}/'
  separator ""
  flush_interval 1
  timeout_label @NORMAL
</filter>

<filter **firelens**>
  @type parser
  key_name log
  reserve_data true
  replace_invalid_sequence true
  emit_invalid_record_to_error false
  <parse>
  @type json
  </parse>
</filter>

[1] https://docs.fluentd.org/filter

from fluent-plugin-concat.

cosmo0920 avatar cosmo0920 commented on June 16, 2024

Could you please add a new feature which is to add a new parameter replace_invalid_sequence into the concat plugin or suggest another solution to fix this mutual exclusion? Thank you very much!

We won't add replace_invalid_sequance on filter concat plugin.
In Fluentd world, one plugin should has one functionality.
Monolithic plugin is not followed for Fluentd design concept.

Instead, how about using fluent-plugin-string-scrub to scrub invalid byte sequences?

from fluent-plugin-concat.

chikinchoi avatar chikinchoi commented on June 16, 2024

Hi @cosmo0920 ,

Thank you for your suggestion.
I added the string_scrub filter as below config and the invalid byte sequence issue is gone.

<filter **>
  @type string_scrub
  replace_char ?
</filter>

However, I don't really understand about this string_scrub plugin. May I know what is the usage or replace_char ?.
Can I have some example input and the output after perform the filter? Thank you very much!!

from fluent-plugin-concat.

cosmo0920 avatar cosmo0920 commented on June 16, 2024

replace_char is used in https://ruby-doc.org/core-2.4.0/String.html#method-i-scrub-21 .
And invalid byte sequence issue is solved. Closing.

from fluent-plugin-concat.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.