Code Monkey home page Code Monkey logo

Comments (5)

boring-cyborg avatar boring-cyborg commented on September 24, 2024

Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.

from airflow.

Taragolis avatar Taragolis commented on September 24, 2024

This happen due to log deduplication, which might happen when logs streaming from remote logging

def _interleave_logs(*logs):
records = []
for log in logs:
records.extend(_parse_timestamps_in_log_file(log.splitlines()))
last = None
for _, _, v in sorted(
records, key=lambda x: (x[0], x[1]) if x[0] else (pendulum.datetime(2000, 1, 1), x[1])
):
if v != last: # dedupe
yield v
last = v

from airflow.

Zoynels avatar Zoynels commented on September 24, 2024

As I understood, the main problem is in log.splitlines(), which split log-string by simple lines and not by log-messages. Then function analyzes line by line and deduplicates lines, but we need to analyze and deduplicate log-messages.
As Airflow can be configured with a custom log-format, then we need to store the pattern in config (custom patterns) to split the whole log into log-messages.

from airflow.

Taragolis avatar Taragolis commented on September 24, 2024

If you have a suggestion how improve logging feel free to raise a PR which will work with any type of existed loggers without breaking changes.

from airflow.

jscheffl avatar jscheffl commented on September 24, 2024

I had the same multiple times, for example using DockerOperator which logs all stdout of the container upon failure. Also the logs are messed up not only because of the split lines but also because the file log handler per default tries to sort messages. This not only causes a lot of overhead on the server, it also changes the order any makes a confusion.

Looking forward that somebody raises a PR allowing log sorting and merging to be turned off :-)

from airflow.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.