Code Monkey home page Code Monkey logo

Comments (15)

joto avatar joto commented on June 12, 2024

I have seen the same issues in logs, too. Interestingly enough I came to the opposite conclusion: Maybe we have to do the VACUUM ourselves also instead of relying on the autovacuum to do its jobs whenever it feels like doing so. The problem from my experience is that working with a table that has not been analyzed can mean PostgreSQL will choose a disastrous plan and that's something we can not risk. I know that eventually PostgreSQL will probably get around to doing the ANALYZE automatically, but "eventually" is not something I can work with.

And there is another issue behind this. Why is the vacuum even necessary? We created that table and we fill it with COPY. If I understand the docs correctly, we should be able to do a COPY FREEZE which means we don't need the vacuum, doesn't it? Currently we can not do this easily because of the way the code is structured (table creation and copy is run in different threads), but maybe this would be a better solution.

from osm2pgsql.

anayrat avatar anayrat commented on June 12, 2024

Usually, t is not the application's job to worry about how the RDBMS clean the tables. The default behavior should be "let postgres do the job" and the "analyze by the application" should be the exception. Not the opposite.

The DBA can control aggressiveness of autovacuum. If you perform analyze or vacuum in the application, how can we control this? I know there are parameters to control that, but this implies to be able to change them.

From what I can see, new tuples are inserted or deleted. So vacuum is needed to clean old rows or update hint bits. You can only perform COPY FREEZE if "the table being loaded has been created or truncated in the current subtransaction"

from osm2pgsql.

joto avatar joto commented on June 12, 2024

Usually, t is not the application's job to worry about how the RDBMS clean the tables.

Yeah, I know. Unfortunately our use case seems sufficiently unusual that we are constantly coming up against these kinds of problems where the database isn't magically doing the right thing.

From what I can see, new tuples are inserted or deleted.

Only in "append" mode, i.e. when updating the data. And in that case osm2pgsql should not run ANALYZE. In "create" mode, ANALYZE is run explicitly. Are you seing those explicit ANALYZEs in "append" mode?

from osm2pgsql.

joto avatar joto commented on June 12, 2024

Looking more closely at your log snippet I am seeing something strange. The autovacuum seems to be working on "planet_osm_ways_nodes_bucket_idx", but that index is only created after the ANALYZE "planet_osm_ways" it seems to be blocking on. So that should not happen.

What version of osm2pgsql are you running?

from osm2pgsql.

anayrat avatar anayrat commented on June 12, 2024

We use version 1.4.1, I ask someone to answer other questions :)
And yes, I see explicit ANALYZE (see logs above).

from osm2pgsql.

anayrat avatar anayrat commented on June 12, 2024

Oh, I see it is an old version, and osm2pgsql's behavior has changed in 1.5.2 around analyze.

from osm2pgsql.

anayrat avatar anayrat commented on June 12, 2024

Just FTR, we updated using backport (debian bulleyes). The problem is fixed. Furthermore, queries are faster. Thanks for yours answers, and sorry for the noise.

from osm2pgsql.

pnorman avatar pnorman commented on June 12, 2024

PostgreSQL's autoanalyze works okay normally, but because we know what the upcoming usage of the table will be, we can do better by forcing it manually at appropriate times.

from osm2pgsql.

anayrat avatar anayrat commented on June 12, 2024

Yes, but you have to keep in mind the analyze can prevent vacuum.

from osm2pgsql.

pnorman avatar pnorman commented on June 12, 2024

That's fine, at no point during the import should vacuum be run. There's no need to vacuum the tables in the processing or phase 2 parts of the import, because they get rewritten in the postprocessing.

from osm2pgsql.

joto avatar joto commented on June 12, 2024

There's no need to vacuum the tables in the processing or phase 2 parts of the import, because they get rewritten in the postprocessing.

That's only true for the output tables, the middle tables don't get rewritten.

from osm2pgsql.

pnorman avatar pnorman commented on June 12, 2024

But the middle tables are append-only during import, so there's no dead tuples.

from osm2pgsql.

anayrat avatar anayrat commented on June 12, 2024

But append-only table also require vacuum (update hint bits, freeze ...). That's why, since pg13, autovacuum can trigger on append-only workload.

from osm2pgsql.

rustprooflabs avatar rustprooflabs commented on June 12, 2024

The error initially reported was just a timeout of one attempt by autovacuum. The daemon lives on and will do its job, it'll come back to any tables that need vacuuming at another time.

from osm2pgsql.

anayrat avatar anayrat commented on June 12, 2024

Not at all. In our case, the daemon was cancelled by analyze. Then, it was restarted, but it took a long time to vacuum a big table and before the next vacuum finish, it was again cancelled by analyze.

from osm2pgsql.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.