Code Monkey home page Code Monkey logo

Comments (9)

ctdk avatar ctdk commented on July 16, 2024 2

You know, that's a very good question. There's already a setting to delete log entries, so similar ones for node statuses and reports is totally reasonable (I've even been bit by the reports filling everything up, but like you just set up a cron to clear them out). I'll get that added before sending 0.11.6 out. With sandboxes I'm not quite sure, but I'll check that out too. I have a feeling that they aren't useful for very long at all, but I haven't done much in there for a while.

from goiardi.

julian7 avatar julian7 commented on July 16, 2024

👍 I just realized log_infos TOAST size is about 55GB, which just wins the gold medal to reports, which is just 501MB, and to search_items which is a mere 379MB. The node is running for 15 months.

from goiardi.

ctdk avatar ctdk commented on July 16, 2024

Urk. I've added some new options for purging that kind of data, but they still need testing. (I've once again had stuff come up that needed dealt with, plus I want to get that cookbook issue mentioned elsewhere out of the way and I've been dragging my feet on writing a proper go test test for it, because creating a cookbook that way is awful.)

I was thinking about this issue again though after seeing this comment (really) for the last few days, and while log_infos needs some pretty serious refactoring I noticed one thing that might help right off the bat. When I wrote that feature I don't think I realized how much extraneous information it would create, especially for people who run chef as a cron, and didn't think about it much afterwards (so thanks for bringing it up). Anyway, along with periodic purging it looks like I maybe should have at the very least set log_infos up to store the data differently - I set the storage type in Postgres to EXTERNAL, but it may have been better to use EXTENDED for this.

If either of you have a table with this data handy, would you mind making a copy (presumably a subset of the data) of the table and see if altering the storage to EXTENDED and see if it makes a difference? I'll try it out too, but it may take a little while to generate some data for it.

from goiardi.

julian7 avatar julian7 commented on July 16, 2024

TBH I'm not interested in keeping run results for eternity. What I'm interested is to have the whole infra backed up periodically, and so far goiardi database backup takes the most resources.

Nevertheless, changing it to EXTENDED makes sense.

I've changed log_infos.extended_info to use EXTENDED storage, and I'll give it a week to collect some more data.

from goiardi.

ctdk avatar ctdk commented on July 16, 2024

@julian7 This is a little embarrassing, but I went to look at time-based log_infos purging, and realized I had in fact set up an optional argument for purging entries when I originally added the feature (this is what I get for adding things I don't always use). It's -K or --log-event-keep, log-event-keep in the config file, or $GOIARDI_LOG_EVENT_KEEP as an environment variable.

That said, those log_infos entries can still take up way too much space. Using EXTENDED helps a bit, but I'm looking at good ways to make that better.

from goiardi.

rmoriz avatar rmoriz commented on July 16, 2024

GOIARDI_LOG_EVENT_KEEP does not purge node_statuses and sandboxes for me with a PG-based setup. I also wonder if this can be the reason for the memory "leak" when using the go-based data store.

from goiardi.

ctdk avatar ctdk commented on July 16, 2024

The wheels turn slowly, but they've started turning again. I've pushed up another prerelease with a simplified tack at tackling the log info memory usage (optional skipping of recording that information) after backing off of more complicated ideas like storing diffs, at least for now. The node status and report purging's actually being started now too.

It needs more testing, but so far it looks good.

from goiardi.

ctdk avatar ctdk commented on July 16, 2024

The node statuses and reports should be dealt with now with the latest release (yay). I haven't been able to get an answer on the sandbox issue yet, though, so for now I'm leaving that be. Closing this out for now, but I'll keep the sandbox cleaning on my mind.

from goiardi.

julian7 avatar julian7 commented on July 16, 2024

@ctdk you had more time than I have :) I'll look into this as soon as I can.

from goiardi.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.