Code Monkey home page Code Monkey logo

Comments (7)

alekc avatar alekc commented on June 23, 2024

Added additional point about vendoring.

from goiardi.

ctdk avatar ctdk commented on June 23, 2024

I apologize for this response taking so long - I worked on it for a long time yesterday, until my left arm started both going numb and hurting a great deal. :-( There are many good points here. Here are my thoughts, along with some extra information:

  • To get this out of the way, some of these items are addressed in the various 1.0.0 branches. However, you are not wrong about the age of the project and maintaining backwards compatibility. Coupled with the fact that I originally started this to learn Go in the first place and was learning as I went along, there are definitely some weird things going on as you mentioned. It does sometimes feel like it's burdened with code that ought to be rewritten.
  • Chef has historically been extremely lenient about what you can put in a cookbook. Goiardi is more strict than Chef Server in many ways, especially with metadata, but many of the tests in the various chef-pedant suites test for acceptance (or rejection) of different malformed cookbooks.

To address some specific points:

  • 1.0.0 is using gorilla/mux for HTTP handling and middleware. I need to check out echo to compare, but if it has significant advantages hopefully the switch woldn't be too difficult since the general concept is in place. It's a big part of why the 1.0.0-dev goiardi completes more tests in a much shorter amount of time than 0.11.x (something like 3.5-6 minutes vs. 12+ minutes). That said, I set that up a long time ago.
  • Additionally many of the handlers have been redone from how they look in master. I won't claim they're perfect, but they should be much better.
  • I have no objections to structured logging; there's some method to the madness, but it's not consistent at all. Mostly it just didn't occur to me initially. I've started a related but different process to improve the logging over there by using the new and improved error wrapping functionality they added to the language recently. Currently I have it using the compat package, though.
  • Opencensus looks very interesting. Goiardi does export some metrics to statsd, if enabled, but while the statsd client isn't handrolled the actual sending of the metrics is.
  • I'll take a look at viper as well. I went with toml for the config files because yaml was aggravating me in an unrelated project I was working on, as I recall. I thought toml was easier to deal with than yaml, but I also just used it via BurntSushi/toml and never actually dug in deep to how it works behind the scenes. I will say that occasionally weird cases would pop up with toml having trouble with some postgres related config items.
  • The Data and File layers have improved greatly in 1.0.0, and those conditionals should have been all moved out from the controller. (NB: "should" may not mean "was" here, of course.) That said, I'm sure there's room for improvement; I'll address some of that below.
  • The anonymous JSON vs. proper structs is there as a combination of much of that code having been written early in my Go journey before I got a better grip on JSON decoding, and being very cautious. (Encoding was never too troublesome, though.) I'll do a chef-pedant run and see what examples I can get of malformed input. As I recall there was some trouble with invalid items in the request body JSON just going away if there was a problem. There's a healthy chunk of code present for validating the objects currently in place; some of it might just be unnecessary, but being able to send back the proper error messages was also a consideration. More on this further down.
  • Oddly enough there's already a place where something like that happens, but only in the in-memory side. It's another thing I did quite a while ago, though, so I'll need to dig that up to see why. It might have been an action-at-a-distance fix.
  • In regards to data storage: MySQL's already been dropped in 1.0.0-dev, because Postgres both has more useful features and because there's only so much of me to go around. The local datastore is definitely not the best solution now and hasn't really been once it moved past being a learning project, if that makes sense. I went with implementing a key/value store inside because it was easier initially, and then was leery of breaking backwards compatibility. I've definitely thought about sqlite for goiardi. One thing that did present a bit of a stumbling block when I first thought about using sqlite was trouble cross compiling goiardi because of the cgo requirement. That being said, it can be worked around (or just not worry about providing the more esoteric binaries like for NetBSD or linux/s390x).
  • I'm a little more loathe to change the db migrations, since the sqitch managed setup it's currently using does work pretty darn well. It does require an external tool, though, so I'm open to suggestions.

Testing and goiardi have an interesting history. I've tried to add go tests where it's practical, and there's more in the 1.0.0 branches, but those haven't always worked as well as I would have hoped. Since it's available, I also rely heavily on the various versions of chef-pedant (for 11.x, the old standalone 12.x oc-chef-pedant, and now I'm dipping my toes into the current oc-chef-pedant in the chef-server repo).

The pedant suite does provide a useful way to test every aspect of the server, in ways that don't work as well while testing internally, and has exposed countless problems. Unfortunately, it does have some downsides. It's also pedantic about error messages, so there's been a lot of work making the error messages fit what it expects them to be. There are a lot of tests in place to make sure it will still work with ancient cookbooks, but this may be better with the more recent test suite.

Some tests also have needed to be changed to fit what I want goiardi's behavior to be. This is especially the case with authentication, which is much further ranging than you'd expect. Basically, goiardi does its authentication checks first, with allowances to be made for endpoints that don't require authentication, and then moves on to the request handlers. The official chef server does it the other way around, so pedant will flip because it expects a 404, 405, etc. instead of a 401 or 403 error. I think it's better to do it the way I did, so I went and changed the tests. There are also a handful of tests that just plain don't work for utterly trivial reasons, but at the same time are kind of hard to find where exactly they're coming from. The tests in question are for abusing the metadata to do things like make the description an array or setting a metadata entry to nil. Chef sweeps these kinds of issues under the carpet because formerly all that information was stored as JSON documents in CouchDB, and now they're gzipped JSON blobs in Postgres. Since goiardi uses more proper database tables, which I stand by, translating back and forth is a bit more complicated.

All that said, there's a considerable amount of overhead with the various pedants too. It has a lot of dependencies and can be pretty finicky. There are also many tests that are less relevant to non-Chef Software Inc. users, tests for internal endpoints that goiardi doesn't use, and tests for old behavior and weird JSON abuse you shouldn't actually be able to do. Wading through all the rspec is itself an ordeal, so I'm open to alternate suggestions for at least normal testing.

One thing I've been thinking about lately is how 1.0.0 is basically ready and has been for a while. The holdup's been updating the documentation and testing out Policyfiles. I also usually did most of my goiardi work commuting to and from work, which unsurprisingly is no longer the case. The ACL work was also extremely draining and is one of two big reasons 1.0.0 development stalled (the other being that I didn't have an actual use case for organizations for a long time). Getting the ACLs to work in the way Chef Server's do was both arduous and full of false starts. I even ended up throwing out a few years of work and starting over last year. Ironically, actually mucking about with the ACLs seems to be discouraged now.

Basically, I'm looking at finally letting go and not trying to catch up to the absolute latest version of Chef before putting out 1.0.0 is the way to go, and instead of devoting energy to battling with the latest pedant and its changes to get the current code ready, merging in those changes you've made lately (and thank you for those, btw), and then doing this stuff. What do you think?

from goiardi.

ctdk avatar ctdk commented on June 23, 2024

I may be overlooking it, but where's the vendoring point?

from goiardi.

alekc avatar alekc commented on June 23, 2024

I may be overlooking it, but where's the vendoring point?

No idea where it went :| My point was: lets abandon vendoring and go completely with go.mod, as far as I can see, we don't use any private dependencies, so we can cover all existing vendoring with go.mod

I need to check out echo to compare, but if it has significant advantages hopefully the switch woldn't be too difficult since the general concept is in place

Yes, it should not be too much of an issue passing from one to another. Tbh, gorilla mux is already much better compared to what was there previously, but I still prefer the syntax/functionality of echo a bit more 😉 Also, return headers (such as json, etc) are set automatically based on return type, and its a bit easier to catch the output before its being emitted (i.e. for logs).

Additionally many of the handlers have been redone from how they look in master.

Yes, I can see them in 1.0.0 branch, looks much better indeed. I suppose at the end we would have all handlers to be structured in a specific methods like you did for clients?

	s.HandleFunc("/clients", clientListHandler).Methods("GET")
	s.HandleFunc("/clients", clientCreateHandler).Methods("POST")

Goiardi does export some metrics to statsd

yes, we amended statsd reporting (to include some additional bits) in one of our commits (you can find it in pr relative to the memory leak), opencensus would permit us/consumers to tackle following points:

  • decide if we want to pull (prometheus) or push (statsd) metrics based on existing monitoring stack
  • also implement tracing

In regards to data storage: MySQL's already been dropped in 1.0.0-dev, because Postgres both has more useful features and because there's only so much of me to go around.

Make sense, I would say once we have a properly defined interface for data layer, if someone need mysql it should be fairly trivial bring it back without any big changes to the main code structure.

The local datastore is definitely not the best solution now and hasn't really been once it moved past being a learning project, if that makes sense. I went with implementing a key/value store inside because it was easier initially, and then was leery of breaking backwards compatibility. I've definitely thought about sqlite for goiardi. One thing that did present a bit of a stumbling block when I first thought about using sqlite was trouble cross compiling goiardi because of the cgo requirement. That being said, it can be worked around (or just not worry about providing the more esoteric binaries like for NetBSD or linux/s390x).

Several points:

  • Tbh, I wouldn't worry too much about exotic binaries too much, I don't think there would be a lot of requirements for setting up goiardi in production on "weird" os, it would (most likely) be just linux x86-64/windows
  • If we still want to provide local filestorage (which tbh, I do not know how much value holds nowdays), we can also think about using something like https://github.com/boltdb/bolt or leaving the current key-value storage (I have not looked into it yet, so can't say a lot about it). But generally speaking, I think the higher priority should go to the postgresql solution and the actual release, then IF there is a demand for local storage see what can be done.

I'm a little more loathe to change the db migrations, since the sqitch managed setup it's currently using does work pretty darn well. It does require an external tool, though, so I'm open to suggestions.

For our scenario (running in kubernetes) db init/upgrade was a bit of a sore point (at the end we ended up using a pgsql init script and sending off the dump file). It didn't help that sqitch is perl based, so it requires some specific perl environment and prior knowledge in using perl. But its not a deal breaker for now, and can be tackled later on.

Validations, json, chef inspec, etc.

While rewriting the cookbook handling part, I noticed that once I have converted anon json to a proper json struct, I could delete almost all of validation logic because its was taken care of by the json unmarshalling function (i.e. you cannot place a string inside a num, or if there is a nil value for an array, json unmarshal would declare an empty array bit), so I would be very tempted to define our structure depending on valid requests and in case of invalid json arriving in request, just refuse it with a StatusBadRequest and a reason inside the body/logs.

This brings me to another point which I have expressed previously in one of our MR: Chef-pedant, should we rely heavily on it like we are doing now? The pedant suite is a tool to test chef server, but for something like goiardi, I think we should take as a first priority the functionality from the client point of view (as in using cinc-client, knife, chef-client). Like it happened with knife user upload for example (#75) where existing logic (presumably taken from chef-pedant?) was breaking the actual end user experience and preventing them from making a valid request. I dont think there is a lot of value for goiardi to be returning the error message in exact same way as the official chef server does (if it's not used for validation logic inside chef-client/knife)

I think the best course of action for testing (besides the unit testing for specific functions), would be to have a set of valid requests (i.e. cookbook uploads, user creation, deletion, etc) and run them through goiardi handlers. I would imagine them working in following fashion:

Basically, I'm looking at finally letting go and not trying to catch up to the absolute latest version of Chef before putting out 1.0.0 is the way to go, and instead of devoting energy to battling with the latest pedant and its changes to get the current code ready, merging in those changes you've made lately (and thank you for those, btw), and then doing this stuff. What do you think?

Probably trying to catch up to the latest version would not be necessary tbh. Whats the current situation on the acls and orgs, are they finished?
If you are not against the restructure of the whole goiardi code like I outlined above, it might be better to wait a bit and deploy the refactored (/rewritten) version, after all we cannot deploy 2 major version with potentially breaking changes twice in a short period of time.

from goiardi.

alekc avatar alekc commented on June 23, 2024

Oh, and by the Data interface I meant something like this:

type DataLayerStruct interface {
	BeginTransaction() error
	Commit() error
	Rollback() error
	GetCookbook(ctx context.Context) cookbook.Cookbook
	UpdateCookbook(ctx context.Context,cookbook2 cookbook.Cookbook) error
}

where BeginTransaction/Commit/Rollback in case of in memory for example would probably be some dummy functions

Also added point about https://github.com/golangci/golangci-lint

from goiardi.

ctdk avatar ctdk commented on June 23, 2024

It sounds like we're in general agreement, then. First, I'd like to address a few of your points:

  • I'm not passionate about gorilla/mux, to be clear. Someone suggested it to me years ago, is all, and it was light years better than what I had cobbled together. The current handler situation isn't entirely consistent, either, given how long I've been working on 1.0.0 in fits and starts and changing ideas of how they should be organized. It might even be a good time to further regularize those.
  • Sounds good with metrics. Metrics are good, after all, and I've been at a little bit of an impasse with the best ways to put out per-org metrics; I cooked up something for that in 1.0.0-dev, but I'm on the fence as to how good of a solution it is. Also working with more metric backends is a good thing, I think. Also also, tracing.
  • In regards to local file storage, I've been told that a common use case for goiardi is as a test server for cookbook development and such. Then again, there's a lot of overhead, cognitive and otherwise, keeping things the way they are. The common availability of docker containers would help as well. More on this below.
  • In re: chef pedant and its various versions, it does have some value as a way to compare the behavior knife, chef-client, etc. expect from a server and to exercise more weird corner cases. I'll admit that many of the error messages have given me a lot of grief over the years, though, and as I mentioned previously there are a lot of tests covering chef 0.10 and before behavior that's not really relevant anymore. So... it depends. What might be even more useful, I just realized, is a pedant suite that's both cut down (to avoid unnecessary tests) and modified further than I had before to deal with error message mismatches and such.

All that said, not relying on chef-pedant for normal testing would be pretty nice.

  • ACLs and organizations are finished, yes. I got Policyfiles in as well, but haven't gotten those thoroughly tested yet (which is a big reason I was working on getting goiardi to work with that latest chef-pedant from chef-server).

Additionally:

  • Relying on postgres-in-docker as a local datastore replacement, which has a definite appeal, does make the sqitch issue more pressing.
  • In regards to vendoring vs. go.mod, the current situation isn't intended to be permanent. I set it up that way to be a temporary measure and plan for it to go away. Actually it looks like now or shortly after would be a good time anyway, since go 1.15 just came out.
  • I like the data layer interface idea.
  • I've used go lint in the past, but didn't find it especially useful after knocking out the worst transgressions since it had a lot of false positives. Linters also make me a bit twitchy in the best of circumstances, but I'd be surprised if the golang ones could possibly fill me with as much rage as e.g. rubocop ("What do you MEAN, I shouldn't use $$?"). I'll take it for a spin real quick.

And finally, I suppose there's not really a reason to cut a release of the current 1.0.0 work right now. I was thinking about making one as a checkpoint, as it were, but taking the time to make it more better is better in the end.

Thank you for all your work and ideas that you've had so far, by the way. I apologize for being kind of slow on this lately.

from goiardi.

alekc avatar alekc commented on June 23, 2024

No worries :)

For the next couple of weeks or so I will be busy with deploying the current version of goiardi to prod, once that's over and we are stable I will be able to slowly work on the new implementation. I will post updates once I have something usable.

Leaving this open for further discussion.

from goiardi.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.