thought-machine / please Goto Github PK
View Code? Open in Web Editor NEWHigh-performance extensible build system for reproducible multi-language builds.
Home Page: https://please.build
License: Apache License 2.0
High-performance extensible build system for reproducible multi-language builds.
Home Page: https://please.build
License: Apache License 2.0
Root cause is the upstream bug golang/go#6329. It affects us disproportionately though because we always build tests with coverage annotations.
We have internal rules for building JS. Is not so obvious how best to separate them from our private repo but would be super nice to have them included.
Extension to #8 , add ability to identify system-level tools and add them as dependencies so changes in them invalidate other outputs. For example protoc, if someone installs a new version we'd know it was different and regenerate all generated code correctly.
eg. plz clean //package:some_cc_rule
should clean the associated #a and #hdr rules as well.
We have some situations, like testing in javascript, where having some kind of debug flag to allow a more interactive debugging session. For example with karma we can launch the test in a browser and put breakpoints, load sourcemaps, etc, but we only want that for local development so it shouldn't be the default behaviour.
On python_binary at least, because the target with the pre-build function doesn't have any dependencies. I think it's still OK on python_test which is maybe why I didn't notice sooner.
Seems we're using cffi in kind of a naive way that invokes a C parser at startup. There are techniques for avoiding this - see e.g. http://cffi.readthedocs.org/en/latest/overview.html (the out-of-line ABI example) which should make things significantly faster at startup. It will be more complex to package it though.
Apparently is faster to start up and suchforth. We need to be able to install PyPy and protobuf-3.0 either via apt or in a non-root owned directory though.
From Riccardo:
I had the following BUILD file:
go_library(
name = 'postprocessing',
srcs = ['internal_bank_transfers.go'],
deps = [
'//common/protos:banking_proto',
'//common/protos:category_proto',
'//ml/api/proto:yunus_proto',
'//third_party/go:logging',
],
visibility = ['//ml/api/...'],
)
go_test(
name = 'internal_bank_transfers_test',
srcs = ['internal_bank_transfers_test.go'],
deps = [
':postprocessing',
'//common/protos:banking_proto',
'//common/protos:category_proto',
'//third_party/go:testify',
],
)
Turns out that the go_test was failing with the following error:
14:44:19.878 CRITICAL: //ml/api/yunus/postprocessing:internal_bank_transfers_test failed:
Error building target //ml/api/yunus/postprocessing:internal_bank_transfers_test: exit status 1
src/ml/api/yunus/postprocessing/internal_bank_transfers.go:12:2: cannot find package "ml/api/proto/yunus_proto" in any of:
/usr/local/go/src/ml/api/proto/yunus_proto (from $GOROOT)
/home/riccardo/workspace/core3/plz-out/tmp/ml/api/yunus/postprocessing/internal_bank_transfers_test/src/ml/api/proto/yunus_proto (from $GOPATH)
/home/riccardo/workspace/core3/plz-out/tmp/ml/api/yunus/postprocessing/internal_bank_transfers_test/third_party/go/src/ml/api/proto/yunus_proto
until i added the //ml/api/proto:yunus_proto library to the deps of the test.
When a containerised test specifies dependent binaries that exist at different levels of the directory hierarchy, they are flattened when copied into the container. Ie.
filegroup(
name = 'services',
srcs = [
'//foo:foo_service',
'//foo/bar:bar_service',
],
link = False
)
python_test(
name = 'e2e_test',
srcs = ['e2e_test.py'],
data = [
':services',
],
container = True,
labels = ['e2e'],
)
puts foo.pex
and bar.pex
into the working directory. I would expected foo.pex
to be there, but bar.pex
to be in a bar/
subdirectory.
We have internal rules for building & testing Thrax grammars. Integrate them into plz itself.
go_test rebuilds all its transitive inputs at build time which can be slow. Would be nice if it reused previous compiles better.
The biggest problem is that we want coverage and don't know whether the test will be invoked with 'plz cover' or not, so we need build config support for this to work.
I suspect that getting the logic just so for the dependency resolution is going to be a bit of a rabbit hole; currently we know we need to walk back up parent projects to resolve variables but there might be any amount of other things lurking.
It may be an idea to rewrite it in Java so we can reuse existing logic for this stuff. The Braintree guys have written https://github.com/pgr0ss/bazel-deps which might be a good basis for it - looks nice and simple.
I've got this pencilled in for an intern project but that will obviously depend on whether we find anyone appropriate and whether they get assigned to something "more important" (if such a thing could ever exist).
It does some replacement of the input file to determine the output file name, which doesn't work if the input is ":rule". Not obvious how to fix though...
This seems to appear rarely and is apparently PyPy related. I'm pretty sure it happens when something else has already gone wrong, maybe because we panic upwards through a cgo call which I'm sure will be extremely surprising to PyPy.
Not entirely sure how best to solve. Ideal would be never to panic through cgo and report all errors back to get reraised, but it's hard to see how to do that in a generic way.
With the relatively recent introduction of the ability to access the hash of the input sources of a rule we now have the ability (or at least I hope we do) to produce an output file which has a named derived from input sources. This is handy for files that we want to cache forever, or which are versioned by their contents.
To use this however we will need a way to access some information all input deps by label (so that we can reference their outputs). The most obvious example would be a rule which templates a javascript file into an index.html, with the javascript file being cached forever on some CDN. We could stamp the output js file with its input hash 12345.js
and then have an html template that can include it like this:
<script src="//app/frontend/js:compiled" />
The build rule could then template that label value for the hash of the sources (as long as it was a declared dependency).
This is obviously useful for other situations, perhaps the most pressing being the creation of containers and their use in a deployment config file (such as a kubernetes config file). We might want a potential build rule to look like this.
kube_config(
name = 'my_app',
config = 'config_template.yaml',
container_repo = 'https://secret-containers.company.com:1234',
deps = ['//my_app/backend:container_rule']
)
This rule will build the deps (which would create versioned containers), then template their container name / version / repo string into the config file.
The new version of Go has some stricter checks about what can be passed to cgo (see https://github.com/golang/proposal/blob/master/design/12416-cgo-pointers.md). We fail on this because we pass pointers to Go structures through cgo and back again. AFAIK we are not doing anything conceptually wrong because the pointers are opaque to C, but it's against the new rules so we'll have to find some workaround / alternate approach.
Note that targeting go1.6 seems to work just fine, it's only building plz with it that is problematic.
It's not too hard to pass the wrong type to an argument (e.g. string instead of list of strings) and the resulting error messages can be obtuse. Maybe we should verify the type of all arguments earlier to be more explicit.
Ability to "build" things that are not in the source tree. Motivating example: Riccardo wants to use descriptor.proto which is installed with protobuf. Historically we've vendorised these things (eg. byte.grm for Thrax) but it'd be nice to be able to include standard things like that without having to vendorise them.
Vague plan is to have a way of identifying 'system' dependencies which plz provides by default. In this case it'd just be a proto_library rule plus whatever's needed to move the files about.
At the moment they're the same, ie. if you do "plz build -c cover //mypackage:mytarget", any tools needed for that rule are compiled with -c cover as well. Ideal would be that they were compiled with -c opt since we don't need them to be slowed down by coverage instrumentation.
Generalisation of this is if we supported cross-compiling we'd need to understand to build tools for the local arch.
Found with Lorenzo before; it looks like when a target with need_transitive_deps = True
depends on a target with some tools, it receives those tools as well as the tree of deps in its working directory. It shouldn't do this, tools shouldn't appear in the working directory that way.
They definitely were back in the pre-1.0 days, because it enables a workflow where you set PYTHONPATH to plz-out/gen, edit files locally and run with python and it should Just Work. Currently that doesn't when you edit deps because you've got stale copies of them in plz-out.
Can't remember exactly why this changed - I suspect it may have had something to do with JS, I remember lots of problems with symlinks around webpack / npm.
java_library
and .plzconfig should allow customising the flags passed to javac, in a similar manner to what we do for gcc / clang.
From James. It wasn't actually what he intended (he really wanted to use deps) but of course plz shouldn't panic.
Something about dependencies not being resolved; no doubt require/provide related. Not sure if it will be easy to resolve though.
I'm not sure we're correctly freeing memory that is returned to PyPy in various places (eg. glob()
returns a char**). Should investigate and fix.
i.e. if :label has three outputs, you get the dir three times. You should only get it once.
Not sure why, and whenever I try something and re-source it it seems to go away and pop back up again later so it's very hard to pin down :(
It would be nice to be able to build a python_binary
which didn't contain any source, just .pyc/.pyo bytecode.
I've got a branch (python-precompile) which implements this in a reasonably nice way; each python_library
precompiles its relevant bytecode. I'm a little worried that the files contain timestamps, although I suspect this doesn't make things worse at the moment because the zipfiles have mtimes in them anyway.
So far I've not merged because it turns out to be a pessimism on the pexes; they are a little larger (expected) but weirdly about 10-15% slower to start up. You'd obviously expect it was faster since the bytecode didn't need recompilation, and indeed stripped pexes are 30% faster or so. Not sure precisely what's going on with it, ideally we would be able to get the performance improvement in both cases.
Instrumentation must be messing it up, although don't know exactly what's wrong.
Need rkt to resolve their bug where they have to be run as root first.
This functionality already exists when building (and testing?) targets.
Can be worked around by excluding them and adding a maven_jar and a manual dep, but that is somewhat awkward.
Not sure if / how we can detect the presence of sources when please_maven runs.
You can find the relevant test case in core3/
branch model-python-server-test
We have a BUILD file like this (noe the double slash in the last dependency).
(core3/experimental/diana/models/frontend/src/js/components/monitor/BUILD)
js_library(
name = 'monitor',
srcs = ['monitor.js'],
deps = [
'//experimental/diana/models/frontend/src/js/components/header:header',
'//experimental/diana/models/frontend/src/js/services:monitor_service',
'//experimental/diana/models/frontend/src/js/services:benchmark_service',
'//experimental/diana/models/frontend/src/js/components/graphs:size_graph',
'//experimental/diana/models/frontend/src/js/components/graphs:bench_graph',
'//experimental/diana/models/frontend/src/js/components//graphs:time_graph',
'//third_party/js:react',
],
visibility = ['//experimental/diana/models/...'],
)
Building that project results in the error below. Running a second time it is successful!
~/dev/core3 model-python-server-test ✔ 45m ⍉
▶ plz build //experimental/diana/models/frontend/src/...
Building [78/166, 0.2s]:
=> [ 0.0s] //third_party/js:babel-preset-react_npm Finished post-build function for //third_party/js:babel-preset-react_npm
=> [ 0.0s] //third_party/js:babel-plugin-syntax-trailing-function-commas_npm Preparing...
=> [ 0.0s] //third_party/js:webpack-dev-server_npm Preparing...
=> [ 0.0s] //third_party/js:css-loader_npm Running post-build function for //third_party/js:css-loader_npm
=> [ 0.0s] //third_party/js:chokidar_npm Finished post-build function for //third_party/js:chokidar_npm
=> [-0.0s] //third_party/js:babel-relay-plugin_npm Finished post-build function for //third_party/js:babel-relay-plugin_npm
=> [-0.0s] //third_party/js:imports-loader_npm Running post-build function for //third_party/js:imports-loader_npm
=> [-0.0s] //third_party/js:chalk_npm Running post-build function for //third_party/js:chalk_npm
=> [-0.0s] //third_party/js:fs-extra_npm Running post-build function for //third_party/js:fs-extra_npm
=> [-0.0s] //third_party/js:extract-text-webpack-plugin_npm Running post-build function for //third_party/js:extract-text-webpack-plugin_npm
10:24:46.702 CRITICAL: //experimental/diana/models/frontend/src/js/components/graphs:time_graph failed:
Error moving outputs for target //experimental/diana/models/frontend/src/js/components/graphs:time_graph: Rule //experimental/diana/models/frontend/src/js/components/graphs:time_graph failed to create output plz-out/tmp/experimental/diana/models/frontend/src/js/components/graphs/time_graph/timeGraph.js
~/dev/core3 model-python-server-test ✔ 46m ⍉
▶ plz build //experimental/diana/models/frontend/src/...
Build finished; total time 13.56s, incrementality 92.6%. Outputs:
plz-out/gen/experimental/diana/models/frontend/src/css/main.scss
plz-out/gen/experimental/diana/models/frontend/src/js/compiled.js
plz-out/gen/experimental/diana/models/frontend/src/js/compiled.js.css
plz-out/gen/experimental/diana/models/frontend/src/js/compiled.js.css.map
plz-out/gen/experimental/diana/models/frontend/src/js/compiled.js.map
plz-out/gen/experimental/diana/models/frontend/src/js/components/graphs/benchGraph.js
plz-out/gen/experimental/diana/models/frontend/src/js/components/graphs/sizeGraph.js
plz-out/gen/experimental/diana/models/frontend/src/js/components/graphs/timeGraph.js
plz-out/gen/experimental/diana/models/frontend/src/js/components/header/header.js
plz-out/gen/experimental/diana/models/frontend/src/js/components/header/headerData.js
plz-out/gen/experimental/diana/models/frontend/src/js/components/model/modelBenchmark.js
plz-out/gen/experimental/diana/models/frontend/src/js/components/model/modelName.js
plz-out/gen/experimental/diana/models/frontend/src/js/components/model/modelSize.js
plz-out/gen/experimental/diana/models/frontend/src/js/components/model/modelTime.js
plz-out/gen/experimental/diana/models/frontend/src/js/components/model/modelUncompressedSize.js
plz-out/gen/experimental/diana/models/frontend/src/js/components/model/modelVersion.js
plz-out/gen/experimental/diana/models/frontend/src/js/components/monitor/monitor.js
plz-out/gen/experimental/diana/models/frontend/src/js/services/BenchmarkService.js
plz-out/gen/experimental/diana/models/frontend/src/js/services/MonitorService.js
I think the problem is that it fails the first time (but manages to place it in the cache) and passes the second time because it's going through a different code path since the artifact is in the cache.
Depends on #14
Thinking about #13, it's probably not ideal to integrate esoteric languages directly into plz since they become a compile-time dependency. A better system would be if one could subinclude('https://github.com/thought-machine/please-plugins/grm/grm.build_defs)
and get all the bits you needed (with optional revision / hash verification, of course).
I actually think most of the bits we need for this are there already (it's really just a combination of remote_file
and subinclude
) although it would be nice to be able to create 'virtual targets' to fetch any necessary tools which didn't get duplicated in every package. That would need more significant changes so might encourage people to do it manually for now.
Like Python + Java. Slightly trickier since 'go test' generally needs an argument to specify the test to run; not sure how best to handle that.
Would make it easy to set up a repo for other people to use.
We might also consider storing tools etc in ~/.please by default instead of in /opt which would also be lower friction.
From our high-level design meeting at the Keats the other night: experiment with adding a "work factor" flag so we don't upload cache artifacts that turn out to be easy to build locally.
The rule works fine on systems with non-losing linkers but unfortunately Apple's one doesn't support -b binary any more and their replacement is a bit weak. Need to find some workaround.
Currently plz test --max_flakes=1
runs a test only once. This is possibly not the most obvious interpretation; one could also read it that it allows one flake, and hence would run it twice if the first time failed.
Let's decide what the correct behaviour / interpretation is then make it that way.
Initial skeleton of some documentation is in repo.
If we upload an artifact of the wrong version, plz downloads it once, then in future always attempts to run that, but fails because it's the wrong one. It would be better if it checked after download (would need to be pre-fork) and deleted it.
We have a case where we retrieve build output, run a post-build function, and then fail to retrieve the final artifacts. Then after building for real we re-run the post build function which is not idempotent and it dies.
Ideally we would guarantee it only runs once, but it's not clear how we do that in the face of caching, and I think it's too harsh to not cache any rule with a post-build function.
Or maybe we just say they must be idempotent?
e.g. if I run 'plz cover //src/query/...' I should only get source files from //src/query/... in the output.
Might as well keep it similar to Bazel's plugins
/ exported_plugins
attributes and java_plugin
rule.
Because we use zip and jar respectively which save timestamps.
Should extend the increasingly misnamed jarcat to perform the functions we use them for (which are very limited, essentially "zip everything in this directory") and override the modification timestamps.
Feature request from Riccardo. Can currently be controlled by --num_threads but it'd be nice if you could permanently override in .plzconfig.local or whatever.
i.e. in the following situation
genrule(
name = 'a',
provides = {'go': ':b'},
)
genrule(
name = 'b',
outs = 'b.go',
)
genrule(
name = 'c',
srcs = [':a'],
requires = ['go'],
)
SRCS
of c is empty at build time because a
has no outs, but it should be b.go
.
No doubt it needs to be reassigned somewhere, but it's no doubt not obvious exactly how to tie it in because this stuff is a bit fiddly.
When building multiple targets at once, like some/target/...
, targets with dependency cycles will be silently skipped. Furthermore, tests that depend on those targets will be skipped as well, giving the impression that everything is in correct working order.
Tend to get errors about "Failed to remove outputs" and things. Should be robust to this.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.