github-vet / bots Goto Github PK
View Code? Open in Web Editor NEWBots for running analysis on GitHub's public Go repositories and crowdsourcing their classification.
License: MIT License
Bots for running analysis on GitHub's public Go repositories and crowdsourcing their classification.
License: MIT License
The paths output in github-vet/rangeloop-pointer-findings#6717 appear to contain duplicates. We might expect to see this in case the callgraph ends up with multiple non-unique nodes that have the same signature. Based on my recollection of the type definition, that's not impossible.
The callgraph being reported in github-vet/rangeloop-pointer-findings#8083 has some strange edges, including a few outgoing edges from Atoi
which make absolutely no sense. The repository does not contain any declaration bodies for Atoi
so this seems impossible on its face.
github-vet/rangeloop-pointer-findings#8532 was erroneously marked as involving a 'third-party' fucnction despite its presence in the repository. It's quite possible the use of variadic functions is throwing things off. No idea how many arguments are expected in the signature that VetBot found.
TrackBot is failing to apply community threshold labels. Top of my priority list.
To avoid more manual effort, VetBot should build graphviz visualizations of suspicious callgraphs, upload them somewhere (probably Google's version of an S3 Bucket) and embed the picture into the uploaded file in the issue.
Originally I wanted to keep the images in the same repository as the issues, but I don't like that so much since it uses at least one more API request (and, anecdotally, we're already hitting the rate limit between VetBot and TrackBot).
This is a nice-to-have feature, which should probably be done anyway to make the UX better for our crowdsourcers.
... clearly we don't want that.
Easily reproducible.
TrackBot removed the fresh label on this issue when an 'uninteresting' reaction was added. It should only count the 'voting' reactions instead.
We need a more comprehensive list of GitHub repositories to look through. Right now the list of repositories I have was retrieved from this script which uses a clever trick dirty hack to attempt to grab all the repositories. Amusingly, it retrieved 223k of the roughly 700k public Golang repos on GitHub.
A better way is to use the date filter on GitHub's search API to visit all of the repositories by month and sort in order by creation date. If there are more than 1000 results in a given month, the search can be repeated by splitting the month in two.
Another option is to grab the GitHub repositories from https://pkg.go.dev, but that will require use of a different API.
The structure of checkUnaryExpr
in looppointer.go
gives up at the first issue found. We should try and get it to report every issue found, in case there are multiples.
I made more false-negatives! 😅
When validating an asyncSuspicion, we have to report paths ending in third-party calls. This may explain why I was seeing some BFS searches that ended with no paths earlier (in #63).
(The current data set is definitely corrupted by false negatives now, if it wasn't already before... it's #beta.)
I'm seeing several false-positives in which the address of a range loop is assigned immediately before breaking out of the range loop. This is an obvious false-positive and it is possible to make the analyzer ignore it by just checking the block for the break or return keyword and verifying the inner-most loop on the stack matches the Obj of the range variable.
e.g.
var specialX *X
for _, x := range xs {
if x.isSpecial() {
specialX = &x
break
}
}
also this
This tracks and justifies a change in the approx-call-graph branch.
bots/cmd/vet-bot/pointerescapes/pointerescapes.go
Lines 178 to 185 in aba1e3c
Line 179 alone is probably not enough. Taking the address of a pointer variable can also be dangerous. Any attempt to dereference the pointer can lead to retrieving the loop variable once more. See below for more.
func main() {
var ptrs []**int
for _, x := range []int{1,2,3,4,5} {
y := foo(&x)
fmt.Printf("loop: %v,%v\n", y, *y)
ptrs = append(ptrs, y)
}
for _, px := range ptrs {
fmt.Printf("ptrs: %v,%v\n", px, *px)
}
}
func foo(x *int) **int {
fmt.Printf("foo: %v,%v\n", x, &x)
return &x
}
There is no reason I should be running these bots from my local machine -- even though it works (for now), there are obvious issues involved.
Ideally both vet-bot
and track-bot
would be deployed in the cloud. All it requires is one container and a persistent disk (for recovery).
If anyone wants to take a stab at that, I'm open to PRs. Otherwise I'll get around to it once testing is done.
Also want CI/CD, possibly via Travis. Any expertise there would be great.
There's no reason we cannot report the suspicious call-graph paths the tool is using to make its assessment(s). This would require storing those traces at the BFS.
As a first step, I intend to get a minimal amount of callgraph information in place -- just reporting what currently exists on the stack of the callgraph whenever the BFS hits a triggering node.
I'll link source information in as a second step.
The current approach skips reporting on some third-party functions, leading to false-negatives. This needs to be reconsidered carefully.
This is vague -- it's mostly a reminder to me to look carefully at what nogofunc and pointerescapes both do with third-party functions in mind.
The findings in this issue can be avoided by better static analysis.
Consider the below code.
for _, port := range ports {
wg.Add(1)
go func(p int, wg *sync.WaitGroup) {
done := make(chan struct{})
s.jobChan <- portJob{
open: openChan,
closed: closedChan,
filtered: filteredChan,
ip: host,
port: p,
done: done,
ctx: ctx,
}
<-done
wg.Done()
}(port, wg)
}
The static analysis is triggering for one of two reasons:
port
as an argument to the function in the goroutine Both are false-positives for our purposes so it'd be preferred if the analysis ignored these sorts of examples automatically.
By default, vet-bot
runs in 'scrape mode' -- it samples from a list of github repos and visits them, parsing the code and running the analyses.
A command-line argument should be introduced which changes its operation to scrape a single repository given as a command-line argument. Example usage:
vet-bot scrape kalexmills/bad-go
The default usage could also be replaced by:
vet-bot sample all-repos.csv
which could continuously sample from the all-repos.csv
file and output the list of visited repos in all-repos_visited.csv
.
Probably getting the wrong node from the syntax tree. I will resolve in the morning.
This issue needs to be closed and purged from the record. It can be picked up on another pass.
One way to avoid false-positives due to passing pointers into third-party functions that the analyzer has no visibility into is to use a whitelist for well-known third party functions.
For instance, passing a pointer to fmt.Println
is always going to be a false-positive. We don't currently have any mechanism to add these.
If this is implemented, it is possible to achieve a false negative in case of sample code which aliases a package name to something in the standard library. We should work around that issue by checking package aliases and disabling safe functions appropriately. This ought to be doable since all that syntax is present in the same file where calls are being made.
A sanity check added to nogofunc
is being triggered, which, for me, is calling into question the accuracy of some of the callgraph procedures.
Based on the symptoms, I suspect one of these:
lazyInitCalledBy
is wrongnogofunc
is wrongCalledByGraphBFS
is wrongThere is no reason to consider an edge of the callgraph for a function whose declaration contains no pointers.
We can potentially experience a massive decrease in false-positives and the size of the callgraph reported by the tool this way.
However once we remove these edges, every other function in the current package will be absent from the callgraph just like third-party functions. After making this change, it isn't immediately clear that we will be able to distinguish between functions defined in the repository which do not use pointers and functions defined outside the repository
But I think we can do so if we take into account information present at the callsite and assume that the repository source type-checks (even though we don't actually type-check it ourselves).
Here's a short "proof". Maybe it's only useful to help me wrap my head around it, and maybe I've overcomplicated the claim that needs to be shown. I'm putting it here in case someone can explain how it's wrong. Hopefully I've written it in a way that isn't too confusing -- if not, feel free to ask questions, I don't bite.
Suppose we construct the callgraph as described above (only include functions which take a pointer as an argument). We need to show that we have enough information to distinguish between cases 1a, 1b, and 2 at the callsite.
Suppose we see a callsite whose function falls into case 1a.
Suppose we see a callsite whose function falls into case 1b.
Suppose that we see a callsite whose function falls into case 2.
In debugging, I'm noticing that pointerescapes
seems to be including empty function names in the SafePtrs list; perhaps from anonymous function calls. Analysis needs to be done to understand the impact of this; I think we want to disallow it.
In order to avoid false-negatives completely, nogofunc
needs a complete list of third-party code that starts goroutines.
That's not going to happen without fetching dependencies, which makes me very sad.
Issues like this one are false-positives. Our break detection feature can instead read to the end of the statement block.
A lot of 'generic' Go can pass pointers around as empty interfaces. They can still be stored as such. These arguments should also be considered equivalent to pointers when constructing the callgraph.
Right now, nothing is done in the callgraph (and therefore any of the analyzers) to consider variadic functions. This is a potential source of false-negatives.
EDIT: actually they may erroneously be reported as third-party functions when they shouldn't be. Changing the labels.
Welp...
/ # wc -l data/visited_repos.csv
321683 data/visited_repos.csv
/ # wc -l repos.csv
223664 repos.csv
That's a thing.
The pointerescape
analysis pass checks each pointer argument of each function call, whereas the nogofunc
pass does not. Since we're already doing it for pointerescape
, it would be simple to determine if a pointer argument is used during a goroutine and thread this information back through the callgraph like we do there.
That may or may not be needed, though. It depends how many false-positives that analyzer is picking up.
In addition to closing any issues containing test code found (which it does currently), track-bot should also close any code it finds which lives in the vendor directory of a target repository.
from the logs:
2020/12/25 21:41:18 failed to retrieve root commit ID for repo prog666/youtube-channels-parser
2020/12/25 21:41:18 repo prog666/youtube-channels-parser will be tried again despite error: GET https://api.github.com/repos/prog666/youtube-channels-parser: 403 API rate limit of 5000 still exceeded until 2020-12-25 21:45:57 +0000 UTC, not making remote request. [rate reset in 4m39s]
2020/12/25 21:41:18 rate limit hit; blocking until 2020-12-25T21:43:18Z
Looks like the rate-limiter is not using the duration returned from GitHub, so it's polling every 2 minutes to see if it can start again.
It's easy to visualize graphs with graphviz, and there's a go library for it here.
We can construct the DAG as an image, upload it to the target repository, and link to it from the image.
Each node of the DAG can contain a list of source locations where the function is known to be defined.
I'd estimate the above would take me a day of effort, so it might be worthwhile.
Later, I'd like to think about linking to all of those locations in the issue as well. I want to see what the DAGs look like first before going that far, since it might still be too much data.
Follow-on to #31
I'm not comfortable moving this into GA without better automated tests for the static analyzers.
The analysis from the looppointer package includes false positives, which can be vetted by the community. Importantly, it provides coverage of a different sort of check than the one we are currently performing.
looppointer
does not require type-checking, so it would be somewhat simple to integrate with the current design of vet-bot. exportloopref
may also be useful, but would require type checking, which may require more extensive rewrite and would almost certainly require keeping more of the contents of a repository in memory at one time.
There's another false positive which can occur when a range loop ignores all of its variables, as seen in this finding.
This range loop ranges over a channel and ignores the results returned from the ticker.
for _ = range ticker.C {
ch := make(chan bool)
go func() {
err = a.Client.Call(a.ctx, "GetInfo", &common.ServerInfo, &common.Config)
if err != nil {
a.log("RPC Client Call:", err.Error())
return
}
ch <- true
}()
// Server集群列表获取
select {
case <-ch:
serverList, err := a.getServerList()
if err != nil {
a.log("RPC Client Call:", err.Error())
break
}
if len(serverList) == 0 {
a.log("No server node available")
break
}
if len(serverList) == len(a.ServerList) {
for i, server := range serverList {
// TODO 可能会产生问题
if server != a.ServerList[i] {
a.ServerList = serverList
// 防止正在传输重置client导致数据丢失
a.Mutex.Lock()
a.Client.Close()
a.newClient()
a.Mutex.Unlock()
break
}
}
} else {
a.log("Server nodes from old to new:", a.ServerList, "->", serverList)
a.ServerList = serverList
a.Mutex.Lock()
a.Client.Close()
a.newClient()
a.Mutex.Unlock()
}
case <-time.NewTicker(time.Second * 3).C:
break
}
}
Leaving this for the record. A potentially substantial number of false-negatives were possible prior to commit 88d108f, due to a negated conditional.
Need a section on the kinds of analysis that the bots perform.
Currently the plan is to add functions to the accept list whenever they are found in an issue. However, there is no reason to be so conservative. For instance, commonly-used packages like testify
and the go standard library can be used to produce some of this information either automatically or via manual effort.
The relevant file is in kubernetes/acceptlist.yml
At the moment, both bots are deployed to a very small Kubernetes cluster I'm hosting on GKE. The deployment config lives in the kubernetes
path.
It would be nice to have the ability to update the bot deployment whenever a new tag is pushed to the main
branch.
The current static analysis that is being performed does not account for the more insidious examples. To get them all, something rather similar to Rust's lifetime checker needs to be written to perform escape analysis of references to range loop variables.
That is a rather more involved task than hacking on the procedure from go vet
, which is all that I have done so far.
Related: golang/go
Related: dominikh/go-tools
Related: kyoh86/exportloopref
Related: kyoh86/looppointer
For some codebases, some of the goroutine searches seem to be really hairy (they're actually just a BFS tree being printed in an exceptionally lazy manner).
If we really want to see them analyzed, we're going to need to visualize them like trees. This means laying them out as trees either in a textual format (which VetBot could do) or using some other tool.
One strategy even if the output becomes easier to parse could be to add a new label for issues whose callgraphs may be much too complicated and need further analysis. We can sequester them and learn how to deal with them later.
With that in mind, I think adding a new 👀 status will allow us to set aside issues that are going to need additional data crunching in order to classify. The point here isn't to kick the can down the road, we might learn that there aren't enough issues that fall into this category to justify a significant development effort.
This is open for discussion.
Sanity checks are finding that ArgDeclPos can sometimes be token.NoPos
, which causes the argument to be skipped inside pointerescapes
, which seems bad. It looks like I stuck this inside the callgraph in case id.Obj
is nil (i.e. the argument may require type-checking to determine it's type).
It's quite possible this could be causing false-negatives, so I'll need to analyze, address it and scan again.
There are several features to test (manually for now... 😱)
Currently looppointer does not report when a UnaryExpr is used directly in a composite literal. But it must.
Once #28 lands, there will be one more thing we can use in the call-graph to disambiguate nodes without using any type information.
Whether we want to do this depends entirely on how many false-positives we see. (EDIT: maybe not; I kind of just want to do it...)
We can use name, arity, and the pattern of pointer arguments to uniquely identify a function in the callgraph.
For instance, a function
func foo(x int, y *int, z int) {
// ...
}
would have signature {foo, 3, [false, true, false]}
.
We can easily construct this graph for function signatures. However, using the graph at the call-site without type information is a little bit trickier. We can use the position where the pointer argument(s) are being passed to lookup the set of all the nodes in the call-graph whose signatures match and run our BFS starting from those nodes.
That lookup is only a tiny bit tricky. I think it's easiest trading off space for time by using an index.
Functions which do not take any pointer arguments are also totally useless for our purposes. I think we can just ignore them in every case (see #102). Any edges which we may miss couldn't have a pointer crossing them in any case.
I believe the way I've written TrackBot it allows for some fairly significant misbehavior. I don't think I had originally realized that GitHub allows for multiple emoji reactions from the same user. Needs to be checked.
Currently, VetBot could misreport when the callgraph bottoms out due to hitting third-party code. It will say something like 'this function could store a reference to pointer' which is technically correct but not very helpful.
Instead of reporting an inaccurate error message which would mislead the user, VetBot should report that it's reached a function call for which it cannot view the definition. It should report something like 'reference to a range loop variable could be passed into untrusted third-party code at %pos'. The path the variable is supposed to take should also be reported.
This is particularly annoying. github-vet/rangeloop-pointer-findings#308
Snippet quoted below:
for _, p := range parallelisms {
// Enable error count gathering
var gotError chan bool
var performed chan int
var allDone chan bool
if logFailedCalls {
gotError = make(chan bool)
performed = make(chan int)
allDone = make(chan bool)
go func() {
errNb := 0
totalNb := 0
for {
select {
case <-allDone:
results := float32(errNb) / float32(totalNb) * 100
fmt.Printf("Benchmarking %s - %.00f%% error rate\n", fmt.Sprintf("%s/%v", r.Name, p), results)
break
case nb := <-performed:
totalNb += nb
case <-gotError:
errNb++
}
}
}()
}
b.Run(fmt.Sprintf("%s/%v", r.Name, p), func(b *testing.B) {
b.SetParallelism(p)
b.RunParallel(func(pb *testing.PB) {
counter := 0
for pb.Next() {
if r.StreamsRequest && !r.StreamsReturns {
err := benchStreamClient(b, action, currentService, r, tmpl)
if err != nil {
if logFailedCalls {
gotError <- true
}
}
counter = counter + b.N
} else if !r.StreamsRequest && r.StreamsReturns {
err := benchStreamServer(b, action, currentService, r, tmpl)
if err != nil {
if logFailedCalls {
gotError <- true
counter++
}
}
counter = counter + b.N
} else {
err := benchCall(b, action, currentService, r, tmpl)
if err != nil {
if logFailedCalls {
gotError <- true
}
}
counter = counter + b.N
}
}
if logFailedCalls {
performed <- counter
}
})
})
if logFailedCalls {
allDone <- true
}
}
The track-bot can label issues based on whether they appear to be in a test repository. A simple way to check is by looking at the filename. If it ends in test.go
it is likely making use of the standard library's testing package.
In some cases, root signatures appear to be missing from the callgraph when they should not be; a potential source of false-negatives.
This log statement appeared recently in VetBot.
2020/12/25 17:31:01 writeptr: could not find root signature {validateDefaultValueItemsAgainstSchema 4} in callgraph; 3rd-party code suspected
Had this actually been third-party code, it wouldn't be a private method.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.