cki-project / sktm Goto Github PK

View Code? Open in Web Editor NEW

0.0 8.0 6.0 352 KB

"Sonic Kernel Testing Manager" - a job scheduler for skt

License: GNU General Public License v2.0

Python 100.00%

kernel patchwork ci testing

sktm's People

Contributors

Watchers

Forkers

spbnick veruu major danghai dzickusrh agustinhenze

sktm's Issues

Unify at least part of Patchwork functionality by using mboxes when possible

mbox retrieval is same for both Patchwork1 and 2 so we can use the same codebase for eg. header retrieval.

fix UX of patchwork command when PW2 is used

A timestamp of the last patch needs to be passed on a command line if we want to use PW2, but patch number is used with PW1. This is both confusing (users shouldn't care about patchwork versions) and not user-friendly.

This might need a deeper rework of how patches are retrieved, and be a part of patchwork interfaces rework in general.

sktm test_init.py check_pending

In test_init.py::test_wait_for_pending_done has a check_pending Mock() call that sets a return value, which is unneeded as that fails with Exceptions.

It would also be nice to mimic that a little better.
Adding: self.watcher_obj.pj = [(1, 2, 3)]

makes it correctly hang. Is there an easy way to Mock removal of those 'pj' ids? Such that 60 seconds later, check_pending succeeds? (mocking turning sleep from 60 seconds to 1 second would be nice too).

sktm must exit when there is no work to do

Once sktm queues a baseline test or a test of various patchwork patches, it should exit. It should check on the jobs it started during a subsequent run. This eliminates the need for sktm to have multiple processes running for long periods.

Consider using LinchPin for running tests

Consider using LinchPin to abstract away sktm's dependence on Beaker for running tests. This might allow us to support running tests with libvirt to support a simple development setup. See also LinchPin documentation.

We have to carefully evaluate LinchPin, though. It might be simpler and more reliable to implement our own abstraction for just two providers: Beaker and libvirt, given our relatively minor feature requirements. LinchPin might also bring in too many dependencies.

Fix variable names in db.py

Commit 526b188 was supposed to clean up variable names so they are more descriptive, but sometimes guessed wrong and as a result, some of get_sourceid() results are in variables called patchset_id which is very confusing.

Most of the results is correctly named as sourceid, which is how the rest of variables should be too.

Name patchwork.skt_patchwork and patchwork.skt_patchwork2 properly

We don't need to repeat "patchwork" inside the "patchwork" module, we don't need to repeat "skt" inside the "skt" package, plus we better use descriptive names instead of "2" and nothing. E.g. "patchwork.XMLRPCInterface" and "patchwork.RESTInterface".

Use path manipulations instead of string concatenations

Both URL and file path joins use slashes as dividers between their parts. When joining them, we need to be extra careful if the original path ends with slash or there is none. Examples:

path/to/dir + file --> path/to/dir/file
path/to/dir/ + file --> path/to/dir/file
http://url.com/ + part --> http://url.com/part
http://url.com + part --> http://url.com/part

Right now, most of the code uses string concatenation for joins, which doesn't handle any of these cases. This requires us to be extra careful about what is passed to the functions (eg. some websites don't like double slashes), which is not that simple with all the passing between different functions and objects and doesn't check user inputs either.

os.path.join() can deal with all these cases and works just fine with both URLs and filepaths. This allows us to simply type os.path.join('http://url.com', 'part') or os.path.join(directory, filename) (if variables are used) and not care about ending slashes anywhere.

Refactor sktm.watcher.check_patchwork to not need trivial comments

At the moment the sktm.watcher.check_patchwork function needs a bunch of comments to explain what it loops over (and for some other trivial parts), which indicates the code is a bit obscure. Refactor the function and possibly some other related parts of sktm.watcher to reduce the need for such comments.

Error: establish a working "baseline" commit

Hi, I got error after the command:

./sktm.py -v --jjname sktm baseline \
          git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git \
          master

The output console from UI Jenkins shows the building of Kernel module and errors from:

[sktm] Running shell script
+ set +x
+ skt -vv --state --rc skt-rc --junit junit run --wait
2018-06-01 13:38:51,806     INFO   runner type: beaker
2018-06-01 13:38:51,806     INFO   beaker template: beakerjob.xml
CA_CERT configuration points to non-existing file: /etc/beaker/ca.pem
2018-06-01 13:38:52,247     INFO   submitted jobid: None
2018-06-01 13:38:52,255    ERROR   Exception caught: Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/skt/executable.py", line 107, in wrapper
    func(cfg)
  File "/usr/lib/python2.7/site-packages/skt/executable.py", line 289, in cmd_run
    cfg.get('wait'), uid=cfg.get('uid'))
  File "/usr/lib/python2.7/site-packages/skt/runner.py", line 455, in run
    self.wait(jobid, reschedule)
  File "/usr/lib/python2.7/site-packages/skt/runner.py", line 391, in wait
    self.add_to_watchlist(jobid, reschedule)
  File "/usr/lib/python2.7/site-packages/skt/runner.py", line 373, in add_to_watchlist
    root = self.getresultstree(jobid)
  File "/usr/lib/python2.7/site-packages/skt/runner.py", line 115, in getresultstree
    bkr = subprocess.Popen(args, stdout=subprocess.PIPE)
  File "/usr/lib64/python2.7/subprocess.py", line 390, in __init__
    errread, errwrite)
  File "/usr/lib64/python2.7/subprocess.py", line 1025, in _execute_child
    raise child_exception
TypeError: execv() arg 2 must contain only strings

[Pipeline] }
[Pipeline] // stage
[Pipeline] stage
[Pipeline] { (Report)
[Pipeline] sh
[sktm] Running shell script
+ set +x
+ skt -vv --state --rc skt-rc --junit junit report --reporter mail '{'\''mailfrom'\'': '\''Jenkins <[email protected]>'\'',                           '\''mailto'\'': '\''Root <[email protected]>'\''}'
2018-06-01 13:38:52,849    DEBUG   Starting new HTTP connection (1): pc.example.com
Traceback (most recent call last):
  File "/bin/skt", line 11, in <module>
    load_entry_point('skt==1', 'console_scripts', 'skt')()
  File "/usr/lib/python2.7/site-packages/skt/executable.py", line 871, in main
    args.func(cfg)
  File "/usr/lib/python2.7/site-packages/skt/executable.py", line 340, in cmd_report
    reporter.report()
  File "/usr/lib/python2.7/site-packages/skt/reporter.py", line 535, in report
    self.update_mergedata()
  File "/usr/lib/python2.7/site-packages/skt/reporter.py", line 264, in update_mergedata
    mergedata = self.infourldata(mergedata)
  File "/usr/lib/python2.7/site-packages/skt/reporter.py", line 217, in infourldata
    response = requests.get(self.cfg.get("infourl"))
  File "/usr/lib/python2.7/site-packages/requests/api.py", line 72, in get
    return request('get', url, params=params, **kwargs)
  File "/usr/lib/python2.7/site-packages/requests/api.py", line 58, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 508, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 618, in send
    r = adapter.send(request, **kwargs)
  File "/usr/lib/python2.7/site-packages/requests/adapters.py", line 508, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='pc.example.com', port=4040): Max retries exceeded with url: /result/21ad1173589ef63a93f94e05c879393e2c27588c.csv (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f60efca3e50>: Failed to establish a new connection: [Errno -2] Name or service not known',))
[Pipeline] }
[Pipeline] // stage
[Pipeline] sh
[sktm] Running shell script
+ kill 85227
[Pipeline] junit
Recording test results
[Pipeline] }
[Pipeline] // node
[Pipeline] End of Pipeline
ERROR: script returned exit code 1
Finished: FAILURE

Could you help me handle it? Thanks

Document setup

Document a functional setup with configuration suitable for testing.

Add support for passing extra arguments to the filtering program

The patch-filtering program which could be specified with the --filter option can accept arguments besides what sktm passes to it by itself. Add support for specifying more options to the filtering program. Perhaps another option, like --filter-args.

Implement integration tests

Consider using the same unittest framework skt is starting using.

Describe dependencies in README.md

db.get_repoid should use a transaction

To avoid race condition between checking if a repo ID exists and creating one, db.get_repoid should operate within a transaction. That, or just revert back to using "INSERT OR IGNORE" (which is atomic) since nothing at the moment uses the separate db.create_repoid function.

Unit testing for patchwork.py

The patchwork.py functions/methods need unit tests.

db.create_repoid still handles missing repo ID

db.create_repoid still uses "INSERT OR IGNORE INTO", which ignores a situation where an ID already exists. Instead, it should abort if an ID for the specified repo already exists. db.get_repoid already handles "create if not exists logic".

Exception occurs when checking pending patches

Traceback (most recent call last):
  File "/home/skt/sktm/sktm.py", line 135, in <module>
    sw.wait_for_pending()
  File "/home/skt/sktm/sktm/__init__.py", line 367, in wait_for_pending
    self.check_pending()
  File "/home/skt/sktm/sktm/__init__.py", line 357, in check_pending
    self.db.commit_tested(patches, series_id)
UnboundLocalError: local variable 'series_id' referenced before assignment

Implement passing of original message subject

Same as PR #24 (use already existing infrastructure). This way we can reply with the Re: <original subject>, in addition to in-reply-to header

Document all the code

Document all the code following the conventions for Google Python Style Guide.

Remove baseline rechecking logic

The baseline rechecking logic found here should be removed: https://github.com/RH-FMK/sktm/blob/master/sktm/__init__.py#L354-L361

It has already been removed from skt: CKI-project/skt@f108292

sktm can only mark baseline commit with a worse status

Sktm only marks baseline commits with a worse test status, but never better. We need to consider if we need to do anything with that WRT infrastructure failures resulting in baseline test failures and making it stuck in that.

Refactor patchwork.py to not need trivial comments

Same as #34 just for patchwork.py, mostly related to variable names. For example:

# Patch position in series
cpatch = int(smatch.group(1))

could be instead named something like patch_number which would be a) more telling and b) get rid of the comment too.

db.get_sourceid should use a transaction or have no ID creation

At the moment db.get_sourceid is prone to a race condition between checking if a source ID exists and creating one. It should either use a transaction (but that complicates the logic since create_sourceid is also a public function which can also create a transaction), or better just have no ID creation.

sktm doesn't handle locked databases

Sktm aborts if it detects SQLite database is locked by another process:

Traceback (most recent call last):
File "/home/skt/sktm/sktm.py", line 129, in
sw.wait_for_pending()
File "/home/skt/sktm/sktm/init.py", line 297, in wait_for_pending
self.check_pending()
File "/home/skt/sktm/sktm/init.py", line 287, in check_pending
patches, bres, bid, series)
File "/home/skt/sktm/sktm/db.py", line 391, in commit_patchtest
brepoid = self.get_repoid(baserepo)
File "/home/skt/sktm/sktm/db.py", line 114, in get_repoid
(baserepo,))
sqlite3.OperationalError: database is locked

This needs investigation, and, if possible, sktm should persist and retry until the database is unlocked.

Agree on skip logic and fix it

Right now for PW2, we are skipping the patch series if they contain one of SKIP patterns in their name, however we are not checking the names of patches themselves. This doesn't seem like a correct strategy since a single patchset can span both kernel and appropriate tools, see for example these netdev series. By checking only series' names, this won't be detected and we'd get a patch application error as we try to apply the patches meant for tools.

For PW1 interface, skipping is based on names of patches, not series (since there are no series, hah), which seems like the correct behavior - we should be fine by skipping tools patches from the patchset since they shouldn't be touching kernel code.

I'd say we should implement the same granularity for PW2 as well - check the series name as we do, but also check names of the individual patches so we can skip them properly.

I'll do this as a part of patchwork rework if there isn't anything I'm missing that would prevent it. @spbnick please shout if you see a hole in the above.

Fix email retrieval from patch headers

Those regexes get confused when , is in the name. Check with eg this patch, where the "VMware, Inc.\" <[email protected]> gets parsed into u'"VMware', u'[email protected]'

Use explicit flag for RH-fork detection

Currently, self.fields is used as an indicator, which is a) unclear and b) hacky to set up -- the RPC needs to be initialized for parent class to initialize correctly, but the self.fields is set up during RPC initialization and then needs to be hacked around to not be overwritten in the parent class etc.

We should have a separate flag for RH-fork indication, and possibly clear up self.fields setup as well.

Rename "sktm.py" to "sktm"

Rename the sktm.py program to just sktm. We don't need the .py on the end, and we don't need it to be confused with a module.

Refactor patchwork.py

The sktm/patchwork.py needs refactoring as it's hard to understand, use and maintain. Some ideas include:

Extract common interface into an abstract class to be the base for both Patchwork interfaces.
Make methods and variables not used from outside private
Create a type/class for data both implementations need about a patch, make the implementations provide functions for fetching those and implement the rest of the handling in the base, abstract class.
Separate classes into their own files. This way it's easier to look at implementations separately. Right now, when searching for an implementation of a function, you often don't know which class you're looking at, which makes it difficult to understand what's going on.
Simplify and narrow down the public interfaces
Include #35.

More ideas are welcome!

Mark all private class methods as such

To avoid exposing too much interface, prepend all private class method names with __ to have them recognized as such.

Create contribution guide, same as skt has

Describe purpose and usage better in README.md

Bring the code into compliance with Google Python Style Guide

Make the code comply with Google Python Style Guide, where possible. Where not possible, document exceptions or additions. Document our compliance and requirements for contributions. Proceed gradually. Add pylint checking to CI runs.

Add series to "pendingpatches" after submitting to Jenkins

At the moment series are added to "pendingpatches" table before a job is submitted to Jenkins. Consider doing it after, so that we don't have patches pending in cases where e.g. Jenkins cannot be reached, or the specified Jenkins project is not found, or doesn't have the requisite parameters.

Convert TODOs/FIXMEs to issues

Go over the code and either fix, or convert all the TODOs and FIXMEs in the code to issues, then remove them from the code.

Implement unit tests

Consider using the same unittest framework skt is starting using.

Record dropped patches

We need to record series dropped by the filter program in the database somehow, so that we can derive the "last tested patch" from them as well as from pending and tested patches. Otherwise we end up re-running the filter on dropped series, possibly generating reports with every run, until a patch that passes the filter shows up, which we can add to the database, and thus advance the notion of the "last tested patch" past it.

patchwork: V1 doesn't seem to collect all e-mail addresses

There is a difference between number of addresses v1 and v2 collects per patchset: at least in one case v1 only collected one address vs v2's five. The particular debug output was concerning internal patches, so cannot be presented here. Nevertheless this should be looked into.

update_baseline always runs commit()

The update_baseline() method always runs commit() even if there are no transactions to commit. Here are the three possible code branches:

No baseline exists: INSERT and commit()
An older baseline exists: UPDATE and commit()
A newer baseline exists: do nothing

`sktm patchwork` aborts if a Jenkins job has no junit results

Even though sktm patchwork can submit any number of jobs in a single run,
it aborts with an exception if it detects a single of those jobs aborted:

[43632] 2018-04-16 04:17:14,039     INFO   submitted build: upstream #80
[43632] 2018-04-16 04:17:14,039     INFO   submitted patchset: ['https://patchwork.ozlabs.org/patch/897858']
[43632] 2018-04-16 04:17:16,839     INFO   submitted build: upstream #81
[43632] 2018-04-16 04:17:16,839     INFO   submitted patchset: ['https://patchwork.ozlabs.org/patch/897863']
[43632] 2018-04-16 04:17:20,333     INFO   submitted build: upstream #82
[43632] 2018-04-16 04:17:20,333     INFO   submitted patchset: ['https://patchwork.ozlabs.org/patch/897892']
[43632] 2018-04-16 04:24:26,006     INFO   job completed: jjid=53; type=1
[43632] 2018-04-16 04:24:26,863     INFO   build_status=ABORTED
Traceback (most recent call last):
  File "/home/skt/sktm/sktm.py", line 129, in <module>
    sw.wait_for_pending()
  File "/home/skt/sktm/sktm/__init__.py", line 275, in wait_for_pending
    self.check_pending()
  File "/home/skt/sktm/sktm/__init__.py", line 218, in check_pending
    bres = self.jk.get_result(self.jobname, bid)
  File "/home/skt/sktm/sktm/jenkins.py", line 143, in get_result
    build.get_status()))
Exception: No results for build 53 (ABORTED)

Instead it should note the job result and continue waiting for the rest of the
jobs.

Use print function instead of statement

In Python3, print statement was promoted to print() function. As one of first steps for compatibility, move to using print() when needed (don't forget to import print_function from __future__ module)

Sktm ignores missing Jenkins job name

When executing the patchwork command, without supplying a --jjname or the
corresponding option in the configuration file, sktm doesn't complain but
produces an obscure stack trace, after spending a while communicating with
the Patchwork server:

Traceback (most recent call last):
  File "/home/skt/sktm/sktm.py", line 127, in <module>
    args.func(sw, cfg)
  File "/home/skt/sktm/sktm.py", line 95, in cmd_patchwork
    sw.check_patchwork()
  File "/home/skt/sktm/sktm/__init__.py", line 196, in check_patchwork
    makeopts=self.makeopts),
  File "/home/skt/sktm/sktm/jenkins.py", line 210, in build
    job = self.server.get_job(jobname)
  File "/usr/lib/python2.7/site-packages/jenkinsapi/jenkins.py", line 138, in get_job
    return self.jobs[jobname]
  File "/usr/lib/python2.7/site-packages/jenkinsapi/jobs.py", line 88, in __getitem__
    raise UnknownJob(job_name)
jenkinsapi.custom_exceptions.UnknownJob: None

Instead, sktm should clearly state the problem and abort as soon as possible.

unknown column "patchsource_id" in foreign key definition

The FOREIGN_KEY constraint in the pendingpatches table is invalid after the changes in #58. It must be removed.

Need a decent way to match beaker and jenkins jobs

Pass jenkins ID to then add to beaker whiteboard.

Second half of skt's #86

Patches get stuck in 'pending' state if they change project

Despite the fact that it was not a part of intended Patchwork workflow, in reality patches tend to change project they belong to. If the project is changed after sktm retrieves the patch and starts testing it, and before the testing is complete, the patch gets permanently stuck in the database (pendingpatches table) because matching for removal is done based on - between other things - project the patch belongs to.

Fixing this might need a rework of DB-related structure or functions - ideally we need a way to add and remove patches from pending so it only cares about Patchwork instance baseURL, not project.

Make sktm work with Jenkins CSRF protection enabled

Make sktm work with Jenkins with Cross Site Request Forgery protection enabled. At the moment sktm fails to access Jenkins if it's enabled:

[skt@bkr-hv02 ~]$ ${HOME}/sktm/sktm.py -v baseline git://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git for-next
[9362] 2018-04-03 04:12:54,953     INFO   checking baseline: git://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git [for-next]
Traceback (most recent call last):
  File "/home/skt/sktm/sktm.py", line 127, in <module>
    args.func(sw, cfg)
  File "/home/skt/sktm/sktm.py", line 85, in cmd_baseline
    sw.check_baseline()
  File "/home/skt/sktm/sktm/__init__.py", line 161, in check_baseline
    makeopts=self.makeopts),
  File "/home/skt/sktm/sktm/jenkins.py", line 212, in build
    self.server.build_job(jobname, params)
  File "/usr/lib/python2.7/site-packages/jenkinsapi/jenkins.py", line 170, in build_job
    self[jobname].invoke(build_params=params or {})
  File "/usr/lib/python2.7/site-packages/jenkinsapi/job.py", line 209, in invoke
    allow_redirects=False
  File "/usr/lib/python2.7/site-packages/jenkinsapi/utils/requester.py", line 151, in post_and_confirm_status
    response.text.encode('UTF-8')
jenkinsapi.custom_exceptions.JenkinsAPIException: Operation failed. url=http://bkr-hv02.dsal.lab.eng.bos.redhat.com:8080/job/skt_test/buildWithParameters, data={'json': '{"redirectTo": ".", "parameter": [{"name": "baserepo", "value": "git://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git"}, {"name": "ref", "value": "for-next"}], "statusCode": "303"}', 'ref': 'for-next', 'baserepo': 'git://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git'}, headers={'Content-Type': 'application/x-www-form-urlencoded'}, status=403, text=<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
<title>Error 403 No valid crumb was included in the request</title>
</head>
<body><h2>HTTP ERROR 403</h2>
<p>Problem accessing /job/skt_test/buildWithParameters. Reason:
<pre>    No valid crumb was included in the request</pre></p><hr><a href="http://eclipse.org/jetty">Powered by Jetty:// 9.4.z-SNAPSHOT</a><hr/>

</body>
</html>

Figure out DB schema

It's likely we don't need all the data we now store in the database and it's contents can be largely simplified. This needs to be properly discussed and agreed upon.

For the start:

patchtest table looks like a residue of previous functionality where new patches from Patchwork were cross-checked with the DB content. With sending the emails (and eventually setting checks / tags in Patchwork), we shouldn't need it.
testrun contains all test runs, while only baseline tests are really checked. We can use baseline_tests instead, or only store required information in baseline table (and update rows after new successful runs)
we shouldn't need full history of all tested patches (table patch), combination of last patch info stored with patchwork data and pendingpatches table should be enough to function. With the emails and checks / tags mentioned in the first point, it's useless to keep them

There might be something I'm missing why the above ideas won't work, and other things I haven't mentioned. Let's brainstorm about them and create a new, simpler DB schema and contents.

Sktm fails to extract e-mail addresses from some headers

Sktm fails to extract e-mail addresses from headers, when they're not enclosed into angle brackets (< and >). In general, sktm needs to follow the standard way of extracting addresses, preferably with the help of an existing library, instead of implementing it itself.

Do not import "sktm" in sktm modules

The sktm/*.py modules should not import "sktm", as that breaks composition. Instead move the required definitions to another module under sktm and import that.

cki-project / sktm Goto Github PK

sktm's People

Contributors

Watchers

Forkers

sktm's Issues

Recommend Projects

Recommend Topics

Recommend Org