arporter / habakkuk Goto Github PK
View Code? Open in Web Editor NEWFortran code analysis for performance prediction
Fortran code analysis for performance prediction
When analysing code such as:
rtmp1 = (sshn_u%data(ji ,jj ) + hu%data(ji ,jj ))*un%data(ji ,jj)
Habakkuk reports that there are no array accesses.
Currently Habakkuk contains code to generate a dot-format file for use with graphviz. However, graphviz has an eponomously-titled python package and it would be better if we used that. We can probably make it optional so as to avoid having a mandatory dependence on graphviz (i.e. habakkuk is still useful, even if graphviz is not installed).
Testing is rudimentary at the moment. In this ticket I will improve the coverage of the test suite. This will then provide firmer foundations for future development.
Habakkuk currently includes a patched version of the fparser package as extracted from f2py.
The patches are currently being put into the fparser package under stfc/fparser#9. Once that is done we can remove all of the fparser code from Habakkuk and make it depend on the fparser package on pypi.
Habakkuk currently makes no attempt to deduce what performance might be obtained by SIMD-vectorising the loops that it finds. The only way to account for this at the moment is to assume perfect SIMD and multiply the performance estimate it produces by the vector length (e.g. 2 for SSE, 4 for AVX2).
Since Habakkuk already has support for loop-unrolling we could, in principle, unroll the loop by the vector length and look to pack contiguous array accesses into 'vector' variables/nodes. We will investigate the feasability of doing this in this ticket.
We don't currently recognise MAX and MIN as Fortran intrinsics. We need to add them (including an estimation of their cost in FLOPs and cycles).
It should produce a usage message at the very least...
Currently the type of a DAGNode
is left as None if it represents a scalar variable.
This is not very nice so in this issue we will change Habakkuk to give such nodes a scalar
type.
Although CPU microarchitecture details are pulled out of a file, this is currently hard-wired to be config_ivy_bridge.py
. Since this microarchitecture has no FMA support (and thus no cost associated with an FMA) this causes problems if the user attempts to have the code generate FMAs. Currently two tests are set to xfail because of this limitation.
We need to build upon the existing functionality to make the choice of micro-architecture configurable by the user.
Processing the named dynamo source file fails with:
dag_node.DAGError: "DAG Error: Unrecognised child type: <class 'fparser.Fortran2003.Mult_Operand'>, (temp1 * rho_ref_at_quad(i) * theta_ref_at_quad(i)) ** temp2"
This is because in #3 I changed the code to no-longer silently skip any parser-generated object that it didn't recognise.
During DAG construction a node can be tagged as being an integer quantity if it is found to be used as an array index. However, we currently make no attempt to ensure that other nodes are updated to be consistent. This can mean that an arithmetic operation can be wrongly identified as a FLOP when in fact its arguments are integer.
In this issue we'll add a further processing step to ensure that information on which nodes are integer is propagated through the DAG.
For code of the form:
b(i) = 3*a(i)
a(i) = 2
Habakkuk will count three cache-line accesses. This is because the assignment to a(i)
creates a new node in the DAG (named "a'(i)") which is then seen as a new array access. This is clearly incorrect.
Repository needs re-structuring if it is to work with pypi and (cleanly) with travis.
I'll try to follow the suggestions here:
https://hynek.me/articles/sharing-your-labor-of-love-pypi-quick-and-dirty/
Hi,
Which Fortran dialects are supported? Can this handle F2008 object oriented style code?
We currently only support (and test with) Python 2.7. In this issue we will extend Habakkuk to support both 2.7 and Python 3.
CodeCov (https://codecov.io/) offers coverage information on git diffs which will be very useful for reviewing pull requests. This Issue is to attempt integration with that service.
Although the parser code in parse2003.py recognises indirect array accesses, it currently flattens such expressions into strings. i.e my_array(map(i)+1) results in an array index stored as "map(i)+1". If i is the loop variable and we wish to unroll the loop then this is going to cause problems.
This issue can possibly be thought of as identifying contiguous and non-contiguous array accesses for the purposes of memory-bandwidth usage and potential SIMD vectorisation.
Currently running the test suite results in a lot of .gv files in the CWD.
In this issue we will change the tests to use temporary directories to avoid this.
We'll also investigate the intermittent test failures that I've sometimes seen when running pytest in parallel.
Habakkuk fails to store the array-index expression when it is enclosed within parentheses, e.g.:
a((i+j)) = 2.0*b(i)
Requesting the full_name
of the node representing the LHS of this expression just returns "a()".
The updates to fparser have broken Habakkuk.
This issue (and branch) will address any issues found in getting habakkuk working for the (pre-processed) NEMO code base.
Need to update installation instructions to say that habakkuk may be installed from pypi.
Also need to remove text about f2py installation since it now just depends on the fparser package.
Somewhere along the line Habakkuk's ability to handle indirect array accesses (e.g. my_array(map(i) + 1)
has been broken. Similarly, we now fail to handle e.g. my_array(2*i)
. We will fix these problems in this issue.
Currently setup.py
does not specify which fparser version to install. Since the API has changed in the latest release (0.0.7) this means Habakkuk does not work out of the box. We will fix this by updating Habakkuk's use of fparser.
While working on #8 I've realised that the schedule generator quietly ignores the cost of any intrinsic operations - it considers only operators. This is a significant omission because the intrinsics that we do currently recognise (sin, cos, **) are computationally costly. In this issue we'll think about how we might remedy the situation.
Travis can automate the process of making a release to pypi.
In this issue we'll configure this project to make use of that functionality.
It seems that a DIVSD does not exclusively occupy the execution port - i.e. it's possible for an independent MULSD to make use of the multiplication hardware while a DIVSD is in progress.
Unfortunately this doesn't appear to be documented anywhere. We therefore need to be able to produce a performance estimate that allows for this overlapping, maybe with bounds since we don't know just how much the two operations can be overlapped.
In conjunction with this, Agner's published results for 1/throughput of a DIVSD have a range of 8-14 cycles. Currently habakkuk produces a performance estimate using a single value for this throughput but again, we could do with producing bounds on the estimate.
I'm extending this issue to cope with overlapping of ADDSDs with DIVSDs as well.
It would be very useful if Habakkuk were able to generate some parameterised expression for the
working-set size of a loop body. Will probably have to do this in terms of the loop bounds.
A lot of work has been done on fparser and Habakkuk needs some work to make use of the latest version.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.