gintool / gin Goto Github PK

GI in No Time - a Simple Microframework for Genetic Improvement

License: MIT License

Java 99.89% Shell 0.11%

gin's Introduction

Gin: A Tool for Experimentation with GI

Gin is a Genetic Improvement (GI) tool. Genetic Improvement is the application of Genetic Programming and other Metaheuristics to existing software, to improve it in some way. In its initial incarnation, Gin was designed to reduce the execution time of Java code by modifying source files whilst preserving functionality as embodied by a set of JUnit tests.

This repository is undergoing active development. Please raise an issue if you have a specific request or notice a bug. If you want to learn more about GI, please see the dedicated Genetic Improvement website.

Please cite the following papers, if you use Gin for academic purposes:

release 2 and newer: "Gin: Genetic Improvement Research Made Easy", Alexander E.I. Brownlee, Justyna Petke, Brad Alexander, Earl T. Barr, Markus Wagner, and David R. White, GECCO Proceedings, 2019.
release 1: "GI in No Time", David R. White, 3rd International GI Workshop, GECCO Companion Material Proceedings, 2017.

Extensions and specific parts of Gin:

"Standard" Statement/Line Edits from the statement and line folders: "Program Transformation Landscapes for Automated Program Modification Using Gin", Petke, J., Alexander, B., Barr, E.T., Brownlee, A. E. I., Wagner, M. & White, D.R., vol 28. 2023.
Edits from the insert folder: "Injecting Shortcuts for Faster Running Java Code", Alexander E. I. Brownlee, Justyna Petke, Anna F. Rasburn, CEC 2020.
Edits from the llm branch: "Enhancing Genetic Improvement Mutations Using Large Language Models", Alexander E. I. Brownlee, James Callan, Karine Even-Mendoza, Alina Geiger, Carol Hanna, Justyna Petke, Federica Sarro and Dominik Sobania. International Symposium on Search Based Software Engineering 2023. LNCS 14415.
Regression Test Selection (RTS) strategies: "Enhancing Genetic Improvement of Software with Regression Test Selection", Giovani Guizzo, Justyna Petke, Federica Sarro, Mark Harman, ICSE 2021.
Profiler and associated tools: "Comparing Apples and Oranges? Investigating the Consistency of CPU and Memory Profiler Results Across Multiple Java Versions", Watkinson, M., Brownlee, A.E.I. Automated Software Engineering vol 31. 2024.

The Gin Design Philosophy

The goal of Gin is to stimulate development in GI tooling, and to lower the barrier to experimenting with GI and related ideas such as program fragility.

With this in mind, it was written in Java and targets Java, a language almost universally familiar to GI researchers. It also mostly avoids functional constructs and similar in Java. In fact, the code is in parts stupidly simple, in that design patterns and good practice are sacrificed in the name of brevity, simplicity, and readability. It is not designed to be an elegant or general solution, but to be easily understood and modified. The second release of Gin largely shares the same philosophy, while providing utilities to handle large Maven and Gradle projects, that bring additional complexity in parts of the codebase.

In order to reduce Gin's footprint, we rely on several Java libraries, e.g., JavaParser and JUnit to parse and test the code under optimisation, respectively. Consequently, operations on statements are done on anything that extends the com.github.javaparser.ast.stmt.Statement class (including blocks).

Getting Started

These instructions will show you how to build Gin and run a simple local search on the example program. Gin is designed to be hacked rather than used as a library, so there is no "API" in the traditional sense, you should just modify the code.

Prerequisites

Gin requires:

JDK 17
Gradle (tested with version 8.0.2)
A number of dependencies, which can be downloaded manually or via Gradle (recommended)
For Maven projects: make sure the Java version is set to the same version as Gin's.

JDK downloads:http://www.oracle.com/technetwork/java/javase/downloads/index.html

Gradle can be downloaded from their website:https://gradle.org/install

The library dependencies can be found in the build.gradle file.

If you have multiple JREs on your system, you may need to call something like export JAVA_HOME="/usr/lib/jvm/java-17-oracle/jre" as well as update-alternatives (Linux) to ensure that Gradle uses Java 17.

Important to note

There are a few known issues that should be noted:

Gin often has issues when running on one of its dependencies #55
Gin's profiler will not work with Windows and Java Flight Recorder on Gradle projects #94

Installing and Building gin

These instructions were tested on OS X and Ubuntu 22.04 LTS.

Clone the repo:

git clone https://github.com/gintool/gin.git

Build using gradle (alternatively import into your favourite IDE, such as IntelliJ). We also provide a gradle wrapper with Gradle 8.0.2.

cd gin
gradle build

This will build and test Gin, and also create a fat jar at build/gin.jar containing all required dependencies.

If you want to use Gin with an IDE, it can be useful to have all the dependencies in a top-level directory:

gradle copyToLib

This will place all dependencies in the top-level lib directory.

Running Unit Tests

gradle test

Documentation

We provide the javadoc task, though the documentation is yet incomplete.

gradle javadoc

Running a Simple Example

If you build the fat jar as above, you can run Gin's local search on a simple example with:

java -jar build/gin.jar -f examples/triangle/Triangle.java -m "classifyTriangle(int,int,int)"

100 steps will be run by default. Your output will be similar to the following:

gin.LocalSearch.search() INFO: Localsearch on file: examples/triangle/Triangle.java method: classifyTriangle(int,int,int)
gin.LocalSearch.search() INFO: Original execution time: 195839357ns
gin.LocalSearch.search() INFO: Step: 1, Patch: | gin.edit.line.CopyLine examples/triangle/Triangle.java:17 -> examples/triangle/Triangle.java:18 |, Failed to compile 
gin.LocalSearch.search() INFO: Step: 2, Patch: | gin.edit.line.CopyLine examples/triangle/Triangle.java:10 -> examples/triangle/Triangle.java:23 |, Time: 395527186ns 
gin.LocalSearch.search() INFO: Step: 3, Patch: | gin.edit.line.DeleteLine examples/triangle/Triangle.java:40 |, New best time: 192528915(ns) 
..

The output tells you for each step whether a patch was invalid (e.g. because it could not be applied) or whether the patched code failed to compile or it failed to pass all provided tests. If none of these conditions hold, then the runtime is shown, and it is highlighted if the patch has resulted in the fastest successful passing of all tests seen so far. At the end, a brief summary is shown.

Logging

We use tinylog for logging, allowing for several logging variants, for example:

java -Dtinylog.format="{level}: {message}" -Dtinylog.level=debug -jar build/gin.jar -f examples/triangle/Triangle.java -m "classifyTriangle(int,int,int)"

Utilities

For input arguments to the list of utilities mentioned below, simply run:

java -cp build/gin.jar gin.<utility_name>

Analysing Patches

Gin provides a simple tool called PatchAnalyser, which will run a patch specified as a commandline argument and report execution time improvement, as well as dumping out annotated source files indicating statement numbering etc.

java -cp build/gin.jar gin.PatchAnalyser -f examples/triangle/Triangle.java -p "| gin.edit.line.DeleteLine examples/triangle/Triangle.java:10 |"

NOTE: Due to the way IDs in SourceFileTree are handled, IDs generated against a particular target source file might change from one version of Gin to another. So, don't try to run PatchAnalyser on patches generated by an old version of Gin on a newer version.

Profiling for Maven and Gradle projects

Gin also provides a Profiler that identifies those parts of the software most exercised by the project's unit tests.

Before you run the below, please make sure you have Maven installed. The default Maven home path is set to ' /usr/local/' (with binary in '/usr/local/bin/mvn'). Please change the path with -mavenHome or -h parameter, if need be.

java -cp build/gin.jar gin.util.Profiler -p my-app -d examples/maven-simple/ -h <path_to_mavenHome>

In case you want to use a Regression Test Selection (RTS) technique to speed up the profiling phase, you can use the gin.util.RTSProfiler class instead. RTS is fully supported for Maven projects.

java -cp build/gin.jar gin.util.RTSProfiler -p my-app -d examples/maven-simple/ -h <path_to_mavenHome> -rts ekstazi

Gin integrates 2 RTS techniques: Ekstazi (-rts ekstazi - default) and a Random selection (-rts random - not recommended). To disable it in gin.util.RTSProfiler and use all test cases to test all target methods, use the option -rts none.

The output is saved in profiler_output.csv. Note that this is empty for the simple project above as Profiler depends on jfr and inherits its constraints.

We've observed it's best to run Gin from within real-world project's repositories, in case test cases have some unexpected hard-coded dependencies.

A full example with an existing Maven project is given further below.

As sampler profiling is a stochastic process, it is also worth performing repeat runs, ideally with a reboot between runs. We have provided a tool to merge multiple profiler CSVs into a single file: gin.util.analysis.MergeProfilerFiles. This will retain only hot methods appearing in more than a specific fraction of the repeats, and takes the union of all unit tests observed as calling a given hot method.

Automated test case generation for Maven and Gradle projects

Gin uses EvoSuite to generate test cases automatically. Make sure test class file are available (all constraints of EvoSuite are inherited).

java -cp build/gin.jar gin.util.TestCaseGenerator -d examples/maven-simple -p my-app -classNames com.mycompany.app.App -generateTests -h <path_to_mavenHome>

Generated tests can be run and integrated with Profiler, PatchAnalyser and Samplers for Maven projects only. Gradle support is in development, though the generated tests can be run if EvoSuite dependency is added to the build.gradle file of the project (please see an example init.gradle file in the testgeneration directory).

Note that the below command will instrument the pom.xml file with the EvoSuite dependency. If invoking Samplers with the generated tests, please supply the EvoSuite dependency on the classpath, e.g., `java -cp build/gin.jar: testgeneration/evosuite-1.0.6.jar gin.'.

java -cp build/gin.jar gin.util.TestCaseGenerator -projectDir examples/maven-simple -projectName my-app -test -h <path_to_mavenHome>

You can also remove all tests from the project's directory before new test generation:

java -cp build/gin.jar gin.util.TestCaseGenerator -d examples/maven-simple -p my-app -removeTests

Samplers

We provide an abstract Sampler class that provides utilities for testing patches made at the class and method level. It takes method names and tests associated with those methods as input. Please note that you can simply supply the output of Profiler as an input file. Also note that we capture both the actual and expected result in case of failed assertions. Also, worth noting that the samplers will probably produce different lists of patches with different versions of Gin (owing to updates to Java syntax, the RNG, and supporting libraries that mean the space of possible edits changes).

We provide three subclasses to show example usage scenarios and show how easily Gin can be extended.

EmptyPatchTester simply runs all the project tests (declared in the input method file) through Gin:

java -cp build/gin.jar gin.util.EmptyPatchTester -d examples/triangle/ -c examples/triangle/ -m examples/triangle/method_file.csv

DeleteEnumerator deletes each line and each statement in a method (sampled at random, without replacement) and saves the result to sampler_output.csv.

java -cp build/gin.jar gin.util.DeleteEnumerator -d examples/triangle/ -c examples/triangle/ -m examples/triangle/method_file.csv

RandomSampler makes random edits to a project and saves the results to sampler_output.csv.

java -cp build/gin.jar gin.util.RandomSampler -d examples/triangle/ -c examples/triangle/ -m examples/triangle/method_file.csv

Assuming EvoSuite tests were generated and original tests not removed:

java -cp build/gin.jar:testgeneration/evosuite-1.0.6.jar gin.util.RandomSampler -d examples/maven-simple -p my-app -m examples/maven-simple/example_profiler_results.csv -h <path_to_mavenHome>

The CSV written out by RandomSampler contains one line per unit test per patch. The utility gin.util.analysis.AggregateRandomSamplerOutput can be used to aggregate these results to one line per patch, with summary statistics indicating test pass rates and run times.

Gin also offers an implementation of the multi-objective algorithm NSGA-II for improving the execution time and memory consumption of software. You can run it similarly to the other samplers shown above, for the example triangle project the following command should be called:

java -cp build/gin.jar gin.algorithm.nsgaii.NSGAII -d examples/triangle/ -c examples/triangle/ -m examples/triangle/method_file.csv

Test Runners

The tests can be run internally (through InternalTestRunner), or in a separate jvm (through ExternalTestRunner). Moreover, each test case can be run in a separate jvm as well. This covers the situation where a target project has multiple threads that mean tests cannot be run safely in parallel. We added these options to Samplers, for example:

Tests run in a separate jvm:

java -cp build/gin.jar gin.util.EmptyPatchTester -d examples/triangle/ -c examples/triangle/ -m examples/triangle/method_file.csv -j

Each test case run in a separate jvm:

java -cp build/gin.jar gin.util.EmptyPatchTester -d examples/triangle/ -c examples/triangle/ -m examples/triangle/method_file.csv -J

Full Example with a Maven Project

We will now try cloning, profiling, and sampling for a project taken from GitHub: spatial4j. This example is intended to run with Java 8, so please use Gin version 2.1 (check out tag v2.1). Until we have a chance to update this tutorial, please try substituting spatial4j with jCodec [https://github.com/jcodec/jcodec] - it is longer running but does work under Java 17.

First, move into the examples directory for working, and clone the project.

cd examples
git clone https://github.com/locationtech/spatial4j.git
cd spatial4j
git checkout tags/spatial4j-0.7

Build with maven to ensure we have all the dependencies we need, that the original java source has compiled, and that the tests all run:

mvn compile
mvn test

Run Gin's Profiler. In this case we limit to the first 20 tests for speed using -n 20. Results are written to a CSV. This CSV is used to specify the target methods and unit tests for the samplers below.

projectnameforgin='spatial4j'; java -Dtinylog.level=trace -cp ../../build/gin.jar gin.util.Profiler -r 1 -mavenHome /usr/share/maven -n 20 -p $projectnameforgin -d . -o $projectnameforgin.Profiler_output.csv

Run EmptyPatchTester. This serves as a baseline, showing the performance of the original unaltered code against the unit tests.

projectnameforgin='spatial4j'; java -Dtinylog.level=trace -cp ../../build/gin.jar gin.util.EmptyPatchTester -J -p $projectnameforgin -d . -m $projectnameforgin.Profiler_output.csv -o $projectnameforgin.EmptyPatchTester_output.csv  -mavenHome /usr/share/maven

Run RandomSampler to test the effect of different edits in the space. Here, we limit to statement edits; we allow only 1 edit per patch; and we test 100 edits sampled at random.

projectnameforgin='spatial4j'; editType='STATEMENT'; patchSize='1'; patchNumber='100'; java -Dtinylog.level=trace -cp ../../build/gin.jar gin.util.RandomSampler -j -p $projectnameforgin -d . -m $projectnameforgin.Profiler_output.csv -o $projectnameforgin.RandomSampler_${editType}_patchSize${patchSize}_patchNumber${patchNumber}_output.csv -mavenHome /usr/share/maven -editType $editType -patchNumber $patchNumber -patchSize $patchSize

Contributing

Please feel free to open issues and submit pull requests. If you'd like to aid in the development of Gin more generally, please get in touch with Sandy Brownlee and Justyna Petke.

License

This project is licensed under the MIT License. Please see the LICENSE.md file for details. By submitting a pull request, you agree to license your contribution under the MIT license to this project.

gin's People

Contributors

Stargazers

Watchers

gin's Issues

lonely SortTest.java

There is a lonely SortTest.java in examples/locoGP

Testing

New tests are welcome :)

Re-design and tidy-up

think about re-structuring to pull out common parts into one place for easier manipulation, e.g., have a fitness folder along the lines of edits folder

features for "patch analyser"

To support the final assessment of the impact of patches, it would be good to provide a bit of help to the users. Most of the following thoughts are around an optional testing phase that happens immediately after the actual optimisation phase (the evolution). This is a bit similar to what is implemented in the automated algorithm configurator irace.

Some random thoughts, in no particular order:

For the functional properties (e.g. "number of tests passed"), the final test phase provides an easy-to-use overview. This becomes especially important in case Gin works with more than just a single solution at a time in the future, e.g. by considering trade-offs or populations or multiple solutions that all satisfy the functional requirements are found. As functional tests are typically deterministic in their outcome, no special handling of noise is necessary.
This also applies to measurements of deviation from the oracle (think of a target vector from which the actual calculations of the patched code deviate).
However, some measurements are not deterministic: runtime, memory consumption (debatable), energy consumption, ... For these cases, in order to achieve meaningful and trustworthy observations that can be used/published, it would be good if tests can be run repeatedly. In irace, I have to specify this for each test (== problem instance) individually. As tests can vary in the resources needed, we can go this way or simply let the user provide a global setting, e.g., "do 100 repetitions". If Gin does the repetitions, then the individual results should be piped into a CSV file, because you never know what kind of statistical analyses the user wants to run.
Statistics: if we do the processing on Gin's end, then we need to decide what statistics to provide. min/max/average/median/stdev/quartiles might be a good start - and already more than what the average user needs. The alternative is to let the user handle this, which is how irace handles it. It does not know anything about repetitions, but just runs the configurations on the provided test instances.
Testing & side-effects & plenty of noise: as some tests can be affected by noise and can have side-effects (e.g. due to warming up of the system, due to side-effects, due to dynamic overclocking of single cores, ...), and in case Gin is to do the repetitions, then Gin might have to execute patches A/B/C/D alternatingly, in order to bias the measurements as little as possible and to average out noise: ABCDABCDABCD...
For two patches one might actually want to do ABABAB...BABABA, but it is not clear to me how to extend this to larger sets of patches - therefore the suggested round-robin.

A few of these thoughts have consequences for Gin's internal survival mechanism (read: which solution to pick in case of a local search), but that's a different story...

Test Case for Examples

We should add a new test case that compiles all the examples in ./examples/ and runs their tests against them.

The examples are excluded from the main source tree to ensure dynamic class loading works correctly, but this also means they're not being properly built and tested.

An alternative would be to have gradle build and test them. Unsure what's best.

Bug? Size zero patches possible

LocalSearch.neighbour() can generate a patch with no Edits in it, by deleting an edit when patch size == 1. Not sure if this is intentional or not!

Method not found error thrown from SourceFile

Not sure what the reason is yet. If a method has signature:
"methodname Map<List, Integer>> methodname(int, Map<List, Integer>)"
the following is thrown when trying to setup SourceFile:
"gin.SourceFile.getTargetMethodRootNodesFromCU() ERROR: Couldn't find these methods:.."
I narrowed it down to gin.misc.FullyQualifiedNames.getFQName(MethodDeclaration m) which returns the wrong FQName, namely:
"Integer>methodname(int,Map<List,Integer>)"
instead of:
"methodname(int,Map<List,Integer>)"

similar issue occurs for the following method signature:
"List<Pair<Integer,Integer>> methodname(int , int , int)"
returns:
"Integer>>methodname(int,int,int)"
instead of:
"methodname(int,int,int)"

Tutorial

It would be nice to have a tutorial :) e.g. if you have a new edit/search how can you experiment with Gin? also a hello world (trianlge) example etc.

Test index incorrect

Looks like a regression after commit 60262c1 that I didn't spot when checking the PR.

Should be easily fixed. Working on it now.

command-line output improvement

Example:

Markuss-MBP:git-gin wagner$ java -jar build/gin.jar examples/locoGP/SortBubbleDouble.java
Optimising source file: examples/locoGP/SortBubbleDouble.java

Initial execution time: 5377250.0 (ns)

Step 1 | COPY 11 -> 1:0 |*** New best *** Time: 4127871.0(ns)
Step 2 |Time: 5673439.0
Step 3 |Time: 4937142.0
Step 4 | COPY 11 -> 1:0 | DEL 12 |*** New best *** Time: 2406271.0(ns)
Step 5 | COPY 11 -> 1:0 | DEL 12 | COPY 11 -> 5:0 |Time: 4719200.0
Step 6 | DEL 12 |Time: 4197840.0
Step 7 | COPY 11 -> 1:0 | DEL 12 | DEL 15 |Patch invalid
Step 8 | COPY 11 -> 1:0 |Time: 5805536.0
Step 9 | COPY 11 -> 1:0 |Time: 3091739.0
Step 10 | COPY 11 -> 1:0 | DEL 12 | DEL 4 |Patch invalid
Step 11 | COPY 11 -> 1:0 | DEL 12 | MOVE 8 -> 3:1 |Time: 4251221.0
Step 12 | DEL 12 |Time: 2804988.0
Step 13 | COPY 11 -> 1:0 | DEL 12 | MOVE 19 -> 0:2 |Time: 3854282.0
Step 14 | DEL 12 |Time: 2794879.0
Step 15 | DEL 12 |Time: 2700009.0
Step 16 | COPY 11 -> 1:0 | DEL 12 | COPY 1 -> 6:0 |Time: 1.5343549E7
Step 17 | COPY 11 -> 1:0 | DEL 12 | DEL 13 |Patch invalid
Step 18 | DEL 12 |*** New best *** Time: 2008354.0(ns)
Step 19 | DEL 12 | DEL 0 |Failed to compile
Step 20 | DEL 12 | COPY 5 -> 2:0 |Time: 4259756.0
Step 21 |Time: 4324886.0
Step 22 | DEL 12 | MOVE 12 -> 6:0 |Patch invalid
Step 23 | DEL 12 | COPY 10 -> 4:0 |Failed to compile
Step 24 |Time: 2329325.0
Step 25 |Time: 3040417.0
Step 26 |Time: 2725780.0
Step 27 |*** New best *** Time: 1742064.0(ns)
Step 28 | MOVE 5 -> 1:0 |Failed to compile
Step 29 | MOVE 3 -> 2:0 |Time: 8787057.0
Step 30 | COPY 2 -> 1:0 |Time: 4770382.0
Step 31 | COPY 18 -> 5:0 |Failed to compile
Step 32 | DEL 17 |Time: 2432983.0
Step 33 | DEL 16 |Failed to compile
Step 34 | MOVE 1 -> 4:0 |Failed to compile
Step 35 | MOVE 6 -> 0:1 |Patch invalid
Step 36 | COPY 2 -> 2:0 |Failed to compile
Step 37 | COPY 6 -> 0:2 |Failed to compile
Step 38 | COPY 13 -> 4:0 |Failed to compile
Step 39 | MOVE 9 -> 3:1 |Failed to pass all tests
Step 40 | MOVE 9 -> 0:2 |Failed to compile
Step 41 | COPY 19 -> 0:0 |Failed to compile
Step 42 | DEL 2 |Patch invalid
Step 43 | DEL 11 |Patch invalid
Step 44 | MOVE 0 -> 2:0 |Failed to compile
Step 45 | MOVE 3 -> 2:0 |Time: 8274254.0
Step 46 | COPY 10 -> 0:1 |Time: 4223412.0
Step 47 | MOVE 11 -> 4:0 |Patch invalid
Step 48 | COPY 3 -> 5:0 |Failed to compile
Step 49 | MOVE 18 -> 5:0 |Failed to compile
Step 50 | MOVE 5 -> 6:2 |Failed to compile
Step 51 | DEL 2 |Patch invalid
Step 52 | DEL 10 |Time: 2993312.0
Step 53 | COPY 11 -> 6:0 |Failed to compile
Step 54 | MOVE 8 -> 3:2 |Time: 2385970.0
Step 55 | COPY 11 -> 4:0 |Time: 2803413.0
Step 56 | DEL 5 |Time: 3657767.0
Step 57 | COPY 9 -> 1:0 |Failed to compile
Step 58 | DEL 13 |Patch invalid
Step 59 | MOVE 4 -> 0:2 |Patch invalid
Step 60 | DEL 15 |Patch invalid
Step 61 | DEL 10 |Time: 2856155.0
Step 62 | DEL 4 |Patch invalid
Step 63 | MOVE 18 -> 1:0 |Failed to compile
Step 64 | COPY 5 -> 1:0 |Failed to compile
Step 65 | COPY 0 -> 2:0 |Failed to compile
Step 66 | MOVE 7 -> 5:0 |Failed to compile
Step 67 | COPY 2 -> 0:0 |Time: 2157261.0
Step 68 | DEL 13 |Patch invalid
Step 69 | MOVE 3 -> 0:2 |Time: 5331342.0
Step 70 | DEL 5 |Time: 3313758.0
Step 71 | COPY 16 -> 6:1 |Failed to compile
Step 72 | COPY 2 -> 0:1 |*** New best *** Time: 1394517.0(ns)
Step 73 |Time: 5582996.0
Step 74 | COPY 2 -> 0:1 | DEL 0 |Failed to compile
Step 75 | COPY 2 -> 0:1 | DEL 0 |Failed to compile
Step 76 | COPY 2 -> 0:1 | DEL 12 |Time: 3002779.0
Step 77 |Time: 5694114.0
Step 78 | COPY 2 -> 0:1 | MOVE 7 -> 5:0 |Failed to compile
Step 79 |Time: 3539437.0
Step 80 | COPY 2 -> 0:1 | MOVE 8 -> 2:0 |Failed to pass all tests
Step 81 | COPY 2 -> 0:1 | COPY 16 -> 0:1 |Failed to compile
Step 82 | COPY 2 -> 0:1 | MOVE 3 -> 5:0 |Failed to compile
Step 83 |Time: 2307858.0
Step 84 |Time: 4806585.0
Step 85 |Time: 4796634.0
Step 86 | COPY 2 -> 0:1 | DEL 4 |Patch invalid
Step 87 |Time: 2615552.0
Step 88 | COPY 2 -> 0:1 | COPY 13 -> 6:1 |Failed to compile
Step 89 | COPY 2 -> 0:1 | MOVE 15 -> 4:0 |Patch invalid
Step 90 |Time: 3400359.0
Step 91 |Time: 1422670.0
Step 92 | COPY 2 -> 0:1 | DEL 19 |Failed to compile
Step 93 | COPY 2 -> 0:1 | COPY 14 -> 2:0 |Time: 2111640.0
Step 94 | COPY 2 -> 0:1 | MOVE 6 -> 2:0 |Patch invalid
Step 95 |Time: 8795765.0
Step 96 | COPY 2 -> 0:1 | MOVE 14 -> 0:2 |Failed to compile
Step 97 |Time: 7091829.0
Step 98 |Time: 6114283.0
Step 99 |Time: 8386180.0
Step 100 |Time: 1971654.0

Best patch found: | COPY 2 -> 0:1 |
Found at step: 72
Best execution time: 1394517.0 (ns)
Speedup (%): 0.7406635361941513

==> Does *** New best *** mean that this patch passes all test cases, or just more than before? I assume Gin does not optimise in a multi-objective fashion right away, but maybe it optimises lexicographically <testsFailed,runtime>? Or what does it do? Minimise time, but always passing all tests? Yes, I have yet to look into the source code... Given the distinction of "patch invalid" and "failed to compile", I think the goal is runtime minimisation under the condition that all tests are passed, right? If so, this might be a statement for the Readme.md, e.g., "Running a Simple Example" with an explanation of the terms used.

==> Maybe "Speedup (%): 0.7406635361941513" can be rewritten to "Speedup (%): 74.06635361941513"? Strictly speaking, the speedup is no 0.74%.

SourceFileTree.getStatement() and getNode() don't clone properly

These methods create clones of the nodes to return, but use clone() rather than CloneVisitorCopyIDs so any child nodes don't get copied with their associated IDs. Only really an issue for debugging but could trip someone up sometime!

Bug? JUnit tests and classes in Gin jar

Ordinarily if there is a bug in the test class, then we get a compile error, and if a test fails, there is a "failed to pass all tests".

In examples/TriangleTest.java, if all references to the Triangle class are replaced with ExampleTriangleProgram, then Gin will run, but reporting much faster run times (one example just now has 1203275 ns for the unpatched program), like the delay code didn't trigger but everything else did.

I assume this is because the JUnit reflection code finds ExampleTriangleProgram in the Gin classpath and runs it, though it doesn't explain why the delay isn't firing. Possibly this is just "something to watch out for"?

TriangleTest needs compiling to run Triangle Example out of the box.

Reported by a few users - the TriangleTest class file is not in Gin, and therefore the minimal example doesn't run.

Operator separation of concerns

Currently, the Edit classes are just wrappers for ints, which are then used by Patch.apply(). It would be cleaner to put the operation in each of the Edit classes - though this would mean they have access to the lists of statements and blocks in Patch so not easy to separate out.

Java 1.11 Compatibility

A few people have reported problems with future versions of the JDK and Gin. This could be a regression, but we should do some proper evaluation and work out what's going on.

Improve GP search

Basic GenProg-like GP search added in utilities. A few efficiency improvements could be made: history recording (to avoid fitness-recalculations), early termination (e.g. for bug fixing when fix is found, or on first timeout for runtime improvement). Tests are also missing.

Extend TestCaseGenerator to fully support Gradle projects

As it stands for Gradle projects we can generate tests, but we will have to manually modify the build.gradle file in the given gradle project in order to add the EvoSuite dependency (as shown in testgeneration/init.gradle), so that they could be run with gradle test command (for profiling). For Maven projects the pom is updated programmatically. Theoretically one could use FileWriter to do this task for Gradle, but it would be much nicer to do this programmatically and it seems like wtih gradle tooling api version 6.6.1 it would be possible, hence I am not implementing the FileWriter hack. However, currently updating the tooling dependency breaks a Gin test, so perhaps it's best to put this issue on hold until we are ready for a general upgrade of dependencies.

Add timing sampler

Add new sampler (well, copy tidy versions of the one from private repo) from this paper: Brownlee AEI, Petke J & Rasburn AF (2020) Injecting Shortcuts for Faster Running Java Code. In: IEEE World Congress on Computational Intelligence, Glasgow, 19.07.2020-24.07.2020. Piscataway, NJ, USA: IEEE. https://wcci2020.org/

This sampler extends RandomSampler, by adding repeat runs for successful patches, interleaved with runs of the original code, to allow comparisons of run times.

diff generator

We need the facility to output a Gin patch as a standard diff.

Multiclass Capability

Currently, gin only supports single-class optimisation.

A paper in development, and general application, requires the ability to handle larger systems.

This issue is to discuss and record development to that end.

Only tests with annotations will be run with LocalSearch and their timeout modified

This is due to implmentation of JUnitBridge.annotateTestWithTimeout and TestRunner.testsForClass methods which assume each individual Junit test is annotated. This is to avoid running setup/cleanup as a test, I think. Temporary workaround is to get all declared methods and only run those that start with "test" in its name, i.e:
add the following at line 84 in TestRunner (and import java.lang.reflect.Method;):
if (methods.size()==0) { for (Method eachTestMethod : clazz.getDeclaredMethods()) { if (eachTestMethod.getName().startsWith("test")) { String methodName = eachTestMethod.getName(); UnitTest test = new UnitTest(testClassName, methodName); tests.add(test); } } }
Need a cleaner solution.

Implement Continuous Integration

Gin needs CI. We need it to accelerate development and make things scale.

Someone want to step up for this?

broken test cases

Markuss-MBP:git-gin wagner$ java -jar build/gin.jar examples/locoGP/SortBubble.java
Optimising source file: examples/locoGP/SortBubble.java

Initial execution time: -1.0 (ns)

Step 1 | COPY 5 -> 1:0 |Failed to compile
Step 2 | MOVE 10 -> 3:0 |Failed to compile
...
Step 100 | DEL 3 |Failed to compile

Best patch found: |
Found at step: 0
Best execution time: -1.0 (ns)
Speedup (%): -0.0

==> This seems to be a broken test case?
==> The same for SortHeap.java, SortInsertion.java, SortMerge.java, SortQuick.java, SortRadix.java, SortSelection.java, SortSelection2.java, and SortShell.text - unless the goal here is to fix a bug, of course.

Profiler.java: explicitly set sampling interval

https://github.com/gintool/gin/blob/master/src/main/java/gin/util/Profiler.java

line67 now:
private static final String HPROF_ARG = "-agentlib:hprof=cpu=samples,lineno=y,depth=1,file=";

amend to:
private static final String HPROF_ARG = "-agentlib:hprof=cpu=samples,interval=10,lineno=y,depth=1,file=";

as interval=10 is hprof's current default.

source:
https://docs.oracle.com/javase/8/docs/technotes/samples/hprof.html

Line edits target empty and comment lines

Line edits can target any lines in a source file, including empty and comment-only lines. The constructors currently target one or two of these lines chosen uniformly at random. It might make more sense that comment/empty lines are excluded from the edit source and destination lines.

It is already possible to get a list of lines excluding those that are empty/comments by calling SourceFileLine.getLineIDsNonEmptyOrComments(). The samplers and enumerators do this, choose a pair of line numbers from that list, and call the Line edit constructors with specific line numbers rather than just an RNG and a source file. Something like this:

[code]
List allLines = sourceFile.getLineIDsNonEmptyOrComments(false);
List targetMethodLines = sourceFile.getLineIDsNonEmptyOrComments(true);
CopyLine cl = new CopyLine(sourceFile,targetMethodLines.get(rng.nextInt(targetMethodLines.size())), sourceFile, allLines.get(rng.nextInt(allLines.size())));
[/code]

We could either change the standard (SourceFile, RNG) constructors for Line edits to do this; or make an extended version of Line edits that only target non-empty/comment lines. My preference is for the latter; anyone have any thoughts either way?

Add "shortcut" operators

Add new operators (well, copy tidy versions of the ones from private repo) from this paper: Brownlee AEI, Petke J & Rasburn AF (2020) Injecting Shortcuts for Faster Running Java Code. In: IEEE World Congress on Computational Intelligence, Glasgow, 19.07.2020-24.07.2020. Piscataway, NJ, USA: IEEE. https://wcci2020.org/

Test cross-platform support

Test Gin on Windows and ensure all code is cross-platform, particularly unit tests.

FullyQualifiedNames.getClassName() confused by inner classes

It seems that if inner classes appear before the main class declaration, this method can pick those up instead, with can throw off detection of target methods.

e.g. in Disruptor, trying to target
com.lmax.disruptor.Sequence.compareAndSet(long,long)
gets broken because Sequence.java contains classes LhsPadding,Value, and RhsPadding before Sequence.

Add a list of publications section to README

Add a section to the Gin repo with papers that use Gin. We could add a note, "if you used Gin in your research, please let us know, so we could add your paper to the list below”

Allow end user to choose specific edits

As the list of edits is likely to grow, it might be better to allow a user to specify specific edits that can be used in sampling/searching, rather than the "families" of edits specified by EditType at the moment. I expect something like allowing a list of string of edit names, which are then loaded using reflection.

This will mean that all edits will need a standard constructor (probably taking a sourcefile and RNG as arguments).

Early termination of test execution

I was talking to Justyna and she raised up the idea of early termination of test executions when a single test fails.
Thus, instead of executing the test suite until the end, we can stop the execution at the first failing test case and return the maximum fitness value.

We can alternatively set it as default (since I believe it is going to speed up considerably the optimisation), and add a command line option such as "-ft" (full testing) to run all test cases even when one fails, just in case this functionality is needed.

We have to also think how to do it in case of multiple repetitions of test case executions.

SortCocktail example throws exception

gradle shadowJar
java -jar build/gin.jar ./examples/locoGP/SortCocktail.java

produces

Step 93 | COPY 2 -> 0:2 | DEL 12 | MOVE 16 -> 2:0 |Exception in thread "main" java.lang.AssertionError: I am not a child of my parent.

Improve README instructions based on Markus' feedback

From Markus' original comments:

Incorporate Integration Tests

The tests in IntegrationTests.java test the dynamic loading of in-memory compiled classes and use java assertions. These need to be turned into proper integration tests. I was unable to get these working with junit, because it interferes with the custom class loading.

So for now, integrating these into cradle to run on "gradle test" would be a good idea.

Inner Class Support for In-Memory Compilation

Currently in-memory compilation assumes a single class under optimisation, and that the class does not contain inner classes. The InMemoryCompiler used by gin will actually compiler the inner class, but it doesn't return the inner class' compiled object. We need to fork InMemoryCompiler and add a method to return a collection of class objects, including any inner classes, rather than a single class.

This step is necessary to support #25

Mismatch between treatment of invalid patches between Internal and External test runner

Basically patchValid check is not performed in ExternalTestRunner. Spoted by @GiovaniGuizzo , thank you.
We need to decide on default behaviour and consistently apply it for both.

Edit fromString methods assume no spaces in filenames

These methods typically use spaces to split into tokens for parsing. This split will include any spaces in the source file name.

license: Apache 2.0 includes a patent part

Some comparisons:
https://exygy.com/which-license-should-i-use-mit-vs-apache-vs-gpl/
https://www.quora.com/Whats-the-difference-between-Apache-v2-0-and-MIT-license-What-are-the-major-consequences-of-using-Apache-software-over-MIT
https://snyk.io/blog/mit-apache-bsd-fairest-of-them-all

Information overload: https://en.wikipedia.org/wiki/Comparison_of_free_and_open-source_software_licenses

TLDR: yes, the MIT one is short and sweet and in plain English, but I do like the Apache 2.0 one for being just a little bit more detailed. Maybe this is something to consider.

Deprecated / unchecked warnings

There are many compilation warnings - some due to deprecated calls to JavaParser but plenty of others.

Add this to build.gradle to see them all:

tasks.withType(JavaCompile) { configure(options) { options.compilerArgs << '-Xlint:deprecation' << '-Xlint:unchecked' } }

Running Gin on one of its dependencies

If Gin is run of one of it’s dependencies (e.g. commons libraries) a linkage error occurs if a different version off the same library is used. This is to be expected as two classes with the same name are loaded to the CacheClassLoader. In this case InternalTestRunner simply should not be run: we need to confirm that it's okay with ExternalTestRunner, and then add a note on that to the README.

Add a statement mutation operator

Not Java-specific as such but I think will be heavily slanted towards Java syntax. The idea is to look within statements for possible modifications. So e.g.:

for operators, look for things that take the same arguments and return values (e.g. == =! > < are all mutually swappable)
for an assignment, swap in a new variable or value for the RHS. This will mean maintaining a list of variables for the class (or possibly just for the parent statement so they are in scope)

Operator to mutate boolean expressions

Mainly to benefit from short circuiting, add a mutation operator to vary the order content of boolean expressions (so, look for || or && and swap their child nodes)

Add test case generation

Use Evosuite before applying GIN
(do we want to integrate evolsuite with gin?)

Using more tests in GIN should give a smoother gradient on a problem, provided each test is somewhat independent. Not sure what fitness evosuite uses, but it could certainly help to automatically boost the test cases in GIN.

Standardise and improve Javadoc

The current Javadoc is not complete. I have already improved and am still improving the Javadoc of some of the classes. For example: CacheClassLoader.

My suggestion is to improve and standardise the Javadoc while modifying a class, rather than improving the Javadoc of the whole prohect in one go (too boring).

If you are willing to add Javadoc, here is a link to best practices of documentation: Oracle's Guide.

Multiple CopyStatement to the same block are all applied at the same location

Current behaviour:

When multiple statements are inserted into the same block they are all applied at the location of the earliest insertion.

Expected behaviour:

Insertions should happen at the positions at which they are applied individually.

Steps to reproduce:

Run the following command, inserting the statement delay(); in the source code: ("35 -> 34:61")

java -cp build/gin.jar gin.PatchAnalyser -f examples/triangle/Triangle.java -p "| gin.edit.statement.CopyStatement examples/triangle/Triangle.java:35 -> examples/triangle/Triangle.java:34:61 |"

Then diff examples/triangle/Triangle.java.original examples/triangle/Triangle.java.patched -c yields:

*** examples/triangle/Triangle.java.original	2020-12-10 20:44:36.721198276 +0000
--- examples/triangle/Triangle.java.patched	2020-12-10 21:07:01.147888414 +0000
***************
*** 16,21 ****
--- 16,22 ----
              a = b;
              b = tmp;
          }
+         delay();
          if (a > c) {
              int tmp = a;
              a = c;

Similarly, run the following command, inserting the statement delay(); 10 lines later in the source code: ("35 -> 34:96")

java -cp build/gin.jar gin.PatchAnalyser -f examples/triangle/Triangle.java -p "| gin.edit.statement.CopyStatement examples/triangle/Triangle.java:35 -> examples/triangle/Triangle.java:34:96 |"

diff examples/triangle/Triangle.java.original examples/triangle/Triangle.java.patched -c yields:

*** examples/triangle/Triangle.java.original	2020-12-10 20:44:36.721198276 +0000                
--- examples/triangle/Triangle.java.patched	2020-12-10 21:15:36.664564081 +0000
***************
*** 26,31 ****
--- 26,32 ----
              b = c;
              c = tmp;
          }
+         delay();
          if (a + b <= c) {
              return INVALID;
          } else if (a == b && b == c) {

Applied at the same time, both insertions happen at the same position. ("35 -> 34:61 | 35 -> 34:96" is incorrectly applied as "35 -> 34:61 | 35 -> 34:61")

java -cp build/gin.jar gin.PatchAnalyser -f examples/triangle/Triangle.java -p "| gin.edit.statement.CopyStatement examples/triangle/Triangle.java:35 -> examples/triangle/Triangle.java:34:61 | gin.edit.statement.CopyStatement examples/triangle/Triangle.java:35 -> examples/triangle/Triangle.java:34:96 |"

The diff yields:

*** examples/triangle/Triangle.java.original	2020-12-10 20:44:36.721198276 +0000                
--- examples/triangle/Triangle.java.patched	2020-12-10 21:18:07.734566718 +0000
***************
*** 16,21 ****
--- 16,23 ----
              a = b;
              b = tmp;
          }
+         delay();
+         delay();
          if (a > c) {
              int tmp = a;
              a = c;

Expected behaviour: the diff should yield instead:

*** examples/triangle/Triangle.java.original	2020-12-10 21:21:21.964570109 +0000               
--- examples/triangle/Triangle.java.patched	2020-12-10 21:21:04.691236475 +0000
***************
*** 16,21 ****
--- 16,22 ----
              a = b;
              b = tmp;
          }
+         delay();
          if (a > c) {
              int tmp = a;
              a = c;
***************
*** 26,31 ****
--- 27,33 ----
              b = c;
              c = tmp;
          }
+         delay();
          if (a + b <= c) {
              return INVALID;
          } else if (a == b && b == c) {

Make all classes Serializable

I propose making all classes Serializable (i.e. making them implement java.io.Serializable).
This will allow Gin's objects to be serialized by projects that depend on it or by users that want to perform some kind of distributed computing (e.g. Apache Spark, Hadoop, etc).

Performance issue with the parsing of the method file

There is a minor performance issue with the parsing of the method file.
This block is not very efficient when we have thousands of test cases.

A HashSet instead of a List solves this problem. I tried to change it, but apparently the method index in this List is used in some classes. The refactoring would change the functionality of such classes, so I decided to create this issue instead.

If I end up fixing this issue, I'll commit the changes.
Cheers,
Giovani.

Upgrade JavaParser dependency

It's rather behind, and much has been improved.

Test poisoning over multiple repetitions in ExternalTestRunner

I came across a problem with test poisoning when executing multiple repetitions of a same test suite while using ExternalTestRunner.
Basically, in the second execution of a test case (more precisely, this one), it fails.
Apparently there are some interactions between tests. Probably a static variable is being changed by other tests and it is making this test fail.

I fixed the bug in my development branch by forcing the start of a new process after each repetition, i.e., after all test cases in the test set are executed, the next test set repetition is done in a clean process (see it here).
This makes the test pass (intended behaviour).

However, I am not sure if this should be considered default behaviour.
In any case, I am happy to fix this bug, I just need your input on how to properly do it.
A few options:

Add this feature as default;
Add this feature as default, but also add an option to Sampler to disable it;
Do not add the feature as default, but add an option to enable it in Sampler.

Cheers,
Giovani.

Failure of RTS tests due to uninitiated .ekstazi directory

gin.util.regression.impl.EkstaziRTSTest.testConstructor fails as examples/unittests/ekstazi/.ekstazi is not created by Config.createStorer(); - no idea why; need to consult with Giovani, to be resolved later