Semantic-Preserving Program Transformations

This project contains the program transformation tool and the datasets of transformed programs for the paper 'On the Generalizability of Neural Program Models with respect to Semantic-Preserving Program Transformations' (arXiv, ScienceDirect) accepted at the IST Journal, Elsevier 2021.

Structure

├── JavaMethodTransformer   # source code for the program transformation tool.
├── images                  # some figures from the paper for README.

Motivating Example:

Figure 1: A misprediction in code2vec is revealed by renaming the other variable as var0 in the compareTo method of the java-small/test/hadoop/ApplicationAttemptId.java file.

Semantic Program Transformations:

Variable Renaming (VN) - renames the name of a variable.
Permute Statement (PS) - swaps two independent statements in a basic block.
Unused Statement (UN) - inserts an unused string declaration.
Loop Exchange (LX) - replaces for loops with while loops or vice versa.
Switch to If (SF) - replaces a switch statement with an equivalent if statement.
Boolean Exchange (BX) - switches the value of a boolean variable and propagates this change in the method.

Program Transformation Tool:

Create the jar file (JavaMethodTransformer.jar) using Maven and then call the jar with the following arguments:

args[0] = Input directory to the original methods.
args[1] = Output directory to the transformed methods.

$ cd <...>/JavaMethodTransformer/
$ mvn clean compile assembly:single
$ java -jar target/jar/JavaMethodTransformer.jar <.../methods/> <.../transforms/>

Datasets of Transformed Methods:

single-place - apply the transformation to each candidate location separately.
all-place - apply the transformation to all candidate locations simultaneously.
x-percent - apply the transformation to randomly selected X% candidate locations, where X = [25, 50, 75].

Generalizability Metrics:

Prediction Change Percentage (PCP):

The percentage of changes in predictions before the transformation and after the transformation.

Type of Changes:
- CCP - the percentage of correct predictions that stay correct.
- CWP - the percentage of correct predictions that become wrong.
- WWSP - the percentage of wrong predictions that stay to the same wrong prediction.
- WCP - the percentage of wrong predictions that become correct.
- WWDP - the percentage of wrong predictions that change to a different wrong prediction.
Sub-token Comparison:
- Precision - the percentage of predicted sub-tokens that are true positives.
- Recall - the percentage of true positive sub-tokens that are correctly predicted.
- F1-Score - the harmonic mean of precision (P) and recall (R).

Experimental Setting:

Target Downstream Task:
- Method Name Prediction (a.k.a. Code Summarization)
Subject Neural Program Models:
- code2vec model - represents programs with AST paths (monolithic path embeddings).
- code2seq model - represents programs with AST paths (encode paths node-by-node).
- GGNN model - represents programs with graphs (semantic edges + nodes).
Original Java Datasets:
- Java-Small, Java-Med, and Java-Large

Citation:

On the Generalizability of Neural Program Models with respect to Semantic-Preserving Program Transformations

@article{rabin2021generalizability,
  title = {On the generalizability of Neural Program Models with respect to semantic-preserving program transformations},
  author = {Md Rafiqul Islam Rabin and Nghi D.Q. Bui and Ke Wang and Yijun Yu and Lingxiao Jiang and Mohammad Amin Alipour},
  journal = {Information and Software Technology},
  volume = {135},
  pages = {106552},
  year = {2021},
  issn = {0950-5849},
  doi = {https://doi.org/10.1016/j.infsof.2021.106552},
  url = {https://www.sciencedirect.com/science/article/pii/S0950584921000379}
}

nashid / tnpa-generalizability Goto Github PK

tnpa-generalizability's Introduction

Semantic-Preserving Program Transformations

Structure

Motivating Example:

Semantic Program Transformations:

Program Transformation Tool:

Datasets of Transformed Methods:

Generalizability Metrics:

Experimental Setting:

Citation:

tnpa-generalizability's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent