Code Monkey home page Code Monkey logo

fse-24-unitrans's Introduction

Exploring and Unleashing the Power of Large Language Models in Automated Code Translation

Preparation

jdk17
javafx-sdk-20 refer to https://openjfx.io/openjfx-docs/#introduction
Stack BackTrace for C++: https://github.com/NEWPLAN/newplan_toolkit/backtrace
EMMA Coverage Tool: https://emma.sourceforge.net/
transformers == 4.30.2
torch == 1.12.1

Attachments

  • Please find the data noise breakdown here: img.png
  • Please find the tmp.java file here: tmp.pdf.
  • Please find the statistical test results here: statistical test.pdf.
  • Please find the OJ experimental results in the folder oj_samples, which is reported in the threats to validity section.

Cleaned Dataset

  • ./cleaned_data/testable_samples.jsonl: cleaned dataset used in this work, including parallel functions of Java, Python, and C++.
  • ./cleaned_data/transcoder_evaluation_gfg: test cases associated with the cleaned dataset.

Quick Start

  • Test Case Generation Phase

    1. generate inputs with LLMs (taking GPT3.5 as an example)
      python gpt3_5.py --dst_lang ${dst_lang} --obj 0 --k ${test_case_num} --k ${sample_k}
    
    1. collect test cases
      python process_valid_inputs.py --model ${model_name} --dst_lang ${dst_lang}
    
  • Translation Augmentation Phase

    1. translation augmentation (taking GPT3.5 as an example)
      python gpt3_5.py --src_lang ${src_lang} --dst_lang ${dst_lang} --obj 3 --k ${sample_k} --test_case_num ${test_case_num}  
    
    1. post-process translated programs.
      python process_translation.py --src_lang ${src_lang} --dst_lang ${dst_lang} --suffix ${suffix}
    
    1. translation evaluation
      python fetch_feedbacks.py --model ${model_name} --src_lang ${src_lang} --dst_lang ${dst_lang} --test_case_num ${test_case_num} round ${round}
    
  • Translation Repair Phase

    1. error info analysis
      python process_feedbacks.py --src_lang ${src_lang} --dst_lang ${dst_lang} --round ${round} --test_case_num ${test_case_num}  
    
    1. program repair
      python gpt3_5.py --src_lang ${src_lang} --dst_lang ${dst_lang} --obj 4 --k ${sample_k} --test_case_num ${test_case_num} 
    
    1. post-process repaired programs.
     python process_translation.py --src_lang ${src_lang} --dst_lang ${dst_lang} --suffix ${suffix}
    
  • Evaluation

    1. evaluation for computational accuracy
      python evaluation_CA.py --model ${model_name} --src_lang ${src_lang} --dst_lang ${dst_lang} --k ${CA@k} --timeout ${timeout} --suffix ${suffix}
    
    1. evaluation for exact match accuracy
      python evaluation_EM.py --model ${model_name} --src_lang ${src_lang} --dst_lang ${dst_lang} --suffix ${suffix}
    

fse-24-unitrans's People

Contributors

yz1019117968 avatar

Stargazers

Bin Chen avatar S. CHEN avatar  avatar Roy Willemse avatar Haoyang Ma avatar Albert-Gong avatar

Watchers

Kostas Georgiou avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.