Code Monkey home page Code Monkey logo

Comments (4)

murthyrudra avatar murthyrudra commented on September 13, 2024 1

Hi @zorazrw , this is the command I had run

python nl2code_codegen.py --language en --model_size 2B --model_data mono \
 --num_tests_input 0 --num_tests_eval 100 --num_examples 0 --temperature 0.8 \
 --top_p 0.95 --num_return_sequences 50

This is my environment

- `transformers` version: 4.24.0
- Platform: Linux-4.18.0-425.13.1.el8_7.x86_64-x86_64-with-glibc2.17
- Python version: 3.8.11
- Huggingface_hub version: 0.11.1
- PyTorch version (GPU?): 1.12.1 (True)
- Tensorflow version (GPU?): 2.12.0 (True)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using GPU in script?: Yes
- Using distributed or parallel set-up in script?: No

Please let me know if you need any other information

from odex.

zorazrw avatar zorazrw commented on September 13, 2024

Nice catch in the whitespace stripping! Also thanks a lot for doing the comparison studies.
I tried to reproduce the results: adding prompt.strip() did improve the results a lot, but results on my end are ~10 points lower than your reported ones, as shown below:

Overall Pass@K Scores: 
[pass@1] 0.3465 (439)
[pass@2] 0.4220 (439)
[pass@3] 0.4615 (439)
[pass@4] 0.4861 (439)
[pass@5] 0.5027 (439)
[pass@6] 0.5147 (439)
[pass@7] 0.5236 (439)
[pass@8] 0.5304 (439)
[pass@9] 0.5358 (439)
[pass@10] 0.5399 (439)

Would you be able to provide more configuration details? Or spot any that may differ?

from odex.

neubig avatar neubig commented on September 13, 2024

@zorazrw : is this fixed?

from odex.

zorazrw avatar zorazrw commented on September 13, 2024

Yes, we are able to get similar results using the current code that includes whitespace cleaning.

Overall Pass@K Scores:
[pass@1] 0.4160 (439)
[pass@2] 0.4701 (439)
[pass@3] 0.4945 (439)
[pass@4] 0.5085 (439)
[pass@5] 0.5177 (439)
[pass@6] 0.5241 (439)
[pass@7] 0.5291 (439)
[pass@8] 0.5330 (439)
[pass@9] 0.5361 (439)
[pass@10] 0.5385 (439)

Considering the randomness of sampling, this should be close enough to the results in the first comment.

The results we report in the paper have a slightly larger variance for smaller Ks since by default we sampled 10 predictions instead of 50.

from odex.

Related Issues (7)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.