Code Monkey home page Code Monkey logo

Comments (8)

carlosejimenez avatar carlosejimenez commented on August 16, 2024 1

Hi @Hk669
Just to be clear, I noticed that you used a "test-repo" in your examples. I'm not sure if that's just a placeholder, but generally the evaluation process will only work with examples already in the SWE-bench dataset. This is because we have the correct tests and behavior logged for those instances only.

However, the other command you mentioned: ./run_eval.sh ../trajectories/hrushi66/azure-gpt-3.5-turbo-1106__SWE-bench_Lite__default__t-0.00__p-0.95__c-3.00__install-1/all_preds.jsonl
does seem like an issue. Can you confirm the version of swebench that you're using? Can you make sure to use the latest version of the repository/package?

from swe-agent.

klieret avatar klieret commented on August 16, 2024

Thanks for the report and the already verbose information. Could you also paste the command you ran for evaluation?

from swe-agent.

klieret avatar klieret commented on August 16, 2024

(my suspicion is that you might be specifying the wrong input file, just because I know that this happened to me before...)

from swe-agent.

Hk669 avatar Hk669 commented on August 16, 2024

Thanks for the report and the already verbose information. Could you also paste the command you ran for evaluation?

the evaluation command:

./run_eval.sh ../trajectories/hrushi669/azure-gpt-3.5-turbo-1106__SWE-agent__test-repo__default__t-0.00__p-0.95__c-3.00__install-1/all_preds.jsonl

from swe-agent.

Hk669 avatar Hk669 commented on August 16, 2024

(my suspicion is that you might be specifying the wrong input file, just because I know that this happened to me before...)

i gave it the correct path of the input file (all_preds.jsonl) as mentioned in the above evaluation command. can you please help me out on this.

from swe-agent.

Hk669 avatar Hk669 commented on August 16, 2024

and also an issue, even after miniconda is installed in my pc.

(venv) (base) hrushi669@Hrushikesh:/mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation$ ./run_eval.sh ../trajectories/hrushi66/azure-gpt-3.5-turbo-1106__SWE-bench_Lite__default__t-0.00__p-0.95__c-3.00__install-1/all_preds.jsonl       
Found 8 total predictions, will evaluate 3 (5 are empty)
🏃 Beginning evaluation...
2024-06-01 17:03:59,954 - run_evaluation - INFO - Found 3 predictions across 1 model(s) in predictions file
2024-06-01 17:03:59,963 - run_evaluation - INFO - [azure-gpt-3.5-turbo-1106__SWE-bench_Lite__default__t-0.00__p-0.95__c-3.00__install-1/django__django/4.1] # of predictions to evaluate: 1 (0 already evaluated)
2024-06-01 17:03:59,978 - run_evaluation - INFO - [azure-gpt-3.5-turbo-1106__SWE-bench_Lite__default__t-0.00__p-0.95__c-3.00__install-1/django__django/3.0] # of predictions to evaluate: 1 (0 already evaluated)
2024-06-01 17:03:59,994 - run_evaluation - INFO - [azure-gpt-3.5-turbo-1106__SWE-bench_Lite__default__t-0.00__p-0.95__c-3.00__install-1/sphinx-doc__sphinx/3.5] # of predictions to evaluate: 1 (0 already evaluated)
2024-06-01 17:04:00,075 - testbed - INFO - Created log file /mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/results/azure-gpt-3.5-turbo-1106__SWE-bench_Lite__default__t-0.00__p-0.95__c-3.00__install-1/testbed_django_3.0.log       
2024-06-01 17:04:00,075 - testbed - INFO - Created log file /mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/results/azure-gpt-3.5-turbo-1106__SWE-bench_Lite__default__t-0.00__p-0.95__c-3.00__install-1/testbed_django_4.1.log       
2024-06-01 17:04:00,076 - testbed - INFO - Created log file /mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/results/azure-gpt-3.5-turbo-1106__SWE-bench_Lite__default__t-0.00__p-0.95__c-3.00__install-1/testbed_sphinx_3.5.log       
2024-06-01 17:04:00,079 - testbed - INFO - Repo django/django: 1 versions
2024-06-01 17:04:00,079 - testbed - INFO - Repo django/django: 1 versions
2024-06-01 17:04:00,080 - testbed - INFO - Repo sphinx-doc/sphinx: 1 versions
2024-06-01 17:04:00,082 - testbed - INFO -      Version 4.1: 1 instances
2024-06-01 17:04:00,082 - testbed - INFO -      Version 3.0: 1 instances
2024-06-01 17:04:00,084 - testbed - INFO -      Version 3.5: 1 instances
2024-06-01 17:04:00,092 - testbed - INFO - Using conda path /mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/django/3.0/tmphk0bcuv7
2024-06-01 17:04:00,094 - testbed - INFO - Using conda path /mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/sphinx/3.5/tmp0ges26wf
2024-06-01 17:04:00,095 - testbed - INFO - Using conda path /mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/django/4.1/tmpu2up60g0
2024-06-01 17:04:00,106 - testbed - INFO - Using working directory /mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/django/4.1/tmpc4yi_9nb for testbed
2024-06-01 17:04:00,107 - testbed - INFO - Using working directory /mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/sphinx/3.5/tmp0eb_sfio for testbed
2024-06-01 17:04:00,108 - testbed - INFO - Using working directory /mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/django/3.0/tmpx0a2ox00 for testbed
2024-06-01 17:04:00,118 - testbed - INFO - No conda path provided, creating temporary install in /mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/django/3.0/tmphk0bcuv7/miniconda3...
2024-06-01 17:04:00,119 - testbed - INFO - No conda path provided, creating temporary install in /mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/sphinx/3.5/tmp0ges26wf/miniconda3...
2024-06-01 17:04:00,119 - testbed - INFO - No conda path provided, creating temporary install in /mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/django/4.1/tmpu2up60g0/miniconda3...
2024-06-01 17:04:00,124 - testbed - INFO - django/3.0 instances in a single process
2024-06-01 17:04:00,125 - testbed - INFO - sphinx/3.5 instances in a single process
2024-06-01 17:04:00,125 - testbed - INFO - django/4.1 instances in a single process
2024-06-01 17:04:00,128 - testbed - INFO - django/3.0 using Miniconda link: https://repo.anaconda.com/miniconda/Miniconda3-py39_23.10.0-1
2024-06-01 17:04:00,128 - testbed - INFO - sphinx/3.5 using Miniconda link: https://repo.anaconda.com/miniconda/Miniconda3-py39_23.10.0-1
2024-06-01 17:04:00,129 - testbed - INFO - django/4.1 using Miniconda link: https://repo.anaconda.com/miniconda/Miniconda3-py39_23.10.0-1

2024-06-01 17:10:29,623 - testbed - ERROR - Error: Command '['bash', '/mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/django/4.1/tmpu2up60g0/miniconda3/miniconda.sh', '-b', '-u', '-p', '/mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/django/4.1/tmpu2up60g0/miniconda3', '&&', 'conda', 'init', '--all']' returned non-zero exit status 1.
2024-06-01 17:10:29,627 - testbed - ERROR - Error: Command '['bash', '/mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/sphinx/3.5/tmp0ges26wf/miniconda3/miniconda.sh', '-b', '-u', '-p', '/mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/sphinx/3.5/tmp0ges26wf/miniconda3', '&&', 'conda', 'init', '--all']' returned non-zero exit status 1.
2024-06-01 17:10:29,634 - testbed - ERROR - Error stdout: PREFIX=/mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/sphinx/3.5/tmp0ges26wf/miniconda3
Unpacking payload ...

  0%|          | 0/69 [00:00<?, ?it/s]
Extracting : archspec-0.2.1-pyhd3eb1b0_0.conda:   0%|          | 0/69 [01:01<?, ?it/s]
Extracting : archspec-0.2.1-pyhd3eb1b0_0.conda:   1%|▏         | 1/69 [01:01<1:09:58, 61.74s/it]
Extracting : boltons-23.0.0-py39h06a4308_0.conda:   1%|▏         | 1/69 [02:13<1:09:58, 61.74s/it]
Extracting : boltons-23.0.0-py39h06a4308_0.conda:   3%|▎         | 2/69 [02:13<1:15:25, 67.54s/it]
Extracting : brotli-python-1.0.9-py39h6a678d5_7.conda:   3%|▎         | 2/69 [02:13<1:15:25, 67.54s/it]
Extracting : bzip2-1.0.8-h7b6447c_0.conda:   4%|▍         | 3/69 [02:13<1:14:17, 67.54s/it]
Extracting : c-ares-1.19.1-h5eee18b_0.conda:   6%|▌         | 4/69 [02:13<1:13:10, 67.54s/it]
Extracting : ca-certificates-2023.08.22-h06a4308_0.conda:   7%|▋         | 5/69 [02:13<1:12:02, 67.54s/it]
Extracting : certifi-2023.7.22-py39h06a4308_0.conda:   9%|▊         | 6/69 [02:13<1:10:55, 67.54s/it]
Extracting : cffi-1.15.1-py39h5eee18b_3.conda:  10%|█         | 7/69 [02:13<1:09:47, 67.54s/it]
Extracting : charset-normalizer-2.0.4-pyhd3eb1b0_0.conda:  12%|█▏        | 8/69 [02:13<1:08:39, 67.54s/it]

concurrent.futures.process._RemoteTraceback:
'''
Traceback (most recent call last):
  File "concurrent/futures/process.py", line 387, in wait_result_broken_or_wakeup
  File "multiprocessing/connection.py", line 256, in recv
TypeError: __init__() missing 1 required positional argument: 'msg'
'''

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "entry_point.py", line 69, in <module>
  File "concurrent/futures/process.py", line 562, in _chain_from_iterable_of_lists
  File "concurrent/futures/_base.py", line 609, in result_iterator
  File "concurrent/futures/_base.py", line 446, in result
  File "concurrent/futures/_base.py", line 391, in __get_result
concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.
[21327] Failed to execute script 'entry_point' due to unhandled exception!

2024-06-01 17:10:29,639 - testbed - ERROR - Error stdout: PREFIX=/mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/django/4.1/tmpu2up60g0/miniconda3
Unpacking payload ...

  0%|          | 0/69 [00:00<?, ?it/s]
Extracting : archspec-0.2.1-pyhd3eb1b0_0.conda:   0%|          | 0/69 [01:09<?, ?it/s]
Extracting : archspec-0.2.1-pyhd3eb1b0_0.conda:   1%|▏         | 1/69 [01:09<1:18:49, 69.56s/it]
Extracting : boltons-23.0.0-py39h06a4308_0.conda:   1%|▏         | 1/69 [02:37<1:18:49, 69.56s/it]
Extracting : boltons-23.0.0-py39h06a4308_0.conda:   3%|▎         | 2/69 [02:37<1:29:34, 80.21s/it]
Extracting : brotli-python-1.0.9-py39h6a678d5_7.conda:   3%|▎         | 2/69 [02:37<1:29:34, 80.21s/it]
Extracting : bzip2-1.0.8-h7b6447c_0.conda:   4%|▍         | 3/69 [02:37<1:28:13, 80.21s/it]
Extracting : c-ares-1.19.1-h5eee18b_0.conda:   6%|▌         | 4/69 [02:37<1:26:53, 80.21s/it]
Extracting : ca-certificates-2023.08.22-h06a4308_0.conda:   7%|▋         | 5/69 [02:37<1:25:33, 80.21s/it]
Extracting : certifi-2023.7.22-py39h06a4308_0.conda:   9%|▊         | 6/69 [02:37<1:24:13, 80.21s/it]
Extracting : cffi-1.15.1-py39h5eee18b_3.conda:  10%|█         | 7/69 [02:37<1:22:53, 80.21s/it]
Extracting : charset-normalizer-2.0.4-pyhd3eb1b0_0.conda:  12%|█▏        | 8/69 [02:37<1:21:32, 80.21s/it]

concurrent.futures.process._RemoteTraceback:
'''
Traceback (most recent call last):
  File "concurrent/futures/process.py", line 387, in wait_result_broken_or_wakeup
  File "multiprocessing/connection.py", line 256, in recv
TypeError: __init__() missing 1 required positional argument: 'msg'
'''

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "entry_point.py", line 69, in <module>
  File "concurrent/futures/process.py", line 562, in _chain_from_iterable_of_lists
  File "concurrent/futures/_base.py", line 609, in result_iterator
  File "concurrent/futures/_base.py", line 446, in result
  File "concurrent/futures/_base.py", line 391, in __get_result
concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.
[21337] Failed to execute script 'entry_point' due to unhandled exception!

2024-06-01 17:10:29,659 - testbed - ERROR - Error: Command '['bash', '/mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/django/3.0/tmphk0bcuv7/miniconda3/miniconda.sh', '-b', '-u', '-p', '/mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/django/3.0/tmphk0bcuv7/miniconda3', '&&', 'conda', 'init', '--all']' returned non-zero exit status 1.
2024-06-01 17:10:29,669 - testbed - ERROR - Error stdout: PREFIX=/mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/django/3.0/tmphk0bcuv7/miniconda3
Unpacking payload ...

  0%|          | 0/69 [00:00<?, ?it/s]
Extracting : archspec-0.2.1-pyhd3eb1b0_0.conda:   0%|          | 0/69 [01:03<?, ?it/s]
Extracting : archspec-0.2.1-pyhd3eb1b0_0.conda:   1%|▏         | 1/69 [01:03<1:11:27, 63.04s/it]
Extracting : boltons-23.0.0-py39h06a4308_0.conda:   1%|▏         | 1/69 [02:14<1:11:27, 63.04s/it]
Extracting : boltons-23.0.0-py39h06a4308_0.conda:   3%|▎         | 2/69 [02:14<1:16:00, 68.07s/it]
Extracting : brotli-python-1.0.9-py39h6a678d5_7.conda:   3%|▎         | 2/69 [02:14<1:16:00, 68.07s/it]
Extracting : bzip2-1.0.8-h7b6447c_0.conda:   4%|▍         | 3/69 [02:14<1:14:52, 68.07s/it]
Extracting : c-ares-1.19.1-h5eee18b_0.conda:   6%|▌         | 4/69 [02:14<1:13:44, 68.07s/it]
Extracting : ca-certificates-2023.08.22-h06a4308_0.conda:   7%|▋         | 5/69 [02:14<1:12:36, 68.07s/it]
Extracting : certifi-2023.7.22-py39h06a4308_0.conda:   9%|▊         | 6/69 [02:14<1:11:28, 68.07s/it]
Extracting : cffi-1.15.1-py39h5eee18b_3.conda:  10%|█         | 7/69 [02:14<1:10:20, 68.07s/it]
Extracting : charset-normalizer-2.0.4-pyhd3eb1b0_0.conda:  12%|█▏        | 8/69 [02:14<1:09:12, 68.07s/it]

concurrent.futures.process._RemoteTraceback:
'''
Traceback (most recent call last):
  File "concurrent/futures/process.py", line 387, in wait_result_broken_or_wakeup
  File "multiprocessing/connection.py", line 256, in recv
TypeError: __init__() missing 1 required positional argument: 'msg'
'''

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "entry_point.py", line 69, in <module>
  File "concurrent/futures/process.py", line 562, in _chain_from_iterable_of_lists
  File "concurrent/futures/_base.py", line 609, in result_iterator
  File "concurrent/futures/_base.py", line 446, in result
  File "concurrent/futures/_base.py", line 391, in __get_result
concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.
[21328] Failed to execute script 'entry_point' due to unhandled exception!

2024-06-01 17:10:29,725 - testbed - ERROR - Error traceback: Traceback (most recent call last):
  File "/mnt/c/Users/hrush/OneDrive - Student Ambassadors/Desktop/AutoSwe/venv/lib/python3.12/site-packages/swebench/harness/context_manager.py", line 82, in __call__
    output = subprocess.run(cmd, **combined_args)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['bash', '/mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/django/3.0/tmphk0bcuv7/miniconda3/miniconda.sh', '-b', '-u', '-p', '/mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/django/3.0/tmphk0bcuv7/miniconda3', '&&', 'conda', 'init', '--all']' returned non-zero exit status 1.

2024-06-01 17:10:29,728 - testbed - ERROR - Error traceback: Traceback (most recent call last):
  File "/mnt/c/Users/hrush/OneDrive - Student Ambassadors/Desktop/AutoSwe/venv/lib/python3.12/site-packages/swebench/harness/context_manager.py", line 82, in __call__
    output = subprocess.run(cmd, **combined_args)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['bash', '/mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/django/4.1/tmpu2up60g0/miniconda3/miniconda.sh', '-b', '-u', '-p', '/mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/django/4.1/tmpu2up60g0/miniconda3', '&&', 'conda', 'init', '--all']' returned non-zero exit status 1.

2024-06-01 17:10:29,734 - testbed - ERROR - Error traceback: Traceback (most recent call last):
  File "/mnt/c/Users/hrush/OneDrive - Student Ambassadors/Desktop/AutoSwe/venv/lib/python3.12/site-packages/swebench/harness/context_manager.py", line 82, in __call__
    output = subprocess.run(cmd, **combined_args)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['bash', '/mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/sphinx/3.5/tmp0ges26wf/miniconda3/miniconda.sh', '-b', '-u', '-p', '/mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/sphinx/3.5/tmp0ges26wf/miniconda3', '&&', 'conda', 'init', '--all']' returned non-zero exit status 1.

❌ Evaluation failed: Command '['bash', 
'/mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/django/3.0/tmphk0bcuv7/miniconda3/miniconda
.sh', '-b', '-u', '-p',
'/mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/django/3.0/tmphk0bcuv7/miniconda3', '&&',  
'conda', 'init', '--all']' returned non-zero exit status 1.
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
  File "/usr/lib/python3.12/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
                    ^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/multiprocessing/pool.py", line 48, in mapstar
    return list(map(*args))
           ^^^^^^^^^^^^^^^^
  File "/mnt/c/Users/hrush/OneDrive - Student
Ambassadors/Desktop/AutoSwe/venv/lib/python3.12/site-packages/swebench/harness/engine_evaluation.py", line 177, in main
    setup_testbed(data_groups[0])
  File "/mnt/c/Users/hrush/OneDrive - Student
Ambassadors/Desktop/AutoSwe/venv/lib/python3.12/site-packages/swebench/harness/engine_validation.py", line 91, in      
setup_testbed
    with TestbedContextManager(
  File "/mnt/c/Users/hrush/OneDrive - Student
Ambassadors/Desktop/AutoSwe/venv/lib/python3.12/site-packages/swebench/harness/context_manager.py", line 285, in       
__enter__
    self.exec(install_cmd)
  File "/mnt/c/Users/hrush/OneDrive - Student
Ambassadors/Desktop/AutoSwe/venv/lib/python3.12/site-packages/swebench/harness/context_manager.py", line 95, in        
__call__
    raise e
  File "/mnt/c/Users/hrush/OneDrive - Student
Ambassadors/Desktop/AutoSwe/venv/lib/python3.12/site-packages/swebench/harness/context_manager.py", line 82, in        
__call__
    output = subprocess.run(cmd, **combined_args)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['bash',
'/mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/django/3.0/tmphk0bcuv7/miniconda3/miniconda
.sh', '-b', '-u', '-p',
'/mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/django/3.0/tmphk0bcuv7/miniconda3', '&&',  
'conda', 'init', '--all']' returned non-zero exit status 1.
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/evaluation.py", line 72, in main
    run_evaluation(
  File "/mnt/c/Users/hrush/OneDrive - Student
Ambassadors/Desktop/AutoSwe/venv/lib/python3.12/site-packages/swebench/harness/run_evaluation.py", line 203, in main   
    pool.map(eval_engine, eval_args)
  File "/usr/lib/python3.12/multiprocessing/pool.py", line 367, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/multiprocessing/pool.py", line 774, in get
    raise self._value
subprocess.CalledProcessError: Command '['bash',
'/mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/django/3.0/tmphk0bcuv7/miniconda3/miniconda
.sh', '-b', '-u', '-p',
'/mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/django/3.0/tmphk0bcuv7/miniconda3', '&&',  
'conda', 'init', '--all']' returned non-zero exit status 1.

==================================
Log directory for evaluation run: 
results/azure-gpt-3.5-turbo-1106__SWE-bench_Lite__default__t-0.00__p-0.95__c-3.00__install-1
- Wrote per-instance scorecards to 
../trajectories/hrushi66/azure-gpt-3.5-turbo-1106__SWE-bench_Lite__default__t-0.00__p-0.95__c-3.00__install-1/scorecard
s.json
Reference Report:
- no_generation: 5
- generated: 3
- with_logs: 0
- install_fail: 0
- reset_failed: 0
- no_apply: 0
- applied: 0
- test_errored: 0
- test_timeout: 0
- resolved: 0
- Wrote summary of run to
../trajectories/hrushi66/azure-gpt-3.5-turbo-1106__SWE-bench_Lite__default__t-0.00__p-0.95__c-3.00__install-1/results.j
son

i have this error haunting me since 3 hours, can anyone help @klieret

from swe-agent.

klieret avatar klieret commented on August 16, 2024

Regarding your initial report: I believe you need to specify the dataset name or path as second argument (the default is princeton-nlp/SWE-bench, which is probably not what you need here).

Let me ping @carlosejimenez and @john-b-yang for this issue.

from swe-agent.

ivan4722 avatar ivan4722 commented on August 16, 2024

Hi I am having a similar issue, I do not know where to get the SWE-bench-lite dataset.
Here is what I'm running:
#!/bin/bash
python run_evaluation.py
--predictions_path "../../../SWE-agent/trajectories/vscode/gpt4__SWE-bench_Lite__default__t-0.00__p-0.95__c-2.00__install-1/all_preds.jsonl"
--swe_bench_tasks "SWE-bench_Lite/data/dev-00000-of-00001.parquet"
--log_dir "logs"
--testbed "testbed"
--skip_existing
--timeout 900
--verbose

I also tried to convert the data to a .json file with a python script, but that also did not work. Could you help direct me to the dataset?
I ran SWE-agent with the regular command.
python run.py --model_name gpt4
--per_instance_cost_limit 2.00
--config_file ./config/default.yaml

from swe-agent.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.