Comments (8)
Hi @Hk669
Just to be clear, I noticed that you used a "test-repo" in your examples. I'm not sure if that's just a placeholder, but generally the evaluation process will only work with examples already in the SWE-bench dataset. This is because we have the correct tests and behavior logged for those instances only.
However, the other command you mentioned: ./run_eval.sh ../trajectories/hrushi66/azure-gpt-3.5-turbo-1106__SWE-bench_Lite__default__t-0.00__p-0.95__c-3.00__install-1/all_preds.jsonl
does seem like an issue. Can you confirm the version of swebench that you're using? Can you make sure to use the latest version of the repository/package?
from swe-agent.
Thanks for the report and the already verbose information. Could you also paste the command you ran for evaluation?
from swe-agent.
(my suspicion is that you might be specifying the wrong input file, just because I know that this happened to me before...)
from swe-agent.
Thanks for the report and the already verbose information. Could you also paste the command you ran for evaluation?
the evaluation command:
./run_eval.sh ../trajectories/hrushi669/azure-gpt-3.5-turbo-1106__SWE-agent__test-repo__default__t-0.00__p-0.95__c-3.00__install-1/all_preds.jsonl
from swe-agent.
(my suspicion is that you might be specifying the wrong input file, just because I know that this happened to me before...)
i gave it the correct path of the input file (all_preds.jsonl
) as mentioned in the above evaluation command. can you please help me out on this.
from swe-agent.
and also an issue, even after miniconda is installed in my pc.
(venv) (base) hrushi669@Hrushikesh:/mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation$ ./run_eval.sh ../trajectories/hrushi66/azure-gpt-3.5-turbo-1106__SWE-bench_Lite__default__t-0.00__p-0.95__c-3.00__install-1/all_preds.jsonl
Found 8 total predictions, will evaluate 3 (5 are empty)
🏃 Beginning evaluation...
2024-06-01 17:03:59,954 - run_evaluation - INFO - Found 3 predictions across 1 model(s) in predictions file
2024-06-01 17:03:59,963 - run_evaluation - INFO - [azure-gpt-3.5-turbo-1106__SWE-bench_Lite__default__t-0.00__p-0.95__c-3.00__install-1/django__django/4.1] # of predictions to evaluate: 1 (0 already evaluated)
2024-06-01 17:03:59,978 - run_evaluation - INFO - [azure-gpt-3.5-turbo-1106__SWE-bench_Lite__default__t-0.00__p-0.95__c-3.00__install-1/django__django/3.0] # of predictions to evaluate: 1 (0 already evaluated)
2024-06-01 17:03:59,994 - run_evaluation - INFO - [azure-gpt-3.5-turbo-1106__SWE-bench_Lite__default__t-0.00__p-0.95__c-3.00__install-1/sphinx-doc__sphinx/3.5] # of predictions to evaluate: 1 (0 already evaluated)
2024-06-01 17:04:00,075 - testbed - INFO - Created log file /mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/results/azure-gpt-3.5-turbo-1106__SWE-bench_Lite__default__t-0.00__p-0.95__c-3.00__install-1/testbed_django_3.0.log
2024-06-01 17:04:00,075 - testbed - INFO - Created log file /mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/results/azure-gpt-3.5-turbo-1106__SWE-bench_Lite__default__t-0.00__p-0.95__c-3.00__install-1/testbed_django_4.1.log
2024-06-01 17:04:00,076 - testbed - INFO - Created log file /mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/results/azure-gpt-3.5-turbo-1106__SWE-bench_Lite__default__t-0.00__p-0.95__c-3.00__install-1/testbed_sphinx_3.5.log
2024-06-01 17:04:00,079 - testbed - INFO - Repo django/django: 1 versions
2024-06-01 17:04:00,079 - testbed - INFO - Repo django/django: 1 versions
2024-06-01 17:04:00,080 - testbed - INFO - Repo sphinx-doc/sphinx: 1 versions
2024-06-01 17:04:00,082 - testbed - INFO - Version 4.1: 1 instances
2024-06-01 17:04:00,082 - testbed - INFO - Version 3.0: 1 instances
2024-06-01 17:04:00,084 - testbed - INFO - Version 3.5: 1 instances
2024-06-01 17:04:00,092 - testbed - INFO - Using conda path /mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/django/3.0/tmphk0bcuv7
2024-06-01 17:04:00,094 - testbed - INFO - Using conda path /mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/sphinx/3.5/tmp0ges26wf
2024-06-01 17:04:00,095 - testbed - INFO - Using conda path /mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/django/4.1/tmpu2up60g0
2024-06-01 17:04:00,106 - testbed - INFO - Using working directory /mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/django/4.1/tmpc4yi_9nb for testbed
2024-06-01 17:04:00,107 - testbed - INFO - Using working directory /mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/sphinx/3.5/tmp0eb_sfio for testbed
2024-06-01 17:04:00,108 - testbed - INFO - Using working directory /mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/django/3.0/tmpx0a2ox00 for testbed
2024-06-01 17:04:00,118 - testbed - INFO - No conda path provided, creating temporary install in /mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/django/3.0/tmphk0bcuv7/miniconda3...
2024-06-01 17:04:00,119 - testbed - INFO - No conda path provided, creating temporary install in /mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/sphinx/3.5/tmp0ges26wf/miniconda3...
2024-06-01 17:04:00,119 - testbed - INFO - No conda path provided, creating temporary install in /mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/django/4.1/tmpu2up60g0/miniconda3...
2024-06-01 17:04:00,124 - testbed - INFO - django/3.0 instances in a single process
2024-06-01 17:04:00,125 - testbed - INFO - sphinx/3.5 instances in a single process
2024-06-01 17:04:00,125 - testbed - INFO - django/4.1 instances in a single process
2024-06-01 17:04:00,128 - testbed - INFO - django/3.0 using Miniconda link: https://repo.anaconda.com/miniconda/Miniconda3-py39_23.10.0-1
2024-06-01 17:04:00,128 - testbed - INFO - sphinx/3.5 using Miniconda link: https://repo.anaconda.com/miniconda/Miniconda3-py39_23.10.0-1
2024-06-01 17:04:00,129 - testbed - INFO - django/4.1 using Miniconda link: https://repo.anaconda.com/miniconda/Miniconda3-py39_23.10.0-1
2024-06-01 17:10:29,623 - testbed - ERROR - Error: Command '['bash', '/mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/django/4.1/tmpu2up60g0/miniconda3/miniconda.sh', '-b', '-u', '-p', '/mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/django/4.1/tmpu2up60g0/miniconda3', '&&', 'conda', 'init', '--all']' returned non-zero exit status 1.
2024-06-01 17:10:29,627 - testbed - ERROR - Error: Command '['bash', '/mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/sphinx/3.5/tmp0ges26wf/miniconda3/miniconda.sh', '-b', '-u', '-p', '/mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/sphinx/3.5/tmp0ges26wf/miniconda3', '&&', 'conda', 'init', '--all']' returned non-zero exit status 1.
2024-06-01 17:10:29,634 - testbed - ERROR - Error stdout: PREFIX=/mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/sphinx/3.5/tmp0ges26wf/miniconda3
Unpacking payload ...
0%| | 0/69 [00:00<?, ?it/s]
Extracting : archspec-0.2.1-pyhd3eb1b0_0.conda: 0%| | 0/69 [01:01<?, ?it/s]
Extracting : archspec-0.2.1-pyhd3eb1b0_0.conda: 1%|▏ | 1/69 [01:01<1:09:58, 61.74s/it]
Extracting : boltons-23.0.0-py39h06a4308_0.conda: 1%|▏ | 1/69 [02:13<1:09:58, 61.74s/it]
Extracting : boltons-23.0.0-py39h06a4308_0.conda: 3%|▎ | 2/69 [02:13<1:15:25, 67.54s/it]
Extracting : brotli-python-1.0.9-py39h6a678d5_7.conda: 3%|▎ | 2/69 [02:13<1:15:25, 67.54s/it]
Extracting : bzip2-1.0.8-h7b6447c_0.conda: 4%|▍ | 3/69 [02:13<1:14:17, 67.54s/it]
Extracting : c-ares-1.19.1-h5eee18b_0.conda: 6%|▌ | 4/69 [02:13<1:13:10, 67.54s/it]
Extracting : ca-certificates-2023.08.22-h06a4308_0.conda: 7%|▋ | 5/69 [02:13<1:12:02, 67.54s/it]
Extracting : certifi-2023.7.22-py39h06a4308_0.conda: 9%|▊ | 6/69 [02:13<1:10:55, 67.54s/it]
Extracting : cffi-1.15.1-py39h5eee18b_3.conda: 10%|█ | 7/69 [02:13<1:09:47, 67.54s/it]
Extracting : charset-normalizer-2.0.4-pyhd3eb1b0_0.conda: 12%|█▏ | 8/69 [02:13<1:08:39, 67.54s/it]
concurrent.futures.process._RemoteTraceback:
'''
Traceback (most recent call last):
File "concurrent/futures/process.py", line 387, in wait_result_broken_or_wakeup
File "multiprocessing/connection.py", line 256, in recv
TypeError: __init__() missing 1 required positional argument: 'msg'
'''
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "entry_point.py", line 69, in <module>
File "concurrent/futures/process.py", line 562, in _chain_from_iterable_of_lists
File "concurrent/futures/_base.py", line 609, in result_iterator
File "concurrent/futures/_base.py", line 446, in result
File "concurrent/futures/_base.py", line 391, in __get_result
concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.
[21327] Failed to execute script 'entry_point' due to unhandled exception!
2024-06-01 17:10:29,639 - testbed - ERROR - Error stdout: PREFIX=/mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/django/4.1/tmpu2up60g0/miniconda3
Unpacking payload ...
0%| | 0/69 [00:00<?, ?it/s]
Extracting : archspec-0.2.1-pyhd3eb1b0_0.conda: 0%| | 0/69 [01:09<?, ?it/s]
Extracting : archspec-0.2.1-pyhd3eb1b0_0.conda: 1%|▏ | 1/69 [01:09<1:18:49, 69.56s/it]
Extracting : boltons-23.0.0-py39h06a4308_0.conda: 1%|▏ | 1/69 [02:37<1:18:49, 69.56s/it]
Extracting : boltons-23.0.0-py39h06a4308_0.conda: 3%|▎ | 2/69 [02:37<1:29:34, 80.21s/it]
Extracting : brotli-python-1.0.9-py39h6a678d5_7.conda: 3%|▎ | 2/69 [02:37<1:29:34, 80.21s/it]
Extracting : bzip2-1.0.8-h7b6447c_0.conda: 4%|▍ | 3/69 [02:37<1:28:13, 80.21s/it]
Extracting : c-ares-1.19.1-h5eee18b_0.conda: 6%|▌ | 4/69 [02:37<1:26:53, 80.21s/it]
Extracting : ca-certificates-2023.08.22-h06a4308_0.conda: 7%|▋ | 5/69 [02:37<1:25:33, 80.21s/it]
Extracting : certifi-2023.7.22-py39h06a4308_0.conda: 9%|▊ | 6/69 [02:37<1:24:13, 80.21s/it]
Extracting : cffi-1.15.1-py39h5eee18b_3.conda: 10%|█ | 7/69 [02:37<1:22:53, 80.21s/it]
Extracting : charset-normalizer-2.0.4-pyhd3eb1b0_0.conda: 12%|█▏ | 8/69 [02:37<1:21:32, 80.21s/it]
concurrent.futures.process._RemoteTraceback:
'''
Traceback (most recent call last):
File "concurrent/futures/process.py", line 387, in wait_result_broken_or_wakeup
File "multiprocessing/connection.py", line 256, in recv
TypeError: __init__() missing 1 required positional argument: 'msg'
'''
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "entry_point.py", line 69, in <module>
File "concurrent/futures/process.py", line 562, in _chain_from_iterable_of_lists
File "concurrent/futures/_base.py", line 609, in result_iterator
File "concurrent/futures/_base.py", line 446, in result
File "concurrent/futures/_base.py", line 391, in __get_result
concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.
[21337] Failed to execute script 'entry_point' due to unhandled exception!
2024-06-01 17:10:29,659 - testbed - ERROR - Error: Command '['bash', '/mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/django/3.0/tmphk0bcuv7/miniconda3/miniconda.sh', '-b', '-u', '-p', '/mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/django/3.0/tmphk0bcuv7/miniconda3', '&&', 'conda', 'init', '--all']' returned non-zero exit status 1.
2024-06-01 17:10:29,669 - testbed - ERROR - Error stdout: PREFIX=/mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/django/3.0/tmphk0bcuv7/miniconda3
Unpacking payload ...
0%| | 0/69 [00:00<?, ?it/s]
Extracting : archspec-0.2.1-pyhd3eb1b0_0.conda: 0%| | 0/69 [01:03<?, ?it/s]
Extracting : archspec-0.2.1-pyhd3eb1b0_0.conda: 1%|▏ | 1/69 [01:03<1:11:27, 63.04s/it]
Extracting : boltons-23.0.0-py39h06a4308_0.conda: 1%|▏ | 1/69 [02:14<1:11:27, 63.04s/it]
Extracting : boltons-23.0.0-py39h06a4308_0.conda: 3%|▎ | 2/69 [02:14<1:16:00, 68.07s/it]
Extracting : brotli-python-1.0.9-py39h6a678d5_7.conda: 3%|▎ | 2/69 [02:14<1:16:00, 68.07s/it]
Extracting : bzip2-1.0.8-h7b6447c_0.conda: 4%|▍ | 3/69 [02:14<1:14:52, 68.07s/it]
Extracting : c-ares-1.19.1-h5eee18b_0.conda: 6%|▌ | 4/69 [02:14<1:13:44, 68.07s/it]
Extracting : ca-certificates-2023.08.22-h06a4308_0.conda: 7%|▋ | 5/69 [02:14<1:12:36, 68.07s/it]
Extracting : certifi-2023.7.22-py39h06a4308_0.conda: 9%|▊ | 6/69 [02:14<1:11:28, 68.07s/it]
Extracting : cffi-1.15.1-py39h5eee18b_3.conda: 10%|█ | 7/69 [02:14<1:10:20, 68.07s/it]
Extracting : charset-normalizer-2.0.4-pyhd3eb1b0_0.conda: 12%|█▏ | 8/69 [02:14<1:09:12, 68.07s/it]
concurrent.futures.process._RemoteTraceback:
'''
Traceback (most recent call last):
File "concurrent/futures/process.py", line 387, in wait_result_broken_or_wakeup
File "multiprocessing/connection.py", line 256, in recv
TypeError: __init__() missing 1 required positional argument: 'msg'
'''
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "entry_point.py", line 69, in <module>
File "concurrent/futures/process.py", line 562, in _chain_from_iterable_of_lists
File "concurrent/futures/_base.py", line 609, in result_iterator
File "concurrent/futures/_base.py", line 446, in result
File "concurrent/futures/_base.py", line 391, in __get_result
concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.
[21328] Failed to execute script 'entry_point' due to unhandled exception!
2024-06-01 17:10:29,725 - testbed - ERROR - Error traceback: Traceback (most recent call last):
File "/mnt/c/Users/hrush/OneDrive - Student Ambassadors/Desktop/AutoSwe/venv/lib/python3.12/site-packages/swebench/harness/context_manager.py", line 82, in __call__
output = subprocess.run(cmd, **combined_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/subprocess.py", line 571, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['bash', '/mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/django/3.0/tmphk0bcuv7/miniconda3/miniconda.sh', '-b', '-u', '-p', '/mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/django/3.0/tmphk0bcuv7/miniconda3', '&&', 'conda', 'init', '--all']' returned non-zero exit status 1.
2024-06-01 17:10:29,728 - testbed - ERROR - Error traceback: Traceback (most recent call last):
File "/mnt/c/Users/hrush/OneDrive - Student Ambassadors/Desktop/AutoSwe/venv/lib/python3.12/site-packages/swebench/harness/context_manager.py", line 82, in __call__
output = subprocess.run(cmd, **combined_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/subprocess.py", line 571, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['bash', '/mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/django/4.1/tmpu2up60g0/miniconda3/miniconda.sh', '-b', '-u', '-p', '/mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/django/4.1/tmpu2up60g0/miniconda3', '&&', 'conda', 'init', '--all']' returned non-zero exit status 1.
2024-06-01 17:10:29,734 - testbed - ERROR - Error traceback: Traceback (most recent call last):
File "/mnt/c/Users/hrush/OneDrive - Student Ambassadors/Desktop/AutoSwe/venv/lib/python3.12/site-packages/swebench/harness/context_manager.py", line 82, in __call__
output = subprocess.run(cmd, **combined_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/subprocess.py", line 571, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['bash', '/mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/sphinx/3.5/tmp0ges26wf/miniconda3/miniconda.sh', '-b', '-u', '-p', '/mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/sphinx/3.5/tmp0ges26wf/miniconda3', '&&', 'conda', 'init', '--all']' returned non-zero exit status 1.
❌ Evaluation failed: Command '['bash',
'/mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/django/3.0/tmphk0bcuv7/miniconda3/miniconda
.sh', '-b', '-u', '-p',
'/mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/django/3.0/tmphk0bcuv7/miniconda3', '&&',
'conda', 'init', '--all']' returned non-zero exit status 1.
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/lib/python3.12/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/multiprocessing/pool.py", line 48, in mapstar
return list(map(*args))
^^^^^^^^^^^^^^^^
File "/mnt/c/Users/hrush/OneDrive - Student
Ambassadors/Desktop/AutoSwe/venv/lib/python3.12/site-packages/swebench/harness/engine_evaluation.py", line 177, in main
setup_testbed(data_groups[0])
File "/mnt/c/Users/hrush/OneDrive - Student
Ambassadors/Desktop/AutoSwe/venv/lib/python3.12/site-packages/swebench/harness/engine_validation.py", line 91, in
setup_testbed
with TestbedContextManager(
File "/mnt/c/Users/hrush/OneDrive - Student
Ambassadors/Desktop/AutoSwe/venv/lib/python3.12/site-packages/swebench/harness/context_manager.py", line 285, in
__enter__
self.exec(install_cmd)
File "/mnt/c/Users/hrush/OneDrive - Student
Ambassadors/Desktop/AutoSwe/venv/lib/python3.12/site-packages/swebench/harness/context_manager.py", line 95, in
__call__
raise e
File "/mnt/c/Users/hrush/OneDrive - Student
Ambassadors/Desktop/AutoSwe/venv/lib/python3.12/site-packages/swebench/harness/context_manager.py", line 82, in
__call__
output = subprocess.run(cmd, **combined_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/subprocess.py", line 571, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['bash',
'/mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/django/3.0/tmphk0bcuv7/miniconda3/miniconda
.sh', '-b', '-u', '-p',
'/mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/django/3.0/tmphk0bcuv7/miniconda3', '&&',
'conda', 'init', '--all']' returned non-zero exit status 1.
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/evaluation.py", line 72, in main
run_evaluation(
File "/mnt/c/Users/hrush/OneDrive - Student
Ambassadors/Desktop/AutoSwe/venv/lib/python3.12/site-packages/swebench/harness/run_evaluation.py", line 203, in main
pool.map(eval_engine, eval_args)
File "/usr/lib/python3.12/multiprocessing/pool.py", line 367, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/multiprocessing/pool.py", line 774, in get
raise self._value
subprocess.CalledProcessError: Command '['bash',
'/mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/django/3.0/tmphk0bcuv7/miniconda3/miniconda
.sh', '-b', '-u', '-p',
'/mnt/c/Users/hrush/AutoSWE/AutoSwe/SWE-agent/evaluation/testbed/a09d900d21/django/3.0/tmphk0bcuv7/miniconda3', '&&',
'conda', 'init', '--all']' returned non-zero exit status 1.
==================================
Log directory for evaluation run:
results/azure-gpt-3.5-turbo-1106__SWE-bench_Lite__default__t-0.00__p-0.95__c-3.00__install-1
- Wrote per-instance scorecards to
../trajectories/hrushi66/azure-gpt-3.5-turbo-1106__SWE-bench_Lite__default__t-0.00__p-0.95__c-3.00__install-1/scorecard
s.json
Reference Report:
- no_generation: 5
- generated: 3
- with_logs: 0
- install_fail: 0
- reset_failed: 0
- no_apply: 0
- applied: 0
- test_errored: 0
- test_timeout: 0
- resolved: 0
- Wrote summary of run to
../trajectories/hrushi66/azure-gpt-3.5-turbo-1106__SWE-bench_Lite__default__t-0.00__p-0.95__c-3.00__install-1/results.j
son
i have this error haunting me since 3 hours, can anyone help @klieret
from swe-agent.
Regarding your initial report: I believe you need to specify the dataset name or path as second argument (the default is princeton-nlp/SWE-bench
, which is probably not what you need here).
Let me ping @carlosejimenez and @john-b-yang for this issue.
from swe-agent.
Hi I am having a similar issue, I do not know where to get the SWE-bench-lite dataset.
Here is what I'm running:
#!/bin/bash
python run_evaluation.py
--predictions_path "../../../SWE-agent/trajectories/vscode/gpt4__SWE-bench_Lite__default__t-0.00__p-0.95__c-2.00__install-1/all_preds.jsonl"
--swe_bench_tasks "SWE-bench_Lite/data/dev-00000-of-00001.parquet"
--log_dir "logs"
--testbed "testbed"
--skip_existing
--timeout 900
--verbose
I also tried to convert the data to a .json file with a python script, but that also did not work. Could you help direct me to the dataset?
I ran SWE-agent with the regular command.
python run.py --model_name gpt4
--per_instance_cost_limit 2.00
--config_file ./config/default.yaml
from swe-agent.
Related Issues (20)
- => ERROR [2/4] COPY ../evaluation/evaluation.py /evaluation.py 0.0s HOT 4
- release dockerhub nightly should only trigger on `main`
- `CostLimitExceededError` when raised after blocklist or format issue
- setup/configuration issues on apple m3 HOT 3
- Issue running sklearn instance HOT 5
- SBL matplotlib instances fail at setup: "unexpected keyword argument 'python_version'" HOT 1
- `pylint-dev__pylint-7114` version conflict 'astroid<=2.14.0-dev0,>=2.12.2'
- SBL matplotlib instances time out
- Binary output causes `communicate` crash HOT 4
- AssertionError: `self.env.record` is None after task initiation in Gradio interface HOT 1
- Problem about end-of-line sequence HOT 3
- Gpt 3.5 SWE-Agent results HOT 1
- Yanked packages? "Failed on sqlfluff__sqlfluff-1625: Failed to install requirements.txt" HOT 5
- Is Creating Patch in SWE Agent means issue is resolves? How do I ensure issue is resolved? HOT 2
- How to run evaluation after patch has been created? HOT 1
- Can You explain in proper way how to run the swe bench evaluation on swe agent? HOT 2
- Can you write the command for SWE Bench, I could not understant about below statement
- Found a mistake in SWE-Agent Documentation at Benchmarking Section
- Cannot access the demo website HOT 1
- "No patch to save" after successful looking run HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from swe-agent.