Code Monkey home page Code Monkey logo

k2's People

Contributors

davendw49 avatar eltociear avatar richardscottoz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

k2's Issues

🆘 I need the geosignal.json dataset for finetunning.

The amount of data translated is too large. Are there any Chinese version of geosignal.json?
Badly in need of need the Chinese version of the dataset.
It doesn't matter if it doesn't.
Sorry to take up the issue. Thanks a lot.🙏

Issues in evaluation code.

It looks like the code you've provided "run_eval.py" is not consistent with the benchmark dataset you've provided. I've encountered a few issues and I'd like to know their solutions:

  1. args.geobenchmark is set to "npee" but there's no such benchmark file. I've read in another issue that you've asked to replace with "geobenchmark_npee.json". However, I am not sure how the code will run when we'll pass the apstudy.json benchmark as the code is only written for npee.

  2. In the following image, the code line "for the_answer_is in ['wa', 'woa']" can you please explain what is this 'wa' and 'woa' as it's not mentioned in the code-base or inside the npee dataset anywhere.

  3. In the same image, code line "source = source_target['source'][question_type][the_answer_is]", if you load npee.json benchmark file as json then source_target['source'] will give keyword error as only 6 keys are available ['noun', 'choice', 'completion', 'tf', 'qa', 'discussion'] so this key seems to be wrong.

  4. Moreover, even if you say "source_target[question_type][the_answer_is]" is the correct format, still "the_answer_is" is a key error as only ['question', 'answer'] exist in the "choice" element of npee file. What's the right format?
    image

  5. How do you evaluate and test the apstudy.json benchmark as the code is not written for that?

input_ls.json

Your generate examples includes this - but there is no example file it seems?

Maybe include one with a few sample geoscience questions excluded to see if people get similar to you

'What is the most common igneous rock?' or things like that perhaps?

Due to the dependencies conflict in the k2.yaml, running the k2 model is impossible. Please update the right file.

There is an error when i try to run conda env create -f k2.yaml as is showing in the README file,Thoes are errors that pip showed:

ERROR: Cannot install datasets==2.11.0, evaluate==0.4.0, gradio-client==0.1.3, gradio==3.27.0, huggingface-hub==0.13.3 and transformers==4.32.0 because these package versions have conflicting dependencies.

The conflict is caused by:
    The user requested huggingface-hub==0.13.3
    datasets 2.11.0 depends on huggingface-hub<1.0.0 and >=0.11.0
    evaluate 0.4.0 depends on huggingface-hub>=0.7.0
    gradio 3.27.0 depends on huggingface-hub>=0.13.0
    gradio-client 0.1.3 depends on huggingface-hub>=0.13.0
    transformers 4.32.0 depends on huggingface-hub<1.0 and >=0.15.1

To fix this you could try to:
1. loosen the range of package versions you've specified
2. remove package versions to allow pip attempt to solve the dependency conflict

ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-dependency-conflicts

Due to the dependencies conflict in the k2.yaml, running the k2 model is impossible. Please update the right file.

Example website generation parameters?

For your example, it would be interesting to know what parameters you have set for generation for each query if they are the same, to compare?

e.g. I have some different or failures with the defaults.

Tokenizer not available error

transformers) ubuntu@:~/data/k2$ python -m apply_delta --base-model-path decapoda-research/llama-7b-hf --target-model-path /home/ubuntu/geollama/ --delta-path daven3/k2_fp_delta
Loading the delta weights from daven3/k2_fp_delta
Traceback (most recent call last):
  File "/home/ubuntu/anaconda3/envs/transformers/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/ubuntu/anaconda3/envs/transformers/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/ubuntu/data/k2/apply_delta.py", line 165, in <module>
    apply_delta(args.base_model_path, args.target_model_path, args.delta_path)
  File "/home/ubuntu/data/k2/apply_delta.py", line 127, in apply_delta
    delta_tokenizer = AutoTokenizer.from_pretrained(delta_path, use_fast=False)
  File "/home/ubuntu/anaconda3/envs/transformers/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 720, in from_pretrained
    return tokenizer_class_fast.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
  File "/home/ubuntu/anaconda3/envs/transformers/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1825, in from_pretrained
    raise EnvironmentError(
OSError: Can't load tokenizer for 'daven3/k2_fp_delta'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'daven3/k2_fp_delta' is the correct path to a directory containing all relevant files for a LlamaTokenizerFast tokenizer.

Now because you have a different sort of model it can't find a tokenizer from the pathname perhaps - do you actually have this available you could upload?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.