Comments (2)
Perhaps there is already a way to do this
from dalm.
Indeed there is!
from dalm.models.retriever_only_base_model import AutoModelForSentenceEmbedding
model = "arcee-ai/bge-code-retriever"
retriever_model = AutoModelForSentenceEmbedding(
model,
get_peft=False,
use_bnb=False,
is_autoregressive=True,
)
tokenizer = retriever_model.tokenizer
passage = """class RetrieverModel:
def __init__(self, base_model=DEFAULT_BASE_MODEL):
self.tokenizer = AutoTokenizer.from_pretrained(base_model)
self.model = SentenceTransformer(base_model)
self.base_model = base_model
self.peft_model = None
self.on_cuda = False
self.device = None
# special code retriever model
self.is_code_retriever = False
def to(self, device):
self.model.to(device)
self.on_cuda = True
self.device = device"""
model_inputs = tokenizer(passage, return_tensors="pt")
model_inputs.to(retriever_model.device)
embeddings = retriever_model(model_inputs["input_ids"], model_inputs["attention_mask"])
print(embeddings)
from dalm.
Related Issues (20)
- training fails at the end when with-tracking is false HOT 1
- E2E training checkpoint saving doesn't work HOT 1
- Minimum system requirements to train HOT 4
- Installation: Why 'indomain' and not 'dalm' ? HOT 5
- TypeError: Argument() missing 1 required positional argument: 'default'
- dalm qa-gen toy_data_train.csv doesn't work out of the box. HOT 4
- Eval e2e Rag raising device mismatch error HOT 2
- Reading comprehension synthetic data regex improvements HOT 1
- Update README to document reading comprehension
- CUDA OOM doing reading comprehension on A10 24GB VRAM GPU
- How to run model + finetuned adapter in LlamaIndex or Langchain? HOT 1
- Large values of casual loss HOT 3
- paper released?
- Rag-end2end didn't achieve any improvement in recall score compared to training only with Retriever HOT 4
- DALM installation fails in my python environment. HOT 1
- Installation problem on main branch HOT 1
- cannot import name '_sentencepiece' from partially initialized module 'sentencepiece'
- how to use title question and passage all together while training retriever only HOT 1
- Incorrect pooling BGE model HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dalm.