Comments (4)
argument 읽기
(0) json 파일 예시
{
"model_name_or_path": "test",
"task_name": "document_classification",
"data_dir": "data",
"output_dir": "checkpoint"
}
(1) python console에서 json 파일로 읽어들이기
from ratsnlp.arguments import load_arguments
model_args, data_args, training_args = load_arguments(json_file_path="examples/document_classification.json")
(2) json 파일 경로를 외부의 인자로 주어 읽어들이기
from ratsnlp.arguments import load_arguments
model_args, data_args, training_args = load_arguments()
python examples/document_classification.py examples/document_classification.json
(3) 인자들을 직접 외부에서 주입해 읽어들이기
from ratsnlp.arguments import load_arguments
model_args, data_args, training_args = load_arguments()
python examples/document_classification.py --model_name_or_path test2 --task_name doc --data_dir data --output_dir check
from nlpbook.
코드
from ratsnlp.nlpbook import *
from ratsnlp.nlpbook.classification import NsmcCorpus, Runner
if __name__ == "__main__":
args = load_arguments(json_file_path="examples/document_classification.json")
# args = load_arguments()
set_logger(args)
download_downstream_dataset(
args.downstream_corpus_name,
cache_dir=args.downstream_corpus_dir,
force_download=False
)
download_pretrained_model(
args.pretrained_model_name,
cache_dir=args.pretrained_model_cache_dir,
force_download=False
)
check_exist_checkpoints(args)
seed_setting(args)
tokenizer = get_tokenizer(args)
corpus = NsmcCorpus()
train_dataloader, val_dataloader, test_dataloader = get_dataloaders(corpus, tokenizer, args)
model = get_pretrained_model(args, num_labels=2)
runner = Runner(model, args)
checkpoint_callback, trainer = get_trainer(args)
if args.do_train:
trainer.fit(
runner,
train_dataloader=train_dataloader,
val_dataloaders=val_dataloader,
)
if args.do_predict:
trainer.test(
runner,
test_dataloaders=test_dataloader,
ckpt_path=checkpoint_callback.best_model_path,
)
config
{
"pretrained_model_name": "kobert",
"pretrained_model_cache_dir": "/Users/david/works/cache/kobert",
"downstream_corpus_name": "nsmc",
"downstream_corpus_dir": "/Users/david/works/cache/nsmc",
"downstream_task_name": "document-classification",
"downstream_model_dir": "/Users/david/works/cache/checkpoint",
"do_train": true,
"do_eval": true,
"do_predict": false,
"batch_size": 32
}
from nlpbook.
로컬에서 학습하기
다음 세 가지 방식이 동일하다
- train_local.py에 직접 argument 정의된 설정대로 학습
python train_local.py
- train_config.json(아래 json)에 정의된 설정대로 학습
python train_local.py train_local.json
{
"pretrained_model_name": "beomi/kcbert-base",
"downstream_corpus_name": "nsmc",
"downstream_corpus_root_dir": "data",
"downstream_task_name": "document-classification",
"downstream_model_dir": "checkpoint/document-classification",
"do_train": true,
"do_eval": true,
"batch_size": 32
}
- train_local.py에 args를 직접 주입
CUDA_VISIBLE_DEVICES=1 python cls_train_local.py --pretrained_model_name beomi/kcbert-base --downstream_corpus_root_dir data --downstream_corpus_name nsmc --downstream_task_name document-classification --downstream_model_dir checkpoint/document-classification2 --batch_size 32
from nlpbook.
로컬에서 인퍼런스하기
다음 세 가지 방식이 동일하다
- deploy_local.py에 직접 argument 정의된 설정대로 학습
- deploy_config.json에 정의된 설정대로 학습
- deploy_local.py에 args를 직접 주입
from nlpbook.
Related Issues (20)
- Question Answering - ratsgo's NLPBOOK
- Inference - ratsgo's NLPBOOK
- ↗️ Customization - ratsgo's NLPBOOK
- 4-3 마지막 웹서비스 개시에서 실행이 되지 않습니다 HOT 3
- Inference (1) - ratsgo's NLPBOOK
- Deep NLP - ratsgo's NLPBOOK
- 6장 단어에 꼬리표 달기에 대해서 질문 있습니다.
- 1장 kcbert 관련 오류 문의드립니다.
- Environment - ratsgo's NLPBOOK
- Training Pipeline - ratsgo's NLPBOOK
- 4-3 웹서비스 시작하기 HOT 2
- 5-3 학습 마친 모델 실전 투입에서 웹서비스 실행시 ERR_NGROK_6022 이 뜨네요.ㅠ.ㅠ HOT 2
- ratsnlp 라이브러리를 설치하지 않고 일반적인 패키지를 설치하는 방법은 없을까요? HOT 1
- Home - ratsgo's NLPBOOK
- Named Entity Recognition - ratsgo's NLPBOOK
- removal request
- P150 웹 서비스 시작하기가 (기존 해결책대로 했는데도) 에러가 납니다.
- 커스텀 데이터_분류_레이블별 성능체크 방법
- Vocab Tutorial - ratsgo's NLPBOOK
- 4장 5단계 웹서비스 시작하기 관련 질문 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nlpbook.