Hi there! I am trying to test with your pretrained dependency parsin

I have just uploaded the ptb dataset on <a href="https://1drv.ms/u/s!Am53YNAPSsodhPtFN

I have just uploaded the ptb dataset on <a href="https://1drv.ms/u/s!Am53

Hi Xinyu, Thanks for uploading the data! <p dir="au

You may try to uncomment these lines <div

You may try to uncomment these lines <div class="Bo

Test PTB Dependency Parsing Model about ace HOT 9 OPEN

alibaba-nlp commented on September 23, 2024

Test PTB Dependency Parsing Model

from ace.

Comments (9)

wangxinyu0922 commented on September 23, 2024

I have just uploaded the ptb dataset on onedrive.

For inference, you may make a file like this (add dummy tags in the 7,8,9-th column) and follow the instruction:

1\tBut\t_\t_\t_\t_\t_\t0\troot\t0:root
2\tI\t_\t_\t_\t_\t_\t0\troot\t0:root
3\tfound\t_\t_\t_\t_\t_\t0\troot\t0:root
4\tthe\t_\t_\t_\t_\t_\t0\troot\t0:root
5\tlocation\t_\t_\t_\t_\t_\t0\troot\t0:root
6\twonderful\t_\t_\t_\t_\t_\t0\troot\t0:root
7\tand\t_\t_\t_\t_\t_\t0\troot\t0:root
7.1\tfound\t_\t_\t_\t_\t_\t0\troot\t0:root
8\tthe\t_\t_\t_\t_\t_\t0\troot\t0:root
9\tneighbors\t_\t_\t_\t_\t_\t0\troot\t0:root
10\tvery\t_\t_\t_\t_\t_\t0\troot\t0:root
11\tkind\t_\t_\t_\t_\t_\t0\troot\t0:root
12\t.\t_\t_\t_\t_\t_\t0\troot\t0:root

from ace.

woshiyyya commented on September 23, 2024

Hi Xinyu,

Thanks for uploading the data!

I created a folder named data and put a train.tsv file with the demo case you provide.

Run:
CUDA_VISIBLE_DEVICES=0 python train.py --config config/ptb_parsing_model.yaml --parse --target_dir data --keep_order

But still got an error:

2022-09-07 02:59:16,391 Reading data from /home/yunxuan2/.flair/datasets/ptb_3.3.0_modified
2022-09-07 02:59:16,391 Train: /home/yunxuan2/.flair/datasets/ptb_3.3.0_modified/train_modified.conllu
2022-09-07 02:59:16,391 Test: /home/yunxuan2/.flair/datasets/ptb_3.3.0_modified/test.conllu
2022-09-07 02:59:16,391 Dev: /home/yunxuan2/.flair/datasets/ptb_3.3.0_modified/dev.conllu
Traceback (most recent call last):
  File "train.py", line 85, in <module>
    config = ConfigParser(config,all=args.all,zero_shot=args.zeroshot,other_shot=args.other,predict=args.predict)
  File "/projects/clio1/probing/ACE/flair/config_parser.py", line 63, in __init__
    self.corpus: ListCorpus=self.get_corpus
  File "/projects/clio1/probing/ACE/flair/config_parser.py", line 329, in get_corpus
    current_dataset=getattr(datasets,corpus)(tag_to_bioes=self.target)
  File "/projects/clio1/probing/ACE/flair/datasets.py", line 360, in __init__
    train = UniversalDependenciesDataset(data_folder/'train_modified.conllu', in_memory=in_memory, add_root=True)
  File "/projects/clio1/probing/ACE/flair/datasets.py", line 1006, in __init__
    assert path_to_conll_file.exists()
AssertionError

Do you know how to fix that?

from ace.

wangxinyu0922 commented on September 23, 2024

Have you checked whether the datasets is at the correct place?

from ace.

lizhou21 commented on September 23, 2024

I have just uploaded the ptb dataset on onedrive.

For inference, you may make a file like this (add dummy tags in the 7,8,9-th column) and follow the instruction:

1\tBut\t_\t_\t_\t_\t_\t0\troot\t0:root
2\tI\t_\t_\t_\t_\t_\t0\troot\t0:root
3\tfound\t_\t_\t_\t_\t_\t0\troot\t0:root
4\tthe\t_\t_\t_\t_\t_\t0\troot\t0:root
5\tlocation\t_\t_\t_\t_\t_\t0\troot\t0:root
6\twonderful\t_\t_\t_\t_\t_\t0\troot\t0:root
7\tand\t_\t_\t_\t_\t_\t0\troot\t0:root
7.1\tfound\t_\t_\t_\t_\t_\t0\troot\t0:root
8\tthe\t_\t_\t_\t_\t_\t0\troot\t0:root
9\tneighbors\t_\t_\t_\t_\t_\t0\troot\t0:root
10\tvery\t_\t_\t_\t_\t_\t0\troot\t0:root
11\tkind\t_\t_\t_\t_\t_\t0\troot\t0:root
12\t.\t_\t_\t_\t_\t_\t0\troot\t0:root

Hi Xinyu,
Is there something wrong with the data format provided?
i just find, the code token = Token(fields[1], head_id=int(fields[6])) shows me ValueError: invalid literal for int() with base 10: '_'.

So I guess the 0-th column is token id,
the 1-th column is token,
the 2,3,4,5-th column is ""，
the 6-th column is 0, (dummy tags)
the 7-th column is "",
the 8-th column is "root", (dummy tags)
the 9-th column is "0:root", (dummy tags)

is that right？

from ace.

lizhou21 commented on September 23, 2024

Hi Xinyu,

Thanks for uploading the data!

I created a folder named data and put a train.tsv file with the demo case you provide.

Run: CUDA_VISIBLE_DEVICES=0 python train.py --config config/ptb_parsing_model.yaml --parse --target_dir data --keep_order

But still got an error:

2022-09-07 02:59:16,391 Reading data from /home/yunxuan2/.flair/datasets/ptb_3.3.0_modified
2022-09-07 02:59:16,391 Train: /home/yunxuan2/.flair/datasets/ptb_3.3.0_modified/train_modified.conllu
2022-09-07 02:59:16,391 Test: /home/yunxuan2/.flair/datasets/ptb_3.3.0_modified/test.conllu
2022-09-07 02:59:16,391 Dev: /home/yunxuan2/.flair/datasets/ptb_3.3.0_modified/dev.conllu
Traceback (most recent call last):
  File "train.py", line 85, in <module>
    config = ConfigParser(config,all=args.all,zero_shot=args.zeroshot,other_shot=args.other,predict=args.predict)
  File "/projects/clio1/probing/ACE/flair/config_parser.py", line 63, in __init__
    self.corpus: ListCorpus=self.get_corpus
  File "/projects/clio1/probing/ACE/flair/config_parser.py", line 329, in get_corpus
    current_dataset=getattr(datasets,corpus)(tag_to_bioes=self.target)
  File "/projects/clio1/probing/ACE/flair/datasets.py", line 360, in __init__
    train = UniversalDependenciesDataset(data_folder/'train_modified.conllu', in_memory=in_memory, add_root=True)
  File "/projects/clio1/probing/ACE/flair/datasets.py", line 1006, in __init__
    assert path_to_conll_file.exists()
AssertionError

Do you know how to fix that?

after I change the data format, I also face the same problem.
have you resolved it?

from ace.

wangxinyu0922 commented on September 23, 2024

Hi Xinyu,
Thanks for uploading the data!
I created a folder named data and put a train.tsv file with the demo case you provide.
Run: CUDA_VISIBLE_DEVICES=0 python train.py --config config/ptb_parsing_model.yaml --parse --target_dir data --keep_order
But still got an error:

2022-09-07 02:59:16,391 Reading data from /home/yunxuan2/.flair/datasets/ptb_3.3.0_modified
2022-09-07 02:59:16,391 Train: /home/yunxuan2/.flair/datasets/ptb_3.3.0_modified/train_modified.conllu
2022-09-07 02:59:16,391 Test: /home/yunxuan2/.flair/datasets/ptb_3.3.0_modified/test.conllu
2022-09-07 02:59:16,391 Dev: /home/yunxuan2/.flair/datasets/ptb_3.3.0_modified/dev.conllu
Traceback (most recent call last):
  File "train.py", line 85, in <module>
    config = ConfigParser(config,all=args.all,zero_shot=args.zeroshot,other_shot=args.other,predict=args.predict)
  File "/projects/clio1/probing/ACE/flair/config_parser.py", line 63, in __init__
    self.corpus: ListCorpus=self.get_corpus
  File "/projects/clio1/probing/ACE/flair/config_parser.py", line 329, in get_corpus
    current_dataset=getattr(datasets,corpus)(tag_to_bioes=self.target)
  File "/projects/clio1/probing/ACE/flair/datasets.py", line 360, in __init__
    train = UniversalDependenciesDataset(data_folder/'train_modified.conllu', in_memory=in_memory, add_root=True)
  File "/projects/clio1/probing/ACE/flair/datasets.py", line 1006, in __init__
    assert path_to_conll_file.exists()
AssertionError

Do you know how to fix that?

after I change the data format, I also face the same problem. have you resolved it?

Have you ensured the path /home/yunxuan2/.flair/datasets/ptb_3.3.0_modified/train_modified.conllu exist? If not, you may download the data above and put them at this path.

from ace.

lizhou21 commented on September 23, 2024

yes! I have done it! and I solve this problem, it also needs to have dev/test datasets in the target_dir.
But now I can parse the dataset with CPU(very slow), fail to run it with GPU set.

It shows me :

Traceback (most recent call last):
File "train.py", line 378, in
train_eval_result, train_loss = student.evaluate(loader,out_path=Path('outputs/train.'+'.'+tar_file_name+'.conllu'),embeddings_storage_mode="none",prediction_mode=True)
File "/DM_parser/ACE/flair/models/dependency_model.py", line 1174, in evaluate
arc_scores, rel_scores = self.forward(batch, prediction_mode=prediction_mode)
File "/DM_parser/ACE/flair/models/dependency_model.py", line 597, in forward
self.embeddings.embed(sentences,embedding_mask=self.selection)
File "/DM_parser/ACE/flair/embeddings.py", line 185, in embed
embedding.embed(sentences)
File "/DM_parser/ACE/flair/embeddings.py", line 97, in embed
self._add_embeddings_internal(sentences)
File "/DM_parser/ACE/flair/embeddings.py", line 2960, in _add_embeddings_internal
self._add_embeddings_to_sentences(sentences)
File "/DM_parser/ACE/flair/embeddings.py", line 3155, in _add_embeddings_to_sentences
sequence_output, pooled_output, hidden_states = self.model(input_ids, attention_mask=mask, inputs_embeds = inputs_embeds)
File "/home/anaconda3/envs/ACE_parser/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/home/anaconda3/envs/ACE_parser/lib/python3.6/site-packages/transformers/modeling_bert.py", line 753, in forward
input_ids=input_ids, position_ids=position_ids, token_type_ids=token_type_ids, inputs_embeds=inputs_embeds
File "/home/anaconda3/envs/ACE_parser/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/home/anaconda3/envs/ACE_parser/lib/python3.6/site-packages/transformers/modeling_roberta.py", line 68, in forward
input_ids, token_type_ids=token_type_ids, position_ids=position_ids, inputs_embeds=inputs_embeds
File "/home/anaconda3/envs/ACE_parser/lib/python3.6/site-packages/transformers/modeling_bert.py", line 178, in forward
inputs_embeds = self.word_embeddings(input_ids)
File "/home/anaconda3/envs/ACE_parser/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/home/anaconda3/envs/ACE_parser/lib/python3.6/site-packages/torch/nn/modules/sparse.py", line 114, in forward
self.norm_type, self.scale_grad_by_freq, self.sparse)
File "/home/anaconda3/envs/ACE_parser/lib/python3.6/site-packages/torch/nn/functional.py", line 1484, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: Expected object of device type cuda but got device type cpu for argument #1 'self' in call to _th_index_select

I try to set
sequence_output, pooled_output, hidden_states = self.model(input_ids, attention_mask=mask, inputs_embeds = inputs_embeds)

into

sequence_output, pooled_output, hidden_states = self.model(input_ids.cuda(), attention_mask=mask.cuda(), inputs_embeds = inputs_embeds)

it also shows me the same question.

T T,

from ace.

wangxinyu0922 commented on September 23, 2024

You may try to uncomment these lines

ACE/train.py

Lines 226 to 238 in 7033e91

    
           # if student.selection[idx] == 1: 
        
           # 	embedding.to(flair.device) 
        
           # 	if 'elmo' in embedding.name: 
        
           # 		# embedding.reset_elmo() 
        
           # 		# continue 
        
           # 		# pdb.set_trace() 
        
           # 		embedding.ee.elmo_bilm.cuda(device=embedding.ee.cuda_device) 
        
           # 		states=[x.to(flair.device) for x in embedding.ee.elmo_bilm._elmo_lstm._states] 
        
           # 		embedding.ee.elmo_bilm._elmo_lstm._states = states 
        
           # 		for idx in range(len(embedding.ee.elmo_bilm._elmo_lstm._states)): 
        
           # 			embedding.ee.elmo_bilm._elmo_lstm._states[idx]=embedding.ee.elmo_bilm._elmo_lstm._states[idx].to(flair.device) 
        
           # else: 
        
           embedding.to('cpu')

from ace.

lizhou21 commented on September 23, 2024

You may try to uncomment these lines

ACE/train.py

Lines 226 to 238 in 7033e91

# if student.selection[idx] == 1:

# embedding.to(flair.device)

# if 'elmo' in embedding.name:

# # embedding.reset_elmo()

# # continue

# # pdb.set_trace()

# embedding.ee.elmo_bilm.cuda(device=embedding.ee.cuda_device)

# states=[x.to(flair.device) for x in embedding.ee.elmo_bilm._elmo_lstm._states]

# embedding.ee.elmo_bilm._elmo_lstm._states = states

# for idx in range(len(embedding.ee.elmo_bilm._elmo_lstm._states)):

# embedding.ee.elmo_bilm._elmo_lstm._states[idx]=embedding.ee.elmo_bilm._elmo_lstm._states[idx].to(flair.device)

# else:

embedding.to('cpu')

hi Xinyu, I have resolved the problem, and applied ACE to my data parsing successfully, thanks for your help.

from ace.

Test PTB Dependency Parsing Model about ace HOT 9 OPEN

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

	# if student.selection[idx] == 1:
	# embedding.to(flair.device)
	# if 'elmo' in embedding.name:
	# # embedding.reset_elmo()
	# # continue
	# # pdb.set_trace()
	# embedding.ee.elmo_bilm.cuda(device=embedding.ee.cuda_device)
	# states=[x.to(flair.device) for x in embedding.ee.elmo_bilm._elmo_lstm._states]
	# embedding.ee.elmo_bilm._elmo_lstm._states = states
	# for idx in range(len(embedding.ee.elmo_bilm._elmo_lstm._states)):
	# embedding.ee.elmo_bilm._elmo_lstm._states[idx]=embedding.ee.elmo_bilm._elmo_lstm._states[idx].to(flair.device)
	# else:
	embedding.to('cpu')