Better speed can be achieved with apex installed from https://www.github.com/nvidia/apex.
02/21/2019 12:11:44 - INFO - __main__ - device: cpu n_gpu: 1, distributed training: False, 16-bits training: False
02/21/2019 12:11:45 - INFO - pytorch_pretrained_bert.tokenization - loading vocabulary file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at /home/tong.guo/.pytorch_pretrained_bert/26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084
02/21/2019 12:11:45 - INFO - pytorch_pretrained_bert.modeling - loading archive file ../model_file/bert-base-uncased.tar.gz
02/21/2019 12:11:45 - INFO - pytorch_pretrained_bert.modeling - extracting archive file ../model_file/bert-base-uncased.tar.gz to temp dir /tmp/tmpaho9_3dk
02/21/2019 12:11:50 - INFO - pytorch_pretrained_bert.modeling - Model config {
"attention_probs_dropout_prob": 0.1,
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 768,
"initializer_range": 0.02,
"intermediate_size": 3072,
"max_position_embeddings": 512,
"num_attention_heads": 12,
"num_hidden_layers": 12,
"type_vocab_size": 2,
"vocab_size": 30522
}
02/21/2019 12:11:55 - INFO - pytorch_pretrained_bert.modeling - Weights of BertForSequenceClassification not initialized from pretrained model: ['classifier.weight', 'classifier.bias']
02/21/2019 12:11:55 - INFO - pytorch_pretrained_bert.modeling - Weights from pretrained model not used in BertForSequenceClassification: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias']
02/21/2019 12:11:55 - INFO - pytorch_pretrained_bert.modeling - loading archive file ../model_file/bert-base-uncased.tar.gz
02/21/2019 12:11:55 - INFO - pytorch_pretrained_bert.modeling - extracting archive file ../model_file/bert-base-uncased.tar.gz to temp dir /tmp/tmpfehb71wu
02/21/2019 12:11:59 - INFO - pytorch_pretrained_bert.modeling - Model config {
"attention_probs_dropout_prob": 0.1,
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 768,
"initializer_range": 0.02,
"intermediate_size": 3072,
"max_position_embeddings": 512,
"num_attention_heads": 12,
"num_hidden_layers": 12,
"type_vocab_size": 2,
"vocab_size": 30522
}
02/21/2019 12:12:03 - INFO - pytorch_pretrained_bert.modeling - Weights of BertForSequenceClassification not initialized from pretrained model: ['classifier.weight', 'classifier.bias']
02/21/2019 12:12:03 - INFO - pytorch_pretrained_bert.modeling - Weights from pretrained model not used in BertForSequenceClassification: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias']
02/21/2019 12:12:03 - INFO - __main__ - *** Example ***
02/21/2019 12:12:03 - INFO - __main__ - guid: dev-1
02/21/2019 12:12:03 - INFO - __main__ - tokens: [CLS] [UNK] ' s chief operating officer , [UNK] [UNK] , and [UNK] [UNK] , the chief financial officer , will report directly to [UNK] [UNK] . [SEP] [UNK] [UNK] [UNK] [UNK] [UNK] [UNK] and [UNK] [UNK] [UNK] [UNK] [UNK] [UNK] will report to [UNK] . [SEP]
02/21/2019 12:12:03 - INFO - __main__ - input_ids: 101 100 1005 1055 2708 4082 2961 1010 100 100 1010 1998 100 100 1010 1996 2708 3361 2961 1010 2097 3189 3495 2000 100 100 1012 102 100 100 100 100 100 100 1998 100 100 100 100 100 100 2097 3189 2000 100 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
02/21/2019 12:12:03 - INFO - __main__ - input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
02/21/2019 12:12:03 - INFO - __main__ - segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
02/21/2019 12:12:03 - INFO - __main__ - label: 1 (id = 1)
02/21/2019 12:12:03 - INFO - __main__ - *** Example ***
02/21/2019 12:12:03 - INFO - __main__ - guid: dev-2
02/21/2019 12:12:03 - INFO - __main__ - tokens: [CLS] [UNK] world ' s two largest auto ##makers said their [UNK] . [UNK] . sales declined more than predicted last month as a late summer sales frenzy caused more of an industry backlash than expected . [SEP] [UNK] sales at both [UNK] and [UNK] . 2 [UNK] [UNK] [UNK] . declined more than predicted as a late summer sales frenzy prompted a larger - than - expected industry backlash . [SEP]
02/21/2019 12:12:03 - INFO - __main__ - input_ids: 101 100 2088 1005 1055 2048 2922 8285 12088 2056 2037 100 1012 100 1012 4341 6430 2062 2084 10173 2197 3204 2004 1037 2397 2621 4341 21517 3303 2062 1997 2019 3068 25748 2084 3517 1012 102 100 4341 2012 2119 100 1998 100 1012 1016 100 100 100 1012 6430 2062 2084 10173 2004 1037 2397 2621 4341 21517 9469 1037 3469 1011 2084 1011 3517 3068 25748 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
02/21/2019 12:12:03 - INFO - __main__ - input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
02/21/2019 12:12:03 - INFO - __main__ - segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
02/21/2019 12:12:03 - INFO - __main__ - label: 1 (id = 1)
02/21/2019 12:12:03 - INFO - __main__ - *** Example ***
02/21/2019 12:12:03 - INFO - __main__ - guid: dev-3
02/21/2019 12:12:03 - INFO - __main__ - tokens: [CLS] [UNK] to the federal [UNK] for [UNK] [UNK] and [UNK] ( news - web sites ) , there were 19 reported cases of me ##as ##les in the [UNK] [UNK] in 2002 . [SEP] [UNK] [UNK] for [UNK] [UNK] and [UNK] said there were 19 reported cases of me ##as ##les in the [UNK] [UNK] in 2002 . [SEP]
02/21/2019 12:12:03 - INFO - __main__ - input_ids: 101 100 2000 1996 2976 100 2005 100 100 1998 100 1006 2739 1011 4773 4573 1007 1010 2045 2020 2539 2988 3572 1997 2033 3022 4244 1999 1996 100 100 1999 2526 1012 102 100 100 2005 100 100 1998 100 2056 2045 2020 2539 2988 3572 1997 2033 3022 4244 1999 1996 100 100 1999 2526 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
02/21/2019 12:12:03 - INFO - __main__ - input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
02/21/2019 12:12:03 - INFO - __main__ - segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
02/21/2019 12:12:03 - INFO - __main__ - label: 1 (id = 1)
02/21/2019 12:12:03 - INFO - __main__ - *** Example ***
02/21/2019 12:12:03 - INFO - __main__ - guid: dev-4
02/21/2019 12:12:03 - INFO - __main__ - tokens: [CLS] [UNK] tropical storm rapidly developed in the [UNK] of [UNK] [UNK] and was expected to hit somewhere along the [UNK] or [UNK] coasts by [UNK] night . [SEP] [UNK] tropical storm rapidly developed in the [UNK] of [UNK] on [UNK] and could have hurricane - force winds when it hits land somewhere along the [UNK] coast [UNK] night . [SEP]
02/21/2019 12:12:03 - INFO - __main__ - input_ids: 101 100 5133 4040 5901 2764 1999 1996 100 1997 100 100 1998 2001 3517 2000 2718 4873 2247 1996 100 2030 100 20266 2011 100 2305 1012 102 100 5133 4040 5901 2764 1999 1996 100 1997 100 2006 100 1998 2071 2031 7064 1011 2486 7266 2043 2009 4978 2455 4873 2247 1996 100 3023 100 2305 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
02/21/2019 12:12:03 - INFO - __main__ - input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
02/21/2019 12:12:03 - INFO - __main__ - segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
02/21/2019 12:12:03 - INFO - __main__ - label: 0 (id = 0)
02/21/2019 12:12:03 - INFO - __main__ - *** Example ***
02/21/2019 12:12:03 - INFO - __main__ - guid: dev-5
02/21/2019 12:12:03 - INFO - __main__ - tokens: [CLS] [UNK] company didn ' t detail the costs of the replacement and repairs . [SEP] [UNK] company officials expect the costs of the replacement work to run into the millions of dollars . [SEP]
02/21/2019 12:12:03 - INFO - __main__ - input_ids: 101 100 2194 2134 1005 1056 6987 1996 5366 1997 1996 6110 1998 10315 1012 102 100 2194 4584 5987 1996 5366 1997 1996 6110 2147 2000 2448 2046 1996 8817 1997 6363 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
02/21/2019 12:12:03 - INFO - __main__ - input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
02/21/2019 12:12:03 - INFO - __main__ - segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
02/21/2019 12:12:03 - INFO - __main__ - label: 0 (id = 0)
02/21/2019 12:12:04 - INFO - __main__ - ***** Running evaluation *****
02/21/2019 12:12:04 - INFO - __main__ - Num examples = 1725
02/21/2019 12:12:04 - INFO - __main__ - Batch size = 8
Evaluating: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 216/216 [06:06<00:00, 1.70s/it]
02/21/2019 12:18:11 - INFO - __main__ - ***** Eval results *****
02/21/2019 12:18:11 - INFO - __main__ - eval_accuracy = 0.33507246376811595
02/21/2019 12:18:11 - INFO - __main__ - eval_loss = 1.002936492777533
02/21/2019 12:18:11 - INFO - __main__ - global_step = 0
02/21/2019 12:18:11 - INFO - __main__ - loss = Non
Better speed can be achieved with apex installed from https://www.github.com/nvidia/apex.
02/20/2019 19:49:30 - INFO - __main__ - device: cuda
02/20/2019 19:49:30 - INFO - pytorch_pretrained_bert.tokenization_transfo_xl - loading vocabulary file ../model_file/transfo-xl-wt103-vocab.bin
02/20/2019 19:49:30 - INFO - pytorch_pretrained_bert.tokenization_transfo_xl - loading corpus file ../model_file/transfo-xl-wt103-corpus.bin
02/20/2019 19:49:36 - INFO - pytorch_pretrained_bert.modeling_transfo_xl - loading weights file ../model_file/transfo-xl-wt103-pytorch_model.bin
02/20/2019 19:49:36 - INFO - pytorch_pretrained_bert.modeling_transfo_xl - loading configuration file ../model_file/transfo-xl-wt103-config.json
02/20/2019 19:49:36 - INFO - pytorch_pretrained_bert.modeling_transfo_xl - Model config {
"adaptive": true,
"attn_type": 0,
"clamp_len": 1000,
"cutoffs": [
20000,
40000,
200000
],
"d_embed": 1024,
"d_head": 64,
"d_inner": 4096,
"d_model": 1024,
"div_val": 4,
"dropatt": 0.0,
"dropout": 0.1,
"ext_len": 0,
"init": "normal",
"init_range": 0.01,
"init_std": 0.02,
"mem_len": 1600,
"n_head": 16,
"n_layer": 18,
"n_token": 267735,
"pre_lnorm": false,
"proj_init_std": 0.01,
"same_length": true,
"sample_softmax": -1,
"tgt_len": 128,
"tie_projs": [
false,
true,
true,
true
],
"tie_weight": true,
"untie_r": true
}
02/20/2019 19:49:51 - INFO - __main__ - Evaluating with bsz 10 tgt_len 128 ext_len 0 mem_len 1600 clamp_len 1000
02/20/2019 19:57:35 - INFO - __main__ - Time : 464.00s, 2416.66ms/segment
02/20/2019 19:57:35 - INFO - __main__ - ====================================================================================================
02/20/2019 19:57:35 - INFO - __main__ - | test loss 2.90 | test ppl 18.213
02/20/2019 19:57:35 - INFO - __main__ - ====================================================================================================