Code Monkey home page Code Monkey logo

Comments (5)

jacobzweig avatar jacobzweig commented on June 19, 2024 2

Hey @ningshixian you'll need to remove the pooling layers from trainable weights, e.g.

trainable_vars = [var for var in trainable_vars if not "/cls/" in var.name and not "/pooler/" in var.name]

from keras-bert.

ningshixian avatar ningshixian commented on June 19, 2024

hello,
I modified a part of the code,

bert_outputs = self.bert(inputs=bert_inputs, signature="tokens", as_dict=True)['sequence_output']
return (max_seq_length, self.output_size)

but when i run the code, i have the following problem!
"F tensorflow/core/framework/tensor_shape.cc:44] Check failed: NDIMS == dims() (2 vs. 4)Asking for tensor of 2 dimensions from a tensor of 4 dimensions
Aborted (core dumped)"

may I know what is the reason?

i hava found the answer from google-research/bert#146

from keras-bert.

ningshixian avatar ningshixian commented on June 19, 2024

but another question is raise!
Once i execute the following code, i will get an error:
ValueError: An operation has None for gradient. Please make sure that all of your ops have a gradient defined
'''
trainable_vars = self.bert.variables
trainable_vars = [var for var in trainable_vars if not "/cls/" in var.name]
trainable_vars = trainable_vars[-self.n_fine_tune_layers :]
for var in trainable_vars:
self.trainable_weights.append(var)
'''

from keras-bert.

changquanyou avatar changquanyou commented on June 19, 2024

2019-04-10 16:33:34.752272: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 11481 MB memory) -> physical GPU (device: 0, name: Tesla K40m, pci bus id: 0000:82:00.0, compute capability: 3.5)
segment_ids name: "segment_ids_1:0"
dtype: DT_INT32
tensor_shape {
dim {
size: -1
}
dim {
size: 256
}


1 import tensorflow as tf
}

segment_ids_1:0
==> Tensor("segment_ids_1:0", shape=(?, 256), dtype=int32)
input_ids name: "input_ids_1:0"
dtype: DT_INT32
tensor_shape {
dim {
size: -1
}
dim {
size: 256
}
}

input_ids_1:0
==> Tensor("input_ids_1:0", shape=(?, 256), dtype=int32)
input_mask name: "input_mask_1:0"
dtype: DT_INT32
tensor_shape {
dim {
size: -1
}
dim {
size: 256
}
}

input_mask_1:0
==> Tensor("input_mask_1:0", shape=(?, 256), dtype=int32)
label_ids name: "label_ids_1:0"
dtype: DT_INT32
tensor_shape {
dim {
size: -1
}
dim {
size: 256
}
}

label_ids_1:0
==> Tensor("label_ids_1:0", shape=(?, 256), dtype=int32)
probabilities name: "loss/Softmax:0"
dtype: DT_FLOAT
tensor_shape {
dim {
size: -1
}
dim {
size: 3
}
}

(3613, 2)
2019-04-10 16:33:37.213336: F tensorflow/core/framework/tensor_shape.cc:44] Check failed: NDIMS == dims() (2 vs. 4)Asking for tensor of 2 dimensions from a tensor of 4 dimensions
Aborted (core dumped)

above is the error messsage,I trained the classify model by cpu,I also predict it locally,It is normal; But I run in the server by Gpu, it is the error

from keras-bert.

ningshixian avatar ningshixian commented on June 19, 2024

Hey @ningshixian you'll need to remove the pooling layers from trainable weights, e.g.

trainable_vars = [var for var in trainable_vars if not "/cls/" in var.name and not "/pooler/" in var.name]

thanks...but it does not work

from keras-bert.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.