Code Monkey home page Code Monkey logo

Comments (11)

jiaxiang-wu avatar jiaxiang-wu commented on June 21, 2024

Hi, are you using checkpoint files produced by your own training code, instead of pre-trained models provided by us? This will cause the above error message.

If you do need to use your own model definition and pre-trained models, then you need to create your own ModelHelper class and a Python script to use it, similar to:

  1. nets/resnet_at_cifar10.py (which defines a ModelHelper class)
  2. nets/resnet_at_cifar10_run.py (which uses the above class)

from pocketflow.

as754770178 avatar as754770178 commented on June 21, 2024

I download models_resnet_56_at_cifar_10.tar.gz from https://api.ai.tencent.com/pocketflow/list.html, and decompress it in models.

error:
InvalidArgumentError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Assign requires shapes of both tensors to match. lhs shape= [32] rhs shape= [16]
[[Node: model/save/Assign_13 = Assign[T=DT_FLOAT, _class=["loc:@model/resnet_model/batch_normalization_11/gamma"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](model/resnet_model/batch_normalization_11/gamma, model/save/RestoreV2/_27)]]
[[Node: model/save/RestoreV2/_98 = _SendT=DT_FLOAT, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_104_model/save/RestoreV2", _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

from pocketflow.

BowieHsu avatar BowieHsu commented on June 21, 2024

you have to modify the layer num in resnet.py,the default layer num should be 50 or 101

from pocketflow.

jiaxiang-wu avatar jiaxiang-wu commented on June 21, 2024

Hi @as754770178
For nets/resnet_at_cifar10_run.py, the default number of layers is 20. Since you have downloaded the ResNet-56 model, you need to specify the number of layers with --resnet_size 56.

from pocketflow.

as754770178 avatar as754770178 commented on June 21, 2024

Thanks. I misunderstood the function of PocketFlow, I think the net defined in nets/resnet_at_cifar10_run.py is the student net. Actually, PocketFlow Pruning/Quantization the net defined in nets/resnet_at_cifar10_run.py as the student net? Is my idea correct?

from pocketflow.

jiaxiang-wu avatar jiaxiang-wu commented on June 21, 2024

I'm not sure whether I have understood your question.

In PocketFlow, the student network and teacher network (only exists if network distillation is enabled) share the same network architecture. The student network may have further restrictions introduced by pruning or quantization operations, while the teacher network is the full-precision uncompressed network. Does this resolve your question?

from pocketflow.

as754770178 avatar as754770178 commented on June 21, 2024

I want to confirm that the student only come from the pruning or quantization operations of the full-precision uncompressed network in the network distillation.

from pocketflow.

jiaxiang-wu avatar jiaxiang-wu commented on June 21, 2024

Yes, for all model compression methods in PocketFlow, the compressed network (or student network) only comes from the pruned / quantized version of a full-precision uncompressed network (or teacher network). This is irrelevant to network distillation, which only adds a distillation loss term in the training of student network.

from pocketflow.

as754770178 avatar as754770178 commented on June 21, 2024

ok, thanks

from pocketflow.

as754770178 avatar as754770178 commented on June 21, 2024

I define my net, but the name of variable is Prefixed of 'model', such as 'model/resnet_v1_110/block1/unit_1/bottleneck2_v1/conv1/BatchNorm/beta ', but it should is 'resnet_v1_110/block1/unit_1/bottleneck2_v1/conv1/BatchNorm/beta '.

`NotFoundError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a Variable name or other graph key that is missing from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Key model/resnet_v1_110/block1/unit_1/bottleneck2_v1/conv1/BatchNorm/beta not found in checkpoint
[[Node: model/save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_model/save/Const_0_0, model/save/RestoreV2/tensor_names, model/save/RestoreV2/shape_and_slices)]]
[[Node: model/save/RestoreV2/_1059 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_1064_model/save/RestoreV2", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]]`

from pocketflow.

jiaxiang-wu avatar jiaxiang-wu commented on June 21, 2024

It seems this issue has been resolved in #27. Closing. Reopen it if there are any further questions.

from pocketflow.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.