Why model file size is not fixed, but seems to increase with the training schedule? about fast-neural-style HOT 3 OPEN

jcjohnson commented on September 7, 2024

Why model file size is not fixed, but seems to increase with the training schedule?

from fast-neural-style.

Comments (3)

austingg commented on September 7, 2024

yes, i have found it too. a little confused.
Is the network different for different style ?

from fast-neural-style.

jcjohnson commented on September 7, 2024

The models with instance normalization use a slightly different architecture than the ECCV16 models; they use half as many filters per layer, so they have a smaller file size.

The size of checkpoints increases slightly during training because in addition to the model the checkpoints also store a history of training and validation losses. You could reduce the size of checkpoints by stripping this history of training and validation losses, but this would probably not make a big difference.

In addition to model weights, Torch checkpoints also store tensors for the gradients of model weights; this makes checkpoints twice as big as they really need to be. It is possible to remove these gradient tensors before saving checkpoints, but this is annoying and somewhat error-prone so I have not implemented it here.

from fast-neural-style.

logic1988 commented on September 7, 2024

@jcjohnson Thank you. Do you mean set checkpoint_every larger can slightly reduce the model size but the more important is remove the gradient tensors in the code. I think there is a solution is to write a script that can remove useless data, so that users can use it who need it.

from fast-neural-style.

Recommend Projects

Why model file size is not fixed, but seems to increase with the training schedule? about fast-neural-style HOT 3 OPEN

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent