前辈您好，首先感谢您提供的方法，我在回归模型上复现您的方法时，出现了较大的性能损失，可以向您请教几个问题吗？（1）在加载预训练好的模型后，算法在边训练边压缩，这个

About parameter Settings during training about torch-model-compression HOT 4 OPEN

Annmixiu commented on June 8, 2024

About parameter Settings during training

from torch-model-compression.

Comments (4)

gdh1995 commented on June 8, 2024 1

(1) 看起来剪坏了，建议把剪枝间隔设久一点试试，像 prune.py 里 resrep 方法的参数 prune_interval 是多少次迭代后就剪一次，不是 epoch，所以数据集规模大了建议把参数改大些。

(2) resrep吗？剪枝的那一次迭代会掉点，之后继续训一般还能恢复些。

from torch-model-compression.

Annmixiu commented on June 8, 2024

（1）好的，谢谢前辈，您给的resrep示例中我看到是200次迭代进行一次压缩，我的数据量更多，我修改了您在训练时的部分代码如下：

begin epoch

    print("training...")
    for epoch in range(0, self.config["epoch"]):

        # setting lr
        if epoch <= self.config["warmup_epoch"]:
            lr = 0.005
        else:
            lr = 0.005 * (0.995 ** ((epoch - 1) // 2))
        self.config["lr"] = lr

        self.variable_dict["epoch"] = epoch
        self.run_hook(self.epoch_begin_hook)
        self.variable_dict["avg_mentor"] = AvgMeter()  # operate and update average
        self.model.train()
        # max_step = len(self.trainloader)
        # print("This is the max_step", max_step)
        for step, data in enumerate(tqdm(self.trainloader)):
            # tqdm.write("Step: {}".format(step))
            self.variable_dict["step"] = step + 1
            self.variable_dict["iteration"] += 1
            self.run_hook(self.iteration_begin_hook)  # record prune_iteration
            data = self._sample_to_device(data, self.variable_dict["base_device"])
            c_sample_number = self._get_sample_number(data)  # get the num_sample
            predict = self.config["predict_function"](self.model, data)
            self.variable_dict["loss"] = self.config["calculate_loss_function"](
                predict, data
            )
            self.on_loss_backward()
            self.variable_dict["loss"].backward()
            self.after_loss_backward()
            # add the gradient of penalty with decay to the gradient of precision parameter for the compactor
            self.optimizer.step()  # update parameter
            self.optimizer.zero_grad()  # zero out gradient
        evaluate_result = self.config["evaluate_function"](predict, data)
        evaluate_result["loss"] = self.variable_dict["loss"].item()  # get high accuracy loss
        self.variable_dict["avg_mentor"].update(
            evaluate_result, c_sample_number
        )  # update average
        # del evaluate_result, predict
        # if self.variable_dict["iteration"] % self.config["log_interval"] == 0:
        self.write_log(
            self.variable_dict["epoch"],
            self.variable_dict["iteration"],
            self.variable_dict["avg_mentor"].get(),
            self.config["lr"],
        )
        self.write_tensorboard(
            "train_log",
            self.variable_dict["iteration"],
            self.variable_dict["avg_mentor"].get(),
        )
        self.run_hook(self.iteration_end_hook)  # model prune, optimizer prune, compute flops
        del self.variable_dict["loss"]
        self.scheduler.step()  # adjust lr

        # evaluation/testing step
        print("evaluating...")
        self.variable_dict["test_avg_mentor"] = AvgMeter()
        self.model.eval()
        # for step, data in enumerate(self.testloader):
        #     self.variable_dict["step"] = step + 1
        #     data = self._sample_to_device(data, self.variable_dict["base_device"])
        #     c_sample_number = self._get_sample_number(data)
        #     predict = self.config["predict_function"](self.model, data)
        evaluate_result = self.config["evaluate_function"](predict, data)
        evaluate_result["loss"] = self.config["calculate_loss_function_test"](
            predict, data
        ).item()
        self.variable_dict["test_avg_mentor"].update(
            evaluate_result, c_sample_number
        )  # update precision result and num_sample
        if self.variable_dict["step"] % self.config["log_interval"] == 0:  # (useless at present) decide log_out_period, log_interval=20
            self.write_log(
                self.variable_dict["epoch"],
                self.variable_dict["step"],
                self.variable_dict["test_avg_mentor"].get(),
            )
        # del evaluate_result, predict
        self.write_log(
            self.variable_dict["epoch"],
            "final",
            self.variable_dict["test_avg_mentor"].get(),
            self.config["lr"],
        )
        self.write_tensorboard(
            "test_log",
            self.variable_dict["epoch"],
            self.variable_dict["test_avg_mentor"].get(),
        )

总结来说，由于语音方向回归模型数据量庞大，我没有按照您原来设置的迭代次数进行压缩，而是每个epoch压缩一次，那我改写下尝试2-3个epoch或者1.5个epoch等同的迭代次数来设置压缩频次。
（2）另外，想咨询下您，如果压缩正常，训练和验证的loss是否和正常训练模型一样以较平滑的曲线下降？
感谢前辈您的多次解答和帮助，祝您工作和生活一切顺利！

from torch-model-compression.

gdh1995 commented on June 8, 2024

我只记得在cifar10上的resnet，训练期间验证集的精度不断涨，碰到剪枝就掉一点再继续涨，loss没印象了。

from torch-model-compression.

Annmixiu commented on June 8, 2024

我只记得在cifar10上的resnet，训练期间验证集的精度不断涨，碰到剪枝就掉一点再继续涨，loss没印象了。

好的，谢谢前辈解答

from torch-model-compression.

About parameter Settings during training about torch-model-compression HOT 4 OPEN

Comments (4)

begin epoch

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent