modeltc / mqbench Goto Github PK

View Code? Open in Web Editor NEW

741.0 14.0 135.0 2.3 MB

Model Quantization Benchmark

License: Apache License 2.0

Python 33.48% Shell 66.04% Dockerfile 0.12% Jupyter Notebook 0.37%

mqbench's Introduction

Introduction

MQBench is an open-source model quantization toolkit based on PyTorch fx.

The envision of MQBench is to provide:

SOTA Algorithms. With MQBench, the hardware vendors and researchers can benefit from the latest research progress in academic.
Powerful Toolkits. With the toolkit, quantization node can be inserted to the original PyTorch module automatically with respect to the specific hardware. After training, the quantized model can be smoothly converted to the format that can inference on the real device.

Installation

git clone [email protected]:ModelTC/MQBench.git
cd MQBench
python setup.py install

Documentation

MQBench aims to support (1) various deployable quantization algorithms and (2) hardware backend libraries to facilitate the development of the community.

For the detailed information, please refer to MQBench documentation.

Citation

If you use this toolkit or benchmark in your research, please cite this project.

@article{MQBench,
  title   = {MQBench: Towards Reproducible and Deployable Model Quantization Benchmark},
  author  = {Yuhang Li* and Mingzhu Shen* and Jian Ma* and Yan Ren* and Mingxin Zhao* and
             Qi Zhang* and Ruihao Gong* and Fengwei Yu and Junjie Yan},
  journal= {Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks},
  year={2021}
}

License

This project is released under the Apache 2.0 license.

mqbench's People

Contributors

Stargazers

Watchers

Forkers

chenjun2hao chenbohua3 niko-zyf zzhice haojunyong hustzxd dovedx 666dzy666 zhiwei-dong yyue000 tianhaofu mikeseven zlkanyo009 cyang0515 2021yy peeta586 www516717402 marsbzp yiweichen04 hanhaonwu humu789 scott-mao lvhhhh jiangyuewu a1trl9 jie311 zhy520xp thb1314 mathpopo andeyeluguo jmliu206 stevetsui golden-slumbers-1997 lay2000 13301338176 jaemyungkim pannenetsf aksenventwo zhanghao-jnu vpoul meet-ai chloeehkim runningleon techthiyanes alanjonson zivzone zhangzhipku xingyueye zhouilu sallyeen estherxue ironteen jjjma dulvqingyunlt wangdeyu triple-mu serissa jiang-stan pkuzhou cekcoco yingjiechang zhangyuncheny un-knight ninesheep kismit brotherhappy yxpandjay shenmayufei andyongg jixiege carry-xz xinxin12345 guangyanzhang wangyi120226 balablabala warrenk17 hacunamatada huzicong hylihitic wujinlonglovezhangmiao1314 yuhongjiu dl-coders xiaopengaia sophgo xiaooquanwu stevenokm hit-cwh pprp rami-ismael codekiwifi multiverstack-intellif marsmiao lily-2001 aachenhang meicale wqn628 superhero1991 happygds dogmai120 menace-dragon

mqbench's Issues

Shufflenet cannot be quantified

对 Shufflenet 进行量化时，结果完全错误，具体原因见 pytorch/pytorch#75009

建议对 torch.fake_quantize_per_tensor_affine 使用时，对其输入 X 替换成 X.contiguous()

关于int16量化

文档里提到nnie跟qnn支持16bit量化，实际训练时是否只需要把a_qscheme的bit设置成16即可。

2-bit LSQ quantization on ResNet18

Hi,
Is there 2-bit LSQ quantization config parameters on ResNet18? I use the config and change the quantization bits to 2 and keep others same, and get 65.508 top1 accuracy. There is about 2.1% gap compared with 67.6 in original LSQ paper. Can you give me some advices to reproduce the results?
Thanks.

MQBench in DETR

Hi,
I tried to apply MQBench in DETR from Facebook by running:

model= torch.hub.load('facebookresearch/detr', 'detr_resnet101_panoptic',num_classes=250)
model.eval()
model = prepare_by_platform(model, BackendType.Tensorrt)

but I met such error:

File "/home/shawn/.cache/torch/hub/facebookresearch_detr_master/models/position_encoding.py", line 40, in forward
dim_t = torch.arange(self.num_pos_feats, dtype=torch.float32, device=x.device)
TypeError: arange() received an invalid combination of arguments - got (int, device=Attribute, dtype=torch.dtype), but expected one of:

(Number end, *, Tensor out, torch.dtype dtype, torch.layout layout, torch.device device, bool pin_memory, bool requires_grad)

(Number start, Number end, Number step, *, Tensor out, torch.dtype dtype, torch.layout layout, torch.device device, bool pin_memory, bool requires_grad)

It would be nice to get any help or intuition on how to fix this

Problem in advanced_ptq

I tried the new fix in the latest commit. My original issue (#64) was solved but a new one appears.

The model I want to quantize is Yolov5, the new problem is still with the model_3_cv2. The architecture of the model in the place where the error occurs:

It's in the BottleneckCSP block in Yolo. I'm guessing it's because the concat operation in the original structure caused a problem with the input to the quantization node?

I tried again and the error occurs in the same place but not the same argument. The full output of it :

Apex recommended for faster mixed precision training: https://github.com/NVIDIA/apex
Using CUDA device0 _CudaDeviceProperties(name='TITAN Xp', total_memory=12196MB)

[MQBENCH] INFO: Quantize model Scheme: Tensorrt Mode: Training
[MQBENCH] INFO: Weight Qconfig:
    FakeQuantize: AdaRoundFakeQuantize Params: {}
    Oberver:      MSEObserver Params: Symmetric: False / Bitwidth: 8 / Per channel: True / Pot scale: False / Extra kwargs: {'p': 2.4}
[MQBENCH] INFO: Activation Qconfig:
    FakeQuantize: FixedFakeQuantize Params: {}
    Oberver:      EMAMSEObserver Params: Symmetric: False / Bitwidth: 8 / Per channel: False / Pot scale: False / Extra kwargs: {'p': 2.4}
[MQBENCH] INFO: Replace module to qat module.
[MQBENCH] INFO: Insert act quant x_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_0_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_1_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_2_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_3_cv1_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_3_m_0_cv1_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_3_m_0_cv2_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant add_1_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant cat_1_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_3_act_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_3_cv4_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_4_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_5_cv1_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_5_m_0_cv1_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_5_m_0_cv2_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant add_2_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_5_m_1_cv1_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_5_m_1_cv2_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant add_3_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_5_m_2_cv1_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_5_m_2_cv2_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant add_4_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant cat_2_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_5_act_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_5_cv4_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_6_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_7_cv1_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_7_m_0_cv1_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_7_m_0_cv2_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant add_5_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_7_m_1_cv1_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_7_m_1_cv2_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant add_6_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_7_m_2_cv1_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_7_m_2_cv2_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant add_7_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant cat_3_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_7_act_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_7_cv4_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_8_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_9_cv1_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant cat_4_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_9_cv2_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_10_cv1_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_10_m_0_cv1_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_10_m_0_cv2_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant cat_5_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_10_act_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_10_cv4_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_11_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant cat_6_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_14_cv1_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_14_m_0_cv1_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_14_m_0_cv2_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant cat_7_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_14_act_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_14_cv4_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_15_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant cat_8_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_18_cv1_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_18_m_0_cv1_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_18_m_0_cv2_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant cat_9_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_18_act_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_18_cv4_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant cat_10_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_21_cv1_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_21_m_0_cv1_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_21_m_0_cv2_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant cat_11_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_21_act_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_21_cv4_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant cat_12_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_24_cv1_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_24_m_0_cv1_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_24_m_0_cv2_conv_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant cat_13_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_24_act_post_act_fake_quantizer
[MQBENCH] INFO: Insert act quant model_24_cv4_conv_post_act_fake_quantizer
begin calibration now!
[MQBENCH] INFO: Enable observer and Disable quantize for act_fake_quant
[MQBENCH] INFO: Enable observer and Disable quantize for weight_fake_quant
begin advanced PTQ now!
[MQBENCH] INFO: Disable observer and Disable quantize.
[MQBENCH] INFO: Disable observer and Enable quantize.
[MQBENCH] INFO: prepare layer reconstruction for model.0.conv
[MQBENCH] INFO: the node list is below!
[MQBENCH] INFO: [model_0_conv, model_0_conv_post_act_fake_quantizer]
[MQBENCH] INFO: GraphModule(
  (model_0_conv_post_act_fake_quantizer): FixedFakeQuantize(
    fake_quant_enabled=tensor([1], device='cuda:0', dtype=torch.uint8), observer_enabled=tensor([0], device='cuda:0', dtype=torch.uint8), quant_min=0, quant_max=255, dtype=torch.quint8, qscheme=torch.per_tensor_affine, ch_axis=-1, scale=tensor([0.08819], device='cuda:0'), zero_point=tensor([0], device='cuda:0', dtype=torch.int32)
    (activation_post_process): EMAMSEObserver(min_val=0.0, max_val=22.489063262939453 ch_axis=-1 pot=False)
  )
  (model): Module(
    (0): Module(
      (conv): ConvReLU2d(
        3, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)
        (weight_fake_quant): AdaRoundFakeQuantize(
          fake_quant_enabled=tensor([1], device='cuda:0', dtype=torch.uint8), observer_enabled=tensor([0], device='cuda:0', dtype=torch.uint8), quant_min=0, quant_max=255, dtype=torch.quint8, qscheme=torch.per_channel_affine, ch_axis=0, scale=List, zero_point=List
          (activation_post_process): MSEObserver(min_val=List, max_val=List ch_axis=0 pot=False)
        )
      )
    )
  )
)

def forward(self, x_post_act_fake_quantizer):
    model_0_conv = getattr(self.model, "0").conv(x_post_act_fake_quantizer);  x_post_act_fake_quantizer = None
    model_0_conv_post_act_fake_quantizer = self.model_0_conv_post_act_fake_quantizer(model_0_conv);  model_0_conv = None
    return model_0_conv_post_act_fake_quantizer
    
[MQBENCH] INFO: learn the scale for model_0_conv_post_act_fake_quantizer
Init alpha to be FP32
[MQBENCH] INFO: The world size is 1.
[MQBENCH] INFO: start tuning by adaround
[MQBENCH] INFO: Total loss:	3539151.000 (rec:3539151.000, round:0.000)	b=20.00	count=50
[MQBENCH] INFO: Total loss:	3210827.750 (rec:3210827.750, round:0.000)	b=20.00	count=100
[MQBENCH] INFO: Total loss:	3043429.500 (rec:3043429.500, round:0.000)	b=20.00	count=150
[MQBENCH] INFO: Total loss:	3222497.000 (rec:3222497.000, round:0.000)	b=20.00	count=200
[MQBENCH] INFO: Total loss:	3222481.750 (rec:3222481.750, round:0.000)	b=20.00	count=250
[MQBENCH] INFO: Total loss:	3222472.000 (rec:3222472.000, round:0.000)	b=20.00	count=300
[MQBENCH] INFO: Total loss:	2805912.000 (rec:2805912.000, round:0.000)	b=20.00	count=350
[MQBENCH] INFO: Total loss:	3222206.250 (rec:3222202.250, round:3.956)	b=20.00	count=400
[MQBENCH] INFO: Total loss:	3873423.750 (rec:3873419.750, round:3.885)	b=19.44	count=450
[MQBENCH] INFO: Total loss:	3222203.000 (rec:3222199.250, round:3.812)	b=18.88	count=500
[MQBENCH] INFO: Total loss:	2942019.250 (rec:2942015.500, round:3.754)	b=18.31	count=550
[MQBENCH] INFO: Total loss:	2805662.750 (rec:2805659.000, round:3.698)	b=17.75	count=600
[MQBENCH] INFO: Total loss:	3461206.500 (rec:3461202.750, round:3.633)	b=17.19	count=650
[MQBENCH] INFO: Total loss:	3595007.750 (rec:3595004.250, round:3.568)	b=16.62	count=700
[MQBENCH] INFO: Total loss:	3461003.500 (rec:3461000.000, round:3.505)	b=16.06	count=750
[MQBENCH] INFO: Total loss:	3753907.250 (rec:3753903.750, round:3.447)	b=15.50	count=800
[MQBENCH] INFO: Total loss:	3093474.250 (rec:3093470.750, round:3.391)	b=14.94	count=850
[MQBENCH] INFO: Total loss:	3873098.000 (rec:3873094.750, round:3.344)	b=14.38	count=900
[MQBENCH] INFO: Total loss:	3332251.250 (rec:3332248.000, round:3.295)	b=13.81	count=950
[MQBENCH] INFO: Total loss:	3594865.000 (rec:3594861.750, round:3.241)	b=13.25	count=1000
[MQBENCH] INFO: Total loss:	3093389.750 (rec:3093386.500, round:3.188)	b=12.69	count=1050
[MQBENCH] INFO: Total loss:	3872996.500 (rec:3872993.250, round:3.139)	b=12.12	count=1100
[MQBENCH] INFO: Total loss:	3753790.000 (rec:3753787.000, round:3.086)	b=11.56	count=1150
[MQBENCH] INFO: Total loss:	2923425.000 (rec:2923422.000, round:3.033)	b=11.00	count=1200
[MQBENCH] INFO: Total loss:	3594803.250 (rec:3594800.250, round:2.975)	b=10.44	count=1250
[MQBENCH] INFO: Total loss:	3332164.500 (rec:3332161.500, round:2.916)	b=9.88	count=1300
[MQBENCH] INFO: Total loss:	2870452.500 (rec:2870449.750, round:2.854)	b=9.31	count=1350
[MQBENCH] INFO: Total loss:	3753696.750 (rec:3753694.000, round:2.788)	b=8.75	count=1400
[MQBENCH] INFO: Total loss:	2945751.250 (rec:2945748.500, round:2.721)	b=8.19	count=1450
[MQBENCH] INFO: Total loss:	3753680.000 (rec:3753677.250, round:2.652)	b=7.62	count=1500
[MQBENCH] INFO: Total loss:	3460759.750 (rec:3460757.250, round:2.577)	b=7.06	count=1550
[MQBENCH] INFO: Total loss:	3221720.500 (rec:3221718.000, round:2.496)	b=6.50	count=1600
[MQBENCH] INFO: Total loss:	3538183.500 (rec:3538181.000, round:2.407)	b=5.94	count=1650
[MQBENCH] INFO: Total loss:	3209970.000 (rec:3209967.750, round:2.308)	b=5.38	count=1700
[MQBENCH] INFO: Total loss:	3460724.250 (rec:3460722.000, round:2.198)	b=4.81	count=1750
[MQBENCH] INFO: Total loss:	3042632.750 (rec:3042630.750, round:2.079)	b=4.25	count=1800
[MQBENCH] INFO: Total loss:	2923285.500 (rec:2923283.500, round:1.945)	b=3.69	count=1850
[MQBENCH] INFO: Total loss:	3594662.250 (rec:3594660.500, round:1.791)	b=3.12	count=1900
[MQBENCH] INFO: Total loss:	3872818.500 (rec:3872817.000, round:1.612)	b=2.56	count=1950
[MQBENCH] INFO: Total loss:	3753604.000 (rec:3753602.500, round:1.396)	b=2.00	count=2000
[MQBENCH] INFO: prepare layer reconstruction for model.1.conv
[MQBENCH] INFO: the node list is below!
[MQBENCH] INFO: [model_1_conv, model_1_conv_post_act_fake_quantizer]
[MQBENCH] INFO: GraphModule(
  (model_1_conv_post_act_fake_quantizer): FixedFakeQuantize(
    fake_quant_enabled=tensor([1], device='cuda:0', dtype=torch.uint8), observer_enabled=tensor([0], device='cuda:0', dtype=torch.uint8), quant_min=0, quant_max=255, dtype=torch.quint8, qscheme=torch.per_tensor_affine, ch_axis=-1, scale=tensor([0.10162], device='cuda:0'), zero_point=tensor([0], device='cuda:0', dtype=torch.int32)
    (activation_post_process): EMAMSEObserver(min_val=0.0, max_val=25.912405014038086 ch_axis=-1 pot=False)
  )
  (model): Module(
    (1): Module(
      (conv): ConvReLU2d(
        16, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1)
        (weight_fake_quant): AdaRoundFakeQuantize(
          fake_quant_enabled=tensor([1], device='cuda:0', dtype=torch.uint8), observer_enabled=tensor([0], device='cuda:0', dtype=torch.uint8), quant_min=0, quant_max=255, dtype=torch.quint8, qscheme=torch.per_channel_affine, ch_axis=0, scale=List, zero_point=List
          (activation_post_process): MSEObserver(min_val=List, max_val=List ch_axis=0 pot=False)
        )
      )
    )
  )
)

def forward(self, model_0_conv_post_act_fake_quantizer):
    model_1_conv = getattr(self.model, "1").conv(model_0_conv_post_act_fake_quantizer);  model_0_conv_post_act_fake_quantizer = None
    model_1_conv_post_act_fake_quantizer = self.model_1_conv_post_act_fake_quantizer(model_1_conv);  model_1_conv = None
    return model_1_conv_post_act_fake_quantizer
    
[MQBENCH] INFO: learn the scale for model_1_conv_post_act_fake_quantizer
Init alpha to be FP32
[MQBENCH] INFO: The world size is 1.
[MQBENCH] INFO: start tuning by adaround
[MQBENCH] INFO: Total loss:	1575079.750 (rec:1575079.750, round:0.000)	b=20.00	count=50
[MQBENCH] INFO: Total loss:	1575079.250 (rec:1575079.250, round:0.000)	b=20.00	count=100
[MQBENCH] INFO: Total loss:	1470952.500 (rec:1470952.500, round:0.000)	b=20.00	count=150
[MQBENCH] INFO: Total loss:	1300293.250 (rec:1300293.250, round:0.000)	b=20.00	count=200
[MQBENCH] INFO: Total loss:	1304137.125 (rec:1304137.125, round:0.000)	b=20.00	count=250
[MQBENCH] INFO: Total loss:	1616733.500 (rec:1616733.500, round:0.000)	b=20.00	count=300
[MQBENCH] INFO: Total loss:	1646501.125 (rec:1646501.125, round:0.000)	b=20.00	count=350
[MQBENCH] INFO: Total loss:	1325806.375 (rec:1325763.375, round:43.054)	b=20.00	count=400
[MQBENCH] INFO: Total loss:	1325726.125 (rec:1325684.125, round:41.956)	b=19.44	count=450
[MQBENCH] INFO: Total loss:	1344530.250 (rec:1344489.500, round:40.739)	b=18.88	count=500
[MQBENCH] INFO: Total loss:	1646303.125 (rec:1646263.750, round:39.414)	b=18.31	count=550
[MQBENCH] INFO: Total loss:	1302296.750 (rec:1302258.750, round:38.059)	b=17.75	count=600
[MQBENCH] INFO: Total loss:	1366993.125 (rec:1366956.375, round:36.708)	b=17.19	count=650
[MQBENCH] INFO: Total loss:	1302282.500 (rec:1302247.000, round:35.524)	b=16.62	count=700
[MQBENCH] INFO: Total loss:	1574807.500 (rec:1574772.875, round:34.652)	b=16.06	count=750
[MQBENCH] INFO: Total loss:	1637826.375 (rec:1637792.375, round:33.994)	b=15.50	count=800
[MQBENCH] INFO: Total loss:	1444923.375 (rec:1444890.000, round:33.409)	b=14.94	count=850
[MQBENCH] INFO: Total loss:	1300030.750 (rec:1299997.875, round:32.837)	b=14.38	count=900
[MQBENCH] INFO: Total loss:	1303803.750 (rec:1303771.500, round:32.255)	b=13.81	count=950
[MQBENCH] INFO: Total loss:	1302106.750 (rec:1302075.125, round:31.657)	b=13.25	count=1000
[MQBENCH] INFO: Total loss:	1616274.375 (rec:1616243.375, round:31.059)	b=12.69	count=1050
[MQBENCH] INFO: Total loss:	1629899.750 (rec:1629869.250, round:30.472)	b=12.12	count=1100
[MQBENCH] INFO: Total loss:	1487585.250 (rec:1487555.375, round:29.897)	b=11.56	count=1150
[MQBENCH] INFO: Total loss:	1299870.875 (rec:1299841.625, round:29.275)	b=11.00	count=1200
[MQBENCH] INFO: Total loss:	1290767.875 (rec:1290739.250, round:28.602)	b=10.44	count=1250
[MQBENCH] INFO: Total loss:	1470381.375 (rec:1470353.500, round:27.921)	b=9.88	count=1300
[MQBENCH] INFO: Total loss:	1487473.500 (rec:1487446.250, round:27.240)	b=9.31	count=1350
[MQBENCH] INFO: Total loss:	1299758.250 (rec:1299731.750, round:26.540)	b=8.75	count=1400
[MQBENCH] INFO: Total loss:	1637492.875 (rec:1637467.125, round:25.801)	b=8.19	count=1450
[MQBENCH] INFO: Total loss:	1299750.875 (rec:1299725.875, round:25.026)	b=7.62	count=1500
[MQBENCH] INFO: Total loss:	1366667.625 (rec:1366643.375, round:24.194)	b=7.06	count=1550
[MQBENCH] INFO: Total loss:	1616043.625 (rec:1616020.375, round:23.299)	b=6.50	count=1600
[MQBENCH] INFO: Total loss:	1629663.625 (rec:1629641.250, round:22.344)	b=5.94	count=1650
[MQBENCH] INFO: Total loss:	1637379.750 (rec:1637358.375, round:21.319)	b=5.38	count=1700
[MQBENCH] INFO: Total loss:	1487367.000 (rec:1487346.750, round:20.197)	b=4.81	count=1750
[MQBENCH] INFO: Total loss:	1637375.500 (rec:1637356.500, round:18.948)	b=4.25	count=1800
[MQBENCH] INFO: Total loss:	1469221.625 (rec:1469204.125, round:17.536)	b=3.69	count=1850
[MQBENCH] INFO: Total loss:	1344037.125 (rec:1344021.250, round:15.918)	b=3.12	count=1900
[MQBENCH] INFO: Total loss:	1487311.250 (rec:1487297.250, round:14.057)	b=2.56	count=1950
[MQBENCH] INFO: Total loss:	1303450.875 (rec:1303439.000, round:11.883)	b=2.00	count=2000
[MQBENCH] INFO: prepare layer reconstruction for model.2.conv
[MQBENCH] INFO: the node list is below!
[MQBENCH] INFO: [model_2_conv, model_2_conv_post_act_fake_quantizer]
[MQBENCH] INFO: GraphModule(
  (model_2_conv_post_act_fake_quantizer): FixedFakeQuantize(
    fake_quant_enabled=tensor([1], device='cuda:0', dtype=torch.uint8), observer_enabled=tensor([0], device='cuda:0', dtype=torch.uint8), quant_min=0, quant_max=255, dtype=torch.quint8, qscheme=torch.per_tensor_affine, ch_axis=-1, scale=tensor([0.07901], device='cuda:0'), zero_point=tensor([0], device='cuda:0', dtype=torch.int32)
    (activation_post_process): EMAMSEObserver(min_val=0.0, max_val=20.147693634033203 ch_axis=-1 pot=False)
  )
  (model): Module(
    (2): Module(
      (conv): ConvReLU2d(
        32, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1)
        (weight_fake_quant): AdaRoundFakeQuantize(
          fake_quant_enabled=tensor([1], device='cuda:0', dtype=torch.uint8), observer_enabled=tensor([0], device='cuda:0', dtype=torch.uint8), quant_min=0, quant_max=255, dtype=torch.quint8, qscheme=torch.per_channel_affine, ch_axis=0, scale=List, zero_point=List
          (activation_post_process): MSEObserver(min_val=List, max_val=List ch_axis=0 pot=False)
        )
      )
    )
  )
)

def forward(self, model_1_conv_post_act_fake_quantizer):
    model_2_conv = getattr(self.model, "2").conv(model_1_conv_post_act_fake_quantizer);  model_1_conv_post_act_fake_quantizer = None
    model_2_conv_post_act_fake_quantizer = self.model_2_conv_post_act_fake_quantizer(model_2_conv);  model_2_conv = None
    return model_2_conv_post_act_fake_quantizer
    
[MQBENCH] INFO: learn the scale for model_2_conv_post_act_fake_quantizer
Init alpha to be FP32
[MQBENCH] INFO: The world size is 1.
[MQBENCH] INFO: start tuning by adaround
[MQBENCH] INFO: Total loss:	1022320.125 (rec:1022320.125, round:0.000)	b=20.00	count=50
[MQBENCH] INFO: Total loss:	1012533.938 (rec:1012533.938, round:0.000)	b=20.00	count=100
[MQBENCH] INFO: Total loss:	1050814.500 (rec:1050814.500, round:0.000)	b=20.00	count=150
[MQBENCH] INFO: Total loss:	1190185.875 (rec:1190185.875, round:0.000)	b=20.00	count=200
[MQBENCH] INFO: Total loss:	996599.250 (rec:996599.250, round:0.000)	b=20.00	count=250
[MQBENCH] INFO: Total loss:	1050711.375 (rec:1050711.375, round:0.000)	b=20.00	count=300
[MQBENCH] INFO: Total loss:	1149552.250 (rec:1149552.250, round:0.000)	b=20.00	count=350
[MQBENCH] INFO: Total loss:	1022317.188 (rec:1022144.812, round:172.386)	b=20.00	count=400
[MQBENCH] INFO: Total loss:	1012527.938 (rec:1012360.938, round:166.975)	b=19.44	count=450
[MQBENCH] INFO: Total loss:	1305114.000 (rec:1304953.000, round:161.035)	b=18.88	count=500
[MQBENCH] INFO: Total loss:	1177806.500 (rec:1177651.875, round:154.631)	b=18.31	count=550
[MQBENCH] INFO: Total loss:	1022247.688 (rec:1022099.750, round:147.910)	b=17.75	count=600
[MQBENCH] INFO: Total loss:	1079310.000 (rec:1079168.375, round:141.625)	b=17.19	count=650
[MQBENCH] INFO: Total loss:	1268717.500 (rec:1268581.000, round:136.557)	b=16.62	count=700
[MQBENCH] INFO: Total loss:	1177739.000 (rec:1177606.250, round:132.722)	b=16.06	count=750
[MQBENCH] INFO: Total loss:	996573.312 (rec:996443.750, round:129.561)	b=15.50	count=800
[MQBENCH] INFO: Total loss:	1268653.875 (rec:1268527.250, round:126.623)	b=14.94	count=850
[MQBENCH] INFO: Total loss:	1160553.625 (rec:1160429.750, round:123.814)	b=14.38	count=900
[MQBENCH] INFO: Total loss:	1125875.125 (rec:1125754.000, round:121.146)	b=13.81	count=950
[MQBENCH] INFO: Total loss:	996476.125 (rec:996357.625, round:118.517)	b=13.25	count=1000
[MQBENCH] INFO: Total loss:	1080828.375 (rec:1080712.500, round:115.918)	b=12.69	count=1050
[MQBENCH] INFO: Total loss:	1050525.375 (rec:1050412.125, round:113.307)	b=12.12	count=1100
[MQBENCH] INFO: Total loss:	996359.562 (rec:996248.938, round:110.646)	b=11.56	count=1150
[MQBENCH] INFO: Total loss:	1238611.125 (rec:1238503.250, round:107.931)	b=11.00	count=1200
[MQBENCH] INFO: Total loss:	1268448.625 (rec:1268343.500, round:105.166)	b=10.44	count=1250
[MQBENCH] INFO: Total loss:	1194375.125 (rec:1194272.750, round:102.322)	b=9.88	count=1300
[MQBENCH] INFO: Total loss:	1189787.625 (rec:1189688.250, round:99.391)	b=9.31	count=1350
[MQBENCH] INFO: Total loss:	1149302.125 (rec:1149205.750, round:96.362)	b=8.75	count=1400
[MQBENCH] INFO: Total loss:	1268396.875 (rec:1268303.625, round:93.201)	b=8.19	count=1450
[MQBENCH] INFO: Total loss:	1080680.375 (rec:1080590.500, round:89.817)	b=7.62	count=1500
[MQBENCH] INFO: Total loss:	1149271.250 (rec:1149185.000, round:86.198)	b=7.06	count=1550
[MQBENCH] INFO: Total loss:	1268374.250 (rec:1268291.875, round:82.328)	b=6.50	count=1600
[MQBENCH] INFO: Total loss:	1125635.875 (rec:1125557.625, round:78.223)	b=5.94	count=1650
[MQBENCH] INFO: Total loss:	1025074.938 (rec:1025001.062, round:73.882)	b=5.38	count=1700
[MQBENCH] INFO: Total loss:	1177378.875 (rec:1177309.625, round:69.255)	b=4.81	count=1750
[MQBENCH] INFO: Total loss:	1160254.875 (rec:1160190.625, round:64.281)	b=4.25	count=1800
[MQBENCH] INFO: Total loss:	1050356.625 (rec:1050297.750, round:58.832)	b=3.69	count=1850
[MQBENCH] INFO: Total loss:	1149190.125 (rec:1149137.375, round:52.783)	b=3.12	count=1900
[MQBENCH] INFO: Total loss:	1177344.125 (rec:1177298.125, round:45.972)	b=2.56	count=1950
[MQBENCH] INFO: Total loss:	996172.812 (rec:996134.625, round:38.158)	b=2.00	count=2000
[MQBENCH] INFO: prepare layer reconstruction for model.3.cv1.conv
[MQBENCH] INFO: the node list is below!
[MQBENCH] INFO: [model_3_cv1_conv, model_3_cv1_conv_post_act_fake_quantizer]
[MQBENCH] INFO: GraphModule(
  (model_3_cv1_conv_post_act_fake_quantizer): FixedFakeQuantize(
    fake_quant_enabled=tensor([1], device='cuda:0', dtype=torch.uint8), observer_enabled=tensor([0], device='cuda:0', dtype=torch.uint8), quant_min=0, quant_max=255, dtype=torch.quint8, qscheme=torch.per_tensor_affine, ch_axis=-1, scale=tensor([0.03633], device='cuda:0'), zero_point=tensor([0], device='cuda:0', dtype=torch.int32)
    (activation_post_process): EMAMSEObserver(min_val=0.0, max_val=9.263504028320312 ch_axis=-1 pot=False)
  )
  (model): Module(
    (3): Module(
      (cv1): Module(
        (conv): ConvReLU2d(
          64, 32, kernel_size=(1, 1), stride=(1, 1)
          (weight_fake_quant): AdaRoundFakeQuantize(
            fake_quant_enabled=tensor([1], device='cuda:0', dtype=torch.uint8), observer_enabled=tensor([0], device='cuda:0', dtype=torch.uint8), quant_min=0, quant_max=255, dtype=torch.quint8, qscheme=torch.per_channel_affine, ch_axis=0, scale=List, zero_point=List
            (activation_post_process): MSEObserver(min_val=List, max_val=List ch_axis=0 pot=False)
          )
        )
      )
    )
  )
)

def forward(self, model_2_conv_post_act_fake_quantizer):
    model_3_cv1_conv = getattr(self.model, "3").cv1.conv(model_2_conv_post_act_fake_quantizer);  model_2_conv_post_act_fake_quantizer = None
    model_3_cv1_conv_post_act_fake_quantizer = self.model_3_cv1_conv_post_act_fake_quantizer(model_3_cv1_conv);  model_3_cv1_conv = None
    return model_3_cv1_conv_post_act_fake_quantizer
    
[MQBENCH] INFO: learn the scale for model_3_cv1_conv_post_act_fake_quantizer
Init alpha to be FP32
[MQBENCH] INFO: The world size is 1.
[MQBENCH] INFO: start tuning by adaround
[MQBENCH] INFO: Total loss:	104984.023 (rec:104984.023, round:0.000)	b=20.00	count=50
[MQBENCH] INFO: Total loss:	88707.805 (rec:88707.805, round:0.000)	b=20.00	count=100
[MQBENCH] INFO: Total loss:	101765.180 (rec:101765.180, round:0.000)	b=20.00	count=150
[MQBENCH] INFO: Total loss:	85504.398 (rec:85504.398, round:0.000)	b=20.00	count=200
[MQBENCH] INFO: Total loss:	90762.773 (rec:90762.773, round:0.000)	b=20.00	count=250
[MQBENCH] INFO: Total loss:	85503.766 (rec:85503.766, round:0.000)	b=20.00	count=300
[MQBENCH] INFO: Total loss:	107946.148 (rec:107946.148, round:0.000)	b=20.00	count=350
[MQBENCH] INFO: Total loss:	90780.094 (rec:90760.984, round:19.107)	b=20.00	count=400
[MQBENCH] INFO: Total loss:	107402.766 (rec:107384.234, round:18.528)	b=19.44	count=450
[MQBENCH] INFO: Total loss:	107402.000 (rec:107384.094, round:17.908)	b=18.88	count=500
[MQBENCH] INFO: Total loss:	90778.031 (rec:90760.828, round:17.205)	b=18.31	count=550
[MQBENCH] INFO: Total loss:	86059.000 (rec:86042.539, round:16.464)	b=17.75	count=600
[MQBENCH] INFO: Total loss:	90776.555 (rec:90760.805, round:15.751)	b=17.19	count=650
[MQBENCH] INFO: Total loss:	86053.188 (rec:86038.055, round:15.130)	b=16.62	count=700
[MQBENCH] INFO: Total loss:	90340.594 (rec:90325.953, round:14.639)	b=16.06	count=750
[MQBENCH] INFO: Total loss:	94209.672 (rec:94195.453, round:14.219)	b=15.50	count=800
[MQBENCH] INFO: Total loss:	104980.758 (rec:104966.906, round:13.854)	b=14.94	count=850
[MQBENCH] INFO: Total loss:	104980.453 (rec:104966.898, round:13.551)	b=14.38	count=900
[MQBENCH] INFO: Total loss:	107382.312 (rec:107369.055, round:13.256)	b=13.81	count=950
[MQBENCH] INFO: Total loss:	86042.688 (rec:86029.703, round:12.988)	b=13.25	count=1000
[MQBENCH] INFO: Total loss:	101750.539 (rec:101737.820, round:12.716)	b=12.69	count=1050
[MQBENCH] INFO: Total loss:	88696.070 (rec:88683.656, round:12.417)	b=12.12	count=1100
[MQBENCH] INFO: Total loss:	87091.375 (rec:87079.273, round:12.100)	b=11.56	count=1150
[MQBENCH] INFO: Total loss:	107925.648 (rec:107913.859, round:11.790)	b=11.00	count=1200
[MQBENCH] INFO: Total loss:	88690.602 (rec:88679.125, round:11.475)	b=10.44	count=1250
[MQBENCH] INFO: Total loss:	87085.820 (rec:87074.672, round:11.147)	b=9.88	count=1300
[MQBENCH] INFO: Total loss:	101740.039 (rec:101729.234, round:10.805)	b=9.31	count=1350
[MQBENCH] INFO: Total loss:	86021.062 (rec:86010.609, round:10.452)	b=8.75	count=1400
[MQBENCH] INFO: Total loss:	86020.656 (rec:86010.570, round:10.084)	b=8.19	count=1450
[MQBENCH] INFO: Total loss:	91561.102 (rec:91551.406, round:9.694)	b=7.62	count=1500
[MQBENCH] INFO: Total loss:	111278.703 (rec:111269.430, round:9.277)	b=7.06	count=1550
[MQBENCH] INFO: Total loss:	85475.719 (rec:85466.891, round:8.827)	b=6.50	count=1600
[MQBENCH] INFO: Total loss:	86016.086 (rec:86007.742, round:8.341)	b=5.94	count=1650
[MQBENCH] INFO: Total loss:	88679.156 (rec:88671.328, round:7.831)	b=5.38	count=1700
[MQBENCH] INFO: Total loss:	85473.922 (rec:85466.633, round:7.287)	b=4.81	count=1750
[MQBENCH] INFO: Total loss:	91554.492 (rec:91547.789, round:6.707)	b=4.25	count=1800
[MQBENCH] INFO: Total loss:	107909.867 (rec:107903.789, round:6.079)	b=3.69	count=1850
[MQBENCH] INFO: Total loss:	108901.992 (rec:108896.586, round:5.403)	b=3.12	count=1900
[MQBENCH] INFO: Total loss:	90728.711 (rec:90724.047, round:4.667)	b=2.56	count=1950
[MQBENCH] INFO: Total loss:	101727.688 (rec:101723.836, round:3.852)	b=2.00	count=2000
[MQBENCH] INFO: prepare layer reconstruction for model.3.m.0.cv1.conv
[MQBENCH] INFO: the node list is below!
[MQBENCH] INFO: [model_3_m_0_cv1_conv, model_3_m_0_cv1_conv_post_act_fake_quantizer]
[MQBENCH] INFO: GraphModule(
  (model_3_m_0_cv1_conv_post_act_fake_quantizer): FixedFakeQuantize(
    fake_quant_enabled=tensor([1], device='cuda:0', dtype=torch.uint8), observer_enabled=tensor([0], device='cuda:0', dtype=torch.uint8), quant_min=0, quant_max=255, dtype=torch.quint8, qscheme=torch.per_tensor_affine, ch_axis=-1, scale=tensor([0.03124], device='cuda:0'), zero_point=tensor([0], device='cuda:0', dtype=torch.int32)
    (activation_post_process): EMAMSEObserver(min_val=0.0, max_val=7.966949939727783 ch_axis=-1 pot=False)
  )
  (model): Module(
    (3): Module(
      (m): Module(
        (0): Module(
          (cv1): Module(
            (conv): ConvReLU2d(
              32, 32, kernel_size=(1, 1), stride=(1, 1)
              (weight_fake_quant): AdaRoundFakeQuantize(
                fake_quant_enabled=tensor([1], device='cuda:0', dtype=torch.uint8), observer_enabled=tensor([0], device='cuda:0', dtype=torch.uint8), quant_min=0, quant_max=255, dtype=torch.quint8, qscheme=torch.per_channel_affine, ch_axis=0, scale=List, zero_point=List
                (activation_post_process): MSEObserver(min_val=List, max_val=List ch_axis=0 pot=False)
              )
            )
          )
        )
      )
    )
  )
)

def forward(self, model_3_cv1_conv_post_act_fake_quantizer):
    model_3_m_0_cv1_conv = getattr(getattr(self.model, "3").m, "0").cv1.conv(model_3_cv1_conv_post_act_fake_quantizer);  model_3_cv1_conv_post_act_fake_quantizer = None
    model_3_m_0_cv1_conv_post_act_fake_quantizer = self.model_3_m_0_cv1_conv_post_act_fake_quantizer(model_3_m_0_cv1_conv);  model_3_m_0_cv1_conv = None
    return model_3_m_0_cv1_conv_post_act_fake_quantizer
    
[MQBENCH] INFO: learn the scale for model_3_m_0_cv1_conv_post_act_fake_quantizer
Init alpha to be FP32
[MQBENCH] INFO: The world size is 1.
[MQBENCH] INFO: start tuning by adaround
[MQBENCH] INFO: Total loss:	74922.727 (rec:74922.727, round:0.000)	b=20.00	count=50
[MQBENCH] INFO: Total loss:	90180.148 (rec:90180.148, round:0.000)	b=20.00	count=100
[MQBENCH] INFO: Total loss:	74915.336 (rec:74915.336, round:0.000)	b=20.00	count=150
[MQBENCH] INFO: Total loss:	73166.547 (rec:73166.547, round:0.000)	b=20.00	count=200
[MQBENCH] INFO: Total loss:	72276.906 (rec:72276.906, round:0.000)	b=20.00	count=250
[MQBENCH] INFO: Total loss:	90175.766 (rec:90175.766, round:0.000)	b=20.00	count=300
[MQBENCH] INFO: Total loss:	72275.961 (rec:72275.961, round:0.000)	b=20.00	count=350
[MQBENCH] INFO: Total loss:	90108.414 (rec:90098.922, round:9.493)	b=20.00	count=400
[MQBENCH] INFO: Total loss:	92421.492 (rec:92412.227, round:9.269)	b=19.44	count=450
[MQBENCH] INFO: Total loss:	73170.367 (rec:73161.320, round:9.049)	b=18.88	count=500
[MQBENCH] INFO: Total loss:	90901.281 (rec:90892.445, round:8.837)	b=18.31	count=550
[MQBENCH] INFO: Total loss:	77723.289 (rec:77714.672, round:8.619)	b=17.75	count=600
[MQBENCH] INFO: Total loss:	77713.320 (rec:77704.906, round:8.412)	b=17.19	count=650
[MQBENCH] INFO: Total loss:	77236.125 (rec:77227.891, round:8.231)	b=16.62	count=700
[MQBENCH] INFO: Total loss:	82413.883 (rec:82405.820, round:8.064)	b=16.06	count=750
[MQBENCH] INFO: Total loss:	77230.914 (rec:77222.992, round:7.918)	b=15.50	count=800
[MQBENCH] INFO: Total loss:	89802.352 (rec:89794.555, round:7.800)	b=14.94	count=850
[MQBENCH] INFO: Total loss:	73140.039 (rec:73132.367, round:7.674)	b=14.38	count=900
[MQBENCH] INFO: Total loss:	73135.469 (rec:73127.945, round:7.525)	b=13.81	count=950
[MQBENCH] INFO: Total loss:	72246.219 (rec:72238.844, round:7.377)	b=13.25	count=1000
[MQBENCH] INFO: Total loss:	86495.039 (rec:86487.812, round:7.226)	b=12.69	count=1050
[MQBENCH] INFO: Total loss:	92371.758 (rec:92364.688, round:7.072)	b=12.12	count=1100
[MQBENCH] INFO: Total loss:	77215.203 (rec:77208.281, round:6.918)	b=11.56	count=1150
[MQBENCH] INFO: Total loss:	90134.648 (rec:90127.891, round:6.759)	b=11.00	count=1200
[MQBENCH] INFO: Total loss:	74565.508 (rec:74558.906, round:6.603)	b=10.44	count=1250
[MQBENCH] INFO: Total loss:	90058.266 (rec:90051.820, round:6.447)	b=9.88	count=1300
[MQBENCH] INFO: Total loss:	78599.281 (rec:78593.000, round:6.279)	b=9.31	count=1350
[MQBENCH] INFO: Total loss:	74871.906 (rec:74865.812, round:6.093)	b=8.75	count=1400
[MQBENCH] INFO: Total loss:	74869.305 (rec:74863.414, round:5.894)	b=8.19	count=1450
[MQBENCH] INFO: Total loss:	73118.312 (rec:73112.625, round:5.685)	b=7.62	count=1500
[MQBENCH] INFO: Total loss:	73114.555 (rec:73109.086, round:5.471)	b=7.06	count=1550
[MQBENCH] INFO: Total loss:	90841.016 (rec:90835.766, round:5.249)	b=6.50	count=1600
[MQBENCH] INFO: Total loss:	90840.742 (rec:90835.727, round:5.016)	b=5.94	count=1650
[MQBENCH] INFO: Total loss:	72227.391 (rec:72222.625, round:4.765)	b=5.38	count=1700
[MQBENCH] INFO: Total loss:	74859.797 (rec:74855.312, round:4.485)	b=4.81	count=1750
[MQBENCH] INFO: Total loss:	82375.586 (rec:82371.414, round:4.174)	b=4.25	count=1800
[MQBENCH] INFO: Total loss:	77658.773 (rec:77654.953, round:3.824)	b=3.69	count=1850
[MQBENCH] INFO: Total loss:	90828.023 (rec:90824.586, round:3.436)	b=3.12	count=1900
[MQBENCH] INFO: Total loss:	77655.438 (rec:77652.430, round:3.004)	b=2.56	count=1950
[MQBENCH] INFO: Total loss:	73101.320 (rec:73098.805, round:2.517)	b=2.00	count=2000
[MQBENCH] INFO: prepare layer reconstruction for model.3.m.0.cv2.conv
[MQBENCH] INFO: the node list is below!
[MQBENCH] INFO: [model_3_m_0_cv2_conv, model_3_m_0_cv2_conv_post_act_fake_quantizer]
[MQBENCH] INFO: GraphModule(
  (model_3_m_0_cv2_conv_post_act_fake_quantizer): FixedFakeQuantize(
    fake_quant_enabled=tensor([1], device='cuda:0', dtype=torch.uint8), observer_enabled=tensor([0], device='cuda:0', dtype=torch.uint8), quant_min=0, quant_max=255, dtype=torch.quint8, qscheme=torch.per_tensor_affine, ch_axis=-1, scale=tensor([0.04500], device='cuda:0'), zero_point=tensor([0], device='cuda:0', dtype=torch.int32)
    (activation_post_process): EMAMSEObserver(min_val=0.0, max_val=11.474054336547852 ch_axis=-1 pot=False)
  )
  (model): Module(
    (3): Module(
      (m): Module(
        (0): Module(
          (cv2): Module(
            (conv): ConvReLU2d(
              32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)
              (weight_fake_quant): AdaRoundFakeQuantize(
                fake_quant_enabled=tensor([1], device='cuda:0', dtype=torch.uint8), observer_enabled=tensor([0], device='cuda:0', dtype=torch.uint8), quant_min=0, quant_max=255, dtype=torch.quint8, qscheme=torch.per_channel_affine, ch_axis=0, scale=List, zero_point=List
                (activation_post_process): MSEObserver(min_val=List, max_val=List ch_axis=0 pot=False)
              )
            )
          )
        )
      )
    )
  )
)

def forward(self, model_3_m_0_cv1_conv_post_act_fake_quantizer):
    model_3_m_0_cv2_conv = getattr(getattr(self.model, "3").m, "0").cv2.conv(model_3_m_0_cv1_conv_post_act_fake_quantizer);  model_3_m_0_cv1_conv_post_act_fake_quantizer = None
    model_3_m_0_cv2_conv_post_act_fake_quantizer = self.model_3_m_0_cv2_conv_post_act_fake_quantizer(model_3_m_0_cv2_conv);  model_3_m_0_cv2_conv = None
    return model_3_m_0_cv2_conv_post_act_fake_quantizer
    
[MQBENCH] INFO: learn the scale for model_3_m_0_cv2_conv_post_act_fake_quantizer
Init alpha to be FP32
[MQBENCH] INFO: The world size is 1.
[MQBENCH] INFO: start tuning by adaround
[MQBENCH] INFO: Total loss:	37715.312 (rec:37715.312, round:0.000)	b=20.00	count=50
[MQBENCH] INFO: Total loss:	41225.355 (rec:41225.355, round:0.000)	b=20.00	count=100
[MQBENCH] INFO: Total loss:	44845.805 (rec:44845.805, round:0.000)	b=20.00	count=150
[MQBENCH] INFO: Total loss:	45799.879 (rec:45799.879, round:0.000)	b=20.00	count=200
[MQBENCH] INFO: Total loss:	50526.023 (rec:50526.023, round:0.000)	b=20.00	count=250
[MQBENCH] INFO: Total loss:	46262.328 (rec:46262.328, round:0.000)	b=20.00	count=300
[MQBENCH] INFO: Total loss:	46661.832 (rec:46661.832, round:0.000)	b=20.00	count=350
[MQBENCH] INFO: Total loss:	44792.629 (rec:44706.270, round:86.358)	b=20.00	count=400
[MQBENCH] INFO: Total loss:	50602.289 (rec:50518.762, round:83.526)	b=19.44	count=450
[MQBENCH] INFO: Total loss:	44917.375 (rec:44837.086, round:80.288)	b=18.88	count=500
[MQBENCH] INFO: Total loss:	41291.262 (rec:41214.719, round:76.543)	b=18.31	count=550
[MQBENCH] INFO: Total loss:	44213.484 (rec:44141.004, round:72.481)	b=17.75	count=600
[MQBENCH] INFO: Total loss:	46723.406 (rec:46654.953, round:68.454)	b=17.19	count=650
[MQBENCH] INFO: Total loss:	38436.500 (rec:38371.410, round:65.089)	b=16.62	count=700
[MQBENCH] INFO: Total loss:	39184.832 (rec:39122.195, round:62.638)	b=16.06	count=750
[MQBENCH] INFO: Total loss:	46708.672 (rec:46647.875, round:60.795)	b=15.50	count=800
[MQBENCH] INFO: Total loss:	44473.789 (rec:44414.590, round:59.198)	b=14.94	count=850
[MQBENCH] INFO: Total loss:	39177.609 (rec:39119.922, round:57.687)	b=14.38	count=900
[MQBENCH] INFO: Total loss:	39995.133 (rec:39938.891, round:56.242)	b=13.81	count=950
[MQBENCH] INFO: Total loss:	39993.488 (rec:39938.703, round:54.785)	b=13.25	count=1000
[MQBENCH] INFO: Total loss:	45231.879 (rec:45178.508, round:53.372)	b=12.69	count=1050
[MQBENCH] INFO: Total loss:	37747.441 (rec:37695.473, round:51.970)	b=12.12	count=1100
[MQBENCH] INFO: Total loss:	44648.543 (rec:44597.988, round:50.554)	b=11.56	count=1150
[MQBENCH] INFO: Total loss:	44176.258 (rec:44127.168, round:49.091)	b=11.00	count=1200
[MQBENCH] INFO: Total loss:	39979.871 (rec:39932.332, round:47.540)	b=10.44	count=1250
[MQBENCH] INFO: Total loss:	46283.957 (rec:46238.078, round:45.878)	b=9.88	count=1300
[MQBENCH] INFO: Total loss:	45815.332 (rec:45771.234, round:44.098)	b=9.31	count=1350
[MQBENCH] INFO: Total loss:	37730.723 (rec:37688.531, round:42.191)	b=8.75	count=1400
[MQBENCH] INFO: Total loss:	50532.953 (rec:50492.746, round:40.207)	b=8.19	count=1450
[MQBENCH] INFO: Total loss:	50530.844 (rec:50492.684, round:38.161)	b=7.62	count=1500
[MQBENCH] INFO: Total loss:	44718.082 (rec:44682.020, round:36.063)	b=7.06	count=1550
[MQBENCH] INFO: Total loss:	41224.871 (rec:41190.969, round:33.901)	b=6.50	count=1600
[MQBENCH] INFO: Total loss:	39137.375 (rec:39105.684, round:31.691)	b=5.94	count=1650
[MQBENCH] INFO: Total loss:	44428.445 (rec:44399.043, round:29.402)	b=5.38	count=1700
[MQBENCH] INFO: Total loss:	45190.570 (rec:45163.543, round:27.029)	b=4.81	count=1750
[MQBENCH] INFO: Total loss:	45187.898 (rec:45163.363, round:24.536)	b=4.25	count=1800
[MQBENCH] INFO: Total loss:	39127.074 (rec:39105.195, round:21.881)	b=3.69	count=1850
[MQBENCH] INFO: Total loss:	47384.465 (rec:47365.414, round:19.051)	b=3.12	count=1900
[MQBENCH] INFO: Total loss:	44413.383 (rec:44397.352, round:16.032)	b=2.56	count=1950
[MQBENCH] INFO: Total loss:	50498.168 (rec:50485.352, round:12.818)	b=2.00	count=2000
[MQBENCH] INFO: prepare layer reconstruction for model.3.cv3
[MQBENCH] INFO: the node list is below!
[MQBENCH] INFO: [model_3_cv3, cat_1, cat_1_post_act_fake_quantizer, model_3_bn, model_3_act, model_3_act_post_act_fake_quantizer]
[MQBENCH] INFO: GraphModule(
  (cat_1_post_act_fake_quantizer): FixedFakeQuantize(
    fake_quant_enabled=tensor([1], device='cuda:0', dtype=torch.uint8), observer_enabled=tensor([0], device='cuda:0', dtype=torch.uint8), quant_min=0, quant_max=255, dtype=torch.quint8, qscheme=torch.per_tensor_affine, ch_axis=-1, scale=tensor([0.10376], device='cuda:0'), zero_point=tensor([146], device='cuda:0', dtype=torch.int32)
    (activation_post_process): EMAMSEObserver(min_val=-15.115386009216309, max_val=11.34388256072998 ch_axis=-1 pot=False)
  )
  (model_3_act_post_act_fake_quantizer): FixedFakeQuantize(
    fake_quant_enabled=tensor([1], device='cuda:0', dtype=torch.uint8), observer_enabled=tensor([0], device='cuda:0', dtype=torch.uint8), quant_min=0, quant_max=255, dtype=torch.quint8, qscheme=torch.per_tensor_affine, ch_axis=-1, scale=tensor([0.05508], device='cuda:0'), zero_point=tensor([39], device='cuda:0', dtype=torch.int32)
    (activation_post_process): EMAMSEObserver(min_val=-2.1693410873413086, max_val=11.876273155212402 ch_axis=-1 pot=False)
  )
  (model): Module(
    (3): Module(
      (cv3): Conv2d(
        32, 32, kernel_size=(1, 1), stride=(1, 1), bias=False
        (weight_fake_quant): AdaRoundFakeQuantize(
          fake_quant_enabled=tensor([1], device='cuda:0', dtype=torch.uint8), observer_enabled=tensor([0], device='cuda:0', dtype=torch.uint8), quant_min=0, quant_max=255, dtype=torch.quint8, qscheme=torch.per_channel_affine, ch_axis=0, scale=List, zero_point=List
          (activation_post_process): MSEObserver(min_val=List, max_val=List ch_axis=0 pot=False)
        )
      )
      (bn): BatchNorm2d(64, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
      (act): LeakyReLU(negative_slope=0.1, inplace=True)
    )
  )
)
import torch
def forward(self, model_3_cv2, add_1_post_act_fake_quantizer):
    model_3_cv3 = getattr(self.model, "3").cv3(add_1_post_act_fake_quantizer);  add_1_post_act_fake_quantizer = None
    cat_1 = torch.cat((model_3_cv3, model_3_cv2), dim = 1);  model_3_cv3 = model_3_cv2 = None
    cat_1_post_act_fake_quantizer = self.cat_1_post_act_fake_quantizer(cat_1);  cat_1 = None
    model_3_bn = getattr(self.model, "3").bn(cat_1_post_act_fake_quantizer);  cat_1_post_act_fake_quantizer = None
    model_3_act = getattr(self.model, "3").act(model_3_bn);  model_3_bn = None
    model_3_act_post_act_fake_quantizer = self.model_3_act_post_act_fake_quantizer(model_3_act);  model_3_act = None
    return model_3_act_post_act_fake_quantizer
    
[MQBENCH] INFO: learn the scale for cat_1_post_act_fake_quantizer
[MQBENCH] INFO: learn the scale for model_3_act_post_act_fake_quantizer
Init alpha to be FP32
[MQBENCH] INFO: The world size is 1.
[MQBENCH] INFO: start tuning by adaround
Traceback (most recent call last):
  File "yolov5_quantization.py", line 232, in <module>
    yolov5_quantization(opt, device)
  File "yolov5_quantization.py", line 146, in yolov5_quantization
    model, cali_data, config.quantize.reconstruction)
  File "/root/Yolov5-Filter-Pruning/quantization/mqbench/advanced_ptq.py", line 459, in ptq_reconstruction
    subgraph_reconstruction(subgraph, cached_inps, cached_oups, config)
  File "/root/Yolov5-Filter-Pruning/quantization/mqbench/advanced_ptq.py", line 261, in subgraph_reconstruction
    out_quant = subgraph(*cur_inp)
  File "/opt/conda/lib/python3.7/site-packages/torch/fx/graph_module.py", line 308, in wrapped_call
    return cls_call(self, *args, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
TypeError: forward() missing 1 required positional argument: 'add_1_post_act_fake_quantizer'

Any plans to update this toolkit?

Thank you very much for open this code. Please tell me, is there a plan to update this toolkit?

Question about EMAQuantileObserver

I am confused about the implementation in the forward method of EMAQuantileObserver:

cur_total = 0
clip_value = torch.max(-min_val_cur, max_val_cur)
for i, cnt in enumerate(hist):
    if cur_total + cnt >= self.threshold * x.numel():
        clip_value = (i + 0.5) * (max_val_cur / self.bins)
        break

What role does the var cur_total play in this code? Should the var cur_total be added with cnt in each for loop ?
Thank you!

errors when reproduce adaround results in example/PTQ/ptq.py

File "ptq.py", line 69, in
model = ptq_reconstruction(
File "/home/xxx/mq/MQBench/mqbench/advanced_ptq.py", line 418, in ptq_reconstruction
subgraph_reconstruction(subgraph, cached_inps, cached_oups, config)
File "/home/xxx/mq/MQBench/mqbench/advanced_ptq.py", line 198, in subgraph_reconstruction
assert USE_LINK or USE_DDP, 'either USE_LINK or USE_DDP should be True'
AssertionError: either USE_LINK or USE_DDP should be True
Thank you very much if you can answer this question

application/imagenet_example 例子不能执行

How to reproduce 4-bit QAT (Section 5.4) results reported in the MQBench paper?

Can you provide some instructions on reproducing the results in Section 5.4?

How to deploy it with tensorRT

thanks for your great jobs!

But how to deploy this quantified model and params to tensorRT

Error when using advanced_ptq

Hi, When I use advanced_ptq of mqbench to quantify my model, I get the keyvalue error problem in this place.

The adaround optimization operation can be completed for the first few layers, and this error will appear at one layer, like:

However, the same pipeline with navie_ptq works well. Looking forward to your reply

use pytorch version==1.8.1

pytorch 1.9 is not compatible with mqbench in the linear layer.

convert_deploy得到的模型，可以直接用浮点推理吗？

我尝试了一下，发现直接浮点推理Wst8_Ast8精度可以，Wst4_Ast8精度很差，Wsc8_Ast8精度也很差.
W-weight, A-activate, s-symmetry, t-per tensor, c-per channel, 8-8bit, 4-4bit。

I think there is a bug in Imagenet_example folder.

Error at application/imagenet_example/main.py, line 22.

Function repare_qat_fx_by_platform cannot be referred. However prepare_by_platform can be referred.

Possibly, the same error is at application/imagenet_example/main.py , line 153.

Activation FakeQuantizer Insertion after Concat

Hi,

It seems in the current implementation of quantizer, activation FakeQuantizer would be inserted after concat op, like the following:

It is quite reasonable in most cases. However, if only one branch is required for inference (for example, other branchs might be only used as learning guidance), is it still the optimal choice? If it worths a try, is there any public API to support customized insertion?

Any advice would be appreciated.

quantization param of input node?

Hi，
Is it possible to set the quantization parameters for the input node？

about mixed-precision quantization

Is it possible to quantify different layer in different quantization scheme, like use w8a16 for first layer, use w8a8 for the other layers?

How to quant GraphModule in torch and use torch.jit.trace to save quantized model?

Great project! I notice that MQBench only supports export GrapthModule to onnx and perform merge and quantization in onnx. I wonder how to export quantized model using torch.jit.trace?

atout QAT method

Hello, I am very glad to see that you are making a quantitative library. Recently I read a paper called 'Learnable Companding Quantization for Accurate Low-bit Neural Networks'（LCQ）, the result of this paper is very good, but it cannot be reproduced by itself, so I would like to ask you to reproduce it. If this result can be adopted, it will refresh the ranking, thank you.

有技术交流群吗？QQ群？

How do you generate int4 tensorrt engine?

is there a doc for how to use int4 with tensorrt?

BackendType.ONNX_QNN是重复注册了吗？

symbolic_trace出错

在mmdetecion上使用你们这个代码，symbolic_trace会出错，你们有尝试过吗？

Support standard ONNX quantized operators?

Hi, I've been looking for an end2end quantization deployment solution, so far I've tried the onnxrutime + TVM stack, but onnxruntime only supports naive PTQ methods. Your work looks really promising, however, I wonder can MQBench export the quantized model in the form of standard ONNX quantized ops like QLinearConv, QuantizeLinear, DequantizeLinear, etc?

See: apache/tvm#8838

how to use in mmdet build model

when use mmdet build this model, it will like:
object {
module list aaaa
module list bbb
}
when use prepare_by_platform to trace will get error like: TypeError: 'xxxobject' object does not support indexing

Train with PACT but the value for cliping weights and activations which denoted as `alpha` seems not change.

the value for cliping weights and activations which denoted as alpha is initialized to 6.0, In my opinion, this value should be updated during training, but I found it not, I am training with the imagenet_example just adding such following configs to make PACT working.

if args.quant:
        extra_params = {
            'extra_qconfig_dict': {
                'w_observer': "MinMaxObserver",
                'a_observer': "EMAMinMaxObserver",
                'w_fakequantize': "PACTFakeQuantize",
                'a_fakequantize': "PACTFakeQuantize",
                'a_fakeq_params': {},
                'w_qscheme': {
                    'bit': 8,
                    'symmetry': True,
                    'per_channel': False,
                    'pot_scale': False
                },
                'a_qscheme': {
                    'bit': 8,
                    'symmetry': True,
                    'per_channel': False,
                    'pot_scale': False
                }
            },
            'extra_quantizer_dict': {},
            'preserve_attr': {},
            'concrete_args': {},
            'extra_fuse_dict': {}
        }
        print("==> config with extra params", extra_params)
        model = prepare_by_platform(model, args.backend, extra_params)

How to do PTQ with MQBench？

How to Register a new BackendType?

NICE work!
I tried to add a new type of backend according to my quantization needs, and I get stuck with some trouble.
I just add my own backend name, say, 'dummy_type' to the class BackendType in prepare_by_platform.py, like:

class BackendType(Enum):
    Academic = 'Academic'
    Tensorrt = 'Tensorrt'
    SNPE = 'SNPE'
    PPLW8A16 = 'PPLW8A16'
    NNIE = 'NNIE'
    dummy_type = 'dummy_type'

and modified the corresponding ParamsTable:

ParamsTable = {
    BackendType.dummy_type: dict(qtype='affine', ...),}

However, when I reinstalled my env(updated the mqbench pack), I got this error:

File "./miniconda3/envs/mqbench/lib/python3.7/site-packages/mqbench/prepare_by_platform.py", line 274, in prepare_qat_fx_by_platform
    quantizer = DEFAULT_MODEL_QUANTIZER[deploy_backend](extra_quantizer_dict, extra_fuse_dict)
KeyError: <BackendType.dummy_type: 'dummy_type'>

It looks like that DEFAULT_MODEL_QUANTIZER can't recognize my own backend type, and I guess I might miss some other settings. So what should I do to fix this problem? And will you release some docs to help users set their own backend type? Thanks for your help ;)

convert_deploy得到的模型，可以直接用浮点推理吗？

我尝试了一下，发现直接浮点推理Wst8_Ast8精度可以，Wst4_Ast8精度很差，Wsc8_Ast8精度也很差.
W-weight, A-activate, s-symmetry, t-per tensor, c-per channel, 8-8bit, 4-4bit。

Some problems about using MQBench tool in YOLOv5

First, thanks for your work!
And when i use the MQBench tool in YOLOv5, I made some changes to the code of YOLOv5 according to the instructions in the documents, bur when I was training, I found that the speed of training was so slow.
So I would like to ask whether this situation is normal or whether I need to continue to optimize something! Looking forward to your reply! Thanks!

default quant_min is mismatched with hardware

Hi,

when the bit is 8 and the backend is TensorRT, the default quant_min and quant_max of fake_quantizer is -128 and 127 separately. But from the document for TensorRT, you mention that

For weights, [lb, ub] = [-127, 127]. For activations, [lb, ub] = [-128, 127].

is there a conflict here with regard to the weights?

reassign[name] = swap_module(mod, mapping, {})

Hi,

I was trying to do PTQ on my own PyTorch while the following error occurred:
ypeError: cannot assign 'torch.FloatTensor' as parameter 'weight' (torch.nn.Parameter or None expected)

The full error is:

Traceback (most recent call last):
  File "ptq_test.py", line 162, in <module>
    model = prepare_by_platform(net, BackendType.Tensorrt, extra_config)
  File "/home/wenqianzhao/tmp2022/Simple-SR/mqbench/prepare_by_platform.py", line 336, in prepare_by_platform
    prepared = quantizer.prepare(graph_module, qconfig)
  File "/home/wenqianzhao/tmp2022/Simple-SR/mqbench/custom_quantizer.py", line 70, in prepare
    model = self._weight_quant(model, qconfig)
  File "/home/wenqianzhao/tmp2022/Simple-SR/mqbench/custom_quantizer.py", line 131, in _weight_quant
    self._qat_swap_modules(model, self.additional_qat_module_mapping)
  File "/home/wenqianzhao/tmp2022/Simple-SR/mqbench/custom_quantizer.py", line 251, in _qat_swap_modules
    root = self._convert(root, all_mappings, inplace=True)
  File "/home/wenqianzhao/tmp2022/Simple-SR/mqbench/custom_quantizer.py", line 278, in _convert
    self._convert(mod, mapping, True, new_scope)
  File "/home/wenqianzhao/tmp2022/Simple-SR/mqbench/custom_quantizer.py", line 279, in _convert
    reassign[name] = swap_module(mod, mapping, {})
  File "/home/wenqianzhao/anaconda3/envs/mqbench/lib/python3.8/site-packages/torch/quantization/quantize.py", line 534, in swap_module
    new_mod = mapping[type(mod)].from_float(mod)
  File "/home/wenqianzhao/anaconda3/envs/mqbench/lib/python3.8/site-packages/torch/nn/intrinsic/qat/modules/conv_fused.py", line 423, in from_float
    return super(ConvReLU2d, cls).from_float(mod)
  File "/home/wenqianzhao/anaconda3/envs/mqbench/lib/python3.8/site-packages/torch/nn/qat/modules/conv.py", line 52, in from_float
    qat_conv.weight = mod.weight
  File "/home/wenqianzhao/anaconda3/envs/mqbench/lib/python3.8/site-packages/torch/nn/modules/module.py", line 968, in __setattr__
    raise TypeError("cannot assign '{}' as parameter '{}' "
TypeError: cannot assign 'torch.FloatTensor' as parameter 'weight' (torch.nn.Parameter or None expected)

Attached is code to reproduce the error:

import math
import torch
import torch.nn as nn
from torch.nn.parameter import Parameter

from mqbench.prepare_by_platform import prepare_by_platform   # add quant nodes for specific Backend
from mqbench.prepare_by_platform import BackendType           # contain various Backend, contains Academic.
from mqbench.utils.state import enable_calibration            # turn on calibration algorithm, determine scale, zero_point, etc.
from mqbench.utils.state import enable_quantization  

from mqbench.fake_quantize import FixedFakeQuantize
from mqbench.observer import MSEObserver


extra_config = {
    'w_observer': MSEObserver,                                # custom weight observer
    'a_observer': MSEObserver,                                # custom activation observer
    'w_fakequantize': FixedFakeQuantize,                      # custom weight fake quantize function
    'a_fakequantize': FixedFakeQuantize,                      # custom activation fake quantize function
    'w_qscheme': {
        'bit': 8,                                             # custom bitwidth for weight,
        'symmetry': False,                                    # custom whether quant is symmetric for weight,
        'per_channel': True,                                  # custom whether quant is per-channel or per-tensor for weight,
        'pot_scale': False,                                   # custom whether scale is power of two for weight.
    },
    'a_qscheme': {
        'bit': 8,                                             # custom bitwidth for activation,
        'symmetry': False,                                    # custom whether quant is symmetric for activation,
        'per_channel': True,                                  # custom whether quant is per-channel or per-tensor for activation,
        'pot_scale': False,                                   # custom whether scale is power of two for activation.
    }
}




class Scale(nn.Module):
    def __init__(self, init_value=1e-3):
        super(Scale, self).__init__()
        self.scale = Parameter(torch.FloatTensor([init_value]))

    def forward(self, x):
        return x * self.scale


class AWRU(nn.Module):
    def __init__(self, nf, kernel_size, wn, act=nn.ReLU(True)):
        super(AWRU, self).__init__()
        self.res_scale = Scale(1)
        self.x_scale = Scale(1)

        self.body = nn.Sequential(
            wn(nn.Conv2d(nf, nf, kernel_size, padding=kernel_size//2)),
            act,
            wn(nn.Conv2d(nf, nf, kernel_size, padding=kernel_size//2)),
        )

    def forward(self, x):
        res = self.res_scale(self.body(x)) + self.x_scale(x)
        return res


class AWMS(nn.Module):
    def __init__(self, nf, out_chl, wn, act=nn.ReLU(True)):
        super(AWMS, self).__init__()
        self.tail_k3 = wn(nn.Conv2d(nf, nf, 3, padding=3//2, dilation=1))
        self.tail_k5 = wn(nn.Conv2d(nf, nf, 5, padding=5//2, dilation=1))
        self.scale_k3 = Scale(0.5)
        self.scale_k5 = Scale(0.5)
        self.fuse = wn(nn.Conv2d(nf, nf, 3, padding=3 // 2))
        self.act = act
        self.w_conv = wn(nn.Conv2d(nf, out_chl, 3, padding=3//2))

    def forward(self, x):
        x0 = self.scale_k3(self.tail_k3(x))
        x1 = self.scale_k5(self.tail_k5(x))
        cur_x = x0 + x1

        fuse_x = self.act(self.fuse(cur_x))
        out = self.w_conv(fuse_x)

        return out


class LFB(nn.Module):
    def __init__(self, nf, wn, act=nn.ReLU(inplace=True)):
        super(LFB, self).__init__()
        self.b0 = AWRU(nf, 3, wn=wn, act=act)
        self.b1 = AWRU(nf, 3, wn=wn, act=act)
        self.b2 = AWRU(nf, 3, wn=wn, act=act)
        self.b3 = AWRU(nf, 3, wn=wn, act=act)
        self.reduction = wn(nn.Conv2d(nf * 4, nf, 3, padding=3//2))
        self.res_scale = Scale(1)
        self.x_scale = Scale(1)

    def forward(self, x):
        x0 = self.b0(x)
        x1 = self.b1(x0)
        x2 = self.b2(x1)
        x3 = self.b3(x2)
        res = self.reduction(torch.cat([x0, x1, x2, x3], dim=1))

        return self.res_scale(res) + self.x_scale(x)


class WeightNet(nn.Module):
    def __init__(self, config):
        super(WeightNet, self).__init__()

        in_chl = config.IN_CHANNEL
        nf = config.N_CHANNEL
        n_block = config.RES_BLOCK
        out_chl = config.N_WEIGHT
        scale = config.SCALE

        act = nn.ReLU(inplace=True)
        wn = lambda x: nn.utils.weight_norm(x)

        rgb_mean = torch.FloatTensor([0.4488, 0.4371, 0.4040]).view([1, 3, 1, 1]) 
        self.register_buffer('rgb_mean', rgb_mean)

        self.head = nn.Sequential(
            wn(nn.Conv2d(in_chl, nf, 3, padding=3//2)),
            act,
        )

        body = []
        for i in range(n_block):
            body.append(LFB(nf, wn=wn, act=act))
        self.body = nn.Sequential(*body)

        self.up = nn.Sequential(
            wn(nn.Conv2d(nf, nf * scale ** 2, 3, padding=3//2)),
            act,
            nn.PixelShuffle(upscale_factor=scale)
        )

        self.tail = AWMS(nf, out_chl, wn, act=act)

    def forward(self, x):
        x = x - self.rgb_mean
        x = self.head(x)
        x = self.body(x)
        x = self.up(x)
        out = self.tail(x)

        return out


if __name__ == '__main__':
    from easydict import EasyDict as edict

    config = edict()
    config.IN_CHANNEL = 3
    config.N_CHANNEL = 32
    config.RES_BLOCK = 4
    config.N_WEIGHT = 72
    config.SCALE = 2

    net = WeightNet(config).cuda()
    net.eval()
    model = prepare_by_platform(net, BackendType.Tensorrt, extra_config)

    cnt = 0
    for p in net.parameters():
        cnt += p.numel()
    print(cnt)

    x = torch.randn(1, 3, 32, 32).cuda()
    out = net(x)
    print(out.size())

not compatible with pytorch 1.9+

FX was really experimental in torch 1.8 and it has significantly been upgraded.
Any plan to update MQBench to be compatible with pytorch 1.9 and up?

error in ptq: RuntimeError: Zero-point must be Int32, Float or Half, found Long

When I run ptq.py, I find bug as follow:
RuntimeError: Zero-point must be Int32, Float or Half, found Long
Then I go back to source code, I find issue /home/jwliu/MQBench-main/mqbench/fake_quantize/fixed.py in forward()
X = torch.fake_quantize_per_channel_affine( X, self.scale.data, self.zero_point.data.long(), self.ch_axis, self.quant_min, self.quant_max)
Why deliberately set zero_point type to long?
How to solve this problem?

PPLW8A16是什么平台？

PTQ comparison of different quantitative methods

good job！for example winograd + int8/int8 + winograd with classic models.

Does support mix-precision QAT currently ?

Hello, this quant framework is very meaningful, good works !! Does support mix-precision QAT with torch now ??

About MQBench future

Thank you for open this great project.

when will update next version? and what content in it?
Hope write readme for more details.
Where is Examle? How to use it better?

How to customize module to be quantized

I was doing PTQ using MQBench, After the 'prepare_by_platform' I found some modules are not supported to quantize, I am wondering how can I select which part of model to be quantized? For examples, what shall I do to select only the convs but no others such as get item

onnx_qnn convert deploy problems

I'm having a conversion problem use BackendType.ONNX_QNN. The networks is seems like above, and the error code is " op_type=Conv not in FAKE_QUANTIZE_OP"(in fused relu pass), looking forward to a reply

fbgemm backend not found

Hi,
I want to set the backend to fbgemm, but I can not find the choice in the code,how to solve it, thanks.

作者您好，能麻烦您们分享下模型的FP32的预训练模型下载链接吗？

input quantize

model = self._insert_fake_quantize_for_act_quant(model, qconfig)

Insert act node exclude input data. Will influence accuracy without fake-quantize input data ?

Example of AdaRound with tensorRT

Thanks for your team's work. Can you post an example of AdaRound with tensorRT?
Since the dochttp://mqbench.tech/assets/docs/html/user_guide/PTQ/adaround.html is hard to start and the application example is for the QAT, I am wondering if your team could provide an example for us, it will help us a lot.

Why save the clip_range of weights in tensorrt deployment?

I find you only save and set the clip_range of activation when deploying in backend tensorrt. For the weights of models, you only clip them. tensorrt do not need to quantize weights of convolution?

QNN backend not working

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 8, 3, 1)
        self.bn1 = nn.BatchNorm2d(8)
        self.lk1 = nn.LeakyReLU()

        self.conv2 = nn.Conv2d(8, 16, 3, 1)
        self.bn2 = nn.BatchNorm2d(16)
        self.lk2 = nn.LeakyReLU()

        self.conv3 = nn.Conv2d(16, 32, 3, 1)

    def forward(self, x):
        x1 = self.lk1(self.bn1(self.conv1(x)))
        x2 = self.lk2(self.bn2(self.conv2(x1)))
        x3 = self.conv3(x2)

        return x3

no function of prepare_qat_fx_by_platform

Hi,

I can't find the definition of prepare_qat_fx_by_platform, which should be in the library mqbench.prepare_by_platform.

Do you just forget to upload the function?

onnx inference

Hello.
Finish model translate to onnx-quant, however cant use onnx-runtime to inference. error log No Op registered for LearnablePerTensorAffine with domain_version of 11

请问会加入openvino的支持么？

如题，openvino目前已经支持量化，并开源了模型压缩工具库，nncf。支持onnx的量化算子的导入
请问可以考虑加入对openvino int8量化的支持吗？

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.