CVPR2022 Papers (Papers/Codes/Demos)

Focal and Global Knowledge Distillation for Detectors(探测器的焦点和全局知识蒸馏)
keywords: Object Detection, Knowledge Distillation
paper | code

Unknown-Aware Object Detection: Learning What You Don't Know from Videos in the Wild(未知感知对象检测：从野外视频中学习你不知道的东西)
paper | code

Localization Distillation for Dense Object Detection(密集对象检测的定位蒸馏)
keywords: Bounding Box Regression, Localization Quality Estimation, Knowledge Distillation
paper | code

视频目标检测(Video Object Detection)

Unsupervised Activity Segmentation by Joint Representation Learning and Online Clustering(通过联合表示学习和在线聚类进行无监督活动分割)
paper

3D目标检测(3D object detection)

Back to Reality: Weakly-supervised 3D Object Detection with Shape-guided Label Enhancement(带有形状引导标签增强的弱监督 3D 对象检测)
paper | code

Canonical Voting: Towards Robust Oriented Bounding Box Detection in 3D Scenes(在 3D 场景中实现稳健的定向边界框检测)
paper | code

A Versatile Multi-View Framework for LiDAR-based 3D Object Detection with Guidance from Panoptic Segmentation(在全景分割的指导下，用于基于 LiDAR 的 3D 对象检测的多功能多视图框架)
keywords: 3D Object Detection with Point-based Methods, 3D Object Detection with Grid-based Methods, Cluster-free 3D Panoptic Segmentation, CenterPoint 3D Object Detection
paper

Pseudo-Stereo for Monocular 3D Object Detection in Autonomous Driving(自动驾驶中用于单目 3D 目标检测的伪立体)
keywords: Autonomous Driving, Monocular 3D Object Detection
paper | code

伪装目标检测(Camouflaged Object Detection)

Zoom In and Out: A Mixed-scale Triplet Network for Camouflaged Object Detection(放大和缩小：用于伪装目标检测的混合尺度三元组网络)
paper | code

关键点检测(Keypoint Detection)

UKPGAN: A General Self-Supervised Keypoint Detector(一个通用的自监督关键点检测器)
paper | code

车道线检测(Lane Detection)

Rethinking Efficient Lane Detection via Curve Modeling(通过曲线建模重新思考高效车道检测)
keywords: Segmentation-based Lane Detection, Point Detection-based Lane Detection, Curve-based Lane Detection, autonomous driving
paper | code

分割(Segmentation)

全景分割(Panoptic Segmentation)

Bending Reality: Distortion-aware Transformers for Adapting to Panoramic Semantic Segmentation(弯曲现实：适应全景语义分割的失真感知Transformer)
keywords: Semanticand panoramic segmentation, Unsupervised domain adaptation, Transformer
paper | code

语义分割(Semantic Segmentation)

Representation Compensation Networks for Continual Semantic Segmentation(连续语义分割的表示补偿网络)
paper | code

Semi-Supervised Semantic Segmentation Using Unreliable Pseudo-Labels(使用不可靠伪标签的半监督语义分割)
paper | code

Weakly Supervised Semantic Segmentation using Out-of-Distribution Data(使用分布外数据的弱监督语义分割)
paper | code

Self-supervised Image-specific Prototype Exploration for Weakly Supervised Semantic Segmentation(弱监督语义分割的自监督图像特定原型探索)
paper | code

Multi-class Token Transformer for Weakly Supervised Semantic Segmentation(用于弱监督语义分割的多类token Transformer)
paper | code

Cross Language Image Matching for Weakly Supervised Semantic Segmentation(用于弱监督语义分割的跨语言图像匹配)
paper

Learning Affinity from Attention: End-to-End Weakly-Supervised Semantic Segmentation with Transformers(从注意力中学习亲和力：使用 Transformers 的端到端弱监督语义分割)
paper | code

ST++: Make Self-training Work Better for Semi-supervised Semantic Segmentation(让自我训练更好地用于半监督语义分割)
keywords: Semi-supervised learning, Semantic segmentation, Uncertainty estimation
paper | code

Class Re-Activation Maps for Weakly-Supervised Semantic Segmentation(弱监督语义分割的类重新激活图)
paper | code

实例分割(Instance Segmentation)

E2EC: An End-to-End Contour-based Method for High-Quality High-Speed Instance Segmentation(一种基于端到端轮廓的高质量高速实例分割方法)
paper | code

Efficient Video Instance Segmentation via Tracklet Query and Proposal(通过 Tracklet Query 和 Proposal 进行高效的视频实例分割)
paper

SoftGroup for 3D Instance Segmentation on Point Clouds(用于点云上的 3D 实例分割)
keywords: 3D Vision, Point Clouds, Instance Segmentation
paper | code

估计(Estimation)

姿态估计(Human Pose Estimation)

Forecasting Characteristic 3D Poses of Human Actions()
paper

Learning Local-Global Contextual Adaptation for Multi-Person Pose Estimation(学习用于多人姿势估计的局部-全局上下文适应)
keywords: Top-Down Pose Estimation(从上至下姿态估计), Limb-based Grouping, Direct Regression
paper

MixSTE: Seq2seq Mixed Spatio-Temporal Encoder for 3D Human Pose Estimation in Video(用于视频中 3D 人体姿势估计的 Seq2seq 混合时空编码器)
paper

光流/位姿/运动估计(Optical Flow/Pose/Motion Estimation)

CPPF: Towards Robust Category-Level 9D Pose Estimation in the Wild(CPPF：在野外实现稳健的类别级 9D 位姿估计)
paper | code

OVE6D: Object Viewpoint Encoding for Depth-based 6D Object Pose Estimation(用于基于深度的 6D 对象姿态估计的对象视点编码)
paper | code

CamLiFlow: Bidirectional Camera-LiDAR Fusion for Joint Optical Flow and Scene Flow Estimation(用于联合光流和场景流估计的双向相机-LiDAR 融合)
paper

深度估计(Depth Estimation)

ChiTransformer:Towards Reliable Stereo from Cues(从线索走向可靠的立体声)
paper

Rethinking Depth Estimation for Multi-View Stereo: A Unified Representation and Focal Loss(重新思考多视图立体的深度估计：统一表示和焦点损失)
paper | code

ITSA: An Information-Theoretic Approach to Automatic Shortcut Avoidance and Domain Generalization in Stereo Matching Networks(立体匹配网络中自动避免捷径和域泛化的信息论方法)
keywords: Learning-based Stereo Matching Networks, Single Domain Generalization, Shortcut Learning
paper

Attention Concatenation Volume for Accurate and Efficient Stereo Matching(用于精确和高效立体匹配的注意力连接体积)
keywords: Stereo Matching, cost volume construction, cost aggregation
paper | code

Occlusion-Aware Cost Constructor for Light Field Depth Estimation(光场深度估计的遮挡感知成本构造函数)
paper | [code](https://github.com/YingqianWang/OACC- Net)

NeW CRFs: Neural Window Fully-connected CRFs for Monocular Depth Estimation(用于单目深度估计的神经窗口全连接 CRF)
keywords: Neural CRFs for Monocular Depth
paper

OmniFusion: 360 Monocular Depth Estimation via Geometry-Aware Fusion(通过几何感知融合进行 360 度单目深度估计)
keywords: monocular depth estimation(单目深度估计),transformer
paper

图像处理(Image Processing)

超分辨率(Super Resolution)

Reflash Dropout in Image Super-Resolution(图像超分辨率中的闪退dropout)
paper

Towards Bidirectional Arbitrary Image Rescaling: Joint Optimization and Cycle Idempotence(迈向双向任意图像缩放：联合优化和循环幂等)
paper

HyperTransformer: A Textural and Spectral Feature Fusion Transformer for Pansharpening(用于全色锐化的纹理和光谱特征融合Transformer)
paper | code

HDNet: High-resolution Dual-domain Learning for Spectral Compressive Imaging(光谱压缩成像的高分辨率双域学习)
keywords: HSI Reconstruction, Self-Attention Mechanism, Image Frequency Spectrum Analysis
paper

图像复原/图像增强/图像重建(Image Restoration/Image Reconstruction)

Event-based Video Reconstruction via Potential-assisted Spiking Neural Network(通过电位辅助尖峰神经网络进行基于事件的视频重建)
paper

图像去噪/去模糊/去雨去雾(Image Denoising)

E-CIR: Event-Enhanced Continuous Intensity Recovery(事件增强的连续强度恢复)
keywords: Event-Enhanced Deblurring, Video Representation
paper | code

图像编辑/图像修复(Image Edit/Inpainting)

HairCLIP: Design Your Hair by Text and Reference Image(通过文本和参考图像设计你的头发)
keywords: Language-Image Pre-Training (CLIP), Generative Adversarial Networks
paper

Incremental Transformer Structure Enhanced Image Inpainting with Masking Positional Encoding(增量transformer结构增强图像修复与掩蔽位置编码)
keywords: Image Inpainting, Transformer, Image Generation
paper | code

图像翻译(Image Translation)

FlexIT: Towards Flexible Semantic Image Translation(迈向灵活的语义图像翻译)
paper

Exploring Patch-wise Semantic Relation for Contrastive Learning in Image-to-Image Translation Tasks(探索图像到图像翻译任务中对比学习的补丁语义关系)
keywords: image translation, knowledge transfer,Contrastive learning
paper

风格迁移(Style Transfer)

Style-ERD: Responsive and Coherent Online Motion Style Transfer(响应式和连贯的在线运动风格迁移)
paper

CLIPstyler: Image Style Transfer with a Single Text Condition(具有单一文本条件的图像风格转移)
keywords: Style Transfer, Text-guided synthesis, Language-Image Pre-Training (CLIP)
paper

人脸(Face)

人脸识别/检测(Facial Recognition/Detection)

An Efficient Training Approach for Very Large Scale Face Recognition(一种有效的超大规模人脸识别训练方法)
paper | code

人脸生成/合成/重建/编辑(Face Generation/Face Synthesis/Face Reconstruction/Face Editing)

Sparse to Dense Dynamic 3D Facial Expression Generation(稀疏到密集的动态 3D 面部表情生成)
keywords: Facial expression generation, 4D face generation, 3D face modeling
paper

人脸伪造/反欺骗(Face Forgery/Face Anti-Spoofing)

Domain Generalization via Shuffled Style Assembly for Face Anti-Spoofing(通过 Shuffled Style Assembly 进行域泛化以进行人脸反欺骗)
paper | code

Voice-Face Homogeneity Tells Deepfake
paper | code

Protecting Celebrities with Identity Consistency Transformer(使用身份一致性transformer保护名人)
paper

目标跟踪(Object Tracking)

Iterative Corresponding Geometry: Fusing Region and Depth for Highly Efficient 3D Tracking of Textureless Objects(迭代对应几何：融合区域和深度以实现无纹理对象的高效 3D 跟踪)
paper | [code](https://github.com/DLR- RM/3DObjectTracking)

TCTrack: Temporal Contexts for Aerial Tracking(空中跟踪的时间上下文)
paper | code

Beyond 3D Siamese Tracking: A Motion-Centric Paradigm for 3D Single Object Tracking in Point Clouds(超越 3D 连体跟踪：点云中 3D 单对象跟踪的以运动为中心的范式)
keywords: Single Object Tracking, 3D Multi-object Tracking / Detection, Spatial-temporal Learning on Point Clouds
paper

Correlation-Aware Deep Tracking(相关感知深度跟踪)
paper

图像&视频检索/视频理解(Image&Video Retrieval/Video Understanding)

BEVT: BERT Pretraining of Video Transformers(视频Transformer的 BERT 预训练)
keywords: Video understanding, Vision transformers, Self-supervised representation learning, BERT pretraining
paper | code

行为识别/动作识别/检测/分割/定位(Action/Activity Recognition)

OpenTAL: Towards Open Set Temporal Action Localization(走向开放集时间动作定位)
paper | code

End-to-End Semi-Supervised Learning for Video Action Detection(视频动作检测的端到端半监督学习)
paper

Learnable Irrelevant Modality Dropout for Multimodal Action Recognition on Modality-Specific Annotated Videos(模态特定注释视频上多模态动作识别的可学习不相关模态丢失)
paper

Weakly Supervised Temporal Action Localization via Representative Snippet Knowledge Propagation(通过代表性片段知识传播的弱监督时间动作定位)
paper | code

Colar: Effective and Efficient Online Action Detection by Consulting Exemplars(通过咨询示例进行有效且高效的在线动作检测)
keywords: Online action detection(在线动作检测)
paper

图像/视频字幕(Image/Video Caption)

Hierarchical Modular Network for Video Captioning(用于视频字幕的分层模块化网络)
paper | code

X -Trans2Cap: Cross-Modal Knowledge Transfer using Transformer for 3D Dense Captioning(使用 Transformer 进行 3D 密集字幕的跨模式知识迁移)
paper

医学影像(Medical Imaging)

Adaptive Early-Learning Correction for Segmentation from Noisy Annotations(从噪声标签中分割的自适应早期学习校正)
keywords: medical-imaging segmentation, Noisy Annotations
paper | code

Temporal Context Matters: Enhancing Single Image Prediction with Disease Progression Representations(时间上下文很重要：使用疾病进展表示增强单图像预测)
keywords: Self-supervised Transformer, Temporal modeling of disease progression
paper

GAN/生成式/对抗式(GAN/Generative/Adversarial)

Practical Evaluation of Adversarial Robustness via Adaptive Auto Attack(通过自适应自动攻击对对抗鲁棒性的实际评估)
paper

Frequency-driven Imperceptible Adversarial Attack on Semantic Similarity(对语义相似性的频率驱动的不可察觉的对抗性攻击)
paper

Shadows can be Dangerous: Stealthy and Effective Physical-world Adversarial Attack by Natural Phenomenon(阴影可能很危险：自然现象的隐秘而有效的物理世界对抗性攻击)
paper

Protecting Facial Privacy: Generating Adversarial Identity Masks via Style-robust Makeup Transfer(保护面部隐私：通过风格稳健的化妆转移生成对抗性身份面具)
paper

Adversarial Texture for Fooling Person Detectors in the Physical World(物理世界中愚弄人探测器的对抗性纹理)
paper

Label-Only Model Inversion Attacks via Boundary Repulsion(通过边界排斥的仅标签模型反转攻击)
paper

图像生成/图像合成/视频合成(Image Generation/Image Synthesis/Video Generation)

Dynamic Dual-Output Diffusion Models(动态双输出扩散模型)
paper

Exploring Dual-task Correlation for Pose Guided Person Image Generation(探索姿势引导人物图像生成的双任务相关性)
paper | code

Show Me What and Tell Me How: Video Synthesis via Multimodal Conditioning(告诉我什么并告诉我如何：通过多模式调节进行视频合成)
paper | code

3D Shape Variational Autoencoder Latent Disentanglement via Mini-Batch Feature Swapping for Bodies and Faces(基于小批量特征交换的三维形状变化自动编码器潜在解纠缠)
paper | code

Interactive Image Synthesis with Panoptic Layout Generation(具有全景布局生成的交互式图像合成)
paper

Polarity Sampling: Quality and Diversity Control of Pre-Trained Generative Networks via Singular Values(极性采样：通过奇异值对预训练生成网络的质量和多样性控制)
paper

Autoregressive Image Generation using Residual Quantization(使用残差量化的自回归图像生成)
paper | code

三维视觉(3D Vision)

X -Trans2Cap: Cross-Modal Knowledge Transfer using Transformer for 3D Dense Captioning(使用 Transformer 进行 3D 密集字幕的跨模式知识迁移)
paper

点云(Point Cloud)

Contrastive Boundary Learning for Point Cloud Segmentation(点云分割的对比边界学习)
paper | code

Shape-invariant 3D Adversarial Point Clouds(形状不变的 3D 对抗点云)
paper | code

ART-Point: Improving Rotation Robustness of Point Cloud Classifiers via Adversarial Rotation(通过对抗旋转提高点云分类器的旋转鲁棒性)
paper

Lepard: Learning partial point cloud matching in rigid and deformable scenes(Lepard：在刚性和可变形场景中学习部分点云匹配)
paper | code

A Unified Query-based Paradigm for Point Cloud Understanding(一种基于统一查询的点云理解范式)
paper

CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding(用于 3D 点云理解的自监督跨模态对比学习)
keywords: Self-Supervised Learning, Contrastive Learning, 3D Point Cloud, Representation Learning, Cross-Modal Learning
paper | code

三维重建(3D Reconstruction)

Neural Face Identification in a 2D Wireframe Projection of a Manifold Object(流形对象的二维线框投影中的神经人脸识别)
paper | [code](https://manycore- research.github.io/faceformer)

Generating 3D Bio-Printable Patches Using Wound Segmentation and Reconstruction to Treat Diabetic Foot Ulcers()
keywords: semantic segmentation, 3D reconstruction, 3D bio-printers
paper

H4D: Human 4D Modeling by Learning Neural Compositional Representation(通过学习神经组合表示进行人体 4D 建模)
keywords: 4D Representation(4D 表征),Human Body Estimation(人体姿态估计),Fine-grained Human Reconstruction(细粒度人体重建)
paper

场景重建/新视角合成(Novel View Synthesis)

Point-NeRF: Point-based Neural Radiance Fields(基于点的神经辐射场)
paper | code

CLIP-NeRF: Text-and-Image Driven Manipulation of Neural Radiance Fields(文本和图像驱动的神经辐射场操作)
keywords: NeRF, Image Generation and Manipulation, Language-Image Pre-Training (CLIP)
paper | code

Point-NeRF: Point-based Neural Radiance Fields(基于点的神经辐射场)
paper | code

模型压缩(Model Compression)

知识蒸馏(Knowledge Distillation)

Knowledge Distillation as Efficient Pre-training: Faster Convergence, Higher Data-efficiency, and Better Transferability(知识蒸馏作为高效的预训练：更快的收敛、更高的数据效率和更好的可迁移性)
paper | code

Focal and Global Knowledge Distillation for Detectors(探测器的焦点和全局知识蒸馏)
keywords: Object Detection, Knowledge Distillation
paper | code

量化(Quantization)

IntraQ: Learning Synthetic Images with Intra-Class Heterogeneity for Zero-Shot Network Quantization(学习具有类内异质性的合成图像以进行零样本网络量化)
paper | code

神经网络结构设计(Neural Network Structure Design)

BatchFormer: Learning to Explore Sample Relationships for Robust Representation Learning(学习探索样本关系以进行鲁棒表征学习)
keywords: sample relationship, data scarcity learning, Contrastive Self-Supervised Learning, long-tailed recognition, zero-shot learning, domain generalization, self-supervised learning
paper | code

CNN

DeltaCNN: End-to-End CNN Inference of Sparse Frame Differences in Videos(视频中稀疏帧差异的端到端 CNN 推断)
keywords: sparse convolutional neural network, video inference accelerating
paper

A ConvNet for the 2020s
paper | code

Transformer

Delving Deep into the Generalization of Vision Transformers under Distribution Shifts(深入研究分布变化下的视觉Transformer的泛化)
keywords: out-of-distribution (OOD) generalization, Vision Transformers
paper | code

Mobile-Former: Bridging MobileNet and Transformer(连接 MobileNet 和 Transformer)
keywords: Light-weight convolutional neural networks(轻量卷积神经网络),Combination of CNN and ViT
paper

神经网络架构搜索(NAS)

β-DARTS: Beta-Decay Regularization for Differentiable Architecture Search(可微架构搜索的 Beta-Decay 正则化)
paper

MLP

An Image Patch is a Wave: Quantum Inspired Vision MLP(图像补丁是波浪：量子启发的视觉 MLP)
paper | code

数据处理(Data Processing)

数据增广(Data Augmentation)

TeachAugment: Data Augmentation Optimization Using Teacher Knowledge(使用教师知识进行数据增强优化)
paper | code

3D Common Corruptions and Data Augmentation(3D 常见损坏和数据增强)
keywords: Data Augmentation, Image restoration, Photorealistic image synthesis
paper

图像压缩(Image Compression)

Neural Data-Dependent Transform for Learned Image Compression(用于学习图像压缩的神经数据相关变换)
paper | [code](https://dezhao-wang.github.io/Neural- Syntax-Website/)

异常检测(Anomaly Detection)

Generative Cooperative Learning for Unsupervised Video Anomaly Detection(用于无监督视频异常检测的生成式协作学习)
paper

Self-Supervised Predictive Convolutional Attentive Block for Anomaly Detection(用于异常检测的自监督预测卷积注意力块)(论文暂未上传)
paper | code

模型训练/泛化(Model Training/Generalization)

Towards Efficient and Scalable Sharpness-Aware Minimization(迈向高效和可扩展的锐度感知最小化)
keywords: Sharp Local Minima, Large-Batch Training
paper

CAFE: Learning to Condense Dataset by Aligning Features(通过对齐特征学习压缩数据集)
keywords: dataset condensation, coreset selection, generative models
paper | code

The Devil is in the Margin: Margin-based Label Smoothing for Network Calibration(魔鬼在边缘：用于网络校准的基于边缘的标签平滑)
paper | code

DN-DETR: Accelerate DETR Training by Introducing Query DeNoising(通过引入查询去噪加速 DETR 训练)
keywords: Detection Transformer
paper | code

长尾分布(Long-Tailed Distribution)

Targeted Supervised Contrastive Learning for Long-Tailed Recognition(用于长尾识别的有针对性的监督对比学习)
keywords: Long-Tailed Recognition(长尾识别), Contrastive Learning(对比学习)
paper

图像特征提取与匹配(Image feature extraction and matching)

Probabilistic Warp Consistency for Weakly-Supervised Semantic Correspondences(弱监督语义对应的概率扭曲一致性)
paper | code

多模态学习(Multi-Modal Learning)

视觉-语言（Vision-language）

Conditional Prompt Learning for Vision-Language Models(视觉语言模型的条件提示学习)
paper | code

NLX-GPT: A Model for Natural Language Explanations in Vision and Vision-Language Tasks(视觉和视觉语言任务中的自然语言解释模型)
paper | code

**L-Verse: Bidirectional Generation Between Image and Text(图像和文本之间的双向生成) **(Oral Presentation)****
paper

HairCLIP: Design Your Hair by Text and Reference Image(通过文本和参考图像设计你的头发)
keywords: Language-Image Pre-Training (CLIP), Generative Adversarial Networks
paper

CLIP-NeRF: Text-and-Image Driven Manipulation of Neural Radiance Fields(文本和图像驱动的神经辐射场操作)
keywords: NeRF, Image Generation and Manipulation, Language-Image Pre-Training (CLIP)
paper | code

Vision-Language Pre-Training with Triple Contrastive Learning(三重对比学习的视觉语言预训练)
keywords: Vision-language representation learning, Contrastive Learning
paper | code

视觉预测(Vision-based Prediction)

Adaptive Trajectory Prediction via Transferable GNN(基于可迁移 GNN 的自适应轨迹预测)
paper

Towards Robust and Adaptive Motion Forecasting: A Causal Representation Perspective(迈向稳健和自适应运动预测：因果表示视角)
paper | code

How many Observations are Enough? Knowledge Distillation for Trajectory Forecasting(多少个观察就足够了？轨迹预测的知识蒸馏)
keywords: Knowledge Distillation, trajectory forecasting
paper

Motron: Multimodal Probabilistic Human Motion Forecasting(多模式概率人体运动预测)
paper

数据集(Dataset)

GrainSpace: A Large-scale Dataset for Fine-grained and Domain-adaptive Recognition of Cereal Grains(用于细粒度和域自适应识别谷物的大规模数据集)
paper

Kubric: A scalable dataset generator(Kubric：可扩展的数据集生成器)
paper | code

A Large-scale Comprehensive Dataset and Copy-overlap Aware Evaluation Protocol for Segment-level Video Copy Detection(用于分段级视频复制检测的大规模综合数据集和复制重叠感知评估协议)
paper

小样本学习/零样本学习(Few-shot Learning/Zero-shot Learning)

Learning to Affiliate: Mutual Centralized Learning for Few-shot Classification(小样本分类的相互集中学习)
paper

MSDN: Mutually Semantic Distillation Network for Zero-Shot Learning(用于零样本学习的相互语义蒸馏网络)
keywords: Zero-Shot Learning, Knowledge Distillation
paper | code

持续学习(Continual Learning/Life-long Learning)

On Generalizing Beyond Domains in Cross-Domain Continual Learning(关于跨域持续学习中的域外泛化)
paper

场景图(Scene Graph)

场景图生成(Scene Graph Generation)

Classification-Then-Grounding: Reformulating Video Scene Graphs as Temporal Bipartite Graphs(将视频场景图重新格式化为时间二分图)
keywords: Video Scene Graph Generation, Transformer, Video Grounding
paper | code

视觉定位(Visual Localization)

Spatial Commonsense Graph for Object Localisation in Partial Scenes(局部场景中对象定位的空间常识图)
paper | code

图像分类(Image Classification)

GlideNet: Global, Local and Intrinsic based Dense Embedding NETwork for Multi-category Attributes Prediction(用于多类别属性预测的基于全局、局部和内在的密集嵌入网络)
keywords: multi-label classification
paper | code

迁移学习/domain/自适应(Transfer Learning/Domain Adaptation)

How Well Do Sparse Imagenet Models Transfer?(稀疏 Imagenet 模型的迁移效果如何？)
paper

A Simple Multi-Modality Transfer Learning Baseline for Sign Language Translation(用于手语翻译的简单多模态迁移学习基线)
paper

Weakly Supervised Object Localization as Domain Adaption(作为域适应的弱监督对象定位)
keywords: Weakly Supervised Object Localization(WSOL), Multi-instance learning based WSOL, Separated-structure based WSOL, Domain Adaption
paper | code

度量学习(Metric Learning)

Enhancing Adversarial Robustness for Deep Metric Learning(增强深度度量学习的对抗鲁棒性)
keywords: Adversarial Attack, Adversarial Defense, Deep Metric Learning
paper

对比学习(Contrastive Learning)

Selective-Supervised Contrastive Learning with Noisy Labels(带有噪声标签的选择性监督对比学习)
paper | code

HCSC: Hierarchical Contrastive Selective Coding(分层对比选择性编码)
keywords: Self-supervised Representation Learning, Deep Clustering, Contrastive Learning
paper | code

Crafting Better Contrastive Views for Siamese Representation Learning(为连体表示学习制作更好的对比视图)
paper | code

元学习(Meta Learning)

What Matters For Meta-Learning Vision Regression Tasks?(元学习视觉回归任务的重要性是什么？)
paper

机器人(Robotic)

IFOR: Iterative Flow Minimization for Robotic Object Rearrangement(IFOR：机器人对象重排的迭代流最小化)
paper

自监督学习/半监督学习(Self-supervised Learning/Semi-supervised Learning)

Class-Aware Contrastive Semi-Supervised Learning(类感知对比半监督学习)
keywords: Semi-Supervised Learning, Self-Supervised Learning, Real-World Unlabeled Data Learning
paper

A study on the distribution of social biases in self-supervised learning visual models(自监督学习视觉模型中social biases分布的研究)
paper

神经网络可解释性(Neural Network Interpretability)

Do Explanations Explain? Model Knows Best(解释解释吗？模型最清楚)
paper

Interpretable part-whole hierarchies and conceptual-semantic relationships in neural networks(神经网络中可解释的部分-整体层次结构和概念语义关系)
paper

人群计数(Crowd Counting)

Boosting Crowd Counting via Multifaceted Attention(通过多方面注意提高人群计数)
paper | code

联邦学习(Federated Learning)

Differentially Private Federated Learning with Local Regularization and Sparsification(局部正则化和稀疏化的差分私有联邦学习)
paper

其他

**L-Verse: Bidirectional Generation Between Image and Text(图像和文本之间的双向生成) **(视觉语言表征学习)****
paper | code

Backbone

MPViT : Multi-Path Vision Transformer for Dense Prediction
paper | code

CLIP

PointCLIP: Point Cloud Understanding by CLIP
paper | code

Blended Diffusion for Text-driven Editing of Natural Images
paper | code

NAS

ISNAS-DIP: Image-Specific Neural Architecture Search for Deep Image Prior
paper | code

NeRF

Mip-NeRF 360: Unbounded Anti-Aliased Neural Radiance Fields
paper

NeRF in the Dark: High Dynamic Range View Synthesis from Noisy Raw Images
paper

Visual Transformer

Backbone

MPViT : Multi-Path Vision Transformer for Dense Prediction
paper | code

应用(Application)

Language-based Video Editing via Multi-Modal Multi-Level Transformer
paper | code

Embracing Single Stride 3D Object Detector with Sparse Transformer
paper | code

Spatio-temporal Relation Modeling for Few-shot Action Recognition
paper | code

Mask-guided Spectral-wise Transformer for Efficient Hyperspectral Image Reconstruction
paper | code

数据增强(Data Augmentation)

AlignMix: Improving representation by interpolating aligned features
paper | code

实例分割(Instance Segmentation)

自监督实例分割

FreeSOLO: Learning to Segment Objects without Annotations
paper | code

视频理解(Video Understanding)

行为识别(Action Recognition)

Spatio-temporal Relation Modeling for Few-shot Action Recognition
paper | code

图像编辑(Image Editing)

Blended Diffusion for Text-driven Editing of Natural Images
paper | code

Low-level Vision

ISNAS-DIP: Image-Specific Neural Architecture Search for Deep Image Prior
paper | code

超分辨率(Super-Resolution)

图像超分辨率(Image Super-Resolution)

Learning the Degradation Distribution for Blind Image Super-Resolution
paper | code

视频超分辨率(Video Super-Resolution)

BasicVSR++: Improving Video Super-Resolution with Enhanced Propagation and Alignment
paper | code

3D点云(3D Point Cloud)

PointCLIP: Point Cloud Understanding by CLIP
paper | code

3D目标检测(3D Object Detection)

Embracing Single Stride 3D Object Detector with Sparse Transformer
paper | code

3D人体姿态估计(3D Human Pose Estimation)

MHFormer: Multi-Hypothesis Transformer for 3D Human Pose Estimation
paper | code

3D语义场景补全(3D Semantic Scene Completion)

MonoScene: Monocular 3D Semantic Scene Completion
paper | code

3D重建(3D Reconstruction)

BANMo: Building Animatable 3D Neural Models from Many Casual Videos
paper | code

深度估计(Depth Estimation)

单目深度估计

Toward Practical Self-Supervised Monocular Indoor Depth Estimation
paper | code

人群计数(Crowd Counting)

Leveraging Self-Supervision for Cross-Domain Crowd Counting
paper | code

医学图像(Medical Image)

BoostMIS: Boosting Medical Image Semi-supervised Learning with Adaptive Pseudo Labeling and Informative Active Annotation
paper | code

场景图生成(Scene Graph Generation)

SGTR: End-to-end Scene Graph Generation with Transformer
paper | code

风格迁移(Style Transfer)

StyleMesh: Style Transfer for Indoor 3D Scene Reconstructions
paper | code

高光谱图像重建(Hyperspectral Image Reconstruction)

Mask-guided Spectral-wise Transformer for Efficient Hyperspectral Image Reconstruction
paper | code

水印(Watermarking)

Deep 3D-to-2D Watermarking: Embedding Messages in 3D Meshes and Extracting Them from 2D Renderings
paper | code

数据集(Datasets)

It's About Time: Analog Clock Reading in the Wild
paper | code

Toward Practical Self-Supervised Monocular Indoor Depth Estimation
paper | code

新任务(New Task)

Language-based Video Editing via Multi-Modal Multi-Level Transformer
paper | code

It's About Time: Analog Clock Reading in the Wild
paper | code

yanglee00 / cvpr-2022-papers Goto Github PK

cvpr-2022-papers's Introduction

CVPR2022 Papers (Papers/Codes/Demos)

分类目录：

检测

2D目标检测(2D Object Detection)

视频目标检测(Video Object Detection)

3D目标检测(3D object detection)

伪装目标检测(Camouflaged Object Detection)

关键点检测(Keypoint Detection)

车道线检测(Lane Detection)

分割(Segmentation)

全景分割(Panoptic Segmentation)

语义分割(Semantic Segmentation)

实例分割(Instance Segmentation)

估计(Estimation)

姿态估计(Human Pose Estimation)

光流/位姿/运动估计(Optical Flow/Pose/Motion Estimation)

深度估计(Depth Estimation)

图像处理(Image Processing)

超分辨率(Super Resolution)

图像复原/图像增强/图像重建(Image Restoration/Image Reconstruction)

图像去噪/去模糊/去雨去雾(Image Denoising)

图像编辑/图像修复(Image Edit/Inpainting)

图像翻译(Image Translation)

风格迁移(Style Transfer)

人脸(Face)

人脸识别/检测(Facial Recognition/Detection)

人脸生成/合成/重建/编辑(Face Generation/Face Synthesis/Face Reconstruction/Face Editing)

人脸伪造/反欺骗(Face Forgery/Face Anti-Spoofing)

目标跟踪(Object Tracking)

目标跟踪(Object Tracking)

图像&视频检索/视频理解(Image&Video Retrieval/Video Understanding)

图像&视频检索/视频理解(Image&Video Retrieval/Video Understanding)

行为识别/动作识别/检测/分割/定位(Action/Activity Recognition)

图像/视频字幕(Image/Video Caption)

医学影像(Medical Imaging)

医学影像(Medical Imaging)

GAN/生成式/对抗式(GAN/Generative/Adversarial)

GAN/生成式/对抗式(GAN/Generative/Adversarial)

图像生成/图像合成/视频合成(Image Generation/Image Synthesis/Video Generation)

图像生成/图像合成/视频合成(Image Generation/Image Synthesis/Video Generation)

三维视觉(3D Vision)

三维视觉(3D Vision)

点云(Point Cloud)

三维重建(3D Reconstruction)

场景重建/新视角合成(Novel View Synthesis)

模型压缩(Model Compression)

知识蒸馏(Knowledge Distillation)

量化(Quantization)

神经网络结构设计(Neural Network Structure Design)

神经网络结构设计(Neural Network Structure Design)

CNN

Transformer

神经网络架构搜索(NAS)

MLP

数据处理(Data Processing)

数据增广(Data Augmentation)

图像压缩(Image Compression)

异常检测(Anomaly Detection)

模型训练/泛化(Model Training/Generalization)

模型训练/泛化(Model Training/Generalization)

长尾分布(Long-Tailed Distribution)

图像特征提取与匹配(Image feature extraction and matching)

图像特征提取与匹配(Image feature extraction and matching)

多模态学习(Multi-Modal Learning)

视觉-语言（Vision-language）

视觉预测(Vision-based Prediction)

视觉预测(Vision-based Prediction)

数据集(Dataset)

数据集(Dataset)

小样本学习/零样本学习(Few-shot Learning/Zero-shot Learning)

小样本学习/零样本学习(Few-shot Learning/Zero-shot Learning)

持续学习(Continual Learning/Life-long Learning)

持续学习(Continual Learning/Life-long Learning)

场景图(Scene Graph)

场景图生成(Scene Graph Generation)

视觉定位(Visual Localization)

视觉定位(Visual Localization)

图像分类(Image Classification)