The cv-arxiv-daily's intro from ederev

Updated on 2024.08.22

Usage instructions: here

Table of Contents

Semantic Segmentation
Instance Segmentation
Panoptic Segmentation
Object Detection
Keypoint Detection
Open-Vocabulary
Image Captioning

Semantic Segmentation

Publish Date	Title	Authors	PDF	Code
2024-08-20	NeCo: Improving DINOv2's spatial representations in 19 GPU hours with Patch Neighbor Consistency	Valentinos Pariza et.al.	2408.11054	null
2024-08-20	CO2Wounds-V2: Extended Chronic Wounds Dataset From Leprosy Patients	Karen Sanchez et.al.	2408.10827	null
2024-08-20	Rethinking Video Segmentation with Masked Video Consistency: Did the Model Learn as Intended?	Chen Liang et.al.	2408.10627	null
2024-08-20	Subspace Prototype Guidance for Mitigating Class Imbalance in Point Cloud Semantic Segmentation	Jiawei Han et.al.	2408.10537	link
2024-08-19	Imbalance-Aware Culvert-Sewer Defect Segmentation Using an Enhanced Feature Pyramid Network	Rasha Alshawi et.al.	2408.10181	null
2024-08-19	Dynamic Label Injection for Imbalanced Industrial Defect Segmentation	Emanuele Caruso et.al.	2408.10031	link
2024-08-19	Detecting Adversarial Attacks in Semantic Segmentation via Uncertainty Estimation: A Deep Analysis	Kira Maag et.al.	2408.10021	null
2024-08-19	Segment-Anything Models Achieve Zero-shot Robustness in Autonomous Driving	Jun Yan et.al.	2408.09839	link
2024-08-18	OVOSE: Open-Vocabulary Semantic Segmentation in Event-Based Cameras	Muhammad Rameez Ur Rahman et.al.	2408.09424	link
2024-08-18	Elite360M: Efficient 360 Multi-task Learning via Bi-projection Fusion and Cross-task Collaboration	Hao Ai et.al.	2408.09336	null
2024-08-17	Cross-Species Data Integration for Enhanced Layer Segmentation in Kidney Pathology	Junchao Zhu et.al.	2408.09278	link
2024-08-17	GoodSAM++: Bridging Domain and Capacity Gaps via Segment Anything Model for Panoramic Semantic Segmentation	Weiming Zhang et.al.	2408.09115	null
2024-08-17	Depth-guided Texture Diffusion for Image Semantic Segmentation	Wei Sun et.al.	2408.09097	null
2024-08-15	5%>100%: Breaking Performance Shackles of Full Fine-Tuning on Visual Recognition Tasks	Dongshuo Yin et.al.	2408.08345	link
2024-08-14	MedTsLLM: Leveraging LLMs for Multimodal Medical Time Series Analysis	Nimeesha Chan et.al.	2408.07773	link
2024-08-15	MetaSeg: MetaFormer-based Global Contexts-aware Network for Efficient Semantic Segmentation	Beoungwoo Kang et.al.	2408.07576	link
2024-08-15	MagicFace: Training-free Universal-Style Human Image Customized Synthesis	Yibin Wang et.al.	2408.07433	null
2024-08-14	Segment Using Just One Example	Pratik Vora et.al.	2408.07393	null
2024-08-14	Ensemble architecture in polyp segmentation	Hao-Yun Hsu et.al.	2408.07262	link
2024-08-14	Leveraging Perceptual Scores for Dataset Pruning in Computer Vision Tasks	Raghavendra Singh et.al.	2408.07243	null
2024-08-14	Enhancing Autonomous Vehicle Perception in Adverse Weather through Image Augmentation during Semantic Segmentation Training	Ethan Kou et.al.	2408.07239	null
2024-08-13	ReCLIP++: Learn to Rectify the Bias of CLIP for Unsupervised Semantic Segmentation	Jingyun Wang et.al.	2408.06747	link
2024-08-10	Dilated Convolution with Learnable Spacings	Ismail Khalfaoui-Hassani et.al.	2408.06383	null
2024-08-12	Correlation Weighted Prototype-based Self-Supervised One-Shot Segmentation of Medical Images	Siladittya Manna et.al.	2408.06235	null
2024-08-12	A-BDD: Leveraging Data Augmentations for Safe Autonomous Driving in Adverse Weather and Lighting	Felix Assion et.al.	2408.06071	null
2024-08-12	Enhancing 3D Transformer Segmentation Model for Medical Image with Token-level Representation Learning	Xinrong Hu et.al.	2408.05889	link
2024-08-11	Seg-CycleGAN : SAR-to-optical image translation guided by a downstream task	Hannuo Zhang et.al.	2408.05777	null
2024-08-11	MacFormer: Semantic Segmentation with Fine Object Boundaries	Guoan Xu et.al.	2408.05699	null
2024-08-10	Multimodal generative semantic communication based on latent diffusion model	Weiqi Fu et.al.	2408.05455	null
2024-08-09	In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic Segmentation	Dahyun Kang et.al.	2408.04961	link
2024-08-09	ProxyCLIP: Proxy Attention Improves CLIP for Open-Vocabulary Segmentation	Mengcheng Lan et.al.	2408.04883	link
2024-08-09	Extracting Signal Electron Trajectories in the COMET Phase-I Cylindrical Drift Chamber Using Deep Learning	Fumihiro Kaneko et.al.	2408.04795	null
2024-08-08	SAM 2 in Robotic Surgery: An Empirical Evaluation for Robustness and Generalization in Surgical Video Segmentation	Jieming Yu et.al.	2408.04593	null
2024-08-08	SegXAL: Explainable Active Learning for Semantic Segmentation in Driving Scene Scenarios	Sriram Mandalika et.al.	2408.04482	null
2024-08-08	What could go wrong? Discovering and describing failure modes in computer vision	Gabriela Csurka et.al.	2408.04471	null
2024-08-07	CAS-ViT: Convolutional Additive Self-attention Vision Transformers for Efficient Mobile Applications	Tianfang Zhang et.al.	2408.03703	link
2024-08-07	SAM2-PATH: A better segment anything model for semantic segmentation in digital pathology	Mingya Zhang et.al.	2408.03651	link
2024-08-06	Post-Mortem Human Iris Segmentation Analysis with Deep Learning	Afzal Hossain et.al.	2408.03448	null
2024-08-06	Comb, Prune, Distill: Towards Unified Pruning for Vision Model Compression	Jonas Schmitt et.al.	2408.03046	link
2024-08-05	Perception Matters: Enhancing Embodied AI with Uncertainty-Aware Semantic Segmentation	Sai Prasanna et.al.	2408.02297	null
2024-08-05	Cross-Domain Semantic Segmentation on Inconsistent Taxonomy using VLMs	Jeongkee Lim et.al.	2408.02261	null
2024-08-05	Curriculum learning based pre-training using Multi-Modal Contrastive Masked Autoencoders	Muhammad Abdullah Jamal et.al.	2408.02245	null
2024-08-04	Pixel-Level Domain Adaptation: A New Perspective for Enhancing Weakly Supervised Semantic Segmentation	Ye Du et.al.	2408.02039	null
2024-08-03	Bayesian Active Learning for Semantic Segmentation	Sima Didari et.al.	2408.01694	null
2024-08-03	A Comparative Analysis of CNN-based Deep Learning Models for Landslide Detection	Omkar Oak et.al.	2408.01692	null
2024-08-03	Leveraging GNSS and Onboard Visual Data from Consumer Vehicles for Robust Road Network Estimation	Balázs Opra et.al.	2408.01640	null
2024-08-02	Multi-Unit Floor Plan Recognition and Reconstruction Using Improved Semantic Segmentation of Raster-Wise Floor Plans	Lukas Kratochvila et.al.	2408.01526	null
2024-08-02	Balanced Residual Distillation Learning for 3D Point Cloud Class-Incremental Semantic Segmentation	Yuanzhi Su et.al.	2408.01356	null
2024-08-02	StitchFusion: Weaving Any Visual Modalities to Enhance Multimodal Semantic Segmentation	Bingyu Li et.al.	2408.01343	null
2024-08-02	Visible-Thermal Multiple Object Tracking: Large-scale Video Dataset and Progressive Fusion Approach	Yabin Zhu et.al.	2408.00969	link
2024-08-01	Collaborative Vision-Text Representation Optimizing for Open-Vocabulary Segmentation	Siyu Jiao et.al.	2408.00744	link
2024-08-01	Synthetic dual image generation for reduction of labeling efforts in semantic segmentation of micrographs with a customized metric function	Matias Oscar Volman Stern et.al.	2408.00707	null
2024-08-01	AMAES: Augmented Masked Autoencoder Pretraining on Public Brain MRI Data for 3D-Native Segmentation	Asbjørn Munk et.al.	2408.00640	null
2024-08-01	SegStitch: Multidimensional Transformer for Robust and Efficient Medical Imaging Segmentation	Shengbo Tan et.al.	2408.00496	link
2024-07-31	Open-Vocabulary Audio-Visual Semantic Segmentation	Ruohao Guo et.al.	2407.21721	null
2024-07-31	MTA-CLIP: Language-Guided Semantic Segmentation with Mask-Text Alignment	Anurag Das et.al.	2407.21654	null
2024-07-31	Small Object Few-shot Segmentation for Vision-based Industrial Inspection	Zilong Zhang et.al.	2407.21351	link
2024-07-31	On-the-fly Point Feature Representation for Point Clouds Analysis	Jiangyi Wang et.al.	2407.21335	null
2024-07-31	Fine-grained Metrics for Point Cloud Semantic Segmentation	Zhuheng Lu et.al.	2407.21289	null
2024-07-30	PLANesT-3D: A new annotated dataset for segmentation of 3D plant point clouds	Kerem Mertoğlu et.al.	2407.21150	null
2024-07-30	Learning Ordinality in Semantic Segmentation	Rafael Cristino et.al.	2407.20959	null
2024-07-29	Improving 2D Feature Representations by 3D-Aware Fine-Tuning	Yuanwen Yue et.al.	2407.20229	null
2024-07-29	Background Semantics Matter: Cross-Task Feature Exchange Network for Clustered Infrared Small Target Detection With Sky-Annotated Dataset	Yimian Dai et.al.	2407.20078	link
2024-07-29	Language-driven Grasp Detection with Mask-guided Attention	Tuan Van Vo et.al.	2407.19877	null
2024-07-29	Rethinking RGB-D Fusion for Semantic Segmentation in Surgical Datasets	Muhammad Abdullah Jamal et.al.	2407.19714	null
2024-07-29	ALEN: A Dual-Approach for Uniform and Non-Uniform Low-Light Image Enhancement	Ezequiel Perez-Zarate et.al.	2407.19708	link
2024-07-28	ASI-Seg: Audio-Driven Surgical Instrument Segmentation with Surgeon Intention Understanding	Zhen Chen et.al.	2407.19435	link
2024-07-27	Ensembling convolutional neural networks for human skin segmentation	Patryk Kuban et.al.	2407.19310	null
2024-07-27	Sewer Image Super-Resolution with Depth Priors and Its Lightweight Network	Gang Pan et.al.	2407.19271	null
2024-07-26	Sparse Refinement for Efficient High-Resolution Semantic Segmentation	Zhijian Liu et.al.	2407.19014	null
2024-07-29	Learning Spectral-Decomposed Tokens for Domain Generalized Semantic Segmentation	Jingjun Yi et.al.	2407.18568	null
2024-07-25	Taxonomy-Aware Continual Semantic Segmentation in Hyperbolic Spaces for Open-World Perception	Julia Hindel et.al.	2407.18145	null
2024-07-25	TiCoSS: Tightening the Coupling between Semantic Segmentation and Stereo Matching within A Joint Learning Framework	Guanfeng Tang et.al.	2407.18038	null
2024-07-25	Segmentation-guided MRI reconstruction for meaningfully diverse reconstructions	Jan Nikolas Morshuis et.al.	2407.18026	link
2024-07-24	Embedding-Free Transformer with Inference Spatial Reduction for Efficient Semantic Segmentation	Hyunwoo Yu et.al.	2407.17261	link
2024-07-24	Trans2Unet: Neural fusion for Nuclei Semantic Segmentation	Dinh-Phu Tran et.al.	2407.17181	null
2024-07-24	PiPa++: Towards Unification of Domain Adaptive Semantic Segmentation via Self-supervised Learning	Mu Chen et.al.	2407.17101	null
2024-07-25	Enhancing Environmental Monitoring through Multispectral Imaging: The WasteMS Dataset for Semantic Segmentation of Lakeside Waste	Qinfeng Zhu et.al.	2407.17028	link
2024-07-24	Progressive Query Refinement Framework for Bird's-Eye-View Semantic Segmentation from Surrounding Images	Dooseop Choi et.al.	2407.17003	link
2024-07-23	Deformable Convolution Based Road Scene Semantic Segmentation of Fisheye Images in Autonomous Driving	Anam Manzoor et.al.	2407.16647	null
2024-07-23	Deep Bayesian segmentation for colon polyps: Well-calibrated predictions in medical imaging	Daniela L. Ramos et.al.	2407.16608	null
2024-07-23	Augmented Efficiency: Reducing Memory Footprint and Accelerating Inference for 3D Semantic Segmentation through Hybrid Vision	Aditya Krishnan et.al.	2407.16102	null
2024-07-22	MILAN: Milli-Annotations for Lidar Semantic Segmentation	Nermin Samet et.al.	2407.15797	null
2024-07-22	Diffusion for Out-of-Distribution Detection on Road Scenes and Beyond	Silvio Galesso et.al.	2407.15739	link
2024-07-22	MSSPlace: Multi-Sensor Place Recognition with Visual and Text Semantics	Alexander Melekhin et.al.	2407.15663	link
2024-07-22	Learning at a Glance: Towards Interpretable Data-limited Continual Semantic Segmentation via Semantic-Invariance Modelling	Bo Yuan et.al.	2407.15429	link
2024-07-22	Is user feedback always informative? Retrieval Latent Defending for Semi-Supervised Domain Adaptation without Source Data	Junha Song et.al.	2407.15383	null
2024-07-21	Point Transformer V3 Extreme: 1st Place Solution for 2024 Waymo Open Dataset Challenge in Semantic Segmentation	Xiaoyang Wu et.al.	2407.15282	null
2024-07-20	Downstream-Pretext Domain Knowledge Traceback for Active Learning	Beichen Zhang et.al.	2407.14720	null
2024-07-19	Panoptic Segmentation of Mammograms with Text-To-Image Diffusion Model	Kun Zhao et.al.	2407.14326	null
2024-07-19	Early Preparation Pays Off: New Classifier Pre-tuning for Class Incremental Semantic Segmentation	Zhengyuan Xie et.al.	2407.14142	link
2024-07-19	GaussianBeV: 3D Gaussian Representation meets Perception Models for BeV Segmentation	Florian Chabot et.al.	2407.14108	null
2024-07-18	Many Perception Tasks are Highly Redundant Functions of their Input Data	Rahul Ramesh et.al.	2407.13841	null
2024-07-18	GroupMamba: Parameter-Efficient and Accurate Group Visual State Space Model	Abdelrahman Shaker et.al.	2407.13772	link
2024-07-18	SegPoint: Segment Any Point Cloud via Large Language Model	Shuting He et.al.	2407.13761	null
2024-07-18	MeshSegmenter: Zero-Shot Mesh Semantic Segmentation via Texture Synthesis	Ziming Zhong et.al.	2407.13675	link
2024-07-18	Open-Vocabulary 3D Semantic Segmentation with Text-to-Image Diffusion Models	Xiaoyu Zhu et.al.	2407.13642	null
2024-07-18	FADE: A Task-Agnostic Upsampling Operator for Encoder-Decoder Architectures	Hao Lu et.al.	2407.13500	link
2024-07-18	FREST: Feature RESToration for Semantic Segmentation under Multiple Adverse Conditions	Sohyun Lee et.al.	2407.13437	null
2024-07-18	Lightweight Uncertainty Quantification with Simplex Semantic Segmentation for Terrain Traversability	Judith Dijk et.al.	2407.13392	null
2024-07-18	Learning from the Web: Language Drives Weakly-Supervised Incremental Learning for Semantic Segmentation	Chang Liu et.al.	2407.13363	link
2024-07-18	Make a Strong Teacher with Label Assistance: A Novel Knowledge Distillation Approach for Semantic Segmentation	Shoumeng Qiu et.al.	2407.13254	link
2024-07-18	OE-BevSeg: An Object Informed and Environment Aware Multimodal Framework for Bird's-eye-view Vehicle Semantic Segmentation	Jian Sun et.al.	2407.13137	null
2024-07-17	Weighting Pseudo-Labels via High-Activation Feature Index Similarity and Object Detection for Semi-Supervised Segmentation	Prantik Howlader et.al.	2407.12630	link
2024-07-17	Instance-wise Uncertainty for Class Imbalance in Semantic Segmentation	Luís Almeida et.al.	2407.12609	null
2024-07-18	Benchmarking Robust Self-Supervised Learning Across Diverse Downstream Tasks	Antoni Kowalczuk et.al.	2407.12588	link
2024-07-17	Dual-level Adaptive Self-Labeling for Novel Class Discovery in Point Cloud Segmentation	Ruijie Xu et.al.	2407.12489	link
2024-07-17	Progressive Proxy Anchor Propagation for Unsupervised Semantic Segmentation	Hyun Seok Seong et.al.	2407.12463	link
2024-07-17	ClearCLIP: Decomposing CLIP Representations for Dense Vision-Language Inference	Mengcheng Lan et.al.	2407.12442	null
2024-07-17	Serialized Point Mamba: A Serialized Point Cloud Mamba Segmentation Model	Tao Wang et.al.	2407.12319	null
2024-07-16	FoodMem: Near Real-time and Precise Food Video Segmentation	Ahmad AlMughrabi et.al.	2407.12121	null
2024-07-16	Mitigating Background Shift in Class-Incremental Semantic Segmentation	Gilhan Park et.al.	2407.11859	link
2024-07-16	Stepping Stones: A Progressive Training Strategy for Audio-Visual Semantic Segmentation	Juncheng Ma et.al.	2407.11820	null
2024-07-16	XEdgeAI: A Human-centered Industrial Inspection Framework with Data-centric Explainable Edge AI Approach	Truong Thanh Hung Nguyen et.al.	2407.11771	null
2024-07-16	OAM-TCD: A globally diverse dataset of high-resolution tree cover maps	Josh Veitch-Michaelis et.al.	2407.11743	link
2024-07-16	SFPNet: Sparse Focal Point Network for Semantic Segmentation on General LiDAR Point Clouds	Yanbo Wang et.al.	2407.11569	link
2024-07-16	Leveraging Segment Anything Model in Identifying Buildings within Refugee Camps (SAM4Refugee) from Satellite Imagery for Humanitarian Operations	Yunya Gao et.al.	2407.11381	link
2024-07-16	Learning Modality-agnostic Representation for Semantic Segmentation from Any Modalities	Xu Zheng et.al.	2407.11351	null
2024-07-16	Centering the Value of Every Modality: Towards Efficient and Resilient Modality-agnostic Semantic Segmentation	Xu Zheng et.al.	2407.11344	null
2024-07-16	TCFormer: Visual Recognition via Token Clustering Transformer	Wang Zeng et.al.	2407.11321	link
2024-07-15	Distributed Semantic Segmentation with Efficient Joint Source and Task Decoding	Danish Nazir et.al.	2407.11224	null
2024-07-15	No Train, all Gain: Self-Supervised Gradients Improve Deep Frozen Representations	Walter Simoncini et.al.	2407.10964	link
2024-07-15	APC: Adaptive Patch Contrast for Weakly Supervised Semantic Segmentation	Wangyu Wu et.al.	2407.10649	null
2024-07-15	Automated Label Unification for Multi-Dataset Semantic Segmentation with GNNs	Rong Ma et.al.	2407.10534	null
2024-07-14	Shape2Scene: 3D Scene Representation Learning Through Pre-training on Shape Data	Tuo Feng et.al.	2407.10200	link
2024-07-14	RAPiD-Seg: Range-Aware Pointwise Distance Distribution Networks for 3D LiDAR Segmentation	Li Li et.al.	2407.10159	link
2024-07-14	HSFusion: A high-level vision task-driven infrared and visible image fusion network via semantic and geometric domain transformation	Chengjie Jiang et.al.	2407.10047	null
2024-07-13	Background Adaptation with Residual Modeling for Exemplar-Free Class-Incremental Semantic Segmentation	Anqi Zhang et.al.	2407.09838	null
2024-07-13	Enhancing Semantic Segmentation with Adaptive Focal Loss: A Novel Approach	Md Rakibul Islam et.al.	2407.09828	null
2024-07-13	3D Weakly Supervised Semantic Segmentation with 2D Vision-Language Guidance	Xiaoxu Xu et.al.	2407.09826	link
2024-07-13	TASeg: Temporal Aggregation Network for LiDAR Semantic Segmentation	Xiaopei Wu et.al.	2407.09751	null
2024-07-12	FANet: Feature Amplification Network for Semantic Segmentation in Cluttered Background	Muhammad Ali et.al.	2407.09379	link
2024-07-12	Salt & Pepper Heatmaps: Diffusion-informed Landmark Detection Strategy	Julian Wyatt et.al.	2407.09192	null
2024-07-12	Evaluating the Adversarial Robustness of Semantic Segmentation: Trying Harder Pays Off	Levente Halmosi et.al.	2407.09150	link
2024-07-12	Cs2K: Class-specific and Class-shared Knowledge Guidance for Incremental Semantic Segmentation	Wei Cong et.al.	2407.09047	null
2024-07-12	Textual Query-Driven Mask Transformer for Domain Generalized Segmentation	Byeonghyun Pak et.al.	2407.09033	null
2024-07-12	Global Attention-Guided Dual-Domain Point Cloud Feature Learning for Classification and Segmentation	Zihao Li et.al.	2407.08994	null
2024-07-11	Explore the Potential of CLIP for Training-Free Open Vocabulary Semantic Segmentation	Tong Shao et.al.	2407.08268	link
2024-07-11	Enrich the content of the image Using Context-Aware Copy Paste	Qiushi Guo et.al.	2407.08151	null
2024-07-10	MambaVision: A Hybrid Mamba-Transformer Vision Backbone	Ali Hatamizadeh et.al.	2407.08083	link
2024-07-10	Satellite Image Time Series Semantic Change Detection: Novel Architecture and Analysis of Domain Shift	Elliot Vincent et.al.	2407.07616	link
2024-07-10	H-FCBFormer Hierarchical Fully Convolutional Branch Transformer for Occlusal Contact Segmentation with Articulating Paper	Ryan Banks et.al.	2407.07604	link
2024-07-11	Trainable Highly-expressive Activation Functions	Irit Chelly et.al.	2407.07564	link
2024-07-10	Deformable-Heatmap-Segmentation for Automobile Visual Perception	Hongyu Jin et.al.	2407.07493	null
2024-07-10	Exploring the Untouched Sweeps for Conflict-Aware 3D Segmentation Pretraining	Tianfang Sun et.al.	2407.07465	null
2024-07-11	HAFormer: Unleashing the Power of Hierarchy-Aware Features for Lightweight Semantic Segmentation	Guoan Xu et.al.	2407.07441	null
2024-07-09	ItTakesTwo: Leveraging Peer Representations for Semi-supervised LiDAR Semantic Segmentation	Yuyuan Liu et.al.	2407.07171	link
2024-07-08	Training-free CryoET Tomogram Segmentation	Yizhou Zhao et.al.	2407.06833	link
2024-07-09	CycleSAM: One-Shot Surgical Scene Segmentation using Cycle-Consistent Feature Matching to Prompt SAM	Aditya Murali et.al.	2407.06795	null
2024-07-09	LuSNAR:A Lunar Segmentation, Navigation and Reconstruction Dataset based on Muti-sensor for Autonomous Exploration	Jiayi Liu et.al.	2407.06512	link
2024-07-08	Leveraging image captions for selective whole slide image annotation	Jingna Qiu et.al.	2407.06363	link
2024-07-08	Object-Oriented Material Classification and 3D Clustering for Improved Semantic Perception and Mapping in Mobile Robots	Siva Krishna Ravipati et.al.	2407.06077	link
2024-07-08	Test-time adaptation for geospatial point cloud semantic segmentation with distinct domain shifts	Puzuo Wang et.al.	2407.06043	null
2024-07-08	RHRSegNet: Relighting High-Resolution Night-Time Semantic Segmentation	Sarah Elmahdy et.al.	2407.06016	link
2024-07-07	Semantic Segmentation for Real-World and Synthetic Vehicle's Forward-Facing Camera Images	Tuan T. Nguyen et.al.	2407.05452	null
2024-07-07	Self-supervised Learning via Cluster Distance Prediction for Operating Room Context Awareness	Idris Hamoud et.al.	2407.05448	null
2024-07-06	A Study of Test-time Contrastive Concepts for Open-world, Open-vocabulary Semantic Segmentation	Monika Wysoczańska et.al.	2407.05061	null
2024-07-06	BlessemFlood21: Advancing Flood Analysis with a High-Resolution Georeferenced Dataset for Humanitarian Aid Support	Vladyslav Polushko et.al.	2407.05007	null
2024-07-05	Explainable Metric Learning for Deflating Data Bias	Emma Andrews et.al.	2407.04866	null
2024-07-05	LMSeg: A deep graph message-passing network for efficient and accurate semantic segmentation of large-scale 3D landscape meshes	Zexian Huang et.al.	2407.04326	null
2024-07-04	Beyond Pixels: Semi-Supervised Semantic Segmentation with a Multi-scale Patch-based Multi-Label Classifier	Prantik Howlader et.al.	2407.04036	link
2024-07-04	Relative Difficulty Distillation for Semantic Segmentation	Dong Liang et.al.	2407.03719	link
2024-07-04	POSTURE: Pose Guided Unsupervised Domain Adaptation for Human Body Part Segmentation	Arindam Dutta et.al.	2407.03549	null
2024-07-03	A Unified Framework for 3D Scene Understanding	Wei Xu et.al.	2407.03263	null
2024-07-03	ISWSST: Index-space-wave State Superposition Transformers for Multispectral Remotely Sensed Imagery Semantic Segmentation	Chang Li et.al.	2407.03033	null
2024-07-03	ShiftAddAug: Augment Multiplication-Free Tiny Neural Network with Hybrid Computation	Yipin Guo et.al.	2407.02881	null
2024-07-03	Knowledge Transfer with Simulated Inter-Image Erasing for Weakly Supervised Semantic Segmentation	Tao Chen et.al.	2407.02768	link
2024-07-02	Open Panoramic Segmentation	Junwei Zheng et.al.	2407.02685	link
2024-07-08	Holistically-Nested Structure-Aware Graph Neural Network for Road Extraction	Tinghuai Wang et.al.	2407.02639	null
2024-07-02	Rethinking Data Augmentation for Robust LiDAR Semantic Segmentation in Adverse Weather	Junsung Park et.al.	2407.02286	link
2024-07-02	MTMamba: Enhancing Multi-Task Dense Scene Understanding by Mamba-Based Decoders	Baijiong Lin et.al.	2407.02228	link
2024-07-02	Occlusion-Aware Seamless Segmentation	Yihong Cao et.al.	2407.02182	link
2024-07-02	VRBiom: A New Periocular Dataset for Biometric Applications of HMD	Ketan Kotwal et.al.	2407.02150	null
2024-07-02	Label Anything: Multi-Class Few-Shot Semantic Segmentation with Visual Prompts	Pasquale De Marinis et.al.	2407.02075	link
2024-07-02	Multi-Grained Contrast for Data-Efficient Unsupervised Representation Learning	Chengchao Shen et.al.	2407.02014	link
2024-07-01	Label-free Neural Semantic Image Synthesis	Jiayi Wang et.al.	2407.01790	null
2024-07-01	PanopticRecon: Leverage Open-vocabulary Instance Segmentation for Zero-shot Panoptic Reconstruction	Xuan Yu et.al.	2407.01349	null
2024-07-01	CSFNet: A Cosine Similarity Fusion Network for Real-Time RGB-X Semantic Segmentation of Driving Scenes	Danial Qashqai et.al.	2407.01328	link
2024-06-29	SolarSAM: Building-scale Photovoltaic Potential Assessment Based on Segment Anything Model (SAM) and Remote Sensing for Emerging City	Guohao Wang et.al.	2407.00296	link
2024-07-01	Mobile Robot Oriented Large-Scale Indoor Dataset for Dynamic Scene Understanding	Yifan Tang et.al.	2406.19791	null
2024-06-28	Precision matters: Precision-aware ensemble for weakly supervised semantic segmentation	Junsung Park et.al.	2406.19638	link
2024-06-28	PPTFormer: Pseudo Multi-Perspective Transformer for UAV Segmentation	Deyi Ji et.al.	2406.19632	null
2024-06-27	Mamba or RWKV: Exploring High-Quality and High-Efficiency Segment Anything Model	Haobo Yuan et.al.	2406.19369	link
2024-06-27	ProtoGMM: Multi-prototype Gaussian-Mixture-based Domain Adaptation Model for Semantic Segmentation	Nazanin Moradinasab et.al.	2406.19225	null
2024-06-30	Segment Anything Model for automated image data annotation: empirical studies using text prompts from Grounding DINO	Fuseini Mumuni et.al.	2406.19057	null
2024-06-27	Divide, Ensemble and Conquer: The Last Mile on Unsupervised Domain Adaptation for On-Board Semantic Segmentation	Tao Lian et.al.	2406.18809	null
2024-06-26	CAS: Confidence Assessments of classification algorithms for Semantic segmentation of EO data	Nikolaos Dionelis et.al.	2406.18279	null
2024-06-26	The Surprising Effectiveness of Multimodal Large Language Models for Video Moment Retrieval	Meinardus Boris et.al.	2406.18113	link
2024-06-26	Few-Shot Medical Image Segmentation with High-Fidelity Prototypes	Song Tang et.al.	2406.18074	link
2024-06-25	Local-to-Global Cross-Modal Attention-Aware Fusion for HSI-X Semantic Segmentation	Xuming Zhang et.al.	2406.17679	null
2024-06-25	DocParseNet: Advanced Semantic Segmentation and OCR Embeddings for Efficient Scanned Document Annotation	Ahmad Mohammadshirazi et.al.	2406.17591	link
2024-06-25	Principal Component Clustering for Semantic Segmentation in Synthetic Data Generation	Felix Stillger et.al.	2406.17541	null
2024-06-25	Investigating Self-Supervised Methods for Label-Efficient Learning	Srinivasa Rao Nandam et.al.	2406.17460	null
2024-06-25	Pseudo Labelling for Enhanced Masked Autoencoders	Srinivasa Rao Nandam et.al.	2406.17450	null
2024-06-25	Mamba24/8D: Enhancing Global Interaction in Point Clouds via State Space Model	Zhuoyuan Li et.al.	2406.17442	null
2024-06-25	Implicit-Zoo: A Large-Scale Dataset of Neural Implicit Functions for 2D Images and 3D Scenes	Qi Ma et.al.	2406.17438	link
2024-06-24	Instance Consistency Regularization for Semi-Supervised 3D Instance Segmentation	Yizheng Wu et.al.	2406.16776	link
2024-06-24	μ-Net: A Deep Learning-Based Architecture for μ-CT Segmentation	Pierangela Bruno et.al.	2406.16724	null
2024-06-24	GATSBI: An Online GTSP-Based Algorithm for Targeted Surface Bridge Inspection and Defect Detection	Harnaik Dhami et.al.	2406.16625	link
2024-06-24	LOGCAN++: Local-global class-aware network for semantic segmentation of remote sensing images	Xiaowen Ma et.al.	2406.16502	link
2024-06-24	Cascade Reward Sampling for Efficient Decoding-Time Alignment	Bolian Li et.al.	2406.16306	link
2024-06-24	SegNet4D: Effective and Efficient 4D LiDAR Semantic Segmentation in Autonomous Driving Environments	Neng Wang et.al.	2406.16279	link
2024-06-23	UDHF2-Net: An Uncertainty-diffusion-model-based High-Frequency TransFormer Network for High-accuracy Interpretation of Remotely Sensed Imagery	Pengfei Zhang et.al.	2406.16129	null
2024-06-22	Fine-grained Background Representation for Weakly Supervised Semantic Segmentation	Xu Yin et.al.	2406.15755	link
2024-06-20	Evaluation of Deep Learning Semantic Segmentation for Land Cover Mapping on Multispectral, Hyperspectral and High Spatial Aerial Imagery	Ilham Adi Panuntun et.al.	2406.14220	null
2024-06-20	Trusting Semantic Segmentation Networks	Samik Some et.al.	2406.14201	null
2024-06-20	EvSegSNN: Neuromorphic Semantic Segmentation for Event Data	Dalia Hareb et.al.	2406.14178	null
2024-06-20	Seg-LSTM: Performance of xLSTM for Semantic Segmentation of Remotely Sensed Images	Qinfeng Zhu et.al.	2406.14086	link
2024-06-19	Search-based DNN Testing and Retraining with GAN-enhanced Simulations	Mohammed Oualid Attaoui et.al.	2406.13359	null
2024-06-19	Deep Learning-Based 3D Instance and Semantic Segmentation: A Review	Siddiqui Muhammad Yasir et.al.	2406.13308	null
2024-06-18	Reparameterizable Dual-Resolution Network for Real-time Semantic Segmentation	Guoyu Yang et.al.	2406.12496	link
2024-06-18	Agriculture-Vision Challenge 2024 -- The Runner-Up Solution for Agricultural Pattern Recognition via Class Balancing and Model Ensemble	Wang Liu et.al.	2406.12271	null
2024-06-17	OoDIS: Anomaly Instance Segmentation Benchmark	Alexey Nekrasov et.al.	2406.11835	link
2024-06-17	Multimodal Learning To Improve Segmentation With Intraoperative CBCT & Preoperative CT	Maximilian E. Tschuchnig et.al.	2406.11650	null
2024-06-17	SWCF-Net: Similarity-weighted Convolution and Local-global Fusion for Efficient Large-scale Point Cloud Semantic Segmentation	Zhenchao Lin et.al.	2406.11441	link
2024-06-17	Enhancing Generalizability of Representation Learning for Data-Efficient 3D Scene Understanding	Yunsong Wang et.al.	2406.11283	null
2024-06-17	Frozen CLIP: A Strong Backbone for Weakly Supervised Semantic Segmentation	Bingfeng Zhang et.al.	2406.11189	link
2024-06-16	$α$ -SSC: Uncertainty-Aware Camera-based 3D Semantic Scene Completion	Sanbao Su et.al.	2406.11021	null
2024-06-16	PyramidMamba: Rethinking Pyramid Feature Fusion with Selective Space State Model for Semantic Segmentation of Remote Sensing Imagery	Libo Wang et.al.	2406.10828	link
2024-06-15	GenMM: Geometrically and Temporally Consistent Multimodal Data Generation for Video and LiDAR	Bharat Singh et.al.	2406.10722	null
2024-06-15	A Late-Stage Bitemporal Feature Fusion Network for Semantic Change Detection	Chenyao Zhou et.al.	2406.10678	link
2024-06-14	ALGM: Adaptive Local-then-Global Token Merging for Efficient Semantic Segmentation with Plain Vision Transformers	Narges Norouzi et.al.	2406.09936	link
2024-06-14	Label-Efficient Semantic Segmentation of LiDAR Point Clouds in Adverse Weather Conditions	Aldi Piroli et.al.	2406.09906	null
2024-06-14	Exploring the Benefits of Vision Foundation Models for Unsupervised Domain Adaptation	Brunó B. Englert et.al.	2406.09896	link
2024-06-14	Open-Vocabulary Semantic Segmentation with Image Embedding Balancing	Xiangheng Shan et.al.	2406.09829	link
2024-06-13	Instance-level quantitative saliency in multiple sclerosis lesion segmentation	Federico Spagnolo et.al.	2406.09335	link
2024-06-13	APSeg: Auto-Prompt Network for Cross-Domain Few-Shot Semantic Segmentation	Weizhao He et.al.	2406.08372	null
2024-06-12	Dataset Enhancement with Instance-Level Augmentations	Orest Kupyn et.al.	2406.08249	link
2024-06-16	A $^{2}$ -MAE: A spatial-temporal-spectral unified remote sensing pre-training method based on anchor-aware masked autoencoder	Lixian Zhang et.al.	2406.08079	null
2024-06-12	OpenObj: Open-Vocabulary Object-Level Neural Radiance Fields with Fine-Grained Understanding	Yinan Deng et.al.	2406.08009	link
2024-06-12	SimSAM: Simple Siamese Representations Based Semantic Affinity Matrix for Unsupervised Image Segmentation	Chanda Grover Kamra et.al.	2406.07986	link
2024-06-12	Small Scale Data-Free Knowledge Distillation	He Liu et.al.	2406.07876	link
2024-06-11	Beyond Bare Queries: Open-Vocabulary Object Retrieval with 3D Scene Graph	Sergey Linok et.al.	2406.07113	null
2024-06-11	PanoSSC: Exploring Monocular Panoptic 3D Scene Reconstruction for Autonomous Driving	Yining Shi et.al.	2406.07037	null
2024-06-12	LiSD: An Efficient Multi-Task Learning Framework for LiDAR Segmentation and Detection	Jiahua Xu et.al.	2406.07023	null
2024-06-10	Stable Neighbor Denoising for Source-free Domain Adaptive Segmentation	Dong Zhao et.al.	2406.06813	link
2024-06-09	Transforming Heart Chamber Imaging: Self-Supervised Learning for Whole Heart Reconstruction and Segmentation	Abdul Qayyum et.al.	2406.06643	null
2024-06-10	Merlin: A Vision Language Foundation Model for 3D Computed Tomography	Louis Blankemeier et.al.	2406.06512	null
2024-06-10	UMAD: Unsupervised Mask-Level Anomaly Detection for Autonomous Driving	Daniel Bogdoll et.al.	2406.06370	null
2024-06-09	Scaling Graph Convolutions for Mobile Vision	William Avery et.al.	2406.05850	link
2024-06-09	Solution for CVPR 2024 UG2+ Challenge Track on All Weather Semantic Segmentation	Jun Yu et.al.	2406.05837	null
2024-06-09	Convolution and Attention-Free Mamba-based Cardiac Image Segmentation	Abbas Khan et.al.	2406.05786	null
2024-06-09	Separating the "Chirp" from the "Chat": Self-supervised Visual Grounding of Sound and Language	Mark Hamilton et.al.	2406.05629	link
2024-06-08	A Two-Stage Adverse Weather Semantic Segmentation Method for WeatherProof Challenge CVPR 2024 Workshop UG2+	Jianzhao Wang et.al.	2406.05513	null
2024-06-08	Layered Image Vectorization via Semantic Simplification	Zhenyu Wang et.al.	2406.05404	null
2024-06-08	1st Place Winner of the 2024 Pixel-level Video Understanding in the Wild (CVPR'24 PVUW) Challenge in Video Panoptic Segmentation and Best Long Video Consistency of Video Semantic Segmentation	Qingfeng Liu et.al.	2406.05352	null
2024-06-07	USE: Universal Segment Embeddings for Open-Vocabulary Image Segmentation	Xiaoqi Wang et.al.	2406.05271	null
2024-06-07	Semantic Segmentation on VSPW Dataset through Masked Video Consistency	Chen Liang et.al.	2406.04979	null
2024-06-07	Nacala-Roof-Material: Drone Imagery for Roof Detection, Classification, and Segmentation to Support Mosquito-borne Disease Risk Assessment	Venkanna Babu Guthula et.al.	2406.04949	null
2024-06-06	Characterizing segregation in blast rock piles a deep-learning approach leveraging aerial image analysis	Chengeng Liu et.al.	2406.04149	null
2024-06-06	Frequency-based Matcher for Long-tailed Semantic Segmentation	Shan Li et.al.	2406.03917	link
2024-06-07	Enhanced Semantic Segmentation Pipeline for WeatherProof Dataset Challenge	Nan Zhang et.al.	2406.03799	link
2024-06-06	DSNet: A Novel Way to Use Atrous Convolutions in Semantic Segmentation	Zilu Guo et.al.	2406.03702	link
2024-06-05	Comparative Benchmarking of Failure Detection Methods in Medical Image Segmentation: Unveiling the Role of Confidence Aggregation	Maximilian Zenk et.al.	2406.03323	null
2024-06-05	Learning Semantic Traversability with Egocentric Video and Automated Annotation Strategy	Yunho Kim et.al.	2406.02989	null
2024-06-04	W-RIZZ: A Weakly-Supervised Framework for Relative Traversability Estimation in Mobile Robotics	Andre Schreiber et.al.	2406.02822	link
2024-06-04	Window to Wall Ratio Detection using SegFormer	Zoe De Simone et.al.	2406.02706	link
2024-06-04	Detecting Endangered Marine Species in Autonomous Underwater Vehicle Imagery Using Point Annotations and Few-Shot Learning	Heather Doig et.al.	2406.01932	null
2024-06-03	EAGLE: Efficient Adaptive Geometry-based Learning in Cross-view Understanding	Thanh-Dat Truong et.al.	2406.01429	null
2024-06-03	TE-NeXt: A LiDAR-Based 3D Sparse Convolutional Network for Traversability Estimation	Antonio Santo et.al.	2406.01395	link
2024-06-03	ARCH2S: Dataset, Benchmark and Challenges for Learning Exterior Architectural Structures from Point Clouds	Ka Lung Cheung et.al.	2406.01337	link
2024-06-03	LSKSANet: A Novel Architecture for Remote Sensing Image Semantic Segmentation Leveraging Large Selective Kernel and Sparse Attention Mechanism	Miao Fu et.al.	2406.01228	null
2024-06-04	GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer	Ding Jia et.al.	2406.01210	link
2024-06-03	S-CycleGAN: Semantic Segmentation Enhanced CT-Ultrasound Image-to-Image Translation for Robotic Ultrasonography	Yuhan Song et.al.	2406.01191	link
2024-06-02	Diffusion Features to Bridge Domain Gap for Semantic Segmentation	Yuxiang Ji et.al.	2406.00777	null
2024-06-02	Cascade-CLIP: Cascaded Vision-Language Embeddings Alignment for Zero-Shot Semantic Segmentation	Yunheng Li et.al.	2406.00670	link
2024-06-02	Semi-supervised Video Semantic Segmentation Using Unreliable Pseudo Labels for PVUW2024	Biao Wu et.al.	2406.00587	null
2024-05-31	Uncertainty Quantification for Bird's Eye View Semantic Segmentation: Methods and Benchmarks	Linlin Yu et.al.	2405.20986	null
2024-05-31	Revisiting and Maximizing Temporal Knowledge in Semi-supervised Semantic Segmentation	Wooseok Shin et.al.	2405.20610	link
2024-05-30	P-MSDiff: Parallel Multi-Scale Diffusion for Remote Sensing Image Segmentation	Qi Zhang et.al.	2405.20443	link
2024-05-30	SemFlow: Binding Semantic Segmentation and Image Synthesis via Rectified Flow	Chaoyang Wang et.al.	2405.20282	link
2024-05-30	MCDS-VSS: Moving Camera Dynamic Scene Video Semantic Segmentation by Filtering with Self-Supervised Geometry and Motion	Angel Villar-Corrales et.al.	2405.19921	link
2024-05-30	Open-Set Domain Adaptation for Semantic Segmentation	Seun-An Choe et.al.	2405.19899	link
2024-05-30	DenseSeg: Joint Learning for Semantic Segmentation and Landmark Detection Using Dense Image-to-Shape Representation	Ron Keuth et.al.	2405.19746	link
2024-05-30	Twin Deformable Point Convolutions for Point Cloud Semantic Segmentation in Remote Sensing Scenes	Yong-Qiang Mao et.al.	2405.19735	null
2024-05-30	CRIS: Collaborative Refinement Integrated with Segmentation for Polyp Segmentation	Ankush Gajanan Arudkar et.al.	2405.19672	null
2024-05-29	Organizing Background to Explore Latent Classes for Incremental Few-shot Semantic Segmentation	Lianlei Shan et.al.	2405.19568	null
2024-05-29	Enabling Visual Recognition at Radio Frequency	Haowen Lai et.al.	2405.19516	null
2024-05-29	Reasoning3D -- Grounding and Reasoning in 3D: Fine-Grained Zero-Shot Open-Vocabulary 3D Reasoning Part Segmentation via Large Vision-Language Models	Tianrun Chen et.al.	2405.19326	null
2024-05-29	A Good Foundation is Worth Many Labels: Label-Efficient Panoptic Segmentation	Niclas Vödisch et.al.	2405.19035	link
2024-05-29	Parameter-efficient Fine-tuning in Hyperspherical Space for Open-vocabulary Semantic Segmentation	Zelin Peng et.al.	2405.18840	null
2024-05-28	Learning to Detour: Shortcut Mitigating Augmentation for Weakly Supervised Semantic Segmentation	JuneHyoung Kwon et.al.	2405.18148	null
2024-05-28	Edge-guided and Class-balanced Active Learning for Semantic Segmentation of Aerial Images	Lianlei Shan et.al.	2405.18078	null
2024-05-28	RT-GS2: Real-Time Generalizable Semantic Segmentation for 3D Gaussian Representations of Radiance Fields	Mihnea-Bogdan Jurca et.al.	2405.18033	null
2024-05-28	DMT-JEPA: Discriminative Masked Targets for Joint-Embedding Predictive Architecture	Shentong Mo et.al.	2405.17995	link
2024-05-28	The Binary Quantized Neural Network for Dense Prediction via Specially Designed Upsampling and Attention	Xingyu Ding et.al.	2405.17776	null
2024-05-27	Evaluation of Multi-task Uncertainties in Joint Semantic Segmentation and Monocular Depth Estimation	Steven Landgraf et.al.	2405.17097	null
2024-05-27	DSU-Net: Dynamic Snake U-Net for 2-D Seismic First Break Picking	Hongtao Wang et.al.	2405.16980	null
2024-05-27	Collective Perception Datasets for Autonomous Driving: A Comprehensive Review	Sven Teufel et.al.	2405.16973	null
2024-05-27	Zero-Shot Video Semantic Segmentation based on Pre-Trained Diffusion Models	Qian Wang et.al.	2405.16947	link
2024-05-27	A re-calibration method for object detection with multi-modal alignment bias in autonomous driving	Zhihang Song et.al.	2405.16848	null
2024-05-25	BOLD: Boolean Logic Deep Learning	Van Minh Nguyen et.al.	2405.16339	null
2024-05-25	Improving 3D Occupancy Prediction through Class-balancing Loss and Multi-scale Representation	Huizhou Chen et.al.	2405.16099	null
2024-05-25	Intensity and Texture Correction of Omnidirectional Image Using Camera Images for Indirect Augmented Reality	Hakim Ikebayashi et.al.	2405.16008	null
2024-05-24	Visualize and Paint GAN Activations	Rudolf Herdt et.al.	2405.15636	null
2024-05-24	Leveraging knowledge distillation for partial multi-task learning from multiple remote sensing datasets	Hoàng-Ân Lê et.al.	2405.15394	link
2024-05-24	U3M: Unbiased Multiscale Modal Fusion Model for Multimodal Semantic Segmentation	Bingyu Li et.al.	2405.15365	link
2024-05-24	Cross-Domain Few-Shot Semantic Segmentation via Doubly Matching Transformation	Jiayi Chen et.al.	2405.15265	link
2024-05-23	Mamba-R: Vision Mamba ALSO Needs Registers	Feng Wang et.al.	2405.14858	null
2024-05-23	Segformer++: Efficient Token-Merging Strategies for High-Resolution Semantic Segmentation	Daniel Kienzle et.al.	2405.14467	link
2024-05-23	MAMBA4D: Efficient Long-Sequence Point Cloud Video Understanding with Disentangled Spatial-Temporal State Space Models	Jiuming Liu et.al.	2405.14338	null
2024-05-23	Tuning-free Universally-Supervised Semantic Segmentation	Xiaobo Yang et.al.	2405.14294	null
2024-05-23	SCMix: Stochastic Compound Mixing for Open Compound Domain Adaptation in Semantic Segmentation	Kai Yao et.al.	2405.14278	null
2024-05-23	Harmony: A Joint Self-Supervised and Weakly-Supervised Framework for Learning General Purpose Visual Representations	Mohammed Baharoon et.al.	2405.14239	link
2024-05-24	Leveraging Semantic Segmentation Masks with Embeddings for Fine-Grained Form Classification	Taylor Archibald et.al.	2405.14162	null
2024-05-23	Skip-SCAR: A Modular Approach to ObjectGoal Navigation with Sparsity and Adaptive Skips	Yaotian Liu et.al.	2405.14154	null
2024-05-22	TS40K: a 3D Point Cloud Dataset of Rural Terrain and Electrical Transmission System	Diogo Lavado et.al.	2405.13989	null
2024-05-22	Semantic Equitable Clustering: A Simple, Fast and Effective Strategy for Vision Transformer	Qihang Fan et.al.	2405.13337	link
2024-05-21	Transparency Distortion Robustness for SOTA Image Segmentation Tasks	Volker Knauthe et.al.	2405.12864	null
2024-05-20	A comprehensive overview of deep learning techniques for 3D point cloud classification and semantic segmentation	Sushmita Sarker et.al.	2405.11903	null
2024-05-20	Salience-guided Ground Factor for Robust Localization of Delivery Robots in Complex Urban Environments	Jooyong Park et.al.	2405.11855	null
2024-05-20	Universal Organizer of SAM for Unsupervised Semantic Segmentation	Tingting Li et.al.	2405.11742	link
2024-05-19	Interpreting a Semantic Segmentation Model for Coastline Detection	Conor O'Sullivan et.al.	2405.11500	null
2024-05-17	CM-UNet: Hybrid CNN-Mamba UNet for Remote Sensing Image Semantic Segmentation	Mushui Liu et.al.	2405.10530	link
2024-05-16	Towards Task-Compatible Compressible Representations	Anderson de Andrade et.al.	2405.10244	link
2024-05-16	A Preprocessing and Postprocessing Voxel-based Method for LiDAR Semantic Segmentation Improvement in Long Distance	Andrea Matteazzi et.al.	2405.10046	null
2024-05-16	Towards Realistic Incremental Scenario in Class Incremental Semantic Segmentation	Jihwan Kwak et.al.	2405.09858	null
2024-05-22	Synth-to-Real Unsupervised Domain Adaptation for Instance Segmentation	Yachan Guo et.al.	2405.09682	null
2024-05-14	CLIP with Quality Captions: A Strong Pretraining for Vision Tasks	Pavan Kumar Anasosalu Vasu et.al.	2405.08911	null
2024-05-14	Rethinking Scanning Strategies with Vision Mamba in Semantic Segmentation of Remote Sensing Imagery: An Experimental Study	Qinfeng Zhu et.al.	2405.08493	null
2024-05-14	TEDNet: Twin Encoder Decoder Neural Network for 2D Camera and LiDAR Road Detection	Martín Bayón-Gutiérrez et.al.	2405.08429	link
2024-05-13	IMAFD: An Interpretable Multi-stage Approach to Flood Detection from time series Multispectral Data	Ziyang Zhang et.al.	2405.07916	null
2024-05-12	Building a Strong Pre-Training Baseline for Universal 3D Large-Scale Perception	Haoming Chen et.al.	2405.07201	link
2024-05-10	GreedyViG: Dynamic Axial Graph Construction for Efficient Vision GNNs	Mustafa Munir et.al.	2405.06849	link
2024-05-10	Enhancing Weakly Supervised Semantic Segmentation with Multi-modal Foundation Models: An End-to-End Approach	Elham Ravanbakhsh et.al.	2405.06586	null
2024-05-10	Semantic and Spatial Adaptive Pixel-level Classifier for Semantic Segmentation	Xiaowen Ma et.al.	2405.06525	link
2024-05-10	Multi-Target Unsupervised Domain Adaptation for Semantic Segmentation without External Data	Yonghao Xu et.al.	2405.06502	link
2024-05-10	Multi-level Personalized Federated Learning on Heterogeneous and Long-Tailed Data	Rongyu Zhang et.al.	2405.06413	null
2024-05-10	Context-Guided Spatial Feature Reconstruction for Efficient Semantic Segmentation	Zhenliang Ni et.al.	2405.06228	link
2024-05-10	Zero-shot Degree of Ill-posedness Estimation for Active Small Object Change Detection	Koji Takeda et.al.	2405.06185	null
2024-05-10	Prior-guided Diffusion Model for Cell Segmentation in Quantitative Phase Imaging	Zhuchen Shao et.al.	2405.06175	null
2024-05-09	Mask-TS Net: Mask Temperature Scaling Uncertainty Calibration for Polyp Segmentation	Yudian Zhang et.al.	2405.05830	null
2024-05-08	OpenESS: Event-based Semantic Scene Understanding with Open Vocabularies	Lingdong Kong et.al.	2405.05259	link
2024-05-08	Multi-Modal Data-Efficient 3D Scene Understanding for Autonomous Driving	Lingdong Kong et.al.	2405.05258	link
2024-05-08	Weakly-supervised Semantic Segmentation via Dual-stream Contrastive Learning of Cross-image Contextual Information	Qi Lai et.al.	2405.04913	null
2024-05-08	DeepDamageNet: A two-step deep-learning model for multi-disaster building damage segmentation and classification using satellite imagery	Irene Alisjahbana et.al.	2405.04800	null
2024-05-07	FRACTAL: An Ultra-Large-Scale Aerial Lidar Dataset for 3D Semantic Segmentation of Diverse Landscapes	Charles Gaydon et.al.	2405.04634	link
2024-05-07	A New Dataset and Comparative Study for Aphid Cluster Detection and Segmentation in Sorghum Fields	Raiyan Rahman et.al.	2405.04305	null
2024-05-07	ELiTe: Efficient Image-to-LiDAR Knowledge Transfer for Semantic Segmentation	Zhibo Zhang et.al.	2405.04121	null
2024-05-06	PTQ4SAM: Post-Training Quantization for Segment Anything	Chengtao Lv et.al.	2405.03144	link
2024-05-04	MMEarth: Exploring Multi-Modal Pretext Tasks For Geospatial Representation Learning	Vishal Nedungadi et.al.	2405.02771	link
2024-05-04	Few-Shot Fruit Segmentation via Transfer Learning	Jordan A. James et.al.	2405.02556	link
2024-05-03	DiffMap: Enhancing Map Segmentation with Map Prior Using Diffusion Model	Peijin Jia et.al.	2405.02008	null
2024-05-02	Development of Skip Connection in Deep Neural Networks for Computer Vision and Medical Image Analysis: A Survey	Guoping Xu et.al.	2405.01725	link
2024-05-02	Explainable AI (XAI) in Image Segmentation in Medicine, Industry, and Beyond: A Survey	Rokas Gipiškis et.al.	2405.01636	null
2024-05-02	CromSS: Cross-modal pre-training with noisy labels for remote sensing image segmentation	Chenying Liu et.al.	2405.01217	null
2024-05-02	Uncertainty-aware self-training with expectation maximization basis transformation	Zijia Wang et.al.	2405.01175	null
2024-05-01	Exploring Self-Supervised Vision Transformers for Deepfake Detection: A Comparative Analysis	Huy H. Nguyen et.al.	2405.00355	null
2024-04-30	Masked Multi-Query Slot Attention for Unsupervised Object Discovery	Rishav Pramanik et.al.	2404.19654	link
2024-04-30	DELINE8K: A Synthetic Data Pipeline for the Semantic Segmentation of Historical Documents	Taylor Archibald et.al.	2404.19259	null
2024-04-29	Swin2-MoSE: A New Single Image Super-Resolution Model for Remote Sensing	Leonardo Rossi et.al.	2404.18924	link
2024-04-29	IPixMatch: Boost Semi-supervised Semantic Segmentation with Inter-Pixel Relation	Kebin Wu et.al.	2404.18891	null
2024-04-29	Towards Long-term Robotics in the Wild	Stephen Hausler et.al.	2404.18477	null
2024-04-27	Multi-Stream Cellular Test-Time Adaptation of Real-Time Models Evolving in Dynamic Environments	Benoît Gérin et.al.	2404.17930	link
2024-04-27	GLIMS: Attention-Guided Lightweight Multi-Scale Hybrid Network for Volumetric Semantic Segmentation	Ziya Ata Yazıcı et.al.	2404.17854	link
2024-04-27	CLFT: Camera-LiDAR Fusion Transformer for Semantic Segmentation in Autonomous Driving	Junyi Gu et.al.	2404.17793	link
2024-04-26	Optimizing Universal Lesion Segmentation: State Space Model-Guided Hierarchical Networks with Feature Importance Adjustment	Kazi Shahriar Sanjid et.al.	2404.17235	null
2024-04-25	Calculation of Femur Caput Collum Diaphyseal angle for X-Rays images using Semantic Segmentation	Deepak Bhatia et.al.	2404.17083	null
2024-04-25	Boosting Unsupervised Semantic Segmentation with Principal Mask Proposals	Oliver Hahn et.al.	2404.16818	link
2024-04-26	Multi-Scale Representations by Varying Window Attention for Semantic Segmentation	Haotian Yan et.al.	2404.16573	link
2024-04-25	360SFUDA++: Towards Source-free UDA for Panoramic Segmentation by Learning Reliable Category Prototypes	Xu Zheng et.al.	2404.16501	null
2024-04-25	Semantic Segmentation Refiner for Ultrasound Applications with Zero-Shot Foundation Models	Hedda Cohen Indelman et.al.	2404.16325	null
2024-04-25	Style Adaptation for Domain-adaptive Semantic Segmentation	Ting Li et.al.	2404.16301	null
2024-04-29	A Multi-objective Optimization Benchmark Test Suite for Real-time Semantic Segmentation	Yifan Zhao et.al.	2404.16266	link
2024-04-24	3D Freehand Ultrasound using Visual Inertial and Deep Inertial Odometry for Measuring Patellar Tracking	Russell Buchanan et.al.	2404.15847	null
2024-04-24	Vision Transformer-based Adversarial Domain Adaptation	Yahan Li et.al.	2404.15817	link
2024-04-22	OccFeat: Self-supervised Occupancy Feature Prediction for Pretraining BEV Segmentation Networks	Sophia Sirko-Galouchenko et.al.	2404.14027	link
2024-04-21	Semantic-Rearrangement-Based Multi-Level Alignment for Domain Generalized Segmentation	Guanlong Jiao et.al.	2404.13701	null
2024-04-21	PV-S3: Advancing Automatic Photovoltaic Defect Detection using Semi-Supervised Semantic Segmentation of Electroluminescence Images	Abhishek Jha et.al.	2404.13693	null
2024-04-21	A Complete System for Automated 3D Semantic-Geometric Mapping of Corrosion in Industrial Environments	Rui Pimentel de Figueiredo et.al.	2404.13691	null
2024-04-21	LMFNet: An Efficient Multimodal Fusion Approach for Semantic Segmentation in High-Resolution Remote Sensing	Tong Wang et.al.	2404.13659	null
2024-04-21	Towards Unified Representation of Multi-Modal Pre-training for 3D Understanding via Differentiable Rendering	Ben Fei et.al.	2404.13619	null
2024-04-20	AMMUNet: Multi-Scale Attention Map Merging for Remote Sensing Image Segmentation	Yang Yang et.al.	2404.13408	link
2024-04-19	BACS: Background Aware Continual Semantic Segmentation	Mostafa ElAraby et.al.	2404.13148	link
2024-04-19	Foundation Model assisted Weakly Supervised LiDAR Semantic Segmentation	Yilong Chen et.al.	2404.12861	null
2024-04-19	COIN: Counterfactual inpainting for weakly supervised semantic segmentation for medical images	Dmytro Shvetsov et.al.	2404.12832	link
2024-04-19	A Point-Based Approach to Efficient LiDAR Multi-Task Perception	Christopher Lang et.al.	2404.12798	null
2024-04-19	Generalized Few-Shot Meets Remote Sensing: Discovering Novel Classes in Land Cover Mapping via Hybrid Semantic Segmentation Framework	Zhuohong Li et.al.	2404.12721	link
2024-04-19	Improving Prediction Accuracy of Semantic Segmentation Methods Using Convolutional Autoencoder Based Pre-processing Layers	Hisashi Shimodaira et.al.	2404.12718	null
2024-04-19	Show and Grasp: Few-shot Semantic Segmentation for Robot Grasping through Zero-shot Foundation Models	Leonardo Barcellona et.al.	2404.12717	null
2024-04-18	A Perspective on Deep Vision Performance with Standard Image and Video Codecs	Christoph Reich et.al.	2404.12330	null
2024-04-18	Deep Gaussian mixture model for unsupervised image segmentation	Matthias Schwab et.al.	2404.12252	link
2024-04-18	Observation, Analysis, and Solution: Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training	Jin Gao et.al.	2404.12210	link
2024-04-18	How to Benchmark Vision Foundation Models for Semantic Segmentation?	Tommie Kerssies et.al.	2404.12172	link
2024-04-19	Tendency-driven Mutual Exclusivity for Weakly Supervised Incremental Semantic Segmentation	Chongjie Si et.al.	2404.11981	null
2024-04-18	Group-On: Boosting One-Shot Segmentation with Supportive Query	Hanjing Zhou et.al.	2404.11871	null
2024-04-17	Visual Prompting for Generalized Few-shot Segmentation: A Multi-scale Approach	Mir Rayat Imtiaz Hossain et.al.	2404.11732	null
2024-04-17	A Semantic Segmentation-guided Approach for Ground-to-Aerial Image Matching	Francesco Pro et.al.	2404.11302	link
2024-04-17	Learning from Unlabelled Data with Transformers: Domain Adaptation for Semantic Segmentation of High Resolution Aerial Images	Nikolaos Dionelis et.al.	2404.11299	link
2024-04-16	A Concise Tiling Strategy for Preserving Spatial Context in Earth Observation Imagery	Ellianna Abrahams et.al.	2404.10927	link
2024-04-16	Vocabulary-free Image Classification and Semantic Segmentation	Alessandro Conti et.al.	2404.10864	link
2024-04-16	Gasformer: A Transformer-based Architecture for Segmenting Methane Emissions from Livestock in Optical Gas Imaging	Toqi Tahamid Sarker et.al.	2404.10841	link
2024-04-16	Learning Feature Inversion for Multi-class Anomaly Detection under General-purpose COCO-AD Benchmark	Jiangning Zhang et.al.	2404.10760	link
2024-04-16	ECLAIR: A High-Fidelity Aerial LiDAR Dataset for Semantic Segmentation	Iaroslav Melekhov et.al.	2404.10699	link
2024-04-16	Contextrast: Contextual Contrastive Learning for Semantic Segmentation	Changki Sung et.al.	2404.10633	null
2024-04-16	Label merge-and-split: A graph-colouring approach for memory-efficient brain parcellation	Aaron Kujawa et.al.	2404.10572	null
2024-04-16	LAECIPS: Large Vision Model Assisted Adaptive Edge-Cloud Collaboration for IoT-based Perception System	Shijing Hu et.al.	2404.10498	null
2024-04-16	Adversarial Identity Injection for Semantic Face Image Synthesis	Giuseppe Tarollo et.al.	2404.10408	null
2024-04-16	Domain-Rectifying Adapter for Cross-Domain Few-Shot Segmentation	Jiapeng Su et.al.	2404.10322	link
2024-04-16	Learnable Prompt for Few-Shot Semantic Segmentation in Remote Sensing Domain	Steve Andreas Immanuel et.al.	2404.10307	link
2024-04-15	Empowering Embodied Visual Tracking with Visual Foundation Models and Offline RL	Fangwei Zhong et.al.	2404.09857	null
2024-04-15	In-Context Translation: Towards Unifying Image Recognition, Processing, and Generation	Han Xue et.al.	2404.09633	null
2024-04-15	The revenge of BiSeNet: Efficient Multi-Task Image Segmentation	Gabriele Rosi et.al.	2404.09570	null
2024-04-16	Human-in-the-Loop Segmentation of Multi-species Coral Imagery	Scarlett Raine et.al.	2404.09406	link
2024-04-14	Bridging Data Islands: Geographic Heterogeneity-Aware Federated Learning for Collaborative Remote Sensing Semantic Segmentation	Jieyi Tan et.al.	2404.09292	null
2024-04-12	Analyzing Decades-Long Environmental Changes in Namibia Using Archival Aerial Photography and Deep Learning	Girmaw Abebe Tadesse et.al.	2404.08544	null
2024-04-12	LaSagnA: Language-based Segmentation Assistant for Complex Queries	Cong Wei et.al.	2404.08506	link
2024-04-12	Tackling Ambiguity from Perspective of Uncertainty Inference and Affinity Diversification for Weakly Supervised Semantic Segmentation	Zhiwei Yang et.al.	2404.08195	link
2024-04-12	Pay Attention to Your Neighbours: Training-Free Open-Vocabulary Semantic Segmentation	Sina Hajimiri et.al.	2404.08181	link
2024-04-10	AI-Guided Feature Segmentation Techniques to Model Features from Single Crystal Diamond Growth	Rohan Reddy Mekala et.al.	2404.08017	null
2024-04-11	Exploiting Object-based and Segmentation-based Semantic Features for Deep Learning-based Indoor Scene Classification	Ricardo Pereira et.al.	2404.07739	null
2024-04-11	OpenTrench3D: A Photogrammetric 3D Point Cloud Dataset for Semantic Segmentation of Underground Utilities	Lasse H. Hansen et.al.	2404.07711	link
2024-04-11	Implicit and Explicit Language Guidance for Diffusion-based Visual Perception	Hefeng Wang et.al.	2404.07600	null
2024-04-11	Improving Shift Invariance in Convolutional Neural Networks with Translation Invariant Polyphase Sampling	Sourajit Saha et.al.	2404.07410	null
2024-04-10	AI-Guided Defect Detection Techniques to Model Single Crystal Diamond Growth	Rohan Reddy Mekala et.al.	2404.07306	null
2024-04-10	RESSCAL3D: Resolution Scalable 3D Semantic Segmentation of Point Clouds	Remco Royen et.al.	2404.06863	null
2024-04-10	O2V-Mapping: Online Open-Vocabulary Mapping with Neural Implicit Representation	Muer Tie et.al.	2404.06836	null
2024-04-10	Convolution-based Probability Gradient Loss for Semantic Segmentation	Guohang Shan et.al.	2404.06704	link
2024-04-09	Training-Free Open-Vocabulary Segmentation with Offline Diffusion-Augmented Prototype Generation	Luca Barsellotti et.al.	2404.06542	null
2024-04-09	QueSTMaps: Queryable Semantic Topological Maps for 3D Scene Understanding	Yash Mehan et.al.	2404.06442	null
2024-04-09	DaF-BEVSeg: Distortion-aware Fisheye Camera based Bird's Eye View Segmentation with Occlusion Reasoning	Senthil Yogamani et.al.	2404.06352	null
2024-04-09	Hierarchical Insights: Exploiting Structural Similarities for Reliable 3D Semantic Segmentation	Mariella Dreissig et.al.	2404.06124	null
2024-04-09	Improving Facial Landmark Detection Accuracy and Efficiency with Knowledge Distillation	Zong-Wei Hong et.al.	2404.06029	null
2024-04-08	Evaluating the Efficacy of Cut-and-Paste Data Augmentation in Semantic Segmentation for Satellite Imagery	Ionut M. Motoi et.al.	2404.05693	null
2024-04-08	AlignZeg: Mitigating Objective Misalignment for Zero-shot Semantic Segmentation	Jiannan Ge et.al.	2404.05667	null
2024-04-08	Impact of LiDAR visualisations on semantic segmentation of archaeological objects	Raveerat Jaturapitpornchai et.al.	2404.05512	null
2024-04-08	Rethinking the Spatial Inconsistency in Classifier-Free Diffusion Guidance	Dazhong Shen et.al.	2404.05384	link
2024-04-08	GPS-free Autonomous Navigation in Cluttered Tree Rows with Deep Semantic Segmentation	Alessandro Navone et.al.	2404.05338	null
2024-04-08	Human Detection from 4D Radar Data in Low-Visibility Field Conditions	Mikael Skog et.al.	2404.05307	null
2024-04-08	iVPT: Improving Task-relevant Information Sharing in Visual Prompt Tuning by Cross-layer Dynamic Connection	Nan Zhou et.al.	2404.05207	null
2024-04-08	UniMix: Towards Domain Adaptive and Generalizable LiDAR Semantic Segmentation in Adverse Weather	Haimei Zhao et.al.	2404.05145	null
2024-04-07	D2SL: Decouple Defogging and Semantic Learning for Foggy Domain-Adaptive Segmentation	Xuan Sun et.al.	2404.04807	null
2024-04-06	HawkDrive: A Transformer-driven Visual Perception System for Autonomous Driving in Night Scene	Ziang Guo et.al.	2404.04653	link
2024-04-05	Sigma: Siamese Mamba Network for Multi-Modal Semantic Segmentation	Zifu Wan et.al.	2404.04256	link
2024-04-05	Image-Text Co-Decomposition for Text-Supervised Semantic Segmentation	Ji-Jia Wu et.al.	2404.04231	link
2024-04-05	MarsSeg: Mars Surface Semantic Segmentation with Multi-level Extractor and Connector	Junbo Li et.al.	2404.04155	null
2024-04-04	Language-Guided Instance-Aware Domain-Adaptive Panoptic Segmentation	Elham Amin Mansour et.al.	2404.03799	null
2024-04-04	Flattening the Parent Bias: Hierarchical Semantic Segmentation in the Poincaré Ball	Simon Weber et.al.	2404.03778	link
2024-04-04	Background Noise Reduction of Attention Map for Weakly Supervised Semantic Segmentation	Izumi Fujimori et.al.	2404.03394	null
2024-04-03	GPU-Accelerated RSF Level Set Evolution for Large-Scale Microvascular Segmentation	Meher Niger et.al.	2404.02813	null
2024-04-03	RS-Mamba for Large Remote Sensing Image Dense Prediction	Sijie Zhao et.al.	2404.02668	link
2024-04-03	A Satellite Band Selection Framework for Amazon Forest Deforestation Detection Task	Eduardo Neto et.al.	2404.02659	null
2024-04-03	SG-BEV: Satellite-Guided BEV Fusion for Cross-View Semantic Segmentation	Junyan Ye et.al.	2404.02638	link
2024-04-03	Active learning for efficient annotation in precision agriculture: a use-case on crop-weed semantic segmentation	Bart M. van Marrewijk et.al.	2404.02580	null
2024-04-03	HENet: Hybrid Encoding for End-to-end Multi-task 3D Perception from Multi-view Cameras	Zhongyu Xia et.al.	2404.02517	link
2024-04-03	Optimizing traffic signs and lights visibility for the teleoperation of autonomous vehicles through ROI compression	I. Dror et.al.	2404.02481	null
2024-04-03	RS3Mamba: Visual State Space Model for Remote Sensing Images Semantic Segmentation	Xianping Ma et.al.	2404.02457	link
2024-04-02	Constrained Robotic Navigation on Preferred Terrains Using LLMs and Speech Instruction: Exploiting the Power of Adverbs	Faraz Lotfi et.al.	2404.02294	null
2024-04-02	Multi-Level Label Correction by Distilling Proximate Patterns for Semi-supervised Semantic Segmentation	Hui Xiao et.al.	2404.02065	null
2024-04-02	Synthetic Data for Robust Stroke Segmentation	Liam Chalcroft et.al.	2404.01946	link
2024-04-02	Improving Bird's Eye View Semantic Segmentation by Task Decomposition	Tianhao Zhao et.al.	2404.01925	null
2024-04-02	Samba: Semantic Segmentation of Remotely Sensed Images with State Space Model	Qinfeng Zhu et.al.	2404.01705	link
2024-04-02	Beyond Image Super-Resolution for Image Recognition with Task-Driven Perceptual Loss	Jaeha Kim et.al.	2404.01692	link
2024-04-01	PDF: A Probability-Driven Framework for Open World 3D Point Cloud Semantic Segmentation	Jinfeng Xu et.al.	2404.00979	link
2024-04-01	GOV-NeSF: Generalizable Open-Vocabulary Neural Semantic Fields	Yunsong Wang et.al.	2404.00931	link
2024-04-02	Rethinking Saliency-Guided Weakly-Supervised Semantic Segmentation	Beomyoung Kim et.al.	2404.00918	link
2024-03-31	Training-Free Semantic Segmentation via LLM-Supervision	Wenfang Sun et.al.	2404.00701	null
2024-03-31	LAESI: Leaf Area Estimation with Synthetic Imagery	Jacek Kałużny et.al.	2404.00593	null
2024-03-29	Modeling Weather Uncertainty for Multi-weather Co-Presence Estimation	Qi Bi et.al.	2403.20092	null
2024-03-29	MambaMixer: Efficient Selective State Space Models with Dual Token and Channel Selection	Ali Behrouz et.al.	2403.19888	null
2024-03-28	Segmentation Re-thinking Uncertainty Estimation Metrics for Semantic Segmentation	Qitian Ma et.al.	2403.19826	null
2024-03-28	ENet-21: An Optimized light CNN Structure for Lane Detection	Seyed Rasoul Hosseini et.al.	2403.19782	null
2024-03-29	Genetic Quantization-Aware Approximation for Non-Linear Operations in Transformers	Pingcheng Dong et.al.	2403.19591	link
2024-03-28	DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs	Donghyun Kim et.al.	2403.19588	link
2024-03-28	Learning Multiple Representations with Inconsistency-Guided Detail Regularization for Mask-Guided Matting	Weihao Jiang et.al.	2403.19213	null
2024-03-27	Lift3D: Zero-Shot Lifting of Any 2D Vision Model to 3D	Mukund Varma T et.al.	2403.18922	null
2024-03-27	I2CKD : Intra- and Inter-Class Knowledge Distillation for Semantic Segmentation	Ayoub Karine et.al.	2403.18490	null
2024-03-28	ViTAR: Vision Transformer with Any Resolution	Qihang Fan et.al.	2403.18361	null
2024-03-27	Generating Diverse Agricultural Data for Vision-Based Farming Applications	Mikolaj Cieslak et.al.	2403.18351	null
2024-03-27	Road Obstacle Detection based on Unknown Objectness Scores	Chihiro Noguchi et.al.	2403.18207	null
2024-03-26	The Need for Speed: Pruning Transformers with One Recipe	Samir Khaki et.al.	2403.17921	link
2024-03-26	Compressed Multi-task embeddings for Data-Efficient Downstream training and inference in Earth Observation	Carlos Gomes et.al.	2403.17886	link
2024-03-26	PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition	Chenhongyi Yang et.al.	2403.17695	link
2024-03-26	Integrating Mamba Sequence Model and Hierarchical Upsampling Network for Accurate Semantic Segmentation of Multiple Sclerosis Legion	Kazi Shahriar Sanjid et.al.	2403.17432	null
2024-03-25	Optimizing LiDAR Placements for Robust Driving Perception in Adverse Conditions	Ye Li et.al.	2403.17009	link
2024-03-25	DreamLIP: Language-Image Pre-training with Long Captions	Kecheng Zheng et.al.	2403.17007	link
2024-03-25	TwinLiteNetPlus: A Stronger Model for Real-time Drivable Area and Lane Segmentation	Quang-Huy Che et.al.	2403.16958	null
2024-03-25	HPL-ESS: Hybrid Pseudo-Labeling for Unsupervised Event-based Semantic Segmentation	Linglin Jing et.al.	2403.16788	null
2024-03-25	SatSynth: Augmenting Image-Mask Pairs through Diffusion Models for Aerial Semantic Segmentation	Aysim Toker et.al.	2403.16605	null
2024-03-25	Self-Supervised Learning for Medical Image Data with Anatomy-Oriented Imaging Planes	Tianwei Zhang et.al.	2403.16499	null
2024-03-25	GoodSAM: Bridging Domain and Capacity Gaps via Segment Anything Model for Distortion-aware Panoramic Semantic Segmentation	Weiming Zhang et.al.	2403.16370	null
2024-03-24	Dual-modal Prior Semantic Guided Infrared and Visible Image Fusion for Intelligent Transportation System	Jing Li et.al.	2403.16227	null
2024-03-24	Segment Anything Model for Road Network Graph Extraction	Congrui Hetang et.al.	2403.16051	link
2024-03-24	SM2C: Boost the Semi-supervised Segmentation for Medical Image by using Meta Pseudo Labels and Mixed Images	Yifei Wang et.al.	2403.16009	null
2024-03-22	Semantic Gaussians: Open-Vocabulary Scene Understanding with 3D Gaussian Splatting	Jun Guo et.al.	2403.15624	null
2024-03-22	A2DMN: Anatomy-Aware Dilated Multiscale Network for Breast Ultrasound Semantic Segmentation	Kyle Lucke et.al.	2403.15560	null
2024-03-22	InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding	Yi Wang et.al.	2403.15377	link
2024-03-22	Anytime, Anywhere, Anyone: Investigating the Feasibility of Segment Anything Model for Crowd-Sourcing Medical Image Annotations	Pranav Kulkarni et.al.	2403.15218	link
2024-03-22	Your Image is My Video: Reshaping the Receptive Field via Image-To-Video Differentiable AutoAugmentation and Fusion	Sofia Casarin et.al.	2403.15194	null
2024-03-22	Improve Cross-domain Mixed Sampling with Guidance Training for Adaptive Segmentation	Wenlve Zhou et.al.	2403.14995	link
2024-03-21	WeatherProof: Leveraging Language Guidance for Semantic Segmentation in Adverse Weather	Blake Gella et.al.	2403.14874	null
2024-03-21	Learning to Project for Cross-Task Knowledge Distillation	Dylan Auty et.al.	2403.14494	null
2024-03-21	OA-CNNs: Omni-Adaptive Sparse CNNs for 3D Semantic Segmentation	Bohao Peng et.al.	2403.14418	link
2024-03-21	Open-Vocabulary Attention Maps with Token Optimization for Semantic Segmentation in Diffusion Models	Pablo Marcos-Manchón et.al.	2403.14291	link
2024-03-21	OTSeg: Multi-prompt Sinkhorn Attention for Zero-Shot Semantic Segmentation	Kwanyoung Kim et.al.	2403.14183	link
2024-03-21	Evidential Semantic Mapping in Off-road Environments with Uncertainty-aware Bayesian Kernel Inference	Junyoung Kim et.al.	2403.14138	null
2024-03-21	Soft Masked Transformer for Point Cloud Processing with Skip Attention-Based Upsampling	Yong He et.al.	2403.14124	null
2024-03-21	Semantics from Space: Satellite-Guided Thermal Semantic Segmentation Annotation for Aerial Field Robots	Connor Lee et.al.	2403.14056	null
2024-03-20	When Cars meet Drones: Hyperbolic Federated Learning for Source-Free Domain Adaptation in Adverse Weather	Giulia Rizzoli et.al.	2403.13762	null
2024-03-20	Next day fire prediction via semantic segmentation	Konstantinos Alexis et.al.	2403.13545	null
2024-03-20	MTP: Advancing Remote Sensing Foundation Model via Multi-Task Pretraining	Di Wang et.al.	2403.13430	link
2024-03-20	AMCO: Adaptive Multimodal Coupling of Vision and Proprioception for Quadruped Robot Navigation in Outdoor Environments	Mohamed Elnoor et.al.	2403.13235	null
2024-03-20	Modeling the Label Distributions for Weakly-Supervised Semantic Segmentation	Linshan Wu et.al.	2403.13225	link
2024-03-19	Reflectivity Is All You Need!: Advancing LiDAR Semantic Segmentation	Kasi Viswanath et.al.	2403.13188	link
2024-03-19	As Firm As Their Foundations: Can open-sourced foundation models be used to create adversarial examples for downstream tasks?	Anjun Hu et.al.	2403.12693	null
2024-03-19	PCT: Perspective Cue Training Framework for Multi-Camera BEV Segmentation	Haruya Ishikawa et.al.	2403.12530	null
2024-03-19	Semantics, Distortion, and Style Matter: Towards Source-free UDA for Panoramic Segmentation	Xu Zheng et.al.	2403.12505	null
2024-03-18	Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation	Wangbo Zhao et.al.	2403.11808	link
2024-03-18	LSKNet: A Foundation Lightweight Backbone for Remote Sensing	Yuxuan Li et.al.	2403.11735	link
2024-03-18	TTT-KD: Test-Time Training for 3D Semantic Segmentation through Knowledge Distillation from Foundation Models	Lisa Weijler et.al.	2403.11691	null
2024-03-18	OurDB: Ouroboric Domain Bridging for Multi-Target Domain Adaptive Semantic Segmentation	Seungbeom Woo et.al.	2403.11582	null
2024-03-18	MCD: Diverse Large-Scale Multi-Campus Dataset for Robot Perception	Thien-Minh Nguyen et.al.	2403.11496	null
2024-03-18	Uncertainty-Calibrated Test-Time Model Adaptation without Forgetting	Mingkui Tan et.al.	2403.11491	null
2024-03-17	TAG: Guidance-free Open-Vocabulary Semantic Segmentation	Yasufumi Kawano et.al.	2403.11197	link
2024-03-17	MaskDiffusion: Exploiting Pre-trained Diffusion Models for Semantic Segmentation	Yasufumi Kawano et.al.	2403.11194	link
2024-03-17	DuPL: Dual Student with Trustworthy Progressive Learning for Robust Weakly Supervised Semantic Segmentation	Yuanchen Wu et.al.	2403.11184	link
2024-03-17	LERENet: Eliminating Intra-class Differences for Metal Surface Defect Few-shot Semantic Segmentation	Hanze Ding et.al.	2403.11122	null
2024-03-17	Adaptive Semantic-Enhanced Denoising Diffusion Probabilistic Model for Remote Sensing Image Super-Resolution	Jialu Sui et.al.	2403.11078	link
2024-03-17	Intelligent Railroad Grade Crossing: Leveraging Semantic Segmentation and Object Detection for Enhanced Safety	Al Amin et.al.	2403.11060	null
2024-03-15	FeatUp: A Model-Agnostic Framework for Features at Any Resolution	Stephanie Fu et.al.	2403.10516	link
2024-03-15	Real-Time Image Segmentation via Hybrid Convolutional-Transformer Architecture Search	Hongyuan Yu et.al.	2403.10413	link
2024-03-15	Region-aware Distribution Contrast: A Novel Approach to Multi-Task Partially Supervised Learning	Meixuan Li et.al.	2403.10252	null
2024-03-15	Exploring Optical Flow Inclusion into nnU-Net Framework for Surgical Instrument Segmentation	Marcos Fernández-Rodríguez et.al.	2403.10216	null
2024-03-15	TransLandSeg: A Transfer Learning Approach for Landslide Semantic Segmentation Based on Vision Foundation Model	Changhong Hou et.al.	2403.10127	null
2024-03-15	Visual Foundation Models Boost Cross-Modal Unsupervised Domain Adaptation for 3D Semantic Segmentation	Jingyi Xu et.al.	2403.10001	link
2024-03-14	WeakSurg: Weakly supervised surgical instrument segmentation using temporal equivariance and semantic continuity	Qiyuan Wang et.al.	2403.09551	null
2024-03-14	Annotation Free Semantic Segmentation with Vision Foundation Models	Soroush Seifi et.al.	2403.09307	null
2024-03-14	When Semantic Segmentation Meets Frequency Aliasing	Linwei Chen et.al.	2403.09065	link
2024-03-13	CART: Caltech Aerial RGB-Thermal Dataset in the Wild	Connor Lee et.al.	2403.08997	link
2024-03-13	SLCF-Net: Sequential LiDAR-Camera Fusion for Semantic Scene Completion using a 3D Recurrent U-Net	Helin Cao et.al.	2403.08885	null
2024-03-13	Segmentation of Knee Bones for Osteoarthritis Assessment: A Comparative Analysis of Supervised, Few-Shot, and Zero-Shot Learning Approaches	Yun Xin Teoh et.al.	2403.08761	null
2024-03-13	Real-time 3D semantic occupancy prediction for autonomous vehicles using memory-efficient sparse convolution	Samuel Sze et.al.	2403.08748	null
2024-03-13	Semantic Segmentation of Solar Radio Spikes at Low Frequencies	Pearse C. Murphy et.al.	2403.08546	null
2024-03-13	Language-Driven Visual Consensus for Zero-Shot Semantic Segmentation	Zicheng Zhang et.al.	2403.08426	null
2024-03-13	LIX: Implicitly Infusing Spatial Geometric Prior Knowledge into Visual Semantic Segmentation for Autonomous Driving	Sicen Guo et.al.	2403.08215	null
2024-03-13	Multiscale Low-Frequency Memory Network for Improved Feature Extraction in Convolutional Neural Networks	Fuzhi Wu et.al.	2403.08157	link
2024-03-12	Mitigating the Impact of Attribute Editing on Face Recognition	Sudipta Banerjee et.al.	2403.08092	null
2024-03-12	Hunting Attributes: Context Prototype-Aware Learning for Weakly Supervised Semantic Segmentation	Feilong Tang et.al.	2403.07630	link
2024-03-12	PeLK: Parameter-efficient Large Kernel ConvNets with Peripheral Convolution	Honghao Chen et.al.	2403.07589	null
2024-03-12	Open-World Semantic Segmentation Including Class Similarity	Matteo Sodano et.al.	2403.07532	link
2024-03-11	Average Calibration Error: A Differentiable Loss for Improved Reliability in Image Segmentation	Theodore Barfoot et.al.	2403.06759	link
2024-03-11	Forest Inspection Dataset for Aerial Semantic Segmentation and Depth Estimation	Bianca-Cerasela-Zelia Blaga et.al.	2403.06621	link
2024-03-11	OMH: Structured Sparsity via Optimally Matched Hierarchy for Unsupervised Semantic Segmentation	Baran Ozaydin et.al.	2403.06546	null
2024-03-11	3D Semantic Segmentation-Driven Representations for 3D Object Detection	Hayeon O et.al.	2403.06501	link
2024-03-11	Point Mamba: A Novel Point Cloud Backbone Based on State Space Model with Octree-Based Ordering Strategy	Jiuming Liu et.al.	2403.06467	link
2024-03-11	Towards the Uncharted: Density-Descending Feature Perturbation for Semi-supervised Semantic Segmentation	Xiaoyang Wang et.al.	2403.06462	null
2024-03-11	Refining Segmentation On-the-Fly: An Interactive Framework for Point Cloud Semantic Segmentation	Peng Zhang et.al.	2403.06401	null
2024-03-10	Style Blind Domain Generalized Semantic Segmentation via Covariance Alignment and Semantic Consistence Contrastive Learning	Woo-Jin Ahn et.al.	2403.06122	link
2024-03-09	Mask-Enhanced Segment Anything Model for Tumor Lesion Semantic Segmentation	Hairong Shi et.al.	2403.05912	link
2024-03-08	Attention-guided Feature Distillation for Semantic Segmentation	Amir M. Mansourian et.al.	2403.05451	link
2024-03-08	Generalized Correspondence Matching via Flexible Hierarchical Refinement and Patch Descriptor Distillation	Yu Han et.al.	2403.05388	null
2024-03-08	Frequency-Adaptive Dilated Convolution for Semantic Segmentation	Linwei Chen et.al.	2403.05369	link
2024-03-08	Embedded Deployment of Semantic Segmentation in Medicine through Low-Resolution Inputs	Erik Ostrowski et.al.	2403.05340	null
2024-03-08	LVIC: Multi-modality segmentation by Lifting Visual Info as Cue	Zichao Dong et.al.	2403.05159	null
2024-03-06	ECAP: Extensive Cut-and-Paste Augmentation for Unsupervised Domain Adaptive Semantic Segmentation	Erik Brorsson et.al.	2403.03854	link
2024-03-06	Multi-Grained Cross-modal Alignment for Learning Open-vocabulary Semantic Segmentation from Text Supervision	Yajie Liu et.al.	2403.03707	null
2024-03-06	Causal Prototype-inspired Contrast Adaptation for Unsupervised Domain Adaptive Semantic Segmentation of High-resolution Remote Sensing Imagery	Jingru Zhu et.al.	2403.03704	null
2024-03-06	GSNeRF: Generalizable Semantic Neural Radiance Fields with Enhanced 3D Scene Understanding	Zi-Ting Chou et.al.	2403.03608	null
2024-03-06	Multi-task Learning for Real-time Autonomous Driving Leveraging Task-adaptive Attention Generator	Wonhyeok Choi et.al.	2403.03468	null
2024-03-05	Improved LiDAR Odometry and Mapping using Deep Semantic Segmentation and Novel Outliers Detection	Mohamed Afifi et.al.	2403.03111	null
2024-03-05	ActiveAD: Planning-Oriented Active Learning for End-to-End Autonomous Driving	Han Lu et.al.	2403.02877	null
2024-03-05	DDF: A Novel Dual-Domain Image Fusion Strategy for Remote Sensing Image Semantic Segmentation with Unsupervised Domain Adaptation	Lingyan Ran et.al.	2403.02784	null
2024-03-08	Learning without Exact Guidance: Updating Large-scale High-resolution Land Cover Maps from Low-resolution Historical Labels	Zhuohong Li et.al.	2403.02746	link
2024-03-05	FastOcc: Accelerating 3D Occupancy Prediction by Fusing the 2D Bird's-Eye View and Perspective View	Jiawei Hou et.al.	2403.02710	null
2024-03-05	Deep Common Feature Mining for Efficient Video Semantic Segmentation	Yaoyan Zheng et.al.	2403.02689	null
2024-03-04	Self-Supervised Facial Representation Learning with Facial Region Awareness	Zheng Gao et.al.	2403.02138	null
2024-03-04	Semi-Supervised Semantic Segmentation Based on Pseudo-Labels: A Survey	Lingyan Ran et.al.	2403.01909	null
2024-03-04	Map-aided annotation for pole base detection	Benjamin Missaoui et.al.	2403.01868	null
2024-03-04	AllSpark: Reborn Labeled Features from Unlabeled in Transformer for Semi-Supervised Semantic Segmentation	Haonan Wang et.al.	2403.01818	link
2024-03-02	Benchmarking Segmentation Models with Mask-Preserved Attribute Editing	Zijin Yin et.al.	2403.01231	link
2024-03-02	Auxiliary Tasks Enhanced Dual-affinity Learning for Weakly Supervised Semantic Segmentation	Lian Xu et.al.	2403.01156	null
2024-03-01	Rethinking Few-shot 3D Point Cloud Semantic Segmentation	Zhaochong An et.al.	2403.00592	link
2024-03-01	Small, Versatile and Mighty: A Range-View Perception Framework	Qiang Meng et.al.	2403.00325	null
2024-03-01	YOLO-MED : Multi-Task Interaction Network for Biomedical Images	Suizhi Huang et.al.	2403.00245	null
2024-02-29	FusionVision: A comprehensive approach of 3D object reconstruction and segmentation from RGB-D cameras using YOLO and fast segment anything	Safouane El Ghazouali et.al.	2403.00175	link
2024-02-29	RSAM-Seg: A SAM-based Approach with Prior Knowledge Integration for Remote Sensing Image Semantic Segmentation	Jie Zhang et.al.	2402.19004	null
2024-02-28	Spatial Coherence Loss for Salient and Camouflaged Object Detection and Beyond	Ziyun Yang et.al.	2402.18698	null
2024-02-29	Separate and Conquer: Decoupling Co-occurrence via Decomposition and Representation for Weakly Supervised Semantic Segmentation	Zhiwei Yang et.al.	2402.18467	link
2024-02-29	A Modular System for Enhanced Robustness of Multimedia Understanding Networks via Deep Parametric Estimation	Francesco Barbato et.al.	2402.18402	link
2024-02-28	Enhancing Roadway Safety: LiDAR-based Tree Clearance Analysis	Miriam Louise Carnot et.al.	2402.18309	null
2024-02-28	Self-Supervised Learning in Electron Microscopy: Towards a Foundation Model for Advanced Image Analysis	Bashir Kazimi et.al.	2402.18286	null
2024-02-28	PRCL: Probabilistic Representation Contrastive Learning for Semi-Supervised Semantic Segmentation	Haoyu Xie et.al.	2402.18117	null
2024-02-28	Spannotation: Enhancing Semantic Segmentation for Autonomous Navigation with Efficient Image Annotation	Samuel O. Folorunsho et.al.	2402.18084	link
2024-02-27	Weakly Supervised Co-training with Swapping Assignments for Semantic Segmentation	Xinyu Yang et.al.	2402.17891	link
2024-02-27	Mitigating Distributional Shift in Semantic Segmentation via Uncertainty Estimation from Unlabelled Data	David S. W. Williams et.al.	2402.17653	null
2024-02-27	Masked Gamma-SSL: Learning Uncertainty Estimation via Masked Image Modeling	David S. W. Williams et.al.	2402.17622	null
2024-02-27	A Large-scale Evaluation of Pretraining Paradigms for the Detection of Defects in Electroluminescence Solar Cell Images	David Torpey et.al.	2402.17611	null
2024-02-27	Scribble Hides Class: Promoting Scribble-Based Weakly-Supervised Semantic Segmentation with Its Class Label	Xinliang Zhang et.al.	2402.17555	link
2024-02-26	ConSept: Continual Semantic Segmentation via Adapter-based Vision Transformer	Bowen Dong et.al.	2402.16674	null
2024-02-26	UN-SAM: Universal Prompt-Free Segmentation for Generalized Nuclei Images	Zhen Chen et.al.	2402.16663	link
2024-02-26	Placing Objects in Context via Inpainting for Out-of-distribution Segmentation	Pau de Jorge et.al.	2402.16392	link
2024-02-26	BLO-SAM: Bi-level Optimization Based Overfitting-Preventing Finetuning of SAM	Li Zhang et.al.	2402.16338	link
2024-02-23	Modified CycleGAN for the synthesization of samples for wheat head segmentation	Jaden Myers et.al.	2402.15135	null
2024-02-22	Semantic Image Synthesis with Unconditional Generator	Jungwoo Chae et.al.	2402.14395	null
2024-02-22	Think before You Leap: Content-Aware Low-Cost Edge-Assisted Video Semantic Segmentation	Mingxuan Yan et.al.	2402.14326	null
2024-02-21	Tumor segmentation on whole slide images: training or prompting?	Huaqian Wu et.al.	2402.13932	null
2024-02-26	BenchCloudVision: A Benchmark Analysis of Deep Learning Approaches for Cloud Detection and Segmentation in Remote Sensing Imagery	Loddo Fabio et.al.	2402.13918	link
2024-02-21	Zero-BEV: Zero-shot Projection of Any First-Person Modality to BEV Maps	Gianluca Monaci et.al.	2402.13848	null
2024-02-21	Generalizable Semantic Vision Query Generation for Zero-shot Panoptic and Semantic Segmentation	Jialei Chen et.al.	2402.13697	null
2024-02-20	Cross-Domain Transfer Learning with CoRTe: Consistent and Reliable Transfer from Black-Box to Lightweight Segmentation Model	Claudia Cuttano et.al.	2402.13122	null
2024-02-19	LangXAI: Integrating Large Vision Models for Generating Textual Explanations to Enhance Explainability in Visual Perception Tasks	Truong Thanh Hung Nguyen et.al.	2402.12525	link
2024-02-19	Towards Explainable LiDAR Point Cloud Semantic Segmentation via Gradient Based Target Localization	Abhishek Kuriyal et.al.	2402.12098	link
2024-02-19	ISCUTE: Instance Segmentation of Cables Using Text Embedding	Shir Kozlovsky et.al.	2402.11996	null
2024-02-18	Key Patch Proposer: Key Patches Contain Rich Information	Jing Xu et.al.	2402.11458	link
2024-02-17	ChatEarthNet: A Global-Scale, High-Quality Image-Text Dataset for Remote Sensing	Zhenghang Yuan et.al.	2402.11325	link
2024-02-17	A Decoding Scheme with Successive Aggregation of Multi-Level Features for Light-Weight Semantic Segmentation	Jiwon Yoo et.al.	2402.11201	null
2024-02-16	HistoSegCap: Capsules for Weakly-Supervised Semantic Segmentation of Histological Tissue Type in Whole Slide Images	Mobina Mansoori et.al.	2402.10851	null
2024-02-16	Selective Prediction for Semantic Segmentation using Post-Hoc Confidence Estimation and Its Performance under Distribution Shift	Bruno Laboissiere Camargos Borges et.al.	2402.10665	null
2024-02-16	Efficient Multi-task Uncertainties for Joint Semantic Segmentation and Monocular Depth Estimation	Steven Landgraf et.al.	2402.10580	null
2024-02-15	Is Continual Learning Ready for Real-world Challenges?	Theodora Kontogianni et.al.	2402.10130	null
2024-02-15	Robust semi-automatic vessel tracing in the human retinal image by an instance segmentation neural network	Siyi Chen et.al.	2402.10055	null
2024-02-15	MM-Point: Multi-View Information-Enhanced Multi-Modal Self-Supervised 3D Point Cloud Understanding	Hai-Tao Yu et.al.	2402.10002	link
2024-02-14	Automated Plaque Detection and Agatston Score Estimation on Non-Contrast CT Scans: A Multicenter Study	Andrew M. Nguyen et.al.	2402.09569	null
2024-02-14	Reducing Texture Bias of Deep Neural Networks via Edge Enhancing Diffusion	Edgar Heinert et.al.	2402.09530	null
2024-02-13	Adaptive Hierarchical Certification for Segmentation using Randomized Smoothing	Alaa Anani et.al.	2402.08400	link
2024-02-13	Improving Image Coding for Machines through Optimizing Encoder via Auxiliary Loss	Kei Iino et.al.	2402.08267	null
2024-02-12	Semantic segmentation for recognition of epileptiform patterns recorded via Microelectrode Arrays in vitro	Gabriel Galeote-Checa et.al.	2402.08099	null
2024-02-11	Data Quality Aware Approaches for Addressing Model Drift of Semantic Segmentation Models	Samiha Mirza et.al.	2402.07258	null
2024-02-09	More than the Sum of Its Parts: Ensembling Backbone Networks for Few-Shot Segmentation	Nico Catalano et.al.	2402.06581	null
2024-02-09	Hybridnet for depth estimation and semantic segmentation	Dalila Sánchez-Escobedo et.al.	2402.06539	null
2024-02-09	Classifying point clouds at the facade-level using geometric features and deep learning networks	Yue Tan et.al.	2402.06506	link
2024-02-09	ControlUDA: Controllable Diffusion-assisted Unsupervised Domain Adaptation for Cross-Weather Semantic Segmentation	Fengyi Shen et.al.	2402.06446	null
2024-02-08	Early Fusion of Features for Semantic Segmentation	Anupam Gupta et.al.	2402.06091	null
2024-02-08	Privacy-Preserving Synthetic Continual Semantic Segmentation for Robotic Surgery	Mengya Xu et.al.	2402.05860	link
2024-02-08	On the Effect of Image Resolution on Semantic Segmentation	Ritambhara Singh et.al.	2402.05398	null
2024-02-07	Multi-Scale Semantic Segmentation with Modified MBConv Blocks	Xi Chen et.al.	2402.04618	null
2024-02-06	Energy-based Domain-Adaptive Segmentation with Depth Guidance	Jinjing Zhu et.al.	2402.03795	null
2024-02-05	SGS-SLAM: Semantic Gaussian Splatting For Neural Dense SLAM	Mingrui Li et.al.	2402.03246	null
2024-02-05	RRWNet: Recursive Refinement Network for Effective Retinal Artery/Vein Segmentation and Classification	José Morano et.al.	2402.03166	link
2024-02-05	Unsupervised semantic segmentation of high-resolution UAV imagery for road scene parsing	Zihan Ma et.al.	2402.02985	link
2024-02-04	M $^3$ Face: A Unified Multi-Modal Multilingual Framework for Human Face Generation and Editing	Mohammadreza Mofayezi et.al.	2402.02369	null
2024-02-04	Exploring Intrinsic Properties of Medical Images for Self-Supervised Binary Semantic Segmentation	Pranav Singh et.al.	2402.02367	null
2024-02-04	Region-Based Representations Revisited	Michal Shlapentokh-Rothman et.al.	2402.02352	link
2024-02-03	Multi-Level Feature Aggregation and Recursive Alignment Network for Real-Time Semantic Segmentation	Yanhua Zhang et.al.	2402.02286	link
2024-02-03	Revisiting Generative Adversarial Networks for Binary Semantic Segmentation on Imbalanced Datasets	Lei Xu et.al.	2402.02245	link
2024-02-03	Evaluating the Robustness of Off-Road Autonomous Driving Segmentation against Adversarial Attacks: A Dataset-Centric analysis	Pankaj Deoli et.al.	2402.02154	link
2024-02-03	Decomposition-based and Interference Perception for Infrared and Visible Image Fusion in Complex Scenes	Xilai Li et.al.	2402.02096	null
2024-02-02	Convolution kernel adaptation to calibrated fisheye	Bruno Berenguel-Baeta et.al.	2402.01456	link
2024-02-02	Delving into Decision-based Black-box Attacks on Semantic Segmentation	Zhaoyu Chen et.al.	2402.01220	null
2024-02-02	Scale Equalization for Multi-Level Feature Fusion	Bum Jun Kim et.al.	2402.01149	null
2024-02-01	We're Not Using Videos Effectively: An Updated Domain Adaptive Video Segmentation Baseline	Simar Kareer et.al.	2402.00868	link
2024-02-01	Automatic Segmentation of the Spinal Cord Nerve Rootlets	Jan Valosek et.al.	2402.00724	link
2024-02-01	A Framework for Building Point Cloud Cleaning, Plane Detection and Semantic Segmentation	Ilyass Abouelaziz et.al.	2402.00692	null
2024-01-31	Convolution Meets LoRA: Parameter Efficient Finetuning for Segment Anything Model	Zihan Zhong et.al.	2401.17868	link
2024-01-31	Leveraging Swin Transformer for Local-to-Global Weakly Supervised Semantic Segmentation	Rozhan Ahmadi et.al.	2401.17828	link
2024-02-01	Tiered approach for rapid damage characterisation of infrastructure enabled by remote sensing and deep learning technologies	Nadiia Kopiika et.al.	2401.17759	null
2024-01-31	Towards Image Semantics and Syntax Sequence Learning	Chun Tao et.al.	2401.17515	null
2024-01-30	Evaluation of Out-of-Distribution Detection Performance on Autonomous Driving Datasets	Jens Henriksson et.al.	2401.17013	null
2024-01-30	CAFCT: Contextual and Attentional Feature Fusions of Convolutional Neural Networks and Transformer for Liver Tumor Segmentation	Ming Kang et.al.	2401.16886	null
2024-01-29	Bridging Generative and Discriminative Models for Unified Visual Perception with Diffusion Priors	Shiyin Dong et.al.	2401.16459	null
2024-01-28	SERNet-Former: Semantic Segmentation by Efficient Residual Network with Attention-Boosting Gates and Attention-Fusion Networks	Serdar Erisen et.al.	2401.15741	link
2024-01-28	UP-CrackNet: Unsupervised Pixel-Wise Road Crack Detection via Adversarial Image Restoration	Nachuan Ma et.al.	2401.15647	null
2024-01-27	Vanishing-Point-Guided Video Semantic Segmentation of Driving Scenes	Diandian Guo et.al.	2401.15261	link
2024-01-26	Biological Valuation Map of Flanders: A Sentinel-2 Imagery Analysis	Mingshi Li et.al.	2401.15223	null
2024-01-26	Kitchen Food Waste Image Segmentation and Classification for Compost Nutrients Estimation	Raiyan Rahman et.al.	2401.15175	null
2024-01-26	SSR: SAM is a Strong Regularizer for domain adaptive semantic segmentation	Yanqi Ge et.al.	2401.14686	null
2024-01-25	CloudTracks: A Dataset for Localizing Ship Tracks in Satellite Images of Clouds	Muhammad Ahmed Chaudhry et.al.	2401.14486	null
2024-01-25	Unlocking Past Information: Temporal Embeddings in Cooperative Bird's Eye View Prediction	Dominik Rößle et.al.	2401.14325	null
2024-01-24	Segment Any Cell: A SAM-based Auto-prompting Fine-tuning Framework for Nuclei Segmentation	Saiyang Na et.al.	2401.13220	null
2024-01-24	Boundary and Relation Distillation for Semantic Segmentation	Dong Zhang et.al.	2401.13174	null
2024-01-23	DatUS^2: Data-driven Unsupervised Semantic Segmentation with Pre-trained Self-supervised Vision Transformer	Sonal Kumar et.al.	2401.12820	link
2024-01-23	Self-Supervised Vision Transformers Are Efficient Segmentation Learners for Imperfect Labels	Seungho Lee et.al.	2401.12535	null
2024-01-23	Self-supervised Learning of LiDAR 3D Point Clouds via 2D-3D Neural Calibration	Yifan Zhang et.al.	2401.12452	null
2024-01-22	Scaling Up Quantization-Aware Neural Architecture Search for Efficient Deep Learning on the Edge	Yao Lu et.al.	2401.12350	null
2024-01-22	Exploring Simple Open-Vocabulary Semantic Segmentation	Zihang Lai et.al.	2401.12217	link
2024-01-22	Out-of-Distribution Detection & Applications With Ablated Learned Temperature Energy	Will LeVine et.al.	2401.12129	link
2024-01-22	HomeRobot Open Vocabulary Mobile Manipulation Challenge 2023 Participant Report (Team KuzHum)	Volodymyr Kuzma et.al.	2401.12048	null
2024-01-22	SemPLeS: Semantic Prompt Learning for Weakly-Supervised Semantic Segmentation	Ci-Siang Lin et.al.	2401.11791	null
2024-01-22	EmerDiff: Emerging Pixel-level Semantic Knowledge in Diffusion Models	Koichi Namekata et.al.	2401.11739	null
2024-01-22	MetaSeg: Content-Aware Meta-Net for Omni-Supervised Semantic Segmentation	Shenwang Jiang et.al.	2401.11738	null

(back to top)

Instance Segmentation

Publish Date	Title	Authors	PDF	Code
2024-08-20	Vocabulary-Free 3D Instance Segmentation with Vision and Language Assistant	Guofeng Mei et.al.	2408.10652	null
2024-08-21	LSVOS Challenge 3rd Place Report: SAM2 and Cutie based VOS	Xinyu Liu et.al.	2408.10469	null
2024-08-19	Leveraging Superfluous Information in Contrastive Representation Learning	Xuechu Yu et.al.	2408.10292	null
2024-08-19	3D-Aware Instance Segmentation and Tracking in Egocentric Videos	Yash Bhalgat et.al.	2408.09860	null
2024-08-18	VrdONE: One-stage Video Visual Relation Detection	Xinjie Jiang et.al.	2408.09408	link
2024-08-17	GoodSAM++: Bridging Domain and Capacity Gaps via Segment Anything Model for Panoramic Semantic Segmentation	Weiming Zhang et.al.	2408.09115	null
2024-08-16	Zero-Shot Dual-Path Integration Framework for Open-Vocabulary 3D Instance Segmentation	Tri Ton et.al.	2408.08591	null
2024-08-16	Tuning a SAM-Based Model with Multi-Cognitive Visual Adapter to Remote Sensing Instance Segmentation	Linghao Zheng et.al.	2408.08576	null
2024-08-16	Tell Codec What Worth Compressing: Semantically Disentangled Image Coding for Machine with LMMs	Jinming Liu et.al.	2408.08575	null
2024-08-15	5%>100%: Breaking Performance Shackles of Full Fine-Tuning on Visual Recognition Tasks	Dongshuo Yin et.al.	2408.08345	link
2024-08-09	Assessment of Cell Nuclei AI Foundation Models in Kidney Pathology	Junlin Guo et.al.	2408.06381	link
2024-08-13	Performance Evaluation of YOLOv8 Model Configurations, for Instance Segmentation of Strawberry Fruit Development Stages in an Open Field Environment	Abdul-Razak Alhassan Gamani et.al.	2408.05661	null
2024-08-08	Embodied Uncertainty-Aware Object Segmentation	Xiaolin Fang et.al.	2408.04760	null
2024-08-08	Robust Approximate Characterization of Single-Cell Heterogeneity in Microbial Growth	Richard D. Paul et.al.	2408.04501	link
2024-08-07	CAS-ViT: Convolutional Additive Self-attention Vision Transformers for Efficient Mobile Applications	Tianfang Zhang et.al.	2408.03703	link
2024-08-07	SAM2-PATH: A better segment anything model for semantic segmentation in digital pathology	Mingya Zhang et.al.	2408.03651	link
2024-08-06	Evaluation of Segment Anything Model 2: The Role of SAM2 in the Underwater Environment	Shijie Lian et.al.	2408.02924	link
2024-08-09	NuLite -- Lightweight and Fast Model for Nuclei Instance Segmentation and Classification	Cristian Tommasino et.al.	2408.01797	link
2024-08-02	Amodal Segmentation for Laparoscopic Surgery Video Instruments	Ruohua Shi et.al.	2408.01067	null
2024-08-01	Leaf Angle Estimation using Mask R-CNN and LETR Vision Transformer	Venkat Margapuri et.al.	2408.00749	null
2024-08-01	A Simple Background Augmentation Method for Object Detection with Diffusion Model	Yuhang Li et.al.	2408.00350	null
2024-07-31	Con4m: Context-aware Consistency Learning Framework for Segmented Time Series Classification	Junru Chen et.al.	2408.00041	null
2024-07-31	MaskUno: Switch-Split Block For Enhancing Instance Segmentation	Jawad Haidar et.al.	2407.21498	null
2024-08-02	Depth-Wise Convolutions in Vision Transformers for Efficient Training on Small Datasets	Tianxiao Zhang et.al.	2407.19394	link
2024-07-26	A Survey on Cell Nuclei Instance Segmentation and Classification: Leveraging Context and Attention	João D. Nunes et.al.	2407.18673	null
2024-07-25	LKCell: Efficient Cell Nuclei Instance Segmentation with Large Convolution Kernels	Ziwei Cui et.al.	2407.18054	link
2024-07-26	Quality Assured: Rethinking Annotation Strategies in Imaging AI	Tim Rädsch et.al.	2407.17596	null
2024-07-24	McGAN: Generating Manufacturable Designs by Embedding Manufacturing Rules into Conditional Generative Adversarial Network	Zhichao Wang et.al.	2407.16943	null
2024-07-22	Enhancing Cell Instance Segmentation in Scanning Electron Microscopy Images via a Deep Contour Closing Operator	Florian Robert et.al.	2407.15817	null
2024-07-19	Panoptic Segmentation of Mammograms with Text-To-Image Diffusion Model	Kun Zhao et.al.	2407.14326	null
2024-07-19	Scale Disparity of Instances in Interactive Point Cloud Segmentation	Chenrui Han et.al.	2407.14009	null
2024-07-18	GroupMamba: Parameter-Efficient and Accurate Group Visual State Space Model	Abdelrahman Shaker et.al.	2407.13772	link
2024-07-17	AdaLog: Post-Training Quantization for Vision Transformers with Adaptive Logarithm Quantizer	Zhuguanyu Wu et.al.	2407.12951	link
2024-07-17	Close the Sim2real Gap via Physically-based Structured Light Synthetic Data Simulation	Kaixin Bai et.al.	2407.12449	null
2024-07-17	Serialized Point Mamba: A Serialized Point Cloud Mamba Segmentation Model	Tao Wang et.al.	2407.12319	null
2024-07-16	SGIFormer: Semantic-guided and Geometric-enhanced Interleaving Transformer for 3D Instance Segmentation	Lei Yao et.al.	2407.11564	null
2024-07-19	Crowd-SAM: SAM as a Smart Annotator for Object Detection in Crowded Scenes	Zhi Cai et.al.	2407.11464	link
2024-07-16	Generative AI Driven Task-Oriented Adaptive Semantic Communications	Yuzhou Fu et.al.	2407.11354	null
2024-07-15	M18K: A Comprehensive RGB-D Dataset and Benchmark for Mushroom Detection and Instance Segmentation	Abdollah Zakeri et.al.	2407.11275	link
2024-07-14	Part2Object: Hierarchical Unsupervised 3D Instance Segmentation	Cheng Shi et.al.	2407.10084	link
2024-07-12	WSESeg: Introducing a Dataset for the Segmentation of Winter Sports Equipment with a Baseline for Interactive Segmentation	Robin Schön et.al.	2407.09288	null
2024-07-11	SLoRD: Structural Low-Rank Descriptors for Shape Consistency in Vertebrae Segmentation	Xin You et.al.	2407.08555	null
2024-07-10	MambaVision: A Hybrid Mamba-Transformer Vision Backbone	Ali Hatamizadeh et.al.	2407.08083	link
2024-07-12	Unified Embedding Alignment for Open-Vocabulary Video Instance Segmentation	Hao Fang et.al.	2407.07427	link
2024-07-09	Improved Block Merging for 3D Point Cloud Instance Segmentation	Leon Denis et.al.	2407.06991	null
2024-07-09	Joint prototype and coefficient prediction for 3D instance segmentation	Remco Royen et.al.	2407.06958	null
2024-07-08	Training-free CryoET Tomogram Segmentation	Yizhou Zhao et.al.	2407.06833	link
2024-07-05	Rethinking Visual Prompting for Multimodal Large Language Models with External Knowledge	Yuanze Lin et.al.	2407.04681	null
2024-07-11	Slice-100K: A Multimodal Dataset for Extrusion-based 3D Printing	Anushrut Jignasu et.al.	2407.04180	null
2024-07-04	Performance of Medical Image Fusion in High-level Analysis Tasks: A Mutual Enhancement Framework for Unaligned PAT and MRI Image Fusion	Yutian Zhong et.al.	2407.03992	link
2024-07-03	Context-Aware Video Instance Segmentation	Seunghun Lee et.al.	2407.03010	link
2024-07-03	ADFQ-ViT: Activation-Distribution-Friendly Post-Training Quantization for Vision Transformers	Yanfeng Jiang et.al.	2407.02763	null
2024-07-02	LiDAR-based HD Map Localization using Semantic Generalized ICP with Road Marking Detection	Yansong Gong et.al.	2407.02061	null
2024-07-02	Multi-Grained Contrast for Data-Efficient Unsupervised Representation Learning	Chengchao Shen et.al.	2407.02014	link
2024-07-01	PanopticRecon: Leverage Open-vocabulary Instance Segmentation for Zero-shot Panoptic Reconstruction	Xuan Yu et.al.	2407.01349	null
2024-07-01	Robot Instance Segmentation with Few Annotations for Grasping	Moshe Kimhi et.al.	2407.01302	link
2024-06-28	PM-VIS+: High-Performance Video Instance Segmentation without Video Annotation	Zhangjing Yang et.al.	2406.19665	link
2024-07-01	3D Feature Distillation with Object-Centric Priors	Georgios Tziafas et.al.	2406.18742	null
2024-06-26	CoDA: Interactive Segmentation and Morphological Analysis of Dendroid Structures Exemplified on Stony Cold-Water Corals	Kira Schmitt et.al.	2406.18236	link
2024-06-25	Semi-supervised classification of dental conditions in panoramic radiographs using large language model and instance segmentation: A real-world dataset evaluation	Bernardo Silva et.al.	2406.17915	null
2024-06-25	Depth-Guided Semi-Supervised Instance Segmentation	Xin Chen et.al.	2406.17413	null
2024-06-25	XAMI -- A Benchmark Dataset for Artefact Detection in XMM-Newton Optical Images	Elisabeta-Iulia Dima et.al.	2406.17323	link
2024-06-24	GMT: Guided Mask Transformer for Leaf Instance Segmentation	Feng Chen et.al.	2406.17109	null
2024-06-24	Instance Consistency Regularization for Semi-Supervised 3D Instance Segmentation	Yizheng Wu et.al.	2406.16776	link
2024-06-23	CholecInstanceSeg: A Tool Instance Segmentation Dataset for Laparoscopic Surgery	Oluwatosin Alabi et.al.	2406.16039	link
2024-06-22	Fine-grained Background Representation for Weakly Supervised Semantic Segmentation	Xu Yin et.al.	2406.15755	link
2024-06-21	TraceNet: Segment one thing efficiently	Mingyuan Wu et.al.	2406.14874	null
2024-06-19	3D Instance Segmentation Using Deep Learning on RGB-D Indoor Data	Siddiqui Muhammad Yasir et.al.	2406.14581	null
2024-06-20	2nd Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation	Bin Cao et.al.	2406.13939	null
2024-06-18	Competitive Learning for Achieving Content-specific Filters in Video Coding for Machines	Honglei Zhang et.al.	2406.12367	null
2024-06-17	OoDIS: Anomaly Instance Segmentation Benchmark	Alexey Nekrasov et.al.	2406.11835	link
2024-06-18	Benchmarking Label Noise in Instance Segmentation: Spatial Noise Matters	Eden Grad et.al.	2406.10891	link
2024-06-15	MMVR: Millimeter-wave Multi-View Radar Dataset and Benchmark for Indoor Perception	M. Mahbubur Rahman et.al.	2406.10708	link
2024-06-14	4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities	Roman Bachmann et.al.	2406.09406	null
2024-06-12	2nd Place Solution for MOSE Track in CVPR 2024 PVUW workshop: Complex Video Object Segmentation	Zhensong Xu et.al.	2406.08192	null
2024-06-11	PanoSSC: Exploring Monocular Panoptic 3D Scene Reconstruction for Autonomous Driving	Yining Shi et.al.	2406.07037	null
2024-06-11	RS-DFM: A Remote Sensing Distributed Foundation Model for Diverse Downstream Tasks	Zhechao Wang et.al.	2406.07032	null
2024-06-11	Dual Thinking and Perceptual Analysis of Deep Learning Models using Human Adversarial Examples	Kailas Dayanandan et.al.	2406.06967	link
2024-06-11	UVIS: Unsupervised Video Instance Segmentation	Shuaiyi Huang et.al.	2406.06908	null
2024-06-10	Diving into Underwater: Segment Anything Model Guided Underwater Salient Instance Segmentation and A Large-scale Dataset	Shijie Lian et.al.	2406.06039	link
2024-06-09	Scaling Graph Convolutions for Mobile Vision	William Avery et.al.	2406.05850	link
2024-06-08	1st Place Winner of the 2024 Pixel-level Video Understanding in the Wild (CVPR'24 PVUW) Challenge in Video Panoptic Segmentation and Best Long Video Consistency of Video Semantic Segmentation	Qingfeng Liu et.al.	2406.05352	null
2024-06-07	Nacala-Roof-Material: Drone Imagery for Roof Detection, Classification, and Segmentation to Support Mosquito-borne Disease Risk Assessment	Venkanna Babu Guthula et.al.	2406.04949	null
2024-06-06	Instance Segmentation and Teeth Classification in Panoramic X-rays	Devichand Budagam et.al.	2406.03747	link
2024-06-04	Open-YOLO 3D: Towards Fast and Accurate Open-Vocabulary 3D Instance Segmentation	Mohamed El Amine Boudjoghra et.al.	2406.02548	link
2024-06-04	Generative Active Learning for Long-tailed Instance Segmentation	Muzhi Zhu et.al.	2406.02435	link
2024-06-03	MultiPly: Reconstruction of Multiple People from Monocular Video in the Wild	Zeren Jiang et.al.	2406.01595	null
2024-06-03	An expert-driven data generation pipeline for histological images	Roberto Basla et.al.	2406.01403	link
2024-06-03	MP-PolarMask: A Faster and Finer Instance Segmentation for Concave Images	Ke-Lei Wang et.al.	2406.01356	null
2024-06-03	SceneTextGen: Layout-Agnostic Scene Text Image Synthesis with Diffusion Models	Qilong Zhangli et.al.	2406.01062	null
2024-06-05	From Seedling to Harvest: The GrowingSoy Dataset for Weed Detection in Soy Crops via Instance Segmentation	Raul Steinmetz et.al.	2406.00313	link
2024-06-04	Extreme Point Supervised Instance Segmentation	Hyeonjun Lee et.al.	2405.20729	null
2024-05-29	Reasoning3D -- Grounding and Reasoning in 3D: Fine-Grained Zero-Shot Open-Vocabulary 3D Reasoning Part Segmentation via Large Vision-Language Models	Tianrun Chen et.al.	2405.19326	null
2024-05-28	Adapting Pre-Trained Vision Models for Novel Instance Detection and Segmentation	Yangxiao Lu et.al.	2405.17859	link
2024-05-26	Understanding the Effect of using Semantically Meaningful Tokens for Visual Representation Learning	Neha Kalibhat et.al.	2405.16401	null
2024-05-25	Video Prediction Models as General Visual Encoders	James Maier et.al.	2405.16382	null
2024-05-25	Efficient Temporal Action Segmentation via Boundary-aware Query Voting	Peiyao Wang et.al.	2405.15995	link
2024-05-24	Autonomous Quilt Spreading for Caregiving Robots	Yuchun Guo et.al.	2405.15373	null
2024-05-23	Harmony: A Joint Self-Supervised and Weakly-Supervised Framework for Learning General Purpose Visual Representations	Mohammed Baharoon et.al.	2405.14239	link
2024-05-22	PerSense: Personalized Instance Segmentation in Dense Images	Muhammad Ibraheem Siddiqui et.al.	2405.13518	null
2024-05-22	Unsupervised Pre-training with Language-Vision Prompts for Low-Data Instance Segmentation	Dingwen Zhang et.al.	2405.13388	link
2024-05-22	Semantic Equitable Clustering: A Simple, Fast and Effective Strategy for Vision Transformer	Qihang Fan et.al.	2405.13337	link
2024-05-22	Vision Transformer with Sparse Scan Prior	Qihang Fan et.al.	2405.13335	link
2024-05-20	Improving the Explain-Any-Concept by Introducing Nonlinearity to the Trainable Surrogate Model	Mounes Zaval et.al.	2405.11837	null
2024-05-19	Unifying 3D Vision-Language Understanding via Promptable Queries	Ziyu Zhu et.al.	2405.11442	null
2024-05-18	PS6D: Point Cloud Based Symmetry-Aware 6D Object Pose Estimation in Robot Bin-Picking	Yifan Yang et.al.	2405.11257	null
2024-05-16	DiverGen: Improving Instance Segmentation by Learning Wider Data Distribution with More Diverse Generative Data	Chengxiang Fan et.al.	2405.10185	link
2024-05-22	Synth-to-Real Unsupervised Domain Adaptation for Instance Segmentation	Yachan Guo et.al.	2405.09682	null
2024-05-13	PLUTO: Pathology-Universal Transformer	Dinkar Juyal et.al.	2405.07905	null
2024-05-12	PotatoGANs: Utilizing Generative Adversarial Networks, Instance Segmentation, and Explainable AI for Enhanced Potato Disease Identification and Classification	Mohammad Shafiul Alam et.al.	2405.07332	link
2024-05-11	Global Motion Understanding in Large-Scale Video Object Segmentation	Volodymyr Fedynyak et.al.	2405.07031	null
2024-05-10	GreedyViG: Dynamic Axial Graph Construction for Efficient Vision GNNs	Mustafa Munir et.al.	2405.06849	link
2024-05-13	CSA-Net: Channel-wise Spatially Autocorrelated Attention Networks	Nick Nikzad et.al.	2405.05755	null
2024-05-07	A Self-Supervised Method for Body Part Segmentation and Keypoint Detection of Rat Images	László Kopácsi et.al.	2405.04650	null
2024-05-07	AugmenTory: A Fast and Flexible Polygon Augmentation Library	Tanaz Ghahremani et.al.	2405.04442	link
2024-05-06	PTQ4SAM: Post-Training Quantization for Segment Anything	Chengtao Lv et.al.	2405.03144	link
2024-05-03	Towards general deep-learning-based tree instance segmentation models	Jonathan Henrich et.al.	2405.02061	null
2024-04-30	UniFS: Universal Few-shot Instance Perception with Point Representations	Sheng Jin et.al.	2404.19401	link
2024-04-29	From Density to Geometry: YOLOv8 Instance Segmentation for Reverse Engineering of Optimized Structures	Thomas Rochefort-Beaudoin et.al.	2404.18763	link
2024-04-28	Garbage Segmentation and Attribute Analysis by Robotic Dogs	Nuo Xu et.al.	2404.18112	null
2024-04-25	Self-Balanced R-CNN for Instance Segmentation	Leonardo Rossi et.al.	2404.16633	link
2024-05-04	Unknown Object Grasping for Assistive Robotics	Elle Miller et.al.	2404.15001	null
2024-04-22	Surgical-DeSAM: Decoupling SAM for Instrument Segmentation in Robotic Surgery	Yuyang Sheng et.al.	2404.14040	link
2024-04-22	PM-VIS: High-Performance Box-Supervised Video Instance Segmentation	Zhangjing Yang et.al.	2404.13863	null
2024-04-27	FisheyeDetNet: 360° Surround view Fisheye Camera based Object Detection System for Autonomous Driving	Ganesh Sistu et.al.	2404.13443	null
2024-04-19	Nuclei Instance Segmentation of Cryosectioned H&E Stained Histological Images using Triple U-Net Architecture	Zarif Ahmed et.al.	2404.12986	null
2024-04-19	FipTR: A Simple yet Effective Transformer Framework for Future Instance Prediction in Autonomous Driving	Xingtai Gui et.al.	2404.12867	link
2024-04-18	Spot-Compose: A Framework for Open-Vocabulary Object Retrieval and Drawer Manipulation in Point Clouds	Oliver Lemke et.al.	2404.12440	null
2024-04-18	Performance Evaluation of Segment Anything Model with Variational Prompting for Application to Non-Visible Spectrum Imagery	Yona Falinie A. Gaus et.al.	2404.12285	null
2024-04-17	Mushroom Segmentation and 3D Pose Estimation from Point Clouds using Fully Convolutional Geometric Features and Implicit Pose Encoding	George Retsinas et.al.	2404.12144	link
2024-04-18	The devil is in the object boundary: towards annotation-free instance segmentation using Foundation Models	Cheng Shi et.al.	2404.11957	link
2024-04-17	Criteria for Uncertainty-based Corner Cases Detection in Instance Segmentation	Florian Heidecker et.al.	2404.11266	null
2024-04-12	SEVD: Synthetic Event-based Vision Dataset for Ego and Fixed Traffic Perception	Manideep Reddy Aliminati et.al.	2404.10540	link
2024-04-15	NOISe: Nuclei-Aware Osteoclast Instance Segmentation for Mouse-to-Human Domain Transfer	Sai Kumar Reddy Manne et.al.	2404.10130	link
2024-04-12	Structured Model Pruning for Efficient Inference in Computational Pathology	Mohammed Adnan et.al.	2404.08831	null
2024-04-12	Benchmarking the Cell Image Segmentation Models Robustness under the Microscope Optical Aberrations	Boyuan Peng et.al.	2404.08549	null
2024-04-12	Let It Flow: Simultaneous Optimization of 3D Flow and Object Clustering	Patrik Vacek et.al.	2404.08363	link
2024-04-12	AdaContour: Adaptive Contour Descriptor with Hierarchical Representation	Tianyu Ding et.al.	2404.08292	link
2024-04-11	ViM-UNet: Vision Mamba for Biomedical Segmentation	Anwai Archit et.al.	2404.07705	link
2024-04-09	Automated National Urban Map Extraction	Hasan Nasrallah et.al.	2404.06202	null
2024-04-06	Panoptic Perception: A Novel Task and Fine-grained Dataset for Universal Remote Sensing Image Interpretation	Danpei Zhao et.al.	2404.04608	null
2024-04-04	Language-Guided Instance-Aware Domain-Adaptive Panoptic Segmentation	Elham Amin Mansour et.al.	2404.03799	null
2024-04-04	OW-VISCap: Open-World Video Instance Segmentation and Captioning	Anwesa Choudhuri et.al.	2404.03657	null
2024-04-04	CORP: A Multi-Modal Dataset for Campus-Oriented Roadside Perception Tasks	Beibei Wang et.al.	2404.03191	null
2024-04-02	Segment Any 3D Object with Language	Seungjun Lee et.al.	2404.02157	null
2024-04-01	What is Point Supervision Worth in Video Instance Segmentation?	Shuaiyi Huang et.al.	2404.01990	null
2024-04-01	SUGAR: Pre-training 3D Visual Representations for Robotics	Shizhe Chen et.al.	2404.01491	null
2024-04-01	Teeth-SEG: An Efficient Instance Segmentation Framework for Orthodontic Treatment based on Anthropic Prior Knowledge	Bo Zou et.al.	2404.01013	null
2024-04-01	Instance-Aware Group Quantization for Vision Transformers	Jaehyeon Moon et.al.	2404.00928	null
2024-03-29	Multi-Region Transfer Learning for Segmentation of Crop Field Boundaries in Satellite Images with Limited Labels	Hannah Kerner et.al.	2404.00179	null
2024-03-29	FISBe: A real-world benchmark dataset for instance segmentation of long-range thin filamentous structures	Lisa Mais et.al.	2404.00130	null
2024-03-29	ECLIPSE: Efficient Continual Learning in Panoptic Segmentation with Visual Prompt Tuning	Beomyoung Kim et.al.	2403.20126	link
2024-04-01	Efficient 3D Instance Mapping and Localization with Neural Fields	George Tang et.al.	2403.19797	null
2024-03-28	DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs	Donghyun Kim et.al.	2403.19588	link
2024-03-27	Annolid: Annotate, Segment, and Track Anything You Need	Chen Yang et.al.	2403.18690	null
2024-03-26	Spectral Convolutional Transformer: Harmonizing Real vs. Complex Multi-View Spectral Operators for Vision Transformer	Badri N. Patro et.al.	2403.18063	link
2024-03-26	PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition	Chenhongyi Yang et.al.	2403.17695	link
2024-03-25	GoodSAM: Bridging Domain and Capacity Gaps via Segment Anything Model for Distortion-aware Panoramic Semantic Segmentation	Weiming Zhang et.al.	2403.16370	null
2024-03-24	AutoInst: Automatic Instance-Based Segmentation of LiDAR 3D Scans	Cedric Perauer et.al.	2403.16318	null
2024-03-22	Language-Based Depth Hints for Monocular Depth Estimation	Dylan Auty et.al.	2403.15551	null
2024-03-22	BSNet: Box-Supervised Simulation-assisted Mean Teacher for 3D Instance Segmentation	Jiahao Lu et.al.	2403.15019	link
2024-03-20	MTP: Advancing Remote Sensing Foundation Model via Multi-Task Pretraining	Di Wang et.al.	2403.13430	link
2024-03-19	CLIP-VIS: Adapting CLIP for Open-Vocabulary Video Instance Segmentation	Wenqi Zhu et.al.	2403.12455	link
2024-03-19	Multi-Object RANSAC: Efficient Plane Clustering Method in a Clutter	Seunghyeon Lim et.al.	2403.12449	null
2024-03-18	EffiPerception: an Efficient Framework for Various Perception Tasks	Xinhao Xiang et.al.	2403.12317	null
2024-03-18	Aerial Lifting: Neural Urban Semantic and Building Instance Lifting from Aerial Imagery	Yuqi Zhang et.al.	2403.11812	link
2024-03-18	Better (pseudo-)labels for semi-supervised instance segmentation	François Porcher et.al.	2403.11675	null
2024-03-18	Synthesizing multi-log grasp poses	Arvid Fälldin et.al.	2403.11623	null
2024-03-18	MISS: Memory-efficient Instance Segmentation Framework By Visual Inductive Priors Flow Propagation	Chih-Chung Hsu et.al.	2403.11576	null
2024-03-18	Augment Before Copy-Paste: Data and Memory Efficiency-Oriented Instance Segmentation Framework for Sport-scenes	Chih-Chung Hsu et.al.	2403.11572	null
2024-03-18	Circle Representation for Medical Instance Object Segmentation	Juming Xiong et.al.	2403.11507	link
2024-03-18	ShapeFormer: Shape Prior Visible-to-Amodal Transformer-based Amodal Instance Segmentation	Minh Tran et.al.	2403.11376	null
2024-03-16	Segment Any Object Model (SAOM): Real-to-Simulation Fine-Tuning Strategy for Multi-Class Multi-Instance Segmentation	Mariia Khan et.al.	2403.10780	null
2024-03-15	Grasp Anything: Combining Teacher-Augmented Policy Gradient Learning with Instance Segmentation to Grasp Arbitrary Objects	Malte Mosbach et.al.	2403.10187	null
2024-03-14	WeakSurg: Weakly supervised surgical instrument segmentation using temporal equivariance and semantic continuity	Qiyuan Wang et.al.	2403.09551	null
2024-03-14	StainFuser: Controlling Diffusion for Faster Neural Style Transfer in Multi-Gigapixel Histology Images	Robert Jewsbury et.al.	2403.09302	link
2024-03-14	Customizing Segmentation Foundation Model via Prompt Learning for Instance Segmentation	Hyung-Il Kim et.al.	2403.09199	null
2024-03-14	When Semantic Segmentation Meets Frequency Aliasing	Linwei Chen et.al.	2403.09065	link
2024-03-09	Segmentation Guided Sparse Transformer for Under-Display Camera Image Restoration	Jingyun Xue et.al.	2403.05906	null
2024-03-07	SAM-PD: How Far Can SAM Take Us in Tracking and Segmenting Anything in Videos by Prompt Denoising	Tao Zhou et.al.	2403.04194	link
2024-03-05	CenterDisks: Real-time instance segmentation with disk covering	Katia Jodogne-Del Litto et.al.	2403.03296	link
2024-03-04	RISeg: Robot Interactive Object Segmentation via Body Frame-Invariant Features	Howard H. Qian et.al.	2403.01731	null
2024-03-04	MCA: Moment Channel Attention Networks	Yangbo Jiang et.al.	2403.01713	link
2024-03-03	Self-Supervised Representation Learning with Meta Comprehensive Regularization	Huijie Guo et.al.	2403.01549	null
2024-03-03	End-to-End Human Instance Matting	Qinglin Liu et.al.	2403.01510	link
2024-03-02	Boosting Box-supervised Instance Segmentation with Pseudo Depth	Xinyi Yu et.al.	2403.01214	null
2024-02-29	FusionVision: A comprehensive approach of 3D object reconstruction and segmentation from RGB-D cameras using YOLO and fast segment anything	Safouane El Ghazouali et.al.	2403.00175	link
2024-02-28	Feature Denoising For Low-Light Instance Segmentation Using Weighted Non-Local Blocks	Joanne Lin et.al.	2402.18307	null
2024-02-27	A Vanilla Multi-Task Framework for Dense Visual Prediction Solution to 1st VCL Challenge -- Multi-Task Robustness Track	Zehui Chen et.al.	2402.17319	null
2024-02-27	Few-shot adaptation for morphology-independent cell instance segmentation	Ram J. Zaveri et.al.	2402.17165	null
2024-02-26	Outline-Guided Object Inpainting with Diffusion Models	Markus Pobitzer et.al.	2402.16421	null
2024-02-26	SPINEPS -- Automatic Whole Spine Segmentation of T2-weighted MR images using a Two-Phase Approach to Multi-class Semantic and Instance Segmentation	Hendrik Möller et.al.	2402.16368	link
2024-02-28	Few-Shot Learning for Annotation-Efficient Nucleus Instance Segmentation	Yu Ming et.al.	2402.16280	null
2024-02-27	ISCUTE: Instance Segmentation of Cables Using Text Embedding	Shir Kozlovsky et.al.	2402.11996	null
2024-02-19	Real-time 3D Semantic Scene Perception for Egocentric Robots with Binocular Vision	K. Nguyen et.al.	2402.11872	link
2024-02-17	ReViT: Enhancing Vision Transformers with Attention Residual Connections for Visual Recognition	Anxhelo Diko et.al.	2402.11301	link
2024-02-15	Robust semi-automatic vessel tracing in the human retinal image by an instance segmentation neural network	Siyi Chen et.al.	2402.10055	null
2024-02-15	SAWEC: Sensing-Assisted Wireless Edge Computing	Khandaker Foysal Haque et.al.	2402.10021	null
2024-02-14	Deep Rib Fracture Instance Segmentation and Classification from CT on the RibFrac Challenge	Jiancheng Yang et.al.	2402.09372	null
2024-02-14	TDViT: Temporal Dilated Video Transformer for Dense Video Tasks	Guanxiong Sun et.al.	2402.09257	link
2024-02-12	Complete Instances Mining for Weakly Supervised Instance Segmentation	Zecheng Li et.al.	2402.07633	link
2024-02-11	Improving Pallet Detection Using Synthetic Data	Henry Gann et.al.	2402.07098	null
2024-02-07	Boundary-aware Contrastive Learning for Semi-supervised Nuclei Instance Segmentation	Ye Zhang et.al.	2402.04756	link
2024-02-07	FM-Fusion: Instance-aware Semantic Mapping Boosted by Vision-Language Foundation Models	Chuhao Liu et.al.	2402.04555	null
2024-02-15	M2fNet: Multi-modal Forest Monitoring Network on Large-scale Virtual Dataset	Yawen Lu et.al.	2402.04534	null
2024-02-06	Multi-class Road Defect Detection and Segmentation using Spatial and Channel-wise Attention for Autonomous Road Repairing	Jongmin Yu et.al.	2402.04064	null
2024-02-06	SISP: A Benchmark Dataset for Fine-grained Ship Instance Segmentation in Panchromatic Satellite Images	Pengming Feng et.al.	2402.03708	link
2024-02-05	InstanceDiffusion: Instance-level Control for Image Generation	Xudong Wang et.al.	2402.03290	link
2024-02-05	Instance Segmentation XXL-CT Challenge of a Historic Airplane	Roland Gruber et.al.	2402.02928	null
2024-02-04	Spatio-temporal Prompting Network for Robust Video Feature Extraction	Guanxiong Sun et.al.	2402.02574	link
2024-02-06	Deep Spectral Improvement for Unsupervised Image Instance Segmentation	Farnoosh Arefi et.al.	2402.02474	link
2024-02-01	A Manifold Representation of the Key in Vision Transformers	Li Meng et.al.	2402.00534	null
2024-01-31	Shrub of a thousand faces: an individual segmentation from satellite images using deep learning	Rohaifa Khaldi et.al.	2401.17985	null
2024-02-02	YOLO-World: Real-Time Open-Vocabulary Object Detection	Tianheng Cheng et.al.	2401.17270	link
2024-01-29	SHViT: Single-Head Vision Transformer with Memory Efficient Macro Design	Seokju Yun et.al.	2401.16456	link
2024-01-28	SegmentAnyTree: A sensor and platform agnostic deep learning model for tree segmentation using laser scanning data	Maciej Wielgosz et.al.	2401.15739	null
2024-01-30	SAM-based instance segmentation models for the automation of structural damage detection	Zehao Ye et.al.	2401.15266	null
2024-01-25	CloudTracks: A Dataset for Localizing Ship Tracks in Satellite Images of Clouds	Muhammad Ahmed Chaudhry et.al.	2401.14486	null
2024-01-25	Rethinking Patch Dependence for Masked Autoencoders	Letian Fu et.al.	2401.14391	link
2024-01-25	On generalisability of segment anything model for nuclear instance segmentation in histology images	Kesi Xu et.al.	2401.14248	null
2024-01-31	UniM-OV3D: Uni-Modality Open-Vocabulary 3D Scene Understanding with Fine-Grained Feature Representation	Qingdong He et.al.	2401.11395	link
2024-01-19	One Step Learning, One Step Review	Xiaolong Huang et.al.	2401.10962	link
2024-01-18	A Simple Latent Diffusion Approach for Panoptic Segmentation and Mask Inpainting	Wouter Van Gansbeke et.al.	2401.10227	link
2024-01-19	Skeleton-Guided Instance Separation for Fine-Grained Segmentation in Microscopy	Jun Wang et.al.	2401.09895	null
2024-01-18	SEINE: Structure Encoding and Interaction Network for Nuclei Instance Segmentation	Ye Zhang et.al.	2401.09773	link
2024-01-18	Instance Brownian Bridge as Texts for Open-vocabulary Video Instance Segmentation	Zesen Cheng et.al.	2401.09732	link
2024-01-18	P2Seg: Pointly-supervised Segmentation via Mutual Distillation	Zipeng Wang et.al.	2401.09709	null
2024-01-25	SymTC: A Symbiotic Transformer-CNN Net for Instance Segmentation of Lumbar Spine MRI	Jiasong Chen et.al.	2401.09627	link
2024-01-17	Trapped in texture bias? A large scale comparison of deep instance segmentation	Johannes Theodoridis et.al.	2401.09109	link
2024-01-16	Segment Anything Model Can Not Segment Anything: Assessing AI Foundation Model's Generalizability in Permafrost Mapping	Wenwen Li et.al.	2401.08787	null

(back to top)

Panoptic Segmentation

Publish Date	Title	Authors	PDF	Code
2024-08-19	DiscoNeRF: Class-Agnostic Object Field for 3D Object Discovery	Corentin Dumery et.al.	2408.09928	null
2024-07-23	SAM-CP: Marrying SAM with Composable Prompts for Versatile Segmentation	Pengfei Chen et.al.	2407.16682	null
2024-07-23	Strike a Balance in Continual Panoptic Segmentation	Jinpeng Chen et.al.	2407.16354	link
2024-07-19	Panoptic Segmentation of Mammograms with Text-To-Image Diffusion Model	Kun Zhao et.al.	2407.14326	null
2024-07-19	MC-PanDA: Mask Confidence for Panoptic Domain Adaptation	Ivan Martinović et.al.	2407.14110	link
2024-07-15	OpenPSG: Open-set Panoptic Scene Graph Generation via Large Multimodal Models	Zijian Zhou et.al.	2407.11213	null
2024-07-12	A Fair Ranking and New Model for Panoptic Scene Graph Generation	Julian Lorenz et.al.	2407.09216	null
2024-07-12	From Easy to Hard: Learning Curricular Shape-aware Features for Robust Panoptic Scene Graph Generation	Hanrong Shi et.al.	2407.09191	null
2024-07-10	Panoptic Segmentation of Galactic Structures in LSB Images	Felix Richards et.al.	2407.07494	null
2024-07-03	Context-Aware Video Instance Segmentation	Seunghun Lee et.al.	2407.03010	link
2024-07-01	PanopticRecon: Leverage Open-vocabulary Instance Segmentation for Zero-shot Panoptic Reconstruction	Xuan Yu et.al.	2407.01349	null
2024-07-01	Gradient-based Class Weighting for Unsupervised Domain Adaptation in Dense Prediction Visual Tasks	Roberto Alcover-Couso et.al.	2407.01327	null
2024-06-14	Task-aligned Part-aware Panoptic Segmentation through Joint Object-Part Representations	Daan de Geus et.al.	2406.10114	link
2024-06-11	PanoSSC: Exploring Monocular Panoptic 3D Scene Reconstruction for Autonomous Driving	Yining Shi et.al.	2406.07037	null
2024-06-08	1st Place Winner of the 2024 Pixel-level Video Understanding in the Wild (CVPR'24 PVUW) Challenge in Video Panoptic Segmentation and Best Long Video Consistency of Video Semantic Segmentation	Qingfeng Liu et.al.	2406.05352	null
2024-06-07	3rd Place Solution for PVUW Challenge 2024: Video Panoptic Segmentation	Ruipu Wu et.al.	2406.04002	null
2024-06-01	2nd Place Solution for PVUW Challenge 2024: Video Panoptic Segmentation	Biao Wu et.al.	2406.00500	null
2024-05-29	A Good Foundation is Worth Many Labels: Label-Efficient Panoptic Segmentation	Niclas Vödisch et.al.	2405.19035	link
2024-05-23	Efficient Robot Learning for Perception and Mapping	Niclas Vödisch et.al.	2405.14688	null
2024-05-16	4D Panoptic Scene Graph Generation	Jingkang Yang et.al.	2405.10305	link
2024-05-16	An Integrated Framework for Multi-Granular Explanation of Video Summarization	Konstantinos Tsigos et.al.	2405.10082	null
2024-05-12	Building a Strong Pre-Training Baseline for Universal 3D Large-Scale Perception	Haoming Chen et.al.	2405.07201	link
2024-05-03	Panoptic-SLAM: Visual SLAM in Dynamic Environments using Panoptic Segmentation	Gabriel Fischer Abati et.al.	2405.02177	link
2024-04-28	Panoptic Segmentation and Labelling of Lumbar Spine Vertebrae using Modified Attention Unet	Rikathi Pal et.al.	2404.18291	null
2024-04-15	The revenge of BiSeNet: Efficient Multi-Task Image Segmentation	Gabriele Rosi et.al.	2404.09570	null
2024-04-15	kNN-CLIP: Retrieval Enables Training-Free Segmentation on Continually Expanding Large Vocabularies	Zhongrui Gui et.al.	2404.09447	null
2024-04-12	COCONut: Modernizing COCO Segmentation	Xueqing Deng et.al.	2404.08639	null
2024-04-04	Language-Guided Instance-Aware Domain-Adaptive Panoptic Segmentation	Elham Amin Mansour et.al.	2404.03799	null
2024-04-02	JRDB-PanoTrack: An Open-world Panoptic Segmentation and Tracking Robotic Dataset in Crowded Human Environments	Duy-Tho Le et.al.	2404.01686	null
2024-03-29	ECLIPSE: Efficient Continual Learning in Panoptic Segmentation with Visual Prompt Tuning	Beomyoung Kim et.al.	2403.20126	link
2024-03-29	Using Images as Covariates: Measuring Curb Appeal with Deep Learning	Ardyn Nordstrom et.al.	2403.19915	null
2024-03-21	PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model	Zheng Zhang et.al.	2403.14598	link
2024-03-14	PosSAM: Panoptic Open-vocabulary Segment Anything	Vibashan VS et.al.	2403.09620	link
2024-03-01	Small, Versatile and Mighty: A Range-View Perception Framework	Qiang Meng et.al.	2403.00325	null
2024-03-01	PEM: Prototype-based Efficient MaskFormer for Image Segmentation	Niccolò Cavagnero et.al.	2402.19422	link
2024-02-23	Benchmarking the Robustness of Panoptic Segmentation for Automated Driving	Yiting Wang et.al.	2402.15469	null
2024-02-21	Generalizable Semantic Vision Query Generation for Zero-shot Panoptic and Semantic Segmentation	Jialei Chen et.al.	2402.13697	null
2024-02-17	Semantically-aware Neural Radiance Fields for Visual Scene Understanding: A Comprehensive Review	Thang-Anh-Quan Nguyen et.al.	2402.11141	link
2024-02-04	Generalizable Entity Grounding via Assistance of Large Language Model	Lu Qi et.al.	2402.02555	null
2024-01-25	UrbanGenAI: Reconstructing Urban Landscapes using Panoptic Segmentation and Diffusion Models	Timo Kapsalis et.al.	2401.14379	null
2024-01-23	MUSES: The Multi-Sensor Semantic Perception Dataset for Driving under Uncertainty	Tim Brödermann et.al.	2401.12761	link
2024-01-23	Self-supervised Learning of LiDAR 3D Point Clouds via 2D-3D Neural Calibration	Yifan Zhang et.al.	2401.12452	null
2024-01-18	OMG-Seg: Is One Model Good Enough For All Segmentation?	Xiangtai Li et.al.	2401.10229	link
2024-01-18	RAP-SAM: Towards Real-Time All-Purpose Segment Anything	Shilin Xu et.al.	2401.10228	link
2024-01-18	A Simple Latent Diffusion Approach for Panoptic Segmentation and Mask Inpainting	Wouter Van Gansbeke et.al.	2401.10227	link
2024-02-07	Scalable 3D Panoptic Segmentation As Superpoint Graph Clustering	Damien Robert et.al.	2401.06704	link
2024-01-18	UMG-CLIP: A Unified Multi-Granularity Vision Generalist for Open-World Understanding	Bowen Shi et.al.	2401.06397	null
2024-01-11	CoSSegGaussians: Compact and Swift Scene Segmenting 3D Gaussians	Bin Dou et.al.	2401.05925	null
2024-01-04	3D Open-Vocabulary Panoptic Segmentation with 2D-3D Vision-Language Distillation	Zihao Xiao et.al.	2401.02402	null
2023-12-28	Unsupervised Universal Image Segmentation	Dantong Niu et.al.	2312.17243	link

(back to top)

Object Detection

Publish Date	Title	Authors	PDF	Code
2024-08-20	A Closer Look at Data Augmentation Strategies for Finetuning-Based Low/Few-Shot Object Detection	Vladislav Li et.al.	2408.10940	null
2024-08-20	Aligning Object Detector Bounding Boxes with Human Preference	Ombretta Strafforello et.al.	2408.10844	null
2024-08-20	LightMDETR: A Lightweight Approach for Low-Cost Open-Vocabulary Object Detection Training	Binta Sow et.al.	2408.10787	null
2024-08-20	Just a Hint: Point-Supervised Camouflaged Object Detection	Huafeng Chen et.al.	2408.10777	null
2024-08-21	Generative AI in Industrial Machine Vision -- A Review	Hans Aoyang Zhou et.al.	2408.10775	null
2024-08-20	Detection of Intracranial Hemorrhage for Trauma Patients	Antoine P. Sanner et.al.	2408.10768	null
2024-08-20	SAM-COD: SAM-guided Unified Framework for Weakly-Supervised Camouflaged Object Detection	Huafeng Chen et.al.	2408.10760	null
2024-08-20	Leveraging Temporal Contexts to Enhance Vehicle-Infrastructure Cooperative Perception	Jiaru Zhong et.al.	2408.10531	null
2024-08-19	Leveraging Superfluous Information in Contrastive Representation Learning	Xuechu Yu et.al.	2408.10292	null
2024-08-19	SHARP: Segmentation of Hands and Arms by Range using Pseudo-Depth for Enhanced Egocentric 3D Hand Pose Estimation and Action Recognition	Wiktor Mucha et.al.	2408.10037	null
2024-08-19	Segment-Anything Models Achieve Zero-shot Robustness in Autonomous Driving	Jun Yan et.al.	2408.09839	link
2024-08-19	Latent Diffusion for Guided Document Table Generation	Syed Jawwad Haider Hamdani et.al.	2408.09800	null
2024-08-18	Adversarial Attacked Teacher for Unsupervised Domain Adaptive Object Detection	Kaiwen Wang et.al.	2408.09431	null
2024-08-18	Boundary-Recovering Network for Temporal Action Detection	Jihwan Kim et.al.	2408.09354	null
2024-08-18	YOLOv1 to YOLOv10: The fastest and most accurate real-time object detection systems	Chien-Yao Wang et.al.	2408.09332	null
2024-08-17	GSLAMOT: A Tracklet and Query Graph-based Simultaneous Locating, Mapping, and Multiple Object Tracking System	Shuo Wang et.al.	2408.09191	null
2024-08-17	PADetBench: Towards Benchmarking Physical Attacks against Object Detection	Jiawei Lian et.al.	2408.09181	link
2024-08-17	MaskBEV: Towards A Unified Framework for BEV Detection and Map Segmentation	Xiao Zhao et.al.	2408.09122	null
2024-08-17	Locate Anything on Earth: Advancing Open-Vocabulary Object Detection for Remote Sensing Community	Jiancheng Pan et.al.	2408.09110	null
2024-08-16	SAM2-UNet: Segment Anything 2 Makes Strong Encoder for Natural and Medical Image Segmentation	Xinyu Xiong et.al.	2408.08870	link
2024-08-16	Multimodal Relational Triple Extraction with Query-based Entity Object Transformer	Lei Hei et.al.	2408.08709	null
2024-08-16	Tell Codec What Worth Compressing: Semantically Disentangled Image Coding for Machine with LMMs	Jinming Liu et.al.	2408.08575	null
2024-08-15	5%>100%: Breaking Performance Shackles of Full Fine-Tuning on Visual Recognition Tasks	Dongshuo Yin et.al.	2408.08345	link
2024-08-15	Learned Multimodal Compression for Autonomous Driving	Hadi Hadizadeh et.al.	2408.08211	null
2024-08-16	OC3D: Weakly Supervised Outdoor 3D Object Detection with Only Coarse Click Annotation	Qiming Xia et.al.	2408.08092	null
2024-08-15	CamoTeacher: Dual-Rotation Consistency Learning for Semi-Supervised Camouflaged Object Detection	Xunfa Lai et.al.	2408.08050	null
2024-08-15	Co-Fix3D: Enhancing 3D Object Detection with Collaborative Refinement	Wenxuan Li et.al.	2408.07999	null
2024-08-15	GOReloc: Graph-based Object-Level Relocalization for Visual SLAM	Yutong Wang et.al.	2408.07917	link
2024-08-14	See It All: Contextualized Late Aggregation for 3D Dense Captioning	Minjung Kim et.al.	2408.07648	null
2024-08-14	Panacea+: Panoramic and Controllable Video Generation for Autonomous Driving	Yuqing Wen et.al.	2408.07605	null
2024-08-14	Infra-YOLO: Efficient Neural Network Structure with Model Compression for Real-Time Infrared Small Object Detection	Zhonglin Chen et.al.	2408.07455	null
2024-08-14	Sign language recognition based on deep learning and low-cost handcrafted descriptors	Alvaro Leandro Cavalcante Carneiro et.al.	2408.07244	link
2024-08-13	Vision Language Model for Interpretable and Fine-grained Detection of Safety Compliance in Diverse Workplaces	Zhiling Chen et.al.	2408.07146	null
2024-08-13	Divide and Conquer: Improving Multi-Camera 3D Perception with 2D Semantic-Depth Priors and Input-Dependent Queries	Qi Song et.al.	2408.06901	null
2024-08-13	Integrating Saliency Ranking and Reinforcement Learning for Enhanced Object Detection	Matthias Bartolo et.al.	2408.06803	link
2024-08-13	Exploring Domain Shift on Radar-Based 3D Object Detection Amidst Diverse Environmental Conditions	Miao Zhang et.al.	2408.06772	null
2024-08-13	Unified-IoU: For High-Quality Object Detection	Xiangjie Luo et.al.	2408.06636	link
2024-08-13	MV-DETR: Multi-modality indoor object detection by Multi-View DEtecton TRansformers	Zichao Dong et.al.	2408.06604	null
2024-08-12	Latent Disentanglement for Low Light Image Enhancement	Zhihao Zheng et.al.	2408.06245	null
2024-08-12	MR3D-Net: Dynamic Multi-Resolution 3D Sparse Voxel Grid Fusion for LiDAR-Based Collective Perception	Sven Teufel et.al.	2408.06137	link
2024-08-12	DPDETR: Decoupled Position Detection Transformer for Infrared-Visible Object Detection	Junjie Guo et.al.	2408.06123	null
2024-08-12	Optimizing Vision Transformers with Data-Free Knowledge Transfer	Gousia Habib et.al.	2408.05952	null
2024-08-12	MV2DFusion: Leveraging Modality-Specific Object Semantics for Multi-Modal 3D Detection	Zitian Wang et.al.	2408.05945	null
2024-08-12	Multi-scale Contrastive Adaptor Learning for Segmenting Anything in Underperformed Scenes	Ke Zhou et.al.	2408.05936	null
2024-08-13	Weakly Supervised Video Anomaly Detection and Localization with Spatio-Temporal Prompts	Peng Wu et.al.	2408.05905	null
2024-08-11	U-DECN: End-to-End Underwater Object Detection ConvNet with Improved DeNoising Training	Zhuoyan Liu et.al.	2408.05780	link
2024-08-11	FADE: A Dataset for Detecting Falling Objects around Buildings in Video	Zhigang Tu et.al.	2408.05750	null
2024-08-11	Evaluating BM3D and NBNet: A Comprehensive Study of Image Denoising Across Multiple Datasets	Ghazal Kaviani et.al.	2408.05697	null
2024-08-09	DeepInteraction++: Multi-Modality Interaction for Autonomous Driving	Zeyu Yang et.al.	2408.05075	link
2024-08-09	RadarPillars: Efficient Object Detection from 4D Radar Point Clouds	Alexander Musiat et.al.	2408.05020	null
2024-08-09	Hyper-YOLO: When Visual Object Detection Meets Hypergraph Computation	Yifan Feng et.al.	2408.04804	link
2024-08-08	SOD-YOLOv8 -- Enhancing YOLOv8 for Small Object Detection in Traffic Scenes	Boshra Khalili et.al.	2408.04786	null
2024-08-08	Data-Driven Pixel Control: Challenges and Prospects	Saurabh Farkya et.al.	2408.04767	null
2024-08-10	SAM2-Adapter: Evaluating & Adapting Segment Anything 2 in Downstream Tasks: Camouflage, Shadow, Medical Image Segmentation, and More	Tianrun Chen et.al.	2408.04579	null
2024-08-07	Impact Analysis of Data Drift Towards The Development of Safety-Critical Automotive System	Md Shahi Amran Hossain et.al.	2408.04476	null
2024-08-08	Detecting Car Speed using Object Detection and Depth Estimation: A Deep Learning Framework	Subhasis Dasgupta et.al.	2408.04360	null
2024-08-08	Multi-Scale and Detail-Enhanced Segment Anything Model for Salient Object Detection	Shixuan Gao et.al.	2408.04326	null
2024-08-07	PaveCap: The First Multimodal Framework for Comprehensive Pavement Condition Assessment with Dense Captioning and PCI Estimation	Blessing Agyei Kyem et.al.	2408.04110	link
2024-08-07	Vision-Language Guidance for LiDAR-based Unsupervised 3D Object Detection	Christian Fruhwirth-Reisinger et.al.	2408.03790	link
2024-08-07	Data Generation Scheme for Thermal Modality with Edge-Guided Adversarial Conditional Diffusion Model	Guoqing Zhu et.al.	2408.03748	link
2024-08-07	CAS-ViT: Convolutional Additive Self-attention Vision Transformers for Efficient Mobile Applications	Tianfang Zhang et.al.	2408.03703	link
2024-08-07	L4DR: LiDAR-4DRadar Fusion for Weather-Robust 3D Object Detection	Xun Huang et.al.	2408.03677	null
2024-08-07	Designing Extremely Memory-Efficient CNNs for On-device Vision Tasks	Jaewook Lee et.al.	2408.03663	null
2024-08-07	Leveraging LLMs for Enhanced Open-Vocabulary 3D Scene Understanding in Autonomous Driving	Amirhosein Chahe et.al.	2408.03516	null
2024-08-07	GUI Element Detection Using SOTA YOLO Deep Learning Models	Seyed Shayan Daneshvar et.al.	2408.03507	null
2024-08-06	AI Foundation Models in Remote Sensing: A Survey	Siqi Lu et.al.	2408.03464	null
2024-08-06	Biomedical Image Segmentation: A Systematic Literature Review of Deep Learning Based Object Detection Methods	Fazli Wahid et.al.	2408.03393	null
2024-08-06	Diverse Generation while Maintaining Semantic Coordination: A Diffusion-Based Data Augmentation Method for Object Detection	Sen Nie et.al.	2408.02891	null
2024-08-05	HQOD: Harmonious Quantization for Object Detection	Long Huang et.al.	2408.02561	link
2024-08-05	Tensorial template matching for fast cross-correlation with rotations and its application for tomography	Antonio Martinez-Sanchez et.al.	2408.02398	null
2024-08-05	AssemAI: Interpretable Image-Based Anomaly Detection for Manufacturing Pipelines	Renjith Prasad et.al.	2408.02181	null
2024-08-04	KAN-RCBEVDepth: A multi-modal fusion algorithm in object detection for autonomous driving	Zhihao Lai et.al.	2408.02088	null
2024-08-06	A Survey and Evaluation of Adversarial Attacks for Object Detection	Khoi Nguyen Tiet Nguyen et.al.	2408.01934	null
2024-08-04	CAF-YOLO: A Robust Framework for Multi-Scale Lesion Detection in Biomedical Imagery	Zilin Chen et.al.	2408.01897	link
2024-08-03	Supervised Image Translation from Visible to Infrared Domain for Object Detection	Prahlad Anand et.al.	2408.01843	null
2024-08-03	Domain penalisation for improved Out-of-Distribution Generalisation	Shuvam Jena et.al.	2408.01746	null
2024-08-03	LAM3D: Leveraging Attention for Monocular 3D Object Detection	Diana-Alexandra Sas et.al.	2408.01739	null
2024-08-02	A Robotics-Inspired Scanpath Model Reveals the Importance of Uncertainty and Semantic Object Cues for Gaze Guidance in Dynamic Scenes	Vito Mengers et.al.	2408.01322	null
2024-08-02	Underwater Object Detection Enhancement via Channel Stabilization	Muhammad Ali et.al.	2408.01293	link
2024-08-02	PGNeXt: High-Resolution Salient Object Detection via Pyramid Grafting Network	Changqun Xia et.al.	2408.01137	null
2024-08-02	Effect of Fog Particle Size Distribution on 3D Object Detection Under Adverse Weather Conditions	Ajinkya Shinde et.al.	2408.01085	null
2024-08-02	Boosting Gaze Object Prediction via Pixel-level Supervision from Vision Foundation Model	Yang Jin et.al.	2408.01044	null
2024-08-02	Visible-Thermal Multiple Object Tracking: Large-scale Video Dataset and Progressive Fusion Approach	Yabin Zhu et.al.	2408.00969	link
2024-08-01	Joint Neural Networks for One-shot Object Recognition and Detection	Camilo J. Vargas et.al.	2408.00701	null
2024-08-01	Harnessing Uncertainty-aware Bounding Boxes for Unsupervised 3D Object Detection	Ruiyang Zhang et.al.	2408.00619	null
2024-08-05	U2UData: A Large-scale Cooperative Perception Dataset for Swarm UAVs Autonomous Flight	Tongtong Feng et.al.	2408.00606	null
2024-08-01	MUFASA: Multi-View Fusion and Adaptation Network with Spatial Awareness for Radar Object Detection	Xiangyuan Peng et.al.	2408.00565	null
2024-08-01	MonoMM: A Multi-scale Mamba-Enhanced Network for Real-time Monocular 3D Object Detection	Youjia Fu et.al.	2408.00438	null
2024-08-01	DNTextSpotter: Arbitrary-Shaped Scene Text Spotting via Improved Denoising Training	Yu Xie et.al.	2408.00355	null
2024-08-01	A Simple Background Augmentation Method for Object Detection with Diffusion Model	Yuhang Li et.al.	2408.00350	null
2024-08-01	Diff3DETR:Agent-based Diffusion Model for Semi-supervised 3D Object Detection	Jiacheng Deng et.al.	2408.00286	null
2024-08-01	RoCo:Robust Collaborative Perception By Iterative Object Matching and Pose Adjustment	Zhe Huang et.al.	2408.00257	link
2024-07-31	Dynamic Object Queries for Transformer-based Incremental Object Detection	Jichuan Zhang et.al.	2407.21687	null
2024-07-31	Spatial Transformer Network YOLO Model for Agricultural Object Detection	Yash Zambre et.al.	2407.21652	null
2024-07-31	Evaluating SAM2's Role in Camouflaged Object Detection: From SAM to SAM2	Lv Tang et.al.	2407.21596	null
2024-07-31	InScope: A New Real-world 3D Infrastructure-side Collaborative Perception Dataset for Open Traffic Scenarios	Xiaofei Zhang et.al.	2407.21581	null
2024-07-31	Voxel Scene Graph for Intracranial Hemorrhage	Antoine P. Sanner et.al.	2407.21580	null
2024-07-31	MarvelOVD: Marrying Object Recognition and Vision-Language Models for Robust Open-Vocabulary Object Detection	Kuo Wang et.al.	2407.21465	link
2024-07-30	Candidate Distant Trans-Neptunian Objects Detected by the New Horizons Subaru TNO Survey	Wesley C. Fraser et.al.	2407.21142	null
2024-07-30	What is YOLOv5: A deep look into the internal features of the popular object detector	Rahima Khanam et.al.	2407.20892	null
2024-07-30	WARM-3D: A Weakly-Supervised Sim2Real Domain Adaptation Framework for Roadside Monocular 3D Object Detection	Xingcheng Zhou et.al.	2407.20818	null
2024-07-31	Integer-Valued Training and Spike-Driven Inference Spiking Neural Network for High-performance and Energy-efficient Object Detection	Xinhao Luo et.al.	2407.20708	link
2024-07-31	Weakly Supervised Intracranial Hemorrhage Segmentation with YOLO and an Uncertainty Rectified Segment Anything Model	Pascal Spiegler et.al.	2407.20461	null
2024-07-29	MEVDT: Multi-Modal Event-Based Vehicle Detection and Tracking Dataset	Zaid A. El Shair et.al.	2407.20446	null
2024-07-30	AxiomVision: Accuracy-Guaranteed Adaptive Visual Model Selection for Perspective-Aware Video Analytics	Xiangxiang Dai et.al.	2407.20124	link
2024-07-29	Octave-YOLO: Cross frequency detection network with octave convolution	Sangjune Shin et.al.	2407.19746	null
2024-07-29	Cross-Layer Feature Pyramid Transformer for Small Object Detection in Aerial Images	Zewen Du et.al.	2407.19696	null
2024-07-29	Practical Video Object Detection via Feature Selection and Aggregation	Yuheng Shi et.al.	2407.19650	link
2024-07-28	Solving Short-Term Relocalization Problems In Monocular Keyframe Visual SLAM Using Spatial And Semantic Data	Azmyin Md. Kamal et.al.	2407.19518	link
2024-07-28	Depth-Wise Convolutions in Vision Transformers for Efficient Training on Small Datasets	Tianxiao Zhang et.al.	2407.19394	link
2024-07-27	Sewer Image Super-Resolution with Depth Priors and Its Lightweight Network	Gang Pan et.al.	2407.19271	null
2024-07-27	Enhancing Tree Type Detection in Forest Fire Risk Assessment: Multi-Stage Approach and Color Encoding with Forest Fire Risk Evaluation Framework for UAV Imagery	Jinda Zhang et.al.	2407.19184	null
2024-07-27	Reducing Spurious Correlation for Federated Domain Generalization	Shuran Ma et.al.	2407.19174	null
2024-07-27	Robust Multimodal 3D Object Detection via Modality-Agnostic Decoding and Proximity-based Modality Ensemble	Juhan Cha et.al.	2407.19156	link
2024-07-25	LION: Linear Group RNN for 3D Object Detection in Point Clouds	Zhe Liu et.al.	2407.18232	link
2024-07-25	XS-VID: An Extremely Small Video Object Detection Dataset	Jiahao Guo et.al.	2407.18137	null
2024-07-25	SaccadeDet: A Novel Dual-Stage Architecture for Rapid and Accurate Detection in Gigapixel Images	Wenxi Li et.al.	2407.17956	null
2024-07-25	A Novel Perception Entropy Metric for Optimizing Vehicle Perception with LiDAR Deployment	Yongjiang He et.al.	2407.17942	null
2024-07-25	Hierarchical Object Detection and Recognition Framework for Practical Plant Disease Diagnosis	Kohei Iwano et.al.	2407.17906	null
2024-07-25	Advancing 3D Point Cloud Understanding through Deep Transfer Learning: A Comprehensive Survey	Shahab Saquib Sohail et.al.	2407.17877	null
2024-07-25	Enhancing Fine-grained Object Detection in Aerial Images via Orthogonal Mapping	Haoran Zhu et.al.	2407.17738	link
2024-07-26	Unsqueeze [CLS] Bottleneck to Learn Rich Representations	Qing Su et.al.	2407.17671	link
2024-07-24	SDLNet: Statistical Deep Learning Network for Co-Occurring Object Detection and Identification	Binay Kumar Singh et.al.	2407.17664	null
2024-07-24	PEEKABOO: Hiding parts of an image for unsupervised object localization	Hasib Zunair et.al.	2407.17628	link
2024-07-24	ALPI: Auto-Labeller with Proxy Injection for 3D Object Detection using 2D Labels Only	Saad Lahlali et.al.	2407.17197	null
2024-07-24	DVPE: Divided View Position Embedding for Multi-View 3D Object Detection	Jiasen Wang et.al.	2407.16955	link
2024-07-23	What Matters in Range View 3D Object Detection	Benjamin Wilson et.al.	2407.16789	link
2024-07-23	A Framework for Pupil Tracking with Event Cameras	Khadija Iddrisu et.al.	2407.16665	null
2024-07-24	Velocity Driven Vision: Asynchronous Sensor Fusion Birds Eye View Models for Autonomous Vehicles	Seamie Hayes et.al.	2407.16636	null
2024-07-23	COALA: A Practical and Vision-Centric Federated Learning Platform	Weiming Zhuang et.al.	2407.16560	link
2024-07-23	Dynamic Retraining-Updating Mean Teacher for Source-Free Object Detection	Trinh Le Ba Khanh et.al.	2407.16497	link
2024-07-23	MonoWAD: Weather-Adaptive Diffusion Model for Robust Monocular 3D Object Detection	Youngmin Oh et.al.	2407.16448	link
2024-07-23	ESOD: Efficient Small Object Detection on High-Resolution Images	Kai Liu et.al.	2407.16424	null
2024-07-23	Understanding Impacts of Electromagnetic Signal Injection Attacks on Object Detection	Youqian Zhang et.al.	2407.16327	null
2024-07-23	DeepClean: Integrated Distortion Identification and Algorithm Selection for Rectifying Image Corruptions	Aditya Kapoor et.al.	2407.16302	null
2024-07-23	FoRA: Low-Rank Adaptation Model beyond Multimodal Siamese Network	Weiying Xie et.al.	2407.16129	link
2024-07-22	PLayerTV: Advanced Player Tracking and Identification for Automatic Soccer Highlight Clips	Håkon Maric Solberg et.al.	2407.16076	null
2024-07-23	Disentangling spatio-temporal knowledge for weakly supervised object detection and segmentation in surgical video	Guiqiu Liao et.al.	2407.15794	link
2024-07-22	Towards Open-World Object-based Anomaly Detection via Self-Supervised Outlier Synthesis	Brian K. S. Isaac-Medina et.al.	2407.15763	link
2024-07-22	YOLOv10 for Automated Fracture Detection in Pediatric Wrist Trauma X-rays	Ammar Ahmed et.al.	2407.15689	link
2024-07-22	SS-SFR: Synthetic Scenes Spatial Frequency Response on Virtual KITTI and Degraded Automotive Simulations for Object Detection	Daniel Jakab et.al.	2407.15646	null
2024-07-22	YOLO-pdd: A Novel Multi-scale PCB Defect Detection Method Using Deep Representations with Sequential Images	Bowen Liu et.al.	2407.15427	null
2024-07-22	Learning High-resolution Vector Representation from Multi-Camera Images for 3D Object Detection	Zhili Chen et.al.	2407.15354	link
2024-07-22	Explore the LiDAR-Camera Dynamic Adjustment Fusion for 3D Object Detection	Yiran Yang et.al.	2407.15334	null
2024-07-21	Weak-to-Strong Compositional Learning from Generative Models for Language-based Object Detection	Kwanyong Park et.al.	2407.15296	null
2024-07-21	Multiple Object Detection and Tracking in Panoramic Videos for Cycling Safety Analysis	Jingwei Guo et.al.	2407.15199	link
2024-07-21	Rethinking Feature Backbone Fine-tuning for Remote Sensing Object Detection	Yechan Kim et.al.	2407.15143	null
2024-07-19	Enhancing Layout Hotspot Detection Efficiency with YOLOv8 and PCA-Guided Augmentation	Dongyang Wu et.al.	2407.14498	null
2024-07-19	MLMT-CNN for Object Detection and Segmentation in Multi-layer and Multi-spectral Images	Majedaldein Almahasneh et.al.	2407.14473	null
2024-07-19	EmoCAM: Toward Understanding What Drives CNN-based Emotion Recognition	Youssef Doulfoukar et.al.	2407.14314	null
2024-07-19	Bucketed Ranking-based Losses for Efficient Training of Object Detectors	Feyza Yavuz et.al.	2407.14204	link
2024-07-18	GroupMamba: Parameter-Efficient and Accurate Group Visual State Space Model	Abdelrahman Shaker et.al.	2407.13772	link
2024-07-18	General Geometry-aware Weakly Supervised 3D Object Detection	Guowen Zhang et.al.	2407.13748	link
2024-07-18	Enhancing Source-Free Domain Adaptive Object Detection with Low-confidence Pseudo Label Distillation	Ilhoon Yoon et.al.	2407.13524	link
2024-07-18	Learning Camouflaged Object Detection from Noisy Pseudo Label	Jin Zhang et.al.	2407.13157	null
2024-07-18	DFMSD: Dual Feature Masking Stage-wise Knowledge Distillation for Object Detection	Zhourui Zhang et.al.	2407.13147	null
2024-07-18	FocusDiffuser: Perceiving Local Disparities for Camouflaged Object Detection	Jianwei Zhao et.al.	2407.13133	null
2024-07-17	AdaLog: Post-Training Quantization for Vision Transformers with Adaptive Logarithm Quantizer	Zhuguanyu Wu et.al.	2407.12951	link
2024-07-17	Toward INT4 Fixed-Point Training via Exploring Quantization Error for Gradients	Dohyung Kim et.al.	2407.12637	null
2024-07-17	CerberusDet: Unified Multi-Task Object Detection	Irina Tolstykh et.al.	2407.12632	link
2024-07-17	Weighting Pseudo-Labels via High-Activation Feature Index Similarity and Object Detection for Semi-Supervised Segmentation	Prantik Howlader et.al.	2407.12630	link
2024-07-17	Enhancing Wrist Abnormality Detection with YOLO: Analysis of State-of-the-art Single-stage Detection Models	Ammar Ahmed et.al.	2407.12597	link
2024-07-17	Embracing Events and Frames with Hierarchical Feature Refinement Network for Object Detection	Hu Cao et.al.	2407.12582	null
2024-07-17	Close the Sim2real Gap via Physically-based Structured Light Synthetic Data Simulation	Kaixin Bai et.al.	2407.12449	null
2024-07-17	GLARE: Low Light Image Enhancement via Generative Latent Feature based Codebook Retrieval	Han Zhou et.al.	2407.12431	link
2024-07-17	Exploring Deeper! Segment Anything Model with Depth Perception for Camouflaged Object Detection	Zhenni Yu et.al.	2407.12339	null
2024-07-16	AFIDAF: Alternating Fourier and Image Domain Adaptive Filters as an Efficient Alternative to Attention in ViTs	Yunling Zheng et.al.	2407.12217	null
2024-07-16	The object detection method aids in image reconstruction evaluation and clinical interpretation of meniscal abnormalities	Natalia Konovalova et.al.	2407.12184	null
2024-07-16	A Case for Application-Aware Space Radiation Tolerance in Orbital Computing	Meiqi Wang et.al.	2407.11853	null
2024-07-16	Improving Unsupervised Video Object Segmentation via Fake Flow Generation	Suhwan Cho et.al.	2407.11714	link
2024-07-16	Relation DETR: Exploring Explicit Position Relation Prior for Object Detection	Xiuquan Hou et.al.	2407.11699	link
2024-07-16	Bridge Past and Future: Overcoming Information Asymmetry in Incremental Object Detection	Qijie Mo et.al.	2407.11499	link
2024-07-16	Crowd-SAM: SAM as a Smart Annotator for Object Detection in Crowded Scenes	Zhi Cai et.al.	2407.11464	link
2024-07-16	Generative AI Driven Task-Oriented Adaptive Semantic Communications	Yuzhou Fu et.al.	2407.11354	null
2024-07-16	LaMI-DETR: Open-Vocabulary Detection with Language Model Instruction	Penghui Du et.al.	2407.11335	link
2024-07-16	TCFormer: Visual Recognition via Token Clustering Transformer	Wang Zeng et.al.	2407.11321	link
2024-07-16	PADRe: A Unifying Polynomial Attention Drop-in Replacement for Efficient Vision Transformer	Pierre-David Letourneau et.al.	2407.11306	null
2024-07-15	OpenPSG: Open-set Panoptic Scene Graph Generation via Large Multimodal Models	Zijian Zhou et.al.	2407.11213	null
2024-07-15	Interpreting Hand gestures using Object Detection and Digits Classification	Sangeetha K et.al.	2407.10902	null
2024-07-15	RepVF: A Unified Vector Fields Representation for Multi-task 3D Perception	Chunliang Li et.al.	2407.10876	link
2024-07-15	OPEN: Object-wise Position Embedding for Multi-view 3D Object Detection	Jinghua Hou et.al.	2407.10753	link
2024-07-15	Anticipating Future Object Compositions without Forgetting	Youssef Zahran et.al.	2407.10723	null
2024-07-15	OVLW-DETR: Open-Vocabulary Light-Weighted Detection Transformer	Yu Wang et.al.	2407.10655	link
2024-07-15	Backdoor Attacks against Image-to-Image Networks	Wenbo Jiang et.al.	2407.10445	null
2024-07-14	Shape2Scene: 3D Scene Representation Learning Through Pre-training on Shape Data	Tuo Feng et.al.	2407.10200	link
2024-07-14	LabelDistill: Label-guided Cross-modal Knowledge Distillation for Camera-based 3D Object Detection	Sanmin Kim et.al.	2407.10164	link
2024-07-14	FSD-BEV: Foreground Self-Distillation for Multi-view 3D Object Detection	Zheng Jiang et.al.	2407.10135	link
2024-07-14	Plain-Det: A Plain Multi-Dataset Object Detector	Cheng Shi et.al.	2407.10083	link
2024-07-12	DART: An Automated End-to-End Object Detection Pipeline with Data Diversification, Open-Vocabulary Bounding Box Annotation, Pseudo-Label Review, and Model Training	Chen Xin et.al.	2407.09174	link
2024-07-12	Open Vocabulary Multi-Label Video Classification	Rohit Gupta et.al.	2407.09073	null
2024-07-12	DroneMOT: Drone-based Multi-Object Tracking Considering Detection Difficulties and Simultaneous Moving of Drones and Objects	Peng Wang et.al.	2407.09051	null
2024-07-11	OmniNOCS: A unified NOCS dataset and model for 3D lifting of 2D objects	Akshay Krishnan et.al.	2407.08711	null
2024-07-11	Approaching Outside: Scaling Unsupervised 3D Object Detection from 2D Scene	Ruiyang Zhang et.al.	2407.08569	link
2024-07-11	Projecting Points to Axes: Oriented Object Detection via Point-Axis Representation	Zeyang Zhao et.al.	2407.08489	null
2024-07-11	Semi-Supervised Object Detection: A Survey on Progress from CNN to Transformer	Tahira Shehzadi et.al.	2407.08460	null
2024-07-11	PowerYOLO: Mixed Precision Model for Hardware Efficient Object Detection with Event Data	Dominika Przewlocka-Rus et.al.	2407.08272	null
2024-07-11	Knowledge distillation to effectively attain both region-of-interest and global semantics from an image where multiple objects appear	Seonwhee Jin et.al.	2407.08257	link
2024-07-11	Enrich the content of the image Using Context-Aware Copy Paste	Qiushi Guo et.al.	2407.08151	null
2024-07-11	DMM: Disparity-guided Multispectral Mamba for Oriented Object Detection in Remote Sensing	Minghang Zhou et.al.	2407.08132	null
2024-07-10	MambaVision: A Hybrid Mamba-Transformer Vision Backbone	Ali Hatamizadeh et.al.	2407.08083	link
2024-07-10	Bayesian Detector Combination for Object Detection with Crowdsourced Annotations	Zhi Qin Tan et.al.	2407.07958	link
2024-07-10	Cross Domain Object Detection via Multi-Granularity Confidence Alignment based Mean Teacher	Jiangming Chen et.al.	2407.07780	null
2024-07-10	LSM: A Comprehensive Metric for Assessing the Safety of Lane Detection Systems in Autonomous Driving	Jörg Gamerdinger et.al.	2407.07740	null
2024-07-10	Few-Shot Domain Adaptive Object Detection for Microscopic Images	Sumayya Inayat et.al.	2407.07633	link
2024-07-10	Simplifying Source-Free Domain Adaptation for Object Detection: Effective Self-Training Strategies and Performance Insights	Yan Hao et.al.	2407.07586	link
2024-07-09	Exploring Camera Encoder Designs for Autonomous Driving Perception	Barath Lakshmanan et.al.	2407.07276	null
2024-07-09	Category-level Object Detection, Pose Estimation and Reconstruction from Stereo Images	Chuanrui Zhang et.al.	2407.06984	null
2024-07-09	Cue Point Estimation using Object Detection	Giulia Argüello et.al.	2407.06823	link
2024-07-09	CoLA: Conditional Dropout and Language-driven Robust Dual-modal Salient Object Detection	Shuang Hao et.al.	2407.06780	link
2024-07-09	Graph-Based Captioning: Enhancing Visual Descriptions by Interconnecting Region Captions	Yu-Guan Hsieh et.al.	2407.06723	null
2024-07-08	Stochastic Traveling Salesperson Problem with Neighborhoods for Object Detection	Cheng Peng et.al.	2407.06366	null
2024-07-08	GeoWATCH for Detecting Heavy Construction in Heterogeneous Time Series of Satellite Images	Jon Crall et.al.	2407.06337	null
2024-07-08	Multi-clue Consistency Learning to Bridge Gaps Between General and Oriented Object in Semi-supervised Detection	Chenxu Wang et.al.	2407.05909	link
2024-07-08	Boosting 3D Object Detection with Semantic-Aware Multi-Branch Framework	Hao Jing et.al.	2407.05769	null
2024-07-08	Short-term Object Interaction Anticipation with Disentangled Object Detection @ Ego4D Short Term Object Interaction Anticipation Challenge	Hyunjin Cho et.al.	2407.05713	link
2024-07-08	Weakly Supervised Test-Time Domain Adaptation for Object Detection	Anh-Dzung Doan et.al.	2407.05607	null
2024-07-08	Towards Reflected Object Detection: A Benchmark	Zhongtian Wang et.al.	2407.05575	null
2024-07-08	GMC: A General Framework of Multi-stage Context Learning and Utilization for Visual Detection Tasks	Xuan Wang et.al.	2407.05566	null
2024-07-07	CLAMP-ViT: Contrastive Data-Free Learning for Adaptive Post-Training Quantization of ViTs	Akshat Ramachandran et.al.	2407.05266	link
2024-07-07	Unlocking Textual and Visual Wisdom: Open-Vocabulary 3D Object Detection Enhanced by Comprehensive Guidance from Text and Image	Pengkun Jiao et.al.	2407.05256	null
2024-07-06	SCSA: Exploring the Synergistic Effects Between Spatial and Channel Attention	Yunzhong Si et.al.	2407.05128	link
2024-07-06	Quantizing YOLOv7: A Comprehensive Study	Mohammadamin Baghbanbashi et.al.	2407.04943	null
2024-07-05	SH17: A Dataset for Human Safety and Personal Protective Equipment Detection in Manufacturing Industry	Hafiz Mughees Ahmad et.al.	2407.04590	link
2024-07-05	Multi-Branch Auxiliary Fusion YOLO with Re-parameterization Heterogeneous Convolutional for accurate object detection	Zhiqiang Yang et.al.	2407.04381	link
2024-07-05	Towards Stable 3D Object Detection	Jiabao Wang et.al.	2407.04305	null
2024-07-04	LiDAR-based Real-Time Object Detection and Tracking in Dynamic Environments	Wenqiang Du et.al.	2407.04115	null
2024-07-04	FIPGNet:Pyramid grafting network with feature interaction strategies	Ziyi Ding et.al.	2407.04085	null
2024-07-08	Detect Closer Surfaces that can be Seen: New Modeling and Evaluation in Cross-domain 3D Object Detection	Ruixiao Zhang et.al.	2407.04061	link
2024-07-04	The Solution for the GAIIC2024 RGB-TIR object detection Challenge	Xiangyu Wu et.al.	2407.03872	null
2024-07-04	StreamLTS: Query-based Temporal-Spatial LiDAR Fusion for Cooperative Object Detection	Yunshuang Yuan et.al.	2407.03825	null
2024-07-04	CoMix: A Comprehensive Benchmark for Multi-Task Comic Understanding	Emanuele Vivoli et.al.	2407.03550	link
2024-07-03	Comics Datasets Framework: Mix of Comics datasets for detection benchmarking	Emanuele Vivoli et.al.	2407.03540	null
2024-07-03	Visual Grounding with Attention-Driven Constraint Balancing	Weitai Kang et.al.	2407.03243	null
2024-07-03	Category-Aware Dynamic Label Assignment with High-Quality Oriented Proposal	Mingkui Feng et.al.	2407.03205	null
2024-07-03	SegVG: Transferring Object Bounding Box to Segmentation for Visual Grounding	Weitai Kang et.al.	2407.03200	link
2024-07-03	Global Context Modeling in YOLOv8 for Pediatric Wrist Fracture Detection	Rui-Yang Ju et.al.	2407.03163	link
2024-07-03	YOLOv5, YOLOv8 and YOLOv10: The Go-To Detectors for Real-time Vision	Muhammad Hussain et.al.	2407.02988	null
2024-07-03	A Pairwise DomMix Attentive Adversarial Network for Unsupervised Domain Adaptive Object Detection	Jie Shao et.al.	2407.02835	null
2024-07-03	ADFQ-ViT: Activation-Distribution-Friendly Post-Training Quantization for Vision Transformers	Yanfeng Jiang et.al.	2407.02763	null
2024-07-02	SMILe: Leveraging Submodular Mutual Information For Robust Few-Shot Object Detection	Anay Majee et.al.	2407.02665	null
2024-07-02	Robust ADAS: Enhancing Robustness of Machine Learning-based Advanced Driver Assistance Systems for Adverse Weather	Muhammad Zaeem Shahzad et.al.	2407.02581	null
2024-07-03	Similarity Distance-Based Label Assignment for Tiny Object Detection	Shuohao Shi et.al.	2407.02394	link
2024-07-02	OpenSlot: Mixed Open-set Recognition with Object-centric Learning	Xu Yin et.al.	2407.02386	null
2024-07-02	DM3D: Distortion-Minimized Weight Pruning for Lossless 3D Object Detection	Kaixin Xu et.al.	2407.02098	null
2024-07-02	Multi-Grained Contrast for Data-Efficient Unsupervised Representation Learning	Chengchao Shen et.al.	2407.02014	link
2024-07-02	Adaptive Modality Balanced Online Knowledge Distillation for Brain-Eye-Computer based Dim Object Detection	Zixing Li et.al.	2407.01894	link
2024-07-01	Scarecrow monitoring system:employing mobilenet ssd for enhanced animal supervision	Balaji VS et.al.	2407.01435	null
2024-07-01	Formal Verification of Object Detection	Avraham Raviv et.al.	2407.01295	link
2024-07-01	Cross-Architecture Auxiliary Feature Space Translation for Efficient Few-Shot Personalized Object Detection	Francesco Barbato et.al.	2407.01193	null
2024-07-01	Eliminating Position Bias of Language Models: A Mechanistic Approach	Ziqi Wang et.al.	2407.01100	null
2024-07-01	No More Potentially Dynamic Objects: Static Point Cloud Map Generation based on 3D Object Detection and Ground Projection	Soojin Woo et.al.	2407.01073	link
2024-07-01	Mobile Robot Oriented Large-Scale Indoor Dataset for Dynamic Scene Understanding	Yifan Tang et.al.	2406.19791	null
2024-06-28	Basketball-SORT: An Association Method for Complex Multi-object Occlusion Problems in Basketball Multi-object Tracking	Qingrui Hu et.al.	2406.19655	null
2024-06-27	Robustness Testing of Black-Box Models Against CT Degradation Through Test-Time Augmentation	Jack Highton et.al.	2406.19557	null
2024-06-27	BOrg: A Brain Organoid-Based Mitosis Dataset for Automatic Analysis of Brain Diseases	Muhammad Awais et.al.	2406.19556	link
2024-06-27	Weighted Circle Fusion: Ensembling Circle Representation from Different Object Detection Results	Jialin Yue et.al.	2406.19540	null
2024-06-27	Stereo Vision Based Robot for Remote Monitoring with VR Support	Mohamed Fazil M. S. et.al.	2406.19498	null
2024-06-27	HUWSOD: Holistic Self-training for Unified Weakly Supervised Object Detection	Liujuan Cao et.al.	2406.19394	link
2024-06-27	STAL3D: Unsupervised Domain Adaptation for 3D Object Detection via Collaborating Self-Training and Adversarial Learning	Yanan Zhang et.al.	2406.19362	null
2024-06-27	Towards Reducing Data Acquisition and Labeling for Defect Detection using Simulated Data	Lukas Malte Kemeter et.al.	2406.19175	null
2024-06-30	Segment Anything Model for automated image data annotation: empirical studies using text prompts from Grounding DINO	Fuseini Mumuni et.al.	2406.19057	null
2024-06-27	BiCo-Fusion: Bidirectional Complementary LiDAR-Camera Fusion for Semantic- and Spatial-Aware 3D Object Detection	Yang Song et.al.	2406.19048	null
2024-06-27	A Universal Railway Obstacle Detection System based on Semi-supervised Segmentation And Optical Flow	Qiushi Guo et.al.	2406.18908	null
2024-06-26	SpY: A Context-Based Approach to Spacecraft Component Detection	Trupti Mahendrakar et.al.	2406.18709	null
2024-06-26	Unveiling the Unknown: Conditional Evidence Decoupling for Unknown Rejection	Zhaowei Wu et.al.	2406.18443	link
2024-06-26	CTS: Sim-to-Real Unsupervised Domain Adaptation on 3D Detection	Meiying Zhang et.al.	2406.18129	null
2024-06-26	The Surprising Effectiveness of Multimodal Large Language Models for Video Moment Retrieval	Meinardus Boris et.al.	2406.18113	link
2024-06-25	ET tu, CLIP? Addressing Common Object Errors for Unseen Environments	Ye Won Byun et.al.	2406.17876	null
2024-06-25	MDHA: Multi-Scale Deformable Transformer with Hybrid Anchors for Multi-View 3D Object Detection	Michelle Adeline et.al.	2406.17654	link
2024-06-25	Embedded event based object detection with spiking neural network	Jonathan Courtois et.al.	2406.17617	null
2024-06-27	Towards Open-set Camera 3D Object Detection	Zhuolin He et.al.	2406.17297	null
2024-06-25	Exploring Test-Time Adaptation for Object Detection in Continually Changing Environments	Shilei Cao et.al.	2406.16439	null
2024-06-23	Review of Zero-Shot and Few-Shot AI Algorithms in The Medical Domain	Maged Badawi et.al.	2406.16143	null
2024-06-22	Smart Feature is What You Need	Zhaoxin Hu et.al.	2406.15805	link
2024-06-22	MR-MLLM: Mutual Reinforcement of Multimodal Comprehension and Vision Perception	Guanqun Wang et.al.	2406.15768	null
2024-06-21	DiPEx: Dispersing Prompt Expansion for Class-Agnostic Object Detection	Jia Syuen Lim et.al.	2406.14924	null
2024-06-21	MOS: Model Synergy for Test-Time Adaptation on LiDAR-Based 3D Object Detection	Zhuoxiao Chen et.al.	2406.14878	null
2024-06-20	Visible-Thermal Tiny Object Detection: A Benchmark Dataset and Baselines	Xinyi Ying et.al.	2406.14482	link
2024-06-20	Enhanced Bank Check Security: Introducing a Novel Dataset and Transformer-Based Approach for Detection and Verification	Muhammad Saif Ullah Khan et.al.	2406.14370	link
2024-06-20	HoTPP Benchmark: Are We Good at the Long Horizon Events Forecasting?	Ivan Karpukhin et.al.	2406.14341	link
2024-06-20	LeYOLO, New Scalable and Efficient CNN Architecture for Object Detection	Lilian Hollard et.al.	2406.14239	link
2024-06-20	SSAD: Self-supervised Auxiliary Detection Framework for Panoramic X-ray based Dental Disease Diagnosis	Zijian Cai et.al.	2406.13963	link
2024-06-20	Towards the in-situ Trunk Identification and Length Measurement of Sea Cucumbers via Bézier Curve Modelling	Shuaixin Liu et.al.	2406.13951	link
2024-06-19	DPO: Dual-Perturbation Optimization for Test-time Adaptation in 3D Object Detection	Zhuoxiao Chen et.al.	2406.13891	link
2024-06-19	Semantic Enhanced Few-shot Object Detection	Zheng Wang et.al.	2406.13498	null
2024-06-19	Snowy Scenes,Clear Detections: A Robust Model for Traffic Light Detection in Adverse Weather Conditions	Shivank Garg et.al.	2406.13473	link
2024-06-19	Strengthening Layer Interaction via Dynamic Layer Attention	Kaishen Wang et.al.	2406.13392	link
2024-06-18	Privacy Preserving Federated Learning in Medical Imaging with Uncertainty Estimation	Nikolas Koutsoubis et.al.	2406.12815	link
2024-06-18	Online Anchor-based Training for Image Classification Tasks	Maria Tzelepi et.al.	2406.12662	null
2024-06-18	ViDSOD-100: A New Dataset and a Baseline Model for RGB-D Video Salient Object Detection	Junhao Lin et.al.	2406.12536	link
2024-06-18	SDNIA-YOLO: A Robust Object Detection Model for Extreme Weather Conditions	Yuexiong Ding et.al.	2406.12395	null
2024-06-18	Competitive Learning for Achieving Content-specific Filters in Video Coding for Machines	Honglei Zhang et.al.	2406.12367	null
2024-06-18	Certified ML Object Detection for Surveillance Missions	Mohammed Belcaid et.al.	2406.12362	null
2024-06-18	DASSF: Dynamic-Attention Scale-Sequence Fusion for Aerial Object Detection	Haodong Li et.al.	2406.12285	null
2024-06-18	The Solution for CVPR2024 Foundational Few-Shot Object Detection Challenge	Hongpeng Pan et.al.	2406.12225	null
2024-06-17	Multi-Dimensional Pruning: Joint Channel, Layer and Block Pruning with Latency Constraint	Xinglong Sun et.al.	2406.12079	null
2024-06-17	V3Det Challenge 2024 on Vast Vocabulary and Open Vocabulary Object Detection: Methods and Results	Jiaqi Wang et.al.	2406.11739	null
2024-06-17	YOLO-FEDER FusionNet: A Novel Deep Learning Architecture for Drone Detection	Tamara R. Lenhard et.al.	2406.11641	null
2024-06-17	Low-power Ship Detection in Satellite Images Using Neuromorphic Hardware	Gregor Lenz et.al.	2406.11319	null
2024-06-17	Semi-Supervised Domain Adaptation Using Target-Oriented Domain Augmentation for 3D Object Detection	Yecheol Kim et.al.	2406.11313	link
2024-06-17	Syn-to-Real Unsupervised Domain Adaptation for Indoor 3D Object Detection	Yunsong Wang et.al.	2406.11311	null
2024-06-17	Enhancing Generalizability of Representation Learning for Data-Efficient 3D Scene Understanding	Yunsong Wang et.al.	2406.11283	null
2024-06-18	YOLO9tr: A Lightweight Model for Pavement Damage Detection Utilizing a Generalized Efficient Layer Aggregation Network and Attention Mechanism	Sompote Youwai et.al.	2406.11254	link
2024-06-16	Open-Vocabulary X-ray Prohibited Item Detection via Fine-tuning CLIP	Shuyang Lin et.al.	2406.10961	null
2024-06-16	SparseDet: A Simple and Effective Framework for Fully Sparse LiDAR-based 3D Object Detection	Lin Liu et.al.	2406.10907	null
2024-06-15	Object Detection using Oriented Window Learning Vi-sion Transformer: Roadway Assets Recognition	Taqwa Alhadidi et.al.	2406.10712	null
2024-06-14	EFM3D: A Benchmark for Measuring Progress Towards 3D Egocentric Foundation Models	Julian Straub et.al.	2406.10224	null
2024-06-14	YOLOv1 to YOLOv10: A comprehensive review of YOLO variants and their application in the agricultural domain	Mujadded Al Rabbani Alif et.al.	2406.10139	null
2024-06-14	Shelf-Supervised Multi-Modal Pre-Training for 3D Object Detection	Mehar Khurana et.al.	2406.10115	null
2024-06-14	Automated GIS-Based Framework for Detecting Crosswalk Changes from Bi-Temporal High-Resolution Aerial Images	Richard Boadu Antwi et.al.	2406.09731	null
2024-06-14	An alternate approach for estimating grain-growth kinetics	Manoj Prabakar et.al.	2406.09653	link
2024-06-13	Scene Graph Generation in Large-Size VHR Satellite Imagery: A Large-Scale Dataset and A Context-Aware Approach	Yansheng Li et.al.	2406.09410	link
2024-06-13	Towards Evaluating the Robustness of Visual State Space Models	Hashmat Shadab Malik et.al.	2406.09407	link
2024-06-13	Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models	Yushi Hu et.al.	2406.09403	null
2024-06-13	Enhanced Object Detection: A Study on Vast Vocabulary Object Detection Track for V3Det Challenge 2024	Peixi Wu et.al.	2406.09201	null
2024-06-13	Computer vision-based model for detecting turning lane features on Florida's public roadways	Richard Boadu Antwi et.al.	2406.08822	null
2024-06-13	BEVSpread: Spread Voxel Pooling for Bird's-Eye-View Representation in Vision-based Roadside 3D Object Detection	Wenjie Wang et.al.	2406.08785	link
2024-06-12	UnO: Unsupervised Occupancy Fields for Perception and Forecasting	Ben Agro et.al.	2406.08691	null
2024-06-12	Transformation-Dependent Adversarial Attacks	Yaoteng Tan et.al.	2406.08443	null
2024-06-12	Dataset Enhancement with Instance-Level Augmentations	Orest Kupyn et.al.	2406.08249	link
2024-06-12	Chemistry3D: Robotic Interaction Benchmark for Chemistry Experiments	Shoujie Li et.al.	2406.08160	link
2024-06-12	CT3D++: Improving 3D Object Detection with Keypoint-induced Channel-wise Transformer	Hualian Sheng et.al.	2406.08152	null
2024-06-12	MWIRSTD: A MWIR Small Target Detection Dataset	Nikhil Kumar et.al.	2406.08063	link
2024-06-12	Sense Less, Generate More: Pre-training LiDAR Perception with Masked Autoencoders for Ultra-Efficient 3D Sensing	Sina Tayebati et.al.	2406.07833	link
2024-06-13	A Deep Learning Approach to Detect Complete Safety Equipment For Construction Workers Based On YOLOv7	Md. Shariful Islam et.al.	2406.07707	null
2024-06-11	Transforming a rare event search into a not-so-rare event search in real-time with deep learning-based object detection	J. Schueler et.al.	2406.07538	link
2024-06-11	Understanding Visual Concepts Across Models	Brandon Trabucco et.al.	2406.07506	link
2024-06-11	Minimizing Energy Costs in Deep Learning Model Training: The Gaussian Sampling Approach	Challapalli Phanindra Revanth et.al.	2406.07332	null
2024-06-11	Unsupervised Object Detection with Theoretical Guarantees	Marian Longa et.al.	2406.07284	null
2024-06-11	Advancing Grounded Multimodal Named Entity Recognition via LLM-Based Reformulation and Box-Based Segmentation	Jinyuan Li et.al.	2406.07268	link
2024-06-11	EFFOcc: A Minimal Baseline for EFficient Fusion-based 3D Occupancy Network	Yining Shi et.al.	2406.07042	link
2024-06-11	RS-DFM: A Remote Sensing Distributed Foundation Model for Diverse Downstream Tasks	Zhechao Wang et.al.	2406.07032	null
2024-06-12	LiSD: An Efficient Multi-Task Learning Framework for LiDAR Segmentation and Detection	Jiahua Xu et.al.	2406.07023	null
2024-06-11	Teaching with Uncertainty: Unleashing the Potential of Knowledge Distillation in Object Detection	Junfei Yi et.al.	2406.06999	null
2024-06-10	UnSupDLA: Towards Unsupervised Document Layout Analysis	Talha Uddin Sheikh et.al.	2406.06236	null
2024-06-10	UEMM-Air: A Synthetic Multi-modal Dataset for Unmanned Aerial Vehicle Object Detection	Fan Liu et.al.	2406.06230	link
2024-06-10	ReCon1M:A Large-scale Benchmark Dataset for Relation Comprehension in Remote Sensing Imagery	Xian Sun et.al.	2406.06028	null
2024-06-10	Solution for SMART-101 Challenge of CVPR Multi-modal Algorithmic Reasoning Task 2024	Jinwoo Ahn et.al.	2406.05963	null
2024-06-10	Open-Vocabulary Part-Based Grasping	Tjeard van Oort et.al.	2406.05951	null
2024-06-09	Stealthy Targeted Backdoor Attacks against Image Captioning	Wenshu Fan et.al.	2406.05874	link
2024-06-09	Scaling Graph Convolutions for Mobile Vision	William Avery et.al.	2406.05850	link
2024-06-09	Mamba YOLO: SSMs-Based YOLO For Object Detection	Zeyu Wang et.al.	2406.05835	link
2024-06-09	ControlLoc: Physical-World Hijacking Attack on Visual Perception in Autonomous Driving	Chen Ma et.al.	2406.05810	null
2024-06-09	SAM-PM: Enhancing Video Camouflaged Object Detection using Spatio-Temporal Attention	Muhammad Nawfal Meeran et.al.	2406.05802	link
2024-06-07	Nacala-Roof-Material: Drone Imagery for Roof Detection, Classification, and Segmentation to Support Mosquito-borne Disease Risk Assessment	Venkanna Babu Guthula et.al.	2406.04949	null
2024-06-07	EGOR: Efficient Generated Objects Replay for incremental object detection	Zijia An et.al.	2406.04829	null
2024-06-07	UCDNet: Multi-UAV Collaborative 3D Object Detection Network by Reliable Feature Mapping	Pengju Tian et.al.	2406.04648	null
2024-06-07	UVCPNet: A UAV-Vehicle Collaborative Perception Network for 3D Object Detection	Yuchao Wang et.al.	2406.04647	null
2024-06-06	CORU: Comprehensive Post-OCR Parsing and Receipt Understanding Dataset	Abdelrahman Abdallah et.al.	2406.04493	link
2024-06-06	DeTra: A Unified Model for Object Detection and Trajectory Forecasting	Sergio Casas et.al.	2406.04426	null
2024-06-06	Parameter-Inverted Image Pyramid Networks	Xizhou Zhu et.al.	2406.04330	link
2024-06-06	Semmeldetector: Application of Machine Learning in Commercial Bakeries	Thomas H. Schmitt et.al.	2406.04050	null
2024-06-06	Frequency-based Matcher for Long-tailed Semantic Segmentation	Shan Li et.al.	2406.03917	link
2024-06-06	Instance Segmentation and Teeth Classification in Panoramic X-rays	Devichand Budagam et.al.	2406.03747	link
2024-06-05	FedPylot: Navigating Federated Learning for Real-Time Object Detection in Internet of Vehicles	Cyprien Quéméneur et.al.	2406.03611	link
2024-06-05	LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection	Qiang Chen et.al.	2406.03459	link
2024-06-05	Global Clipper: Enhancing Safety and Reliability of Transformer-based Object Detection Models	Qutub Syed Sha et.al.	2406.03229	null
2024-06-05	Situation Monitor: Diversity-Driven Zero-Shot Out-of-Distribution Detection using Budding Ensemble Architecture for Object Detection	Qutub Syed et.al.	2406.03188	null
2024-06-05	Enhanced Automotive Object Detection via RGB-D Fusion in a DiffusionDet Framework	Eliraz Orfaig et.al.	2406.03129	null
2024-06-04	Open-YOLO 3D: Towards Fast and Accurate Open-Vocabulary 3D Instance Segmentation	Mohamed El Amine Boudjoghra et.al.	2406.02548	link
2024-06-04	SatSplatYOLO: 3D Gaussian Splatting-based Virtual Object Detection Ensembles for Satellite Feature Recognition	Van Minh Nguyen et.al.	2406.02533	null
2024-06-04	GrootVL: Tree Topology is All You Need in State Space Model	Yicheng Xiao et.al.	2406.02395	link
2024-06-04	Low-Rank Adaption on Transformer-based Oriented Object Detector for Satellite Onboard Processing of Remote Sensing Images	Xinyang Pu et.al.	2406.02385	link
2024-06-04	Radar Spectra-Language Model for Automotive Scene Parsing	Mariia Pushkareva et.al.	2406.02158	null
2024-06-04	Detecting Endangered Marine Species in Autonomous Underwater Vehicle Imagery Using Point Annotations and Few-Shot Learning	Heather Doig et.al.	2406.01932	null
2024-06-04	GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer	Ding Jia et.al.	2406.01210	link
2024-06-03	Learning Adaptive Fusion Bank for Multi-modal Salient Object Detection	Kunpeng Wang et.al.	2406.01127	link
2024-06-03	Visual Car Brand Classification by Implementing a Synthetic Image Dataset Creation Pipeline	Jan Lippemeier et.al.	2406.01071	null
2024-06-03	Multi-Object Tracking based on Imaging Radar 3D Object Detection	Patrick Palmer et.al.	2406.01011	null
2024-05-31	Power of Cooperative Supervision: Multiple Teachers Framework for Enhanced 3D Semi-Supervised Object Detection	Jin-Hee Lee et.al.	2405.20720	link
2024-05-30	On Calibration of Object Detectors: Pitfalls, Evaluation and Baselines	Selim Kuzucu et.al.	2405.20459	link
2024-05-30	RTGen: Generating Region-Text Pairs for Open-Vocabulary Object Detection	Fangyi Chen et.al.	2405.19854	link
2024-05-30	Improving Object Detector Training on Synthetic Data by Starting With a Strong Baseline Methodology	Frank A. Ruis et.al.	2405.19822	null
2024-05-30	Fully Test-Time Adaptation for Monocular 3D Object Detection	Hongbin Lin et.al.	2405.19682	null
2024-05-30	YotoR-You Only Transform One Representation	José Ignacio Díaz Villa et.al.	2405.19629	null
2024-05-29	Enabling Visual Recognition at Radio Frequency	Haowen Lai et.al.	2405.19516	null
2024-05-29	Model Agnostic Defense against Adversarial Patch Attacks on Object Detection in Unmanned Aerial Vehicles	Saurabh Pathak et.al.	2405.19179	null
2024-05-29	RGB-T Object Detection via Group Shuffled Multi-receptive Attention and Multi-modal Supervision	Jinzhong Wang et.al.	2405.18955	null
2024-05-29	SSGA-Net: Stepwise Spatial Global-local Aggregation Networks for for Autonomous Driving	Yiming Cui et.al.	2405.18857	null
2024-05-29	PillarHist: A Quantization-aware Pillar Feature Encoder based on Height-aware Histogram	Sifan Zhou et.al.	2405.18734	null
2024-05-28	A Review and Implementation of Object Detection Models and Optimizations for Real-time Medical Mask Detection during the COVID-19 Pandemic	Ioanna Gogou et.al.	2405.18387	link
2024-05-28	Is a 3D-Tokenized LLM the Key to Reliable Autonomous Driving?	Yifan Bai et.al.	2405.18361	null
2024-05-28	Intent3D: 3D Object Detection in RGB-D Scans Based on Human Intention	Weitai Kang et.al.	2405.18295	null
2024-05-28	DMT-JEPA: Discriminative Masked Targets for Joint-Embedding Predictive Architecture	Shentong Mo et.al.	2405.17995	link
2024-05-28	Self-supervised Pre-training for Transferable Multi-modal Perception	Xiaohao Xu et.al.	2405.17942	null
2024-05-28	Boosting General Trimap-free Matting in the Real-World Image	Leo Shan Wenzhang Zhou Grace Zhao et.al.	2405.17916	null
2024-05-28	The Binary Quantized Neural Network for Dense Prediction via Specially Designed Upsampling and Attention	Xingyu Ding et.al.	2405.17776	null
2024-05-27	Understanding differences in applying DETR to natural and medical images	Yanqi Xu et.al.	2405.17677	null
2024-05-27	Hardness-Aware Scene Synthesis for Semi-Supervised 3D Object Detection	Shuai Zeng et.al.	2405.17422	link
2024-05-27	Tracking Small Birds by Detection Candidate Region Filtering and Detection History-aware Association	Tingwei Liu et.al.	2405.17323	null
2024-05-27	Enhanced Automotive Radar Collaborative Sensing By Exploiting Constructive Interference	Lifan Xu et.al.	2405.17297	null
2024-05-27	SCaRL- A Synthetic Multi-Modal Dataset for Autonomous Driving	Avinash Nittur Ramesh et.al.	2405.17030	null
2024-05-27	Collective Perception Datasets for Autonomous Driving: A Comprehensive Review	Sven Teufel et.al.	2405.16973	null
2024-05-27	OED: Towards One-stage End-to-End Dynamic Scene Graph Generation	Guan Wang et.al.	2405.16925	link
2024-05-27	ContrastAlign: Toward Robust BEV Feature Alignment via Contrastive Learning for Multi-Modal 3D Object Detection	Ziying Song et.al.	2405.16873	null
2024-05-27	A re-calibration method for object detection with multi-modal alignment bias in autonomous driving	Zhihang Song et.al.	2405.16848	null
2024-05-26	A Study on Unsupervised Anomaly Detection and Defect Localization using Generative Model in Ultrasonic Non-Destructive Testing	Yusaku Ando et.al.	2405.16580	null
2024-05-25	GreenCOD: A Green Camouflaged Object Detection Method	Hong-Shuo Chen et.al.	2405.16144	null
2024-05-24	UNION: Unsupervised 3D Object Detection using Object Appearance-based Pseudo-Classes	Ted Lentsch et.al.	2405.15688	link
2024-05-24	Multimodal Object Detection via Probabilistic a priori Information Integration	Hafsa El Hafyani et.al.	2405.15596	link
2024-05-24	Scale-Invariant Feature Disentanglement via Adversarial Learning for UAV-based Object Detection	Fan Liu et.al.	2405.15465	null
2024-05-24	Leveraging knowledge distillation for partial multi-task learning from multiple remote sensing datasets	Hoàng-Ân Lê et.al.	2405.15394	link
2024-05-24	Towards Global Optimal Visual In-Context Learning Prompt Selection	Chengming Xu et.al.	2405.15279	null
2024-05-24	Unbiased Faster R-CNN for Single-source Domain Generalized Object Detection	Yajing Liu et.al.	2405.15225	null
2024-05-24	ODGEN: Domain-specific Object Detection Data Generation with Diffusion Models	Jingyuan Zhu et.al.	2405.15199	null
2024-05-24	MonoDETRNext: Next-generation Accurate and Efficient Monocular 3D Object Detection Method	Pan Liao et.al.	2405.15176	null
2024-05-23	Learning to Detect and Segment Mobile Objects from Unlabeled Videos	Yihong Sun et.al.	2405.14841	link
2024-05-23	Designing A Sustainable Marine Debris Clean-up Framework without Human Labels	Raymond Wang et.al.	2405.14815	link
2024-05-23	Drones Help Drones: A Collaborative Framework for Multi-Drone Object Trajectory Prediction and Beyond	Zhechao Wang et.al.	2405.14674	link
2024-05-23	Improving Single Domain-Generalized Object Detection: A Focus on Diversification and Alignment	Muhammad Sohail Danish et.al.	2405.14497	link
2024-05-23	YOLOv10: Real-Time End-to-End Object Detection	Ao Wang et.al.	2405.14458	link
2024-05-23	Harmony: A Joint Self-Supervised and Weakly-Supervised Framework for Learning General Purpose Visual Representations	Mohammed Baharoon et.al.	2405.14239	link
2024-05-22	Two Heads are Better Than One: Neural Networks Quantization with 2D Hilbert Curve-based Output Representation	Mykhailo Uss et.al.	2405.14024	null
2024-05-22	TS40K: a 3D Point Cloud Dataset of Rural Terrain and Electrical Transmission System	Diogo Lavado et.al.	2405.13989	null
2024-05-22	Class-Conditional self-reward mechanism for improved Text-to-Image models	Safouane El Ghazouali et.al.	2405.13473	link
2024-05-22	Adaptive Wireless Image Semantic Transmission and Over-The-Air Testing	Jiarun Ding et.al.	2405.13403	null
2024-05-21	BiomedParse: a biomedical foundation model for image parsing of everything everywhere all at once	Theodore Zhao et.al.	2405.12971	null
2024-05-21	FFAM: Feature Factorization Activation Map for Explanation of 3D Detectors	Shuai Liu et.al.	2405.12601	link
2024-05-21	Active Object Detection with Knowledge Aggregation and Distillation from Large Models	Dejie Yang et.al.	2405.12509	link
2024-05-21	Mutual Information Analysis in Multimodal Learning Systems	Hadi Hadizadeh et.al.	2405.12456	null
2024-05-20	Multi-View Attentive Contextualization for Multi-View 3D Object Detection	Xianpeng Liu et.al.	2405.12200	null
2024-05-20	Bangladeshi Native Vehicle Detection in Wild	Bipin Saha et.al.	2405.12150	link
2024-05-20	Salience-guided Ground Factor for Robust Localization of Delivery Robots in Complex Urban Environments	Jooyong Park et.al.	2405.11855	null
2024-05-20	DATR: Unsupervised Domain Adaptive Detection Transformer with Dataset-Level Adaptation and Prototypical Alignment	Jianhong Han et.al.	2405.11765	link
2024-05-20	Versatile Teacher: A Class-aware Teacher-student Framework for Cross-domain Adaptation	Runou Yang et.al.	2405.11754	link
2024-05-19	FADet: A Multi-sensor 3D Object Detection Network based on Local Featured Attention	Ziang Guo et.al.	2405.11682	link
2024-05-19	SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-parameterized Batch Normalization	Jialong Guo et.al.	2405.11582	link
2024-05-18	InfRS: Incremental Few-Shot Object Detection in Remote Sensing Images	Wuzhou Li et.al.	2405.11293	null
2024-05-18	Visible and Clear: Finding Tiny Objects in Difference Map	Bing Cao et.al.	2405.11276	null
2024-05-17	A Versatile Framework for Analyzing Galaxy Image Data by Implanting Human-in-the-loop on a Large Vision Model	Mingxiang Fu et.al.	2405.10890	null
2024-05-17	DuoSpaceNet: Leveraging Both Bird's-Eye-View and Perspective View Representations for 3D Object Detection	Zhe Huang et.al.	2405.10577	null
2024-05-16	Drone-type-Set: Drone types detection benchmark for drone detection and tracking	Kholoud AlDosari et.al.	2405.10398	null
2024-05-16	Grounded 3D-LLM with Referent Tokens	Yilun Chen et.al.	2405.10370	link
2024-05-16	Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection	Tianhe Ren et.al.	2405.10300	link
2024-05-16	Towards Task-Compatible Compressible Representations	Anderson de Andrade et.al.	2405.10244	link
2024-05-16	SpecDETR: A Transformer-based Hyperspectral Point Object Detection Network	Zhaoxu Li et.al.	2405.10148	link
2024-05-16	SHiNe: Semantic Hierarchy Nexus for Open-vocabulary Object Detection	Mingxuan Liu et.al.	2405.10053	link
2024-05-19	FPDIoU Loss: A Loss Function for Efficient Bounding Box Regression of Rotated Object Detection	Siliang Ma et.al.	2405.09942	null
2024-05-19	PillarNeXt: Improving the 3D detector by introducing Voxel2Pillar feature encoding and extracting multi-scale features	Xusheng Li et.al.	2405.09828	null
2024-05-16	Size-invariance Matters: Rethinking Metrics and Losses for Imbalanced Multi-object Salient Object Detection	Feiran Li et.al.	2405.09782	link
2024-05-15	Synth-to-Real Unsupervised Domain Adaptation for Instance Segmentation	Guo Yachan et.al.	2405.09682	null
2024-05-15	Dynamic Loss Decay based Robust Oriented Object Detection on Remote Sensing Images with Noisy Labels	Guozhang Liu et.al.	2405.09024	null
2024-05-14	CLIP with Quality Captions: A Strong Pretraining for Vision Tasks	Pavan Kumar Anasosalu Vasu et.al.	2405.08911	null
2024-05-14	Open-Vocabulary Object Detection via Neighboring Region Attention Alignment	Sunyuan Qiang et.al.	2405.08593	null
2024-05-14	RDPN6D: Residual-based Dense Point-wise Network for 6Dof Object Pose Estimation Based on RGB-D Images	Zong-Wei Hong et.al.	2405.08483	link
2024-05-14	Multimodal Collaboration Networks for Geospatial Vehicle Detection in Dense, Occluded, and Large-Scale Events	Xin Wu et.al.	2405.08251	link
2024-05-13	oTTC: Object Time-to-Contact for Motion Estimation in Autonomous Driving	Abdul Hannan Khan et.al.	2405.07698	null
2024-05-13	MonoMAE: Enhancing Monocular 3D Detection through Depth-Aware Masked Autoencoders	Xueying Jiang et.al.	2405.07696	null
2024-05-13	Quality-aware Selective Fusion Network for V-D-T Salient Object Detection	Liuxin Bao et.al.	2405.07655	link
2024-05-13	Fast Training Data Acquisition for Object Detection and Segmentation using Black Screen Luminance Keying	Thomas Pöllabauer et.al.	2405.07653	null
2024-05-13	Integrity Monitoring of 3D Object Detection in Automated Driving Systems using Raw Activation Patterns and Spatial Filtering	Hakan Yekta Yatbaz et.al.	2405.07600	null
2024-05-13	Environmental Matching Attack Against Unmanned Aerial Vehicles Object Detection	Dehong Kong et.al.	2405.07595	null
2024-05-13	Enhancing 3D Object Detection by Using Neural Network with Self-adaptive Thresholding	Houze Liu et.al.	2405.07479	null
2024-05-12	Building a Strong Pre-Training Baseline for Universal 3D Large-Scale Perception	Haoming Chen et.al.	2405.07201	link
2024-05-12	Differentiable Model Scaling using Differentiable Topk	Kai Liu et.al.	2405.07194	link
2024-05-12	Resource Efficient Perception for Vision Systems	A V Subramanyam et.al.	2405.07166	link
2024-05-10	How to Augment for Atmospheric Turbulence Effects on Thermal Adapted Object Detection Models?	Engin Uzun et.al.	2405.06383	null
2024-05-10	Precise Apple Detection and Localization in Orchards using YOLOv5 for Robotic Harvesting Systems	Jiang Ziyue et.al.	2405.06260	null
2024-05-13	CSA-Net: Channel-wise Spatially Autocorrelated Attention Networks	Nick Nikzad et.al.	2405.05755	null
2024-05-09	Depth Awakens: A Depth-perceptual Attention Fusion Network for RGB-D Camouflaged Object Detection	Xinran Liua et.al.	2405.05614	null
2024-05-09	The object detection model uses combined extraction with KNN and RF classification	Florentina Tatrin Kurniati et.al.	2405.05551	null
2024-05-08	Reviewing Intelligent Cinematography: AI research for camera-based video production	Adrian Azzarelli et.al.	2405.05039	null
2024-05-07	A Novel Wide-Area Multiobject Detection System with High-Probability Region Searching	Xianlei Long et.al.	2405.04589	null
2024-05-07	DriveWorld: 4D Pre-trained Scene Understanding via World Models for Autonomous Driving	Chen Min et.al.	2405.04390	null
2024-05-07	A New Dataset and Comparative Study for Aphid Cluster Detection and Segmentation in Sorghum Fields	Raiyan Rahman et.al.	2405.04305	null
2024-05-07	ViewFormer: Exploring Spatiotemporal Modeling for Multi-View 3D Occupancy Perception via View-Guided Transformers	Jinke Li et.al.	2405.04299	link
2024-05-07	Deep Event-based Object Detection in Autonomous Driving: A Survey	Bingquan Zhou et.al.	2405.03995	null
2024-05-06	BadFusion: 2D-Oriented Backdoor Attacks against 3D Object Detection	Saket S. Chaturvedi et.al.	2405.03884	null
2024-05-06	RepVGG-GELAN: Enhanced GELAN with VGG-STYLE ConvNets for Brain Tumour Detection	Thennarasi Balakrishnan et.al.	2405.03541	link
2024-05-06	Low-light Object Detection	Pengpeng Li et.al.	2405.03519	null
2024-05-09	Salient Object Detection From Arbitrary Modalities	Nianchang Huang et.al.	2405.03352	link
2024-05-06	Modality Prompts for Arbitrary Modality Salient Object Detection	Nianchang Huang et.al.	2405.03351	null
2024-05-06	PTQ4SAM: Post-Training Quantization for Segment Anything	Chengtao Lv et.al.	2405.03144	link
2024-05-05	Performance Evaluation of Real-Time Object Detection for Electric Scooters	Dong Chen et.al.	2405.03039	link
2024-05-05	SalFAU-Net: Saliency Fusion Attention U-Net for Salient Object Detection	Kassaw Abraham Mulat et.al.	2405.02906	null
2024-05-07	Adaptive Guidance Learning for Camouflaged Object Detection	Zhennan Chen et.al.	2405.02824	null
2024-05-05	PVTransformer: Point-to-Voxel Transformer for Scalable 3D Object Detection	Zhaoqi Leng et.al.	2405.02811	null
2024-05-05	Fused attention mechanism-based ore sorting network	Junjiang Zhen et.al.	2405.02785	null
2024-05-02	Segmentation-Free Outcome Prediction in Head and Neck Cancer: Deep Learning-based Feature Extraction from Multi-Angle Maximum Intensity Projections (MA-MIPs) of PET Images	Amirhosein Toosi et.al.	2405.01756	null
2024-05-02	PointCompress3D -- A Point Cloud Compression Framework for Roadside LiDARs in Intelligent Transportation Systems	Walter Zimmer et.al.	2405.01750	null
2024-05-02	Development of Skip Connection in Deep Neural Networks for Computer Vision and Medical Image Analysis: A Survey	Guoping Xu et.al.	2405.01725	link
2024-05-06	SOAR: Advancements in Small Body Object Detection for Aerial Imagery Using State Space Models and Programmable Gradients	Tushar Verma et.al.	2405.01699	null
2024-05-02	Overcoming LLM Challenges using RAG-Driven Precision in Coffee Leaf Disease Remediation	Dr. Selva Kumar S et.al.	2405.01310	null
2024-05-02	Towards Consistent Object Detection via LiDAR-Camera Synergy	Kai Luo et.al.	2405.01258	link
2024-05-02	Federated Learning with Heterogeneous Data Handling for Robust Vehicular Object Detection	Ahmad Khalil et.al.	2405.01108	link
2024-05-01	Object detection under the linear subspace model with application to cryo-EM images	Amitay Eldar et.al.	2405.00364	link
2024-04-30	Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation	Yunhao Ge et.al.	2404.19752	null
2024-04-30	Quantifying Nematodes through Images: Datasets, Models, and Baselines of Deep Learning	Zhipeng Yuan et.al.	2404.19748	null
2024-04-30	Masked Multi-Query Slot Attention for Unsupervised Object Discovery	Rishav Pramanik et.al.	2404.19654	link
2024-04-30	Physical Backdoor: Towards Temperature-based Backdoor Attacks in the Physical World	Wen Yin et.al.	2404.19417	null
2024-04-30	UniFS: Universal Few-shot Instance Perception with Point Representations	Sheng Jin et.al.	2404.19401	link
2024-04-30	Pseudo Label Refinery for Unsupervised Domain Adaptation on Cross-dataset 3D Object Detection	Zhanwei Zhang et.al.	2404.19384	null
2024-04-29	MiPa: Mixed Patch Infrared-Visible Modality Agnostic Object Detection	Heitor R. Medeiros et.al.	2404.18849	link
2024-04-29	Leveraging PointNet and PointNet++ for Lyft Point Cloud Classification Challenge	Rajat K. Doshi et.al.	2404.18665	null
2024-04-29	CoSense3D: an Agent-based Efficient Learning Framework for Collective Perception	Yunshuang Yuan et.al.	2404.18617	link
2024-04-29	Efficient Meta-Learning Enabled Lightweight Multiscale Few-Shot Object Detection in Remote Sensing Images	Wenbin Guan et.al.	2404.18426	null
2024-04-29	Multi-modal Perception Dataset of In-water Objects for Autonomous Surface Vehicles	Mingi Jeong et.al.	2404.18411	null
2024-04-28	FAD-SAR: A Novel Fishing Activity Detection System via Synthetic Aperture Radar Images Based on Deep Learning Method	Yanbing Bai et.al.	2404.18245	null
2024-04-28	RadSimReal: Bridging the Gap Between Synthetic and Real Data in Radar Object Detection With Simulation	Oded Bialer et.al.	2404.18150	null
2024-04-27	Reliable Student: Addressing Noise in Semi-Supervised 3D Object Detection	Farzad Nozarian et.al.	2404.17910	link
2024-04-27	A Hybrid Approach for Document Layout Analysis in Document images	Tahira Shehzadi et.al.	2404.17888	null
2024-04-27	BoostRad: Enhancing Object Detection by Boosting Radar Reflections	Yuval Haitman et.al.	2404.17861	null
2024-04-26	Inhomogeneous illuminated image enhancement under extremely low visibility condition	Libang Chen et.al.	2404.17503	null
2024-04-26	Cost-Sensitive Uncertainty-Based Failure Recognition for Object Detection	Moussa Kassem Sbeyti et.al.	2404.17427	link
2024-04-26	Enhancing mmWave Radar Point Cloud via Visual-inertial Supervision	Cong Fan et.al.	2404.17229	link
2024-04-25	Generating Minimalist Adversarial Perturbations to Test Object-Detection Models: An Adaptive Multi-Metric Evolutionary Search Approach	Cristopher McIntyre-Garcia et.al.	2404.17020	link
2024-04-25	Constellation Dataset: Benchmarking High-Altitude Object Detection for an Urban Intersection	Mehmet Kerem Turkcan et.al.	2404.16944	link
2024-04-25	Self-Balanced R-CNN for Instance Segmentation	Leonardo Rossi et.al.	2404.16633	link
2024-04-25	Cross-Domain Spatial Matching for Camera and Radar Sensor Data Fusion in Autonomous Vehicle Perception System	Daniel Dworak et.al.	2404.16548	null
2024-04-25	Commonsense Prototype for Outdoor Unsupervised 3D Object Detection	Hai Wu et.al.	2404.16493	link
2024-04-25	IMWA: Iterative Model Weight Averaging Benefits Class-Imbalanced Learning Tasks	Zitong Huang et.al.	2404.16331	null
2024-04-25	CFMW: Cross-modality Fusion Mamba for Multispectral Object Detection under Adverse Weather Conditions	Haoyuan Li et.al.	2404.16302	link
2024-04-24	AutoGluon-Multimodal (AutoMM): Supercharging Multimodal AutoML with Foundation Models	Zhiqiang Tang et.al.	2404.16233	null
2024-04-24	Observational parameters of Blue Large-Amplitude Pulsators	P. Pietrukowicz et.al.	2404.16089	null
2024-04-26	A Survey on Visual Mamba	Hanwei Zhang et.al.	2404.15956	null
2024-04-24	Steal Now and Attack Later: Evaluating Robustness of Object Detection against Black-box Adversarial Attacks	Erh-Chung Chen et.al.	2404.15881	null
2024-04-24	Revisiting Out-of-Distribution Detection in LiDAR-based 3D Object Detection	Michael Kösel et.al.	2404.15879	link
2024-04-23	CFPFormer: Feature-pyramid like Transformer Decoder for Segmentation and Detection	Hongyi Cai et.al.	2404.15451	null
2024-04-23	Source-free Domain Adaptation for Video Object Detection Under Adverse Image Conditions	Xingguang Zhang et.al.	2404.15252	null
2024-04-23	Efficient Transformer Encoders for Mask2Former-style models	Manyi Yao et.al.	2404.15244	null
2024-04-23	Gallbladder Cancer Detection in Ultrasound Images based on YOLO and Faster R-CNN	Sara Dadjouy et.al.	2404.15129	null
2024-04-23	External Prompt Features Enhanced Parameter-efficient Fine-tuning for Salient Object Detection	Wen Liang et.al.	2404.15008	null
2024-04-23	ContextualFusion: Context-Based Multi-Sensor Fusion for 3D Object Detection in Adverse Operating Conditions	Shounak Sural et.al.	2404.14780	null
2024-04-23	Unified Unsupervised Salient Object Detection via Knowledge Transfer	Yao Yuan et.al.	2404.14759	link
2024-04-22	CKD: Contrastive Knowledge Distillation from A Sample-wise Perspective	Wencheng Zhu et.al.	2404.14109	null
2024-04-22	Benchmarking Multi-Modal LLMs for Testing Visual Deep Learning Systems Through the Lens of Image Mutation	Liwen Wang et.al.	2404.13945	null
2024-04-22	NeRF-DetS: Enhancing Multi-View 3D Object Detection with Sampling-adaptive Network of Continuous NeRF-based Representation	Chi Huang et.al.	2404.13921	null
2024-04-22	TeamTrack: A Dataset for Multi-Sport Multi-Object Tracking in Full-pitch Videos	Atom Scott et.al.	2404.13868	null
2024-04-22	Toward Robust LiDAR based 3D Object Detection via Density-Aware Adaptive Thresholding	Eunho Lee et.al.	2404.13852	null
2024-04-21	A Nasal Cytology Dataset for Object Detection and Deep Learning	Mauro Camporeale et.al.	2404.13745	null
2024-04-23	Clio: Real-time Task-Driven Open-Set 3D Scene Graphs	Dominic Maggio et.al.	2404.13696	link
2024-04-20	FisheyeDetNet: Object Detection on Fisheye Surround View Camera Systems for Automated Driving	Ganesh Sistu et.al.	2404.13443	null
2024-04-20	Efficient and Concise Explanations for Object Detection with Gaussian-Class Activation Mapping Explainer	Quoc Khanh Nguyen et.al.	2404.13417	link
2024-04-19	A comparison between single-stage and two-stage 3D tracking algorithms for greenhouse robotics	David Rapado-Rincon et.al.	2404.12963	null
2024-04-19	Language-Driven Active Learning for Diverse Open-Set 3D Object Detection	Ross Greer et.al.	2404.12856	link
2024-04-19	ECOR: Explainable CLIP for Object Recognition	Ali Rasekh et.al.	2404.12839	null
2024-04-19	A Point-Based Approach to Efficient LiDAR Multi-Task Perception	Christopher Lang et.al.	2404.12798	null
2024-04-19	ELEV-VISION-SAM: Integrated Vision Language and Foundation Model for Automated Estimation of Building Lowest Floor Elevation	Yu-Hsuan Ho et.al.	2404.12606	null
2024-04-18	The devil is in the object boundary: towards annotation-free instance segmentation using Foundation Models	Cheng Shi et.al.	2404.11957	link
2024-04-18	Simultaneous Detection and Interaction Reasoning for Object-Centric Action Recognition	Xunsong Li et.al.	2404.11903	null
2024-04-17	TempBEV: Improving Learned BEV Encoders with Combined Image and BEV Space Temporal Aggregation	Thomas Monninger et.al.	2404.11803	null
2024-04-17	Multimodal 3D Object Detection on Unseen Domains	Deepti Hegde et.al.	2404.11764	null
2024-04-17	Equivariant Spatio-Temporal Self-Supervision for LiDAR Object Detection	Deepti Hegde et.al.	2404.11737	null
2024-04-17	Multi-resolution Rescored ByteTrack for Video Object Detection on Ultra-low-power Embedded Systems	Luca Bompani et.al.	2404.11488	link
2024-04-17	EcoMLS: A Self-Adaptation Approach for Architecting Green ML-Enabled Systems	Meghana Tedla et.al.	2404.11411	null
2024-04-17	Detector Collapse: Backdooring Object Detection to Catastrophic Overload or Blindness	Hangtao Zhang et.al.	2404.11357	null
2024-04-17	Simple In-place Data Augmentation for Surveillance Object Detection	Munkh-Erdene Otgonbold et.al.	2404.11226	null
2024-04-19	Feature Corrective Transfer Learning: End-to-End Solutions to Object Detection in Non-Ideal Visual Conditions	Chuheng Wei et.al.	2404.11214	null
2024-04-17	GhostNetV3: Exploring the Training Strategies for Compact Models	Zhenhua Liu et.al.	2404.11202	null
2024-04-17	How to deal with glare for improved perception of Autonomous Vehicles	Muhammad Z. Alam et.al.	2404.10992	null
2024-04-17	Leveraging 3D LiDAR Sensors to Enable Enhanced Urban Safety and Public Health: Pedestrian Monitoring and Abnormal Activity Detection	Nawfal Guefrachi et.al.	2404.10978	null
2024-04-16	OSR-ViT: A Simple and Modular Framework for Open-Set Object Detection and Discovery	Matthew Inkawhich et.al.	2404.10865	null
2024-04-16	Learning Feature Inversion for Multi-class Anomaly Detection under General-purpose COCO-AD Benchmark	Jiangning Zhang et.al.	2404.10760	link
2024-04-16	Watch Your Step: Optimal Retrieval for Continual Learning at Scale	Truman Hickok et.al.	2404.10758	null
2024-04-16	Camera clustering for scalable stream-based active distillation	Dani Manjah et.al.	2404.10411	null
2024-04-15	Low-Light Image Enhancement Framework for Improved Object Detection in Fisheye Lens Datasets	Dai Quoc Tran et.al.	2404.10078	link
2024-04-15	Explainable Light-Weight Deep Learning Pipeline for Improved Drought Stres	Aswini Kumar Patra et.al.	2404.10073	null
2024-04-15	VFMM3D: Releasing the Potential of Image by Vision Foundation Model for Monocular 3D Object Detection	Bonan Ding et.al.	2404.09431	null
2024-04-14	TEXT2TASTE: A Versatile Egocentric Vision System for Intelligent Reading Assistance Using Large Language Model	Wiktor Mucha et.al.	2404.09254	null
2024-04-14	DetCLIPv3: Towards Versatile Generative Open-vocabulary Object Detection	Lewei Yao et.al.	2404.09216	null
2024-04-14	Coreset Selection for Object Detection	Hojun Lee et.al.	2404.09161	null
2024-04-14	Fusion-Mamba for Cross-modality Object Detection	Wenhao Dong et.al.	2404.09146	null
2024-04-13	The Snake's Beating Heart? A Millisecond Pulsar Binary in the Galactic Center Radio Filament G359.1 $-$ 0.2	Marcus E. Lower et.al.	2404.09098	null
2024-04-13	BG-YOLO: A Bidirectional-Guided Method for Underwater Object Detection	Jian Zhang et.al.	2404.08979	null
2024-04-13	Shifting Spotlight for Co-supervision: A Simple yet Efficient Single-branch Network to See Through Camouflage	Yang Hu et.al.	2404.08936	null
2024-04-12	Training-free Boost for Open-Vocabulary Object Detection with Confidence Aggregation	Yanhao Zheng et.al.	2404.08603	link
2024-04-12	FashionFail: Addressing Failure Cases in Fashion Object Detection and Segmentation	Riza Velioglu et.al.	2404.08582	link
2024-04-12	Analyzing Decades-Long Environmental Changes in Namibia Using Archival Aerial Photography and Deep Learning	Girmaw Abebe Tadesse et.al.	2404.08544	null
2024-04-12	MambaDFuse: A Mamba-based Dual-phase Model for Multi-modality Image Fusion	Zhe Li et.al.	2404.08406	link
2024-04-12	Overcoming Scene Context Constraints for Object Detection in wild using Defilters	Vamshi Krishna Kancharla et.al.	2404.08293	null
2024-04-11	ConsistencyDet: Robust Object Detector with Denoising Paradigm of Consistency Model	Lifan Jiang et.al.	2404.07773	link
2024-04-11	Exploiting Object-based and Segmentation-based Semantic Features for Deep Learning-based Indoor Scene Classification	Ricardo Pereira et.al.	2404.07739	null
2024-04-11	Run-time Monitoring of 3D Object Detection in Automated Driving Systems Using Early Layer Neural Activation Patterns	Hakan Yekta Yatbaz et.al.	2404.07685	null
2024-04-11	Finding Dino: A plug-and-play framework for unsupervised detection of out-of-distribution objects using prototypes	Poulami Sinhamahapatra et.al.	2404.07664	null
2024-04-11	Separated Attention: An Improved Cycle GAN Based Under Water Image Enhancement Method	Tashmoy Ghosh et.al.	2404.07649	null
2024-04-11	GLID: Pre-training a Generalist Encoder-Decoder Vision Model	Jihao Liu et.al.	2404.07603	null
2024-04-11	SFSORT: Scene Features-based Simple Online Real-Time Tracker	M. M. Morsali et.al.	2404.07553	link
2024-04-11	The Sydney Radio Star Catalogue: properties of radio stars at megahertz to gigahertz frequencies	Laura N. Driessen et.al.	2404.07418	null
2024-04-11	Simplifying Two-Stage Detectors for On-Device Inference in Remote Sensing	Jaemin Kang et.al.	2404.07405	null
2024-04-11	A fine-tuning workflow for automatic first-break picking with deep learning	Amir Mardan et.al.	2404.07400	link
2024-04-10	Identification of Fine-grained Systematic Errors via Controlled Scene Generation	Valentyn Boreiko et.al.	2404.07045	null
2024-04-10	Accurate Tennis Court Line Detection on Amateur Recorded Matches	Sameer Agrawal et.al.	2404.06977	null
2024-04-10	Sparse Points to Dense Clouds: Enhancing 3D Detection with Limited LiDAR Data	Aakash Kumar et.al.	2404.06715	null
2024-04-10	Scaling Multi-Camera 3D Object Detection through Weak-to-Strong Eliciting	Hao Lu et.al.	2404.06700	link
2024-04-09	Learning Embeddings with Centroid Triplet Loss for Object Identification in Robotic Grasping	Anas Gouda et.al.	2404.06277	link
2024-04-09	Label-Efficient 3D Object Detection For Road-Side Units	Minh-Quan Dao et.al.	2404.06256	null
2024-04-09	Automatic Defect Detection in Sewer Network Using Deep Learning Based Object Detector	Bach Ha et.al.	2404.06219	null
2024-04-09	YOLC: You Only Look Clusters for Tiny Object Detection in Aerial Images	Chenguang Liu et.al.	2404.06180	link
2024-04-09	Enhanced Radar Perception via Multi-Task Learning: Towards Refined Data for Sensor Fusion Applications	Huawei Sun et.al.	2404.06165	null
2024-04-09	Improving Facial Landmark Detection Accuracy and Efficiency with Knowledge Distillation	Zong-Wei Hong et.al.	2404.06029	null
2024-04-08	Retrieval-Augmented Open-Vocabulary Object Detection	Jooyeon Kim et.al.	2404.05687	link
2024-04-08	3D-COCO: extension of MS-COCO dataset for image detection and 3D reconstruction modules	Maxence Bideaux et.al.	2404.05641	null
2024-04-08	Detecting Every Object from Events	Haitian Zhang et.al.	2404.05285	link
2024-04-08	MOSE: Boosting Vision-based Roadside 3D Object Detection with Scene Cues	Xiahan Chen et.al.	2404.05280	null
2024-04-08	Rendering-Enhanced Automatic Image-to-Point Cloud Registration for Roadside Scenes	Yu Sheng et.al.	2404.05164	null
2024-04-08	Better Monocular 3D Detectors with LiDAR from the Past	Yurong You et.al.	2404.05139	link
2024-04-07	AirShot: Efficient Few-Shot Detection for Autonomous Exploration	Zihan Wang et.al.	2404.05069	link
2024-04-07	Hyperbolic Learning with Synthetic Captions for Open-World Detection	Fanjie Kong et.al.	2404.05016	null
2024-04-07	MonoTAKD: Teaching Assistant Knowledge Distillation for Monocular 3D Object Detection	Hou-I Liu et.al.	2404.04910	null
2024-04-07	Few-Shot Object Detection: Research Advances and Challenges	Zhimeng Xin et.al.	2404.04799	null
2024-04-05	SCAResNet: A ResNet Variant Optimized for Tiny Object Detection in Transmission and Distribution Towers	Weile Li et.al.	2404.04179	link
2024-04-05	Designing Robots to Help Women	Martin Cooney et.al.	2404.04123	null
2024-04-04	Is CLIP the main roadblock for fine-grained open-world perception?	Lorenzo Bianchi et.al.	2404.03539	link
2024-04-04	DQ-DETR: DETR with Dynamic Query for Tiny Object Detection	Yi-Xin Huang et.al.	2404.03507	null
2024-04-05	A Methodology to Study the Impact of Spiking Neural Network Parameters considering Event-Based Automotive Data	Iqra Bano et.al.	2404.03493	null
2024-04-04	MonoCD: Monocular 3D Object Detection with Complementary Depths	Longfei Yan et.al.	2404.03181	link
2024-04-03	DPFT: Dual Perspective Fusion Transformer for Camera-Radar-based Object Detection	Felix Fent et.al.	2404.03015	link
2024-04-03	ALOHa: A New Measure for Hallucination in Captioning Models	Suzanne Petryk et.al.	2404.02904	null
2024-04-03	FlightScope: A Deep Comprehensive Assessment of Aircraft Detection Algorithms in Satellite Imagery	Safouane El Ghazouali et.al.	2404.02877	link
2024-04-03	HENet: Hybrid Encoding for End-to-end Multi-task 3D Perception from Multi-view Cameras	Zhongyu Xia et.al.	2404.02517	link
2024-04-04	TE-TAD: Towards Full End-to-End Temporal Action Detection via Time-Aligned Coordinate Expression	Ho-Joong Kim et.al.	2404.02405	link
2024-04-05	EGTR: Extracting Graph from Transformer for Scene Graph Generation	Jinbae Im et.al.	2404.02072	link
2024-04-03	Cooperative Students: Navigating Unsupervised Domain Adaptation in Nighttime Object Detection	Jicheng Yuan et.al.	2404.01988	link
2024-04-02	Towards Enhanced Analysis of Lung Cancer Lesions in EBUS-TBNA -- A Semi-Supervised Video Object Detection Method	Jyun-An Lin et.al.	2404.01929	null
2024-04-02	Scene Adaptive Sparse Transformer for Event-based Object Detection	Yansong Peng et.al.	2404.01882	link
2024-04-02	Semi-Supervised Domain Adaptation for Wildfire Detection	JooYoung Jang et.al.	2404.01842	link
2024-04-02	Sparse Semi-DETR: Sparse Learnable Queries for Semi-Supervised Object Detection	Tahira Shehzadi et.al.	2404.01819	null
2024-04-02	Analyzing the Single Event Upset Vulnerability of Binarized Neural Networks on SRAM FPGAs	Ioanna Souvatzoglou et.al.	2404.01757	null
2024-04-02	Disentangled Pre-training for Human-Object Interaction Detection	Zhuolong Li et.al.	2404.01725	link
2024-04-02	Task Integration Distillation for Object Detectors	Hai Su et.al.	2404.01699	null
2024-04-02	Beyond Image Super-Resolution for Image Recognition with Task-Driven Perceptual Loss	Jaeha Kim et.al.	2404.01692	link
2024-03-29	PLoc: A New Evaluation Criterion Based on Physical Location for Autonomous Driving Datasets	Ruining Yang et.al.	2403.19893	null
2024-03-29	MambaMixer: Efficient Selective State Space Models with Dual Token and Channel Selection	Ali Behrouz et.al.	2403.19888	null
2024-03-28	DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs	Donghyun Kim et.al.	2403.19588	link
2024-03-28	OV-Uni3DETR: Towards Unified Open-Vocabulary 3D Object Detection via Cycle-Modality Propagation	Zhenyu Wang et.al.	2403.19580	link
2024-03-28	Sparse Generation: Making Pseudo Labels Sparse for weakly supervision with points	Tian Ma et.al.	2403.19306	null
2024-03-28	CAT: Exploiting Inter-Class Dynamics for Domain Adaptive Object Detection	Mikhail Kennerley et.al.	2403.19278	link
2024-03-28	Algorithmic Ways of Seeing: Using Object Detection to Facilitate Art Exploration	Louie Søs Meyer et.al.	2403.19174	null
2024-03-28	CRKD: Enhanced Camera-Radar Object Detection with Cross-modality Knowledge Distillation	Lingjun Zhao et.al.	2403.19104	null
2024-03-28	A Real-Time Framework for Domain-Adaptive Underwater Object Detection with Image Enhancement	Junjie Wen et.al.	2403.19079	null
2024-03-27	Illicit object detection in X-ray images using Vision Transformers	Jorgen Cani et.al.	2403.19043	null
2024-03-27	Benchmarking Object Detectors with COCO: A New Path Forward	Shweta Singh et.al.	2403.18819	link
2024-03-27	PhysicsAssistant: An LLM-Powered Interactive Learning Robot for Physics Lab Investigations	Ehsan Latif et.al.	2403.18721	null
2024-03-27	CosalPure: Learning Concept from Group Images for Robust Co-Saliency Detection	Jiayi Zhu et.al.	2403.18554	null
2024-03-27	BAM: Box Abstraction Monitors for Real-time OoD Detection in Object Detection	Changshun Wu et.al.	2403.18373	null
2024-03-27	Ship in Sight: Diffusion Models for Ship-Image Super Resolution	Luigi Sigillo et.al.	2403.18370	link
2024-03-27	DODA: Diffusion for Object-detection Domain Adaptation in Agriculture	Shuai Xiang et.al.	2403.18334	link
2024-03-27	Tracking-Assisted Object Detection with Event Cameras	Ting-Kang Yen et.al.	2403.18330	link
2024-03-27	SGDM: Static-Guided Dynamic Module Make Stronger Visual Models	Wenjie Xing et.al.	2403.18282	null
2024-03-27	Road Obstacle Detection based on Unknown Objectness Scores	Chihiro Noguchi et.al.	2403.18207	null
2024-03-26	State of the art applications of deep learning within tracking and detecting marine debris: A survey	Zoe Moorton et.al.	2403.18067	null
2024-03-26	The Solution for the CVPR 2023 1st foundation model challenge-Track2	Haonan Xu et.al.	2403.17702	null
2024-03-26	PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition	Chenhongyi Yang et.al.	2403.17695	link
2024-03-26	UADA3D: Unsupervised Adversarial Domain Adaptation for 3D Object Detection with Sparse LiDAR and Large Domain Gaps	Maciej K Wozniak et.al.	2403.17633	link
2024-03-26	SSF3D: Strict Semi-Supervised 3D Object Detection with Switching Filter	Songbur Wong et.al.	2403.17390	null
2024-03-26	Decoupled Pseudo-labeling for Semi-Supervised Monocular 3D Object Detection	Jiacheng Zhang et.al.	2403.17387	null
2024-03-26	AIDE: An Automatic Data Engine for Object Detection in Autonomous Driving	Mingfu Liang et.al.	2403.17373	null
2024-03-26	Staircase Localization for Autonomous Exploration in Urban Environments	Jinrae Kim et.al.	2403.17330	null
2024-03-25	Co-Occurring of Object Detection and Identification towards unlabeled object discovery	Binay Kumar Singh et.al.	2403.17223	null
2024-03-25	Optimizing LiDAR Placements for Robust Driving Perception in Adverse Conditions	Ye Li et.al.	2403.17009	link
2024-03-25	Isolated Diffusion: Optimizing Multi-Concept Text-to-Image Generation Training-Freely with Isolated Diffusion Guidance	Jingyuan Zhu et.al.	2403.16954	null
2024-03-25	RCBEVDet: Radar-camera Fusion in Bird's Eye View for 3D Object Detection	Zhiwei Lin et.al.	2403.16440	link
2024-03-25	ASDF: Assembly State Detection Utilizing Late Fusion by Integrating 6D Pose Estimation	Hannah Schieber et.al.	2403.16400	link
2024-03-25	Impact of Video Compression Artifacts on Fisheye Camera Visual Perception Tasks	Madhumitha Sakthi et.al.	2403.16338	null
2024-03-24	Cross-domain Multi-modal Few-shot Object Detection via Rich Text	Zeyu Shangguan et.al.	2403.16188	link
2024-03-24	Semantic Is Enough: Only Semantic Information For NeRF Reconstruction	Ruibo Wang et.al.	2403.16043	null
2024-03-23	Adversarial Defense Teacher for Cross-Domain Object Detection under Poor Visibility Conditions	Kaiwen Wang et.al.	2403.15786	null
2024-03-25	Point-DETR3D: Leveraging Imagery Data with Spatial Point Prior for Weakly Semi-supervised 3D Object Detection	Hongzhi Gao et.al.	2403.15317	null
2024-03-22	CR3DT: Camera-RADAR Fusion for 3D Detection and Tracking	Nicolas Baumann et.al.	2403.15313	link
2024-03-22	IS-Fusion: Instance-Scene Collaborative Fusion for Multimodal 3D Object Detection	Junbo Yin et.al.	2403.15241	link
2024-03-22	SFOD: Spiking Fusion Object Detector	Yimeng Fan et.al.	2403.15192	link
2024-03-22	CRPlace: Camera-Radar Fusion with BEV Representation for Place Recognition	Shaowei Fu et.al.	2403.15183	null
2024-03-22	An In-Depth Analysis of Data Reduction Methods for Sustainable Deep Learning	Víctor Toscano-Durán et.al.	2403.15150	link
2024-03-22	Gradient-based Sampling for Class Imbalanced Semi-supervised Object Detection	Jiaming Li et.al.	2403.15127	link
2024-03-22	VRSO: Visual-Centric Reconstruction for Static Object Annotation	Chenyao Yu et.al.	2403.15026	link
2024-03-21	Deep Active Learning: A Reality Check	Edrina Gashi et.al.	2403.14800	null
2024-03-21	Multi-Agent VQA: Exploring Multi-Agent Foundation Models in Zero-Shot Visual Question Answering	Bowen Jiang et.al.	2403.14783	link
2024-03-21	T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy	Qing Jiang et.al.	2403.14610	link
2024-03-21	UAV-Assisted Maritime Search and Rescue: A Holistic Approach	Martin Messmer et.al.	2403.14281	null
2024-03-21	Scene-Graph ViT: End-to-End Open-Vocabulary Visual Relationship Detection	Tim Salzmann et.al.	2403.14270	null
2024-03-21	3D Object Detection from Point Cloud via Voting Step Diffusion	Haoran Hou et.al.	2403.14133	null
2024-03-20	EcoSense: Energy-Efficient Intelligent Sensing for In-Shore Ship Detection through Edge-Cloud Collaboration	Wenjun Huang et.al.	2403.14027	null
2024-03-20	RAR: Retrieving And Ranking Augmented MLLMs for Visual Recognition	Ziyu Liu et.al.	2403.13805	link
2024-03-20	Bounding Box Stability against Feature Dropout Reflects Detector Generalization across Environments	Yang Yang et.al.	2403.13803	link
2024-03-20	Fostc3net:A Lightweight YOLOv5 Based On the Network Structure Optimization	Danqing Ma et.al.	2403.13703	null
2024-03-20	Find n' Propagate: Open-Vocabulary 3D Object Detection in Urban Environments	Djamahl Etchegaray et.al.	2403.13556	link
2024-03-20	MTP: Advancing Remote Sensing Foundation Model via Multi-Task Pretraining	Di Wang et.al.	2403.13430	link
2024-03-20	Few-shot Oriented Object Detection with Memorable Contrastive Learning in Remote Sensing Images	Jiawei Zhou et.al.	2403.13375	null
2024-03-20	DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception	Yibo Wang et.al.	2403.13304	null
2024-03-19	SceneScript: Reconstructing Scenes With An Autoregressive Structured Language Model	Armen Avetisyan et.al.	2403.13064	null
2024-03-19	TAPTR: Tracking Any Point with Transformers as Detection	Hongyang Li et.al.	2403.13042	null
2024-03-19	Wildfire danger prediction optimization with transfer learning	Spiros Maggioros et.al.	2403.12871	link
2024-03-19	As Firm As Their Foundations: Can open-sourced foundation models be used to create adversarial examples for downstream tasks?	Anjun Hu et.al.	2403.12693	null
2024-03-19	EAS-SNN: End-to-End Adaptive Sampling and Representation for Event-based Detection with Recurrent Spiking Neural Networks	Ziming Wang et.al.	2403.12574	null
2024-03-19	DetToolChain: A New Prompting Paradigm to Unleash Detection Ability of MLLM	Yixuan Wu et.al.	2403.12488	link
2024-03-19	TransformMix: Learning Transformation and Mixing Strategies from Data	Tsz-Him Cheung et.al.	2403.12429	null
2024-03-19	VisionGPT: LLM-Assisted Real-Time Anomaly Detection for Safe Visual Navigation	Hao Wang et.al.	2403.12415	link
2024-03-19	Entity6K: A Large Open-Domain Evaluation Dataset for Real-World Entity Recognition	Jielin Qiu et.al.	2403.12339	null
2024-03-18	EffiPerception: an Efficient Framework for Various Perception Tasks	Xinhao Xiang et.al.	2403.12317	null
2024-03-18	Prototipo de un Contador Bidireccional Automático de Personas basado en sensores de visión 3D	Benjamín Ojeda-Magaña et.al.	2403.12310	null
2024-03-18	Align and Distill: Unifying and Improving Domain Adaptive Object Detection	Justin Kay et.al.	2403.12029	link
2024-03-18	TrajectoryNAS: A Neural Architecture Search for Trajectory Prediction	Ali Asghar Sharifi et.al.	2403.11695	null
2024-03-18	Just Add $100 More: Augmenting NeRF-based Pseudo-LiDAR Point Cloud for Resolving Class-imbalance Problem	Mincheol Chang et.al.	2403.11573	null
2024-03-18	R2SNet: Scalable Domain Adaptation for Object Detection in Cloud-Based Robots Ecosystems via Proposal Refinement	Michele Antonazzi et.al.	2403.11567	null
2024-03-18	Continual Forgetting for Pre-trained Vision Models	Hongbo Zhao et.al.	2403.11530	link
2024-03-17	V2X-DGW: Domain Generalization for Multi-agent Perception under Adverse Weather Conditions	Baolu Li et.al.	2403.11371	null
2024-03-17	Advanced Knowledge Extraction of Physical Design Drawings, Translation and conversion to CAD formats using Deep Learning	Jesher Joshua M et.al.	2403.11291	null
2024-03-17	ManipVQA: Injecting Robotic Affordance and Physically Grounded Information into Multi-Modal Large Language Models	Siyuan Huang et.al.	2403.11289	link
2024-03-19	CPA-Enhancer: Chain-of-Thought Prompted Adaptive Enhancer for Object Detection under Unknown Degradations	Yuwei Zhang et.al.	2403.11220	link
2024-03-19	GRA: Detecting Oriented Objects through Group-wise Rotating and Attention	Jiangshan Wang et.al.	2403.11127	null
2024-03-17	Self-supervised co-salient object detection via feature correspondence at multiple scales	Souradeep Chakraborty et.al.	2403.11107	link
2024-03-15	SimPB: A Single Model for 2D and 3D Object Detection from Multiple Cameras	Yingqi Tang et.al.	2403.10353	link
2024-03-15	Generative Region-Language Pretraining for Open-Ended Object Detection	Chuang Lin et.al.	2403.10191	link
2024-03-15	A Hybrid SNN-ANN Network for Event-based Object Detection with Spatial and Temporal Attention	Soikat Hasan Ahmed et.al.	2403.10173	null
2024-03-15	CSDNet: Detect Salient Object in Depth-Thermal via A Lightweight Cross Shallow and Deep Perception Network	Xiaotong Yu et.al.	2403.10104	null
2024-03-15	SparseFusion: Efficient Sparse Multi-Modal Fusion Framework for Long-Range 3D Perception	Yiheng Li et.al.	2403.10036	null
2024-03-14	Attention-based Class-Conditioned Alignment for Multi-Source Domain Adaptive Object Detection	Atif Belal et.al.	2403.09918	link
2024-03-14	Open-Vocabulary Object Detection with Meta Prompt Representation and Instance Contrastive Optimization	Zhao Wang et.al.	2403.09433	null
2024-03-14	D3T: Distinctive Dual-Domain Teacher Zigzagging Across RGB-Thermal Gap for Domain-Adaptive Object Detection	Dinh Phat Do et.al.	2403.09359	link
2024-03-14	Griffon v2: Advancing Multimodal Perception with High-Resolution Scaling and Visual-Language Co-Referring	Yufei Zhan et.al.	2403.09333	link
2024-03-14	EfficientMFD: Towards More Efficient Multimodal Synchronous Fusion Detection	Jiaqing Zhang et.al.	2403.09323	link
2024-03-14	Knowledge Distillation in YOLOX-ViT for Side-Scan Sonar Object Detection	Martin Aubard et.al.	2403.09313	link
2024-03-14	MOTPose: Multi-object 6D Pose Estimation for Dynamic Video Sequences using Attention-based Temporal Fusion	Arul Selvam Periyasamy et.al.	2403.09309	null
2024-03-14	CLIP-EBC: CLIP Can Count Accurately through Enhanced Blockwise Classification	Yiming Ma et.al.	2403.09281	link
2024-03-14	D-YOLO a robust framework for object detection in adverse weather conditions	Zihan Chu et.al.	2403.09233	null
2024-03-14	Improving Distant 3D Object Detection Using 2D Box Supervision	Zetong Yang et.al.	2403.09230	null
2024-03-14	PoIFusion: Multi-Modal 3D Object Detection via Fusion at Points of Interest	Jiajun Deng et.al.	2403.09212	null
2024-03-13	MIM4D: Masked Modeling with Multi-View Video for Autonomous Driving Representation Learning	Jialv Zou et.al.	2403.08760	link
2024-03-13	PRAGO: Differentiable Multi-View Pose Optimization From Objectness Detections	Matteo Taiana et.al.	2403.08586	null
2024-03-13	A Multimodal Fusion Network For Student Emotion Recognition Based on Transformer and Tensor Product	Ao Xiang et.al.	2403.08511	null
2024-03-13	Improved YOLOv5 Based on Attention Mechanism and FasterNet for Foreign Object Detection on Railway and Airway tracks	Zongqing Qi et.al.	2403.08499	null
2024-03-13	IAMCV Multi-Scenario Vehicle Interaction Dataset	Novel Certad et.al.	2403.08455	null
2024-03-13	Advancing Security in AI Systems: A Novel Approach to Detecting Backdoors in Deep Neural Networks	Khondoker Murad Hossain et.al.	2403.08208	null
2024-03-12	TaskCLIP: Extend Large Vision-Language Model for Task Oriented Object Detection	Hanning Chen et.al.	2403.08108	null
2024-03-12	Aedes aegypti Egg Counting with Neural Networks for Object Detection	Micheli Nayara de Oliveira Vicente et.al.	2403.08016	null
2024-03-12	Mondrian: On-Device High-Performance Video Analytics with Compressive Packed Inference	Changmin Jeon et.al.	2403.07598	null
2024-03-12	PeLK: Parameter-efficient Large Kernel ConvNets with Peripheral Convolution	Honghao Chen et.al.	2403.07589	null
2024-03-12	A Survey of Vision Transformers in Autonomous Driving: Current Trends and Future Directions	Quoc-Vinh Lai-Dang et.al.	2403.07542	null
2024-03-12	JSTR: Joint Spatio-Temporal Reasoning for Event-based Moving Object Detection	Hanyu Zhou et.al.	2403.07436	null
2024-03-12	Eliminating Cross-modal Conflicts in BEV Space for LiDAR-Camera 3D Object Detection	Jiahui Fu et.al.	2403.07372	null
2024-03-12	SparseLIF: High-Performance Sparse LiDAR-Camera Fusion for 3D Object Detection	Hongcheng Zhang et.al.	2403.07284	null
2024-03-12	Adaptive Bounding Box Uncertainties via Two-Step Conformal Prediction	Alexander Timans et.al.	2403.07263	link
2024-03-11	Class Imbalance in Object Detection: An Experimental Diagnosis and Study of Mitigation Strategies	Nieves Crasto et.al.	2403.07113	link
2024-03-11	LISO: Lidar-only Self-Supervised 3D Object Detection	Stefan Baur et.al.	2403.07071	null
2024-03-11	Real-time Transformer-based Open-Vocabulary Detection with Efficient Fusion Head	Tiancheng Zhao et.al.	2403.06892	link
2024-03-11	LeOCLR: Leveraging Original Images for Contrastive Learning of Visual Representations	Mohammad Alkhalefi et.al.	2403.06813	null
2024-03-11	Genetic Learning for Designing Sim-to-Real Data Augmentations	Bram Vanherle et.al.	2403.06786	link
2024-03-11	Evaluating the Energy Efficiency of Few-Shot Learning for Object Detection in Industrial Settings	Georgios Tsoumplekas et.al.	2403.06631	null
2024-03-11	Cross-domain and Cross-dimension Learning for Image-to-Graph Transformers	Alexander H. Berger et.al.	2403.06601	null
2024-03-11	SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object Detection	Yuxuan Li et.al.	2403.06534	link
2024-03-11	3D Semantic Segmentation-Driven Representations for 3D Object Detection	Hayeon O et.al.	2403.06501	link
2024-03-11	Fine-Grained Pillar Feature Encoding Via Spatio-Temporal Virtual Grid for 3D Object Detection	Konyul Park et.al.	2403.06433	link
2024-03-10	Transformer based Multitask Learning for Image Captioning and Object Detection	Debolena Basak et.al.	2403.06292	null
2024-03-10	Poly Kernel Inception Network for Remote Sensing Detection	Xinhao Cai et.al.	2403.06258	link
2024-03-08	SIRST-5K: Exploring Massive Negatives Synthesis with Self-supervised Learning for Robust Infrared Small Target Detection	Yahao Lu et.al.	2403.05416	link
2024-03-08	Exploring Robust Features for Few-Shot Object Detection in Satellite Imagery	Xavier Bou et.al.	2403.05381	link
2024-03-08	Frequency-Adaptive Dilated Convolution for Semantic Segmentation	Linwei Chen et.al.	2403.05369	link
2024-03-08	VLM-PL: Advanced Pseudo Labeling approach Class Incremental Object Detection with Vision-Language Model	Junsu Kim et.al.	2403.05346	null
2024-03-08	Improving the Successful Robotic Grasp Detection Using Convolutional Neural Networks	Hamed Hosseini et.al.	2403.05211	null
2024-03-08	LanePtrNet: Revisiting Lane Detection as Point Voting and Grouping on Curves	Jiayan Cao et.al.	2403.05155	null
2024-03-08	RadarDistill: Boosting Radar-based Object Detection Performance via Knowledge Distillation from LiDAR Features	Geonho Bang et.al.	2403.05061	null
2024-03-08	ActFormer: Scalable Collaborative Perception via Active Queries	Suozhi Huang et.al.	2403.04968	null
2024-03-07	FriendNet: Detection-Friendly Dehazing Network	Yihua Fan et.al.	2403.04443	link
2024-03-07	Effectiveness Assessment of Recent Large Vision-Language Models	Yao Jiang et.al.	2403.04306	null
2024-03-07	ACC-ViT : Atrous Convolution's Comeback in Vision Transformers	Nabil Ibtehaz et.al.	2403.04200	null
2024-03-07	CN-RMA: Combined Network with Ray Marching Aggregation for 3D Indoors Object Detection from Multi-view Images	Guanlin Shen et.al.	2403.04198	link
2024-03-07	Scalable and Robust Transformer Decoders for Interpretable Image Classification with Foundation Models	Evelyn Mannix et.al.	2403.04125	null
2024-03-07	CMDA: Cross-Modal and Domain Adversarial Adaptation for LiDAR-Based 3D Object Detection	Gyusam Chang et.al.	2403.03721	null
2024-03-06	Adversarial Infrared Geometry: Using Geometry to Perform Adversarial Attack against Infrared Pedestrian Detectors	Kalibinuer Tiliwalidi et.al.	2403.03674	null
2024-03-06	Multi-task Learning for Real-time Autonomous Driving Leveraging Task-adaptive Attention Generator	Wonhyeok Choi et.al.	2403.03468	null
2024-03-06	FLAME Diffuser: Grounded Wildfire Image Synthesis using Mask Guided Diffusion	Hao Wang et.al.	2403.03463	null
2024-03-06	Performance Evaluation of Semi-supervised Learning Frameworks for Multi-Class Weed Detection	Jiajia Li et.al.	2403.03390	link
2024-03-05	Detecting Concrete Visual Tokens for Multimodal Machine Translation	Braeden Bowen et.al.	2403.03075	null
2024-03-05	Loss Design for Single-carrier Joint Communication and Neural Network-based Sensing	Charlotte Muth et.al.	2403.02929	null
2024-03-05	Are Dense Labels Always Necessary for 3D Object Detection from Point Cloud?	Chenqiang Gao et.al.	2403.02818	null
2024-03-05	Bootstrapping Rare Object Detection in High-Resolution Satellite Imagery	Akram Zaytar et.al.	2403.02736	null
2024-03-05	FastOcc: Accelerating 3D Occupancy Prediction by Fusing the 2D Bird's-Eye View and Perspective View	Jiawei Hou et.al.	2403.02710	null
2024-03-05	False Positive Sampling-based Data Augmentation for Enhanced 3D Object Detection Accuracy	Jiyong Oh et.al.	2403.02639	null
2024-03-05	BSDP: Brain-inspired Streaming Dual-level Perturbations for Online Open World Object Detection	Yu Chen et.al.	2403.02637	null
2024-03-04	NiNformer: A Network in Network Transformer with Token Mixing Generated Gating Function	Abdullah Nazhat Abdullah et.al.	2403.02411	link
2024-03-04	COMMIT: Certifying Robustness of Multi-Sensor Fusion Systems against Semantic Attacks	Zijian Huang et.al.	2403.02329	null
2024-03-04	Scalable Vision-Based 3D Object Detection and Monocular Depth Estimation for Autonomous Driving	Yuxuan Liu et.al.	2403.02037	link
2024-03-02	TUMTraf V2X Cooperative Perception Dataset	Walter Zimmer et.al.	2403.01316	link
2024-03-02	Run-time Introspection of 2D Object Detection in Automated Driving Systems Using Learning Representations	Hakan Yekta Yatbaz et.al.	2403.01172	null
2024-03-02	ELA: Efficient Local Attention for Deep Convolutional Neural Networks	Wei Xu et.al.	2403.01123	null
2024-03-02	Beyond Night Visibility: Adaptive Multi-Scale Fusion of Infrared and Visible Images	Shufan Pei et.al.	2403.01083	null
2024-03-01	Learning Causal Features for Incremental Object Detection	Zhenwei He et.al.	2403.00591	null
2024-03-01	Abductive Ego-View Accident Video Understanding for Safe Driving Perception	Jianwu Fang et.al.	2403.00436	null
2024-03-04	DAMS-DETR: Dynamic Adaptive Multispectral Detection Transformer with Competitive Query Selection and Adaptive Feature Fusion	Junjie Guo et.al.	2403.00326	link
2024-03-01	YOLO-MED : Multi-Task Interaction Network for Biomedical Images	Suizhi Huang et.al.	2403.00245	null
2024-02-29	FusionVision: A comprehensive approach of 3D object reconstruction and segmentation from RGB-D cameras using YOLO and fast segment anything	Safouane El Ghazouali et.al.	2403.00175	link
2024-02-29	LLMs in Political Science: Heralding a New Era of Visual Analysis	Yu Wang et.al.	2403.00154	null
2024-02-29	SeMoLi: What Moves Together Belongs Together	Jenny Seidenschwarz et.al.	2402.19463	null
2024-02-29	Genie: Smart ROS-based Caching for Connected Autonomous Robots	Zexin Li et.al.	2402.19410	null
2024-02-29	ProtoP-OD: Explainable Object Detection with Prototypical Parts	Pavlos Rath-Manakidis et.al.	2402.19142	null
2024-02-29	Theoretically Achieving Continuous Representation of Oriented Bounding Boxes	Zikai Xiao et.al.	2402.18975	link
2024-02-29	Boosting Semi-Supervised Object Detection in Remote Sensing Images With Active Teaching	Boxuan Zhang et.al.	2402.18958	null
2024-02-29	Edge Computing Enabled Real-Time Video Analysis via Adaptive Spatial-Temporal Semantic Filtering	Xiang Chen et.al.	2402.18927	null
2024-02-29	A Simple yet Effective Network based on Vision Transformer for Camouflaged Object and Salient Object Detection	Chao Hao et.al.	2402.18922	null
2024-02-29	Privacy-Preserving Autoencoder for Collaborative Object Detection	Bardia Azizian et.al.	2402.18864	null
2024-02-29	Debiased Novel Category Discovering and Localization	Juexiao Feng et.al.	2402.18821	null
2024-02-28	Spatial Coherence Loss for Salient and Camouflaged Object Detection and Beyond	Ziyun Yang et.al.	2402.18698	null
2024-02-28	UniMODE: Unified Monocular 3D Object Detection	Zhuoling Li et.al.	2402.18573	null
2024-02-28	Detection of Micromobility Vehicles in Urban Traffic Videos	Khalil Sabri et.al.	2402.18503	link
2024-02-28	Sunshine to Rainstorm: Cross-Weather Knowledge Distillation for Robust 3D Object Detection	Xun Huang et.al.	2402.18493	null
2024-02-28	Prompt-Driven Dynamic Object-Centric Learning for Single Domain Generalization	Deng Li et.al.	2402.18447	null
2024-02-28	Unveiling novel insights into Kirchhoff migration for effective object detection using experimental Fresnel dataset	Won-Kwang Park et.al.	2402.18322	null
2024-02-28	Zero-Shot Aerial Object Detection with Visual Description Regularization	Zhengqing Zang et.al.	2402.18233	null
2024-02-28	VulMCI : Code Splicing-based Pixel-row Oversampling for More Continuous Vulnerability Image Generation	Tao Peng et.al.	2402.18189	link
2024-02-27	SDDGR: Stable Diffusion-based Deep Generative Replay for Class Incremental Object Detection	Junsu Kim et.al.	2402.17323	null
2024-02-27	A Vanilla Multi-Task Framework for Dense Visual Prediction Solution to 1st VCL Challenge -- Multi-Task Robustness Track	Zehui Chen et.al.	2402.17319	null
2024-02-27	Probing Multimodal Large Language Models for Global and Local Semantic Representation	Mingxu Tao et.al.	2402.17304	link
2024-02-27	Deployment Prior Injection for Run-time Calibratable Object Detection	Mo Zhou et.al.	2402.17207	null
2024-02-26	A NIRCam-dark galaxy detected with the MIRI/F1000W filter in the MIDIS/JADES Hubble Ultra Deep Field	Pablo G. Pérez-González et.al.	2402.16942	null
2024-02-26	DEYO: DETR with YOLO for End-to-End Object Detection	Haodong Ouyang et.al.	2402.16370	link
2024-02-26	mAPm: multi-scale Attention Pyramid module for Enhanced scale-variation in RLD detection	Yunusa Haruna et.al.	2402.16291	null
2024-02-26	Real-Time Vehicle Detection and Urban Traffic Behavior Analysis Based on UAV Traffic Videos on Mobile Devices	Yuan Zhu et.al.	2402.16246	null
2024-02-25	Semi-supervised Open-World Object Detection	Sahal Shaji Mullappilly et.al.	2402.16013	link
2024-02-24	MMW-Carry: Enhancing Carry Object Detection through Millimeter-Wave Radar-Camera Fusion	Xiangyu Gao et.al.	2402.15897	null
2024-02-23	A Comprehensive Survey of Convolutions in Deep Learning: Applications, Challenges, and Future Trends	Abolfazl Younesi et.al.	2402.15490	null
2024-02-23	A Universal Method for Solar Filament Detection from H-alpha Observations using Semi-supervised Deep Learning	Andrea Diercke et.al.	2402.15407	null
2024-02-23	EMIFF: Enhanced Multi-scale Image Feature Fusion for Vehicle-Infrastructure Cooperative 3D Object Detection	Zhe Wang et.al.	2402.15272	link
2024-02-22	WeakSAM: Segment Anything Meets Weakly-supervised Instance-level Recognition	Lianghui Zhu et.al.	2402.14812	link
2024-02-22	High-Speed Detector For Low-Powered Devices In Aerial Grasping	Ashish Kumar et.al.	2402.14591	null
2024-02-22	S^2Former-OR: Single-Stage Bimodal Transformer for Scene Graph Generation in OR	Jialun Pei et.al.	2402.14461	null
2024-02-22	YOLO-TLA: An Efficient and Lightweight Small Object Detection Model based on YOLOv5	Peng Gao et.al.	2402.14309	null
2024-02-21	YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information	Chien-Yao Wang et.al.	2402.13616	link
2024-02-21	TransGOP: Transformer-Based Gaze Object Prediction	Binglu Wang et.al.	2402.13578	link
2024-02-21	Unsupervised learning based object detection using Contrastive Learning	Chandan Kumar et.al.	2402.13465	null
2024-02-20	Combining unsupervised and supervised learning in microscopy enables defect analysis of a full 4H-SiC wafer	Binh Duong Nguyen et.al.	2402.13353	null
2024-02-20	GOOD: Towards Domain Generalized Orientated Object Detection	Qi Bi et.al.	2402.12765	null
2024-02-20	CST: Calibration Side-Tuning for Parameter and Memory Efficient Transfer Learning	Feng Chen et.al.	2402.12736	null
2024-02-20	YOLO-Ant: A Lightweight Detector via Depthwise Separable Convolutional and Large Kernel Design for Antenna Interference Source Detection	Xiaoyu Tang et.al.	2402.12641	link
2024-02-20	Efficient Parameter Mining and Freezing for Continual Object Detection	Angelo G. Menezes et.al.	2402.12624	null
2024-02-19	LangXAI: Integrating Large Vision Models for Generating Textual Explanations to Enhance Explainability in Visual Perception Tasks	Truong Thanh Hung Nguyen et.al.	2402.12525	link
2024-02-19	UncertaintyTrack: Exploiting Detection and Localization Uncertainty in Multi-Object Tracking	Chang Won Lee et.al.	2402.12303	link
2024-02-19	Weakly Supervised Object Detection in Chest X-Rays with Differentiable ROI Proposal Networks and Soft ROI Pooling	Philip Müller et.al.	2402.11985	link
2024-02-19	SDGE: Stereo Guided Depth Estimation for 360° Camera Sets	Jialei Xu et.al.	2402.11791	null
2024-02-19	Reinforcement Learning as a Parsimonious Alternative to Prediction Cascades: A Case Study on Image Segmentation	Bharat Srikishan et.al.	2402.11760	link
2024-02-18	LiRaFusion: Deep Adaptive LiDAR-Radar Fusion for 3D Object Detection	Jingyu Song et.al.	2402.11735	link
2024-02-18	MultiCorrupt: A Multi-Modal Robustness Dataset and Benchmark of LiDAR-Camera Fusion for 3D Object Detection	Till Beemelmanns et.al.	2402.11677	link
2024-02-18	VoltSchemer: Use Voltage Noise to Manipulate Your Wireless Charger	Zihao Zhan et.al.	2402.11423	null
2024-02-18	A Multispectral Automated Transfer Technique (MATT) for machine-driven image labeling utilizing the Segment Anything Model (SAM)	James E. Gallagher et.al.	2402.11413	null
2024-02-17	GraphKD: Exploring Knowledge Distillation Towards Document Object Detection with Structured Graph Creation	Ayan Banerjee et.al.	2402.11401	link
2024-02-17	ReViT: Enhancing Vision Transformers with Attention Residual Connections for Visual Recognition	Anxhelo Diko et.al.	2402.11301	link
2024-02-16	AutoGPT+P: Affordance-based Task Planning with Large Language Models	Timo Birr et.al.	2402.10778	null
2024-02-16	STF: Spatio-Temporal Fusion Module for Improving Video Object Detection	Noreen Anwar et.al.	2402.10752	link
2024-02-16	CodaMal: Contrastive Domain Adaptation for Malaria Detection in Low-Cost Microscopes	Ishan Rajendrakumar Dave et.al.	2402.10478	link
2024-02-15	LLMs as Bridges: Reformulating Grounded Multimodal Named Entity Recognition	Jinyuan Li et.al.	2402.09989	link
2024-02-15	A Comprehensive Review on Computer Vision Analysis of Aerial Data	Vivek Tetarwal et.al.	2402.09781	null
2024-02-14	Few-Shot Object Detection with Sparse Context Transformers	Jie Mei et.al.	2402.09315	null
2024-02-14	TDViT: Temporal Dilated Video Transformer for Dense Video Tasks	Guanxiong Sun et.al.	2402.09257	link
2024-02-14	Efficient One-stage Video Object Detection by Exploiting Temporal Consistency	Guanxiong Sun et.al.	2402.09241	link
2024-02-14	Switch EMA: A Free Lunch for Better Flatness and Sharpness	Siyuan Li et.al.	2402.09240	link
2024-02-13	Leveraging Self-Supervised Instance Contrastive Learning for Radar Object Detection	Colin Decourt et.al.	2402.08427	null
2024-02-13	Improving Image Coding for Machines through Optimizing Encoder via Auxiliary Loss	Kei Iino et.al.	2402.08267	null
2024-02-13	Object Detection in Thermal Images Using Deep Learning for Unmanned Aerial Vehicles	Minh Dang Tu et.al.	2402.08251	null
2024-02-12	MODIPHY: Multimodal Obscured Detection for IoT using PHantom Convolution-Enabled Faster YOLO	Shubhabrata Mukherjee et.al.	2402.07894	link
2024-02-12	Evaluation of a Smart Mobile Robotic System for Industrial Plant Inspection and Supervision	Georg K. J. Fischer et.al.	2402.07691	null
2024-02-12	AYDIV: Adaptable Yielding 3D Object Detection via Integrated Contextual Vision Transformer	Tanmoy Dam et.al.	2402.07680	link
2024-02-12	A Flow-based Credibility Metric for Safety-critical Pedestrian Detection	Maria Lyssenko et.al.	2402.07642	null
2024-02-12	Context-aware Multi-Model Object Detection for Diversely Heterogeneous Compute Systems	Justin Davis et.al.	2402.07415	null
2024-02-10	Domain Adaptable Fine-Tune Distillation Framework For Advancing Farm Surveillance	Raza Imam et.al.	2402.07059	link
2024-02-10	Semantic Object-level Modeling for Robust Visual Camera Relocalization	Yifan Zhu et.al.	2402.06951	null
2024-02-09	Neural Rendering based Urban Scene Reconstruction for Autonomous Driving	Shihao Shen et.al.	2402.06826	null
2024-02-09	Event-to-Video Conversion for Overhead Object Detection	Darryl Hannan et.al.	2402.06805	null
2024-02-09	Transfer learning with generative models for object detection on limited datasets	Matteo Paiano et.al.	2402.06784	null
2024-02-09	SWITCH: An Exemplar for Evaluating Self-Adaptive ML-Enabled Systems	Arya Marda et.al.	2402.06351	link
2024-02-08	A versatile robotic hand with 3D perception, force sensing for autonomous manipulation	Nikolaus Correll et.al.	2402.06018	link
2024-02-08	InstaGen: Enhancing Object Detection by Training on Synthetic Dataset	Chengjian Feng et.al.	2402.05937	null
2024-02-08	YOLO-CIANNA: Galaxy detection with deep learning in radio data. I. A new YOLO-inspired source detection method applied to the SKAO SDC1	D. Cornu et.al.	2402.05925	link
2024-02-08	Using YOLO v7 to Detect Kidney in Magnetic Resonance Imaging: A Supervised Contrastive Learning	Pouria Yazdian Anari et.al.	2402.05817	null
2024-02-08	Scrapping The Web For Early Wildfire Detection	Mateo Lostanlen et.al.	2402.05349	null
2024-02-07	Toward Accurate Camera-based 3D Object Detection via Cascade Depth Estimation and Calibration	Chaoqun Wang et.al.	2402.04883	null
2024-02-07	STAR: Shape-focused Texture Agnostic Representations for Improved Object Detection and 6D Pose Estimation	Peter Hönig et.al.	2402.04878	link
2024-02-07	Streamlined Hybrid Annotation Framework using Scalable Codestream for Bandwidth-Restricted UAV Object Detection	Karim El Khoury et.al.	2402.04673	null
2024-02-07	G-NAS: Generalizable Neural Architecture Search for Single Domain Generalization Object Detection	Fan Wu et.al.	2402.04672	link
2024-02-07	LLMs Meet VLMs: Boost Open Vocabulary Object Detection with Fine-grained Descriptors	Sheng Jin et.al.	2402.04630	null
2024-02-07	FM-Fusion: Instance-aware Semantic Mapping Boosted by Vision-Language Foundation Models	Chuhao Liu et.al.	2402.04555	null
2024-02-06	Breaking Data Silos: Cross-Domain Learning for Multi-Agent Perception from Independent Private Sources	Jinlong Li et.al.	2402.04273	link
2024-02-06	Acceleration and energy consumption optimization in cascading classifiers for face detection on low-cost ARM big.LITTLE asymmetric architectures	Alberto Corpas et.al.	2402.04090	null
2024-02-06	YOLOPoint Joint Keypoint and Object Detection	Anton Backhaus et.al.	2402.03989	link
2024-02-06	Enhancing Embodied Object Detection through Language-Image Pre-training and Implicit Object Memory	Nicolas Harvey Chapman et.al.	2402.03721	null
2024-02-06	Online Informative Sampling using Semantic Features in Underwater Environments	Shrutika Vishal Thengane et.al.	2402.03636	null
2024-02-06	BEAM: Beta Distribution Ray Denoising for Multi-view 3D Object Detection	Feng Liu et.al.	2402.03634	link
2024-02-05	Stitching the Spectrum: Semantic Spectrum Segmentation with Wideband Signal	Daniel Uvaydov et.al.	2402.03465	link
2024-02-05	HASSOD: Hierarchical Adaptive Self-Supervised Object Detection	Shengcao Cao et.al.	2402.03311	link
2024-02-05	ActiveAnno3D -- An Active Learning Framework for Multi-Modal 3D Object Detection	Ahmed Ghita et.al.	2402.03235	null
2024-02-05	Cross-Domain Few-Shot Object Detection via Enhanced Open-Set Object Detector	Yuqian Fu et.al.	2402.03094	link
2024-02-05	Improving Robustness of LiDAR-Camera Fusion Model against Weather Corruption from Fusion Strategy Perspective	Yihao Huang et.al.	2402.02738	null
2024-02-04	Spatio-temporal Prompting Network for Robust Video Feature Extraction	Guanxiong Sun et.al.	2402.02574	link
2024-02-04	Gazebo Plants: Simulating Plant-Robot Interaction with Cosserat Rods	Junchen Deng et.al.	2402.02570	null
2024-02-04	DeSparsify: Adversarial Attack Against Token Sparsification Mechanisms in Vision Transformers	Oryan Yehezkel et.al.	2402.02554	null
2024-02-03	$\textit{A Contrario}$ Paradigm for YOLO-based Infrared Small Target Detection	Alina Ciocarlan et.al.	2402.02288	null
2024-02-03	CoFiNet: Unveiling Camouflaged Objects with Multi-Scale Finesse	Cunhan Guo et.al.	2402.02217	null
2024-02-03	Decomposition-based and Interference Perception for Infrared and Visible Image Fusion in Complex Scenes	Xilai Li et.al.	2402.02096	null
2024-02-02	Dynamic Occupancy Grids for Object Detection: A Radar-Centric Approach	Max Peter Ronecker et.al.	2402.01488	null
2024-02-02	Phrase Grounding-based Style Transfer for Single-Domain Generalized Object Detection	Hao Li et.al.	2402.01304	null
2024-02-02	Spiking CenterNet: A Distillation-boosted Spiking Neural Network for Object Detection	Lennard Bodden et.al.	2402.01287	null
2024-02-02	TSJNet: A Multi-modality Target and Semantic Awareness Joint-driven Image Fusion Network	Yuchan Jie et.al.	2402.01212	null
2024-02-02	A Survey for Foundation Models in Autonomous Driving	Haoxiang Gao et.al.	2402.01105	null
2024-02-01	Semantic-Aware and Goal-Oriented Communications for Object Detection in Wireless End-to-End Image Transmission	Fatemeh Zahra Safaeipour et.al.	2402.01064	null
2024-02-01	Vehicle Perception from Satellite	Bin Zhao et.al.	2402.00703	link
2024-02-01	A Manifold Representation of the Key in Vision Transformers	Li Meng et.al.	2402.00534	null
2024-02-01	Night-Rider: Nocturnal Vision-aided Localization in Streetlight Maps Using Invariant Extended Kalman Filtering	Tianxiao Gao et.al.	2402.00330	link
2024-02-01	FineBio: A Fine-Grained Video Dataset of Biological Experiments with Hierarchical Annotation	Takuma Yagi et.al.	2402.00293	null
2024-01-31	Capacity Constraint Analysis Using Object Detection for Smart Manufacturing	Hafiz Mughees Ahmad et.al.	2402.00243	null
2024-01-31	Improving Object Detection Quality in Football Through Super-Resolution Techniques	Karolina Seweryn et.al.	2402.00163	null
2024-01-31	Real-time Traffic Object Detection for Autonomous Driving	Abdul Hannan Khan et.al.	2402.00128	null
2024-01-31	Enhancing Multimodal Large Language Models with Vision Detection Models: An Empirical Study	Qirui Jiao et.al.	2401.17981	null
2024-01-31	MelNet: A Real-Time Deep Learning Algorithm for Object Detection	Yashar Azadvatan et.al.	2401.17972	null
2024-01-31	Source-free Domain Adaptive Object Detection in Remote Sensing Images	Weixing Liu et.al.	2401.17916	null
2024-01-31	SubPipe: A Submarine Pipeline Inspection Dataset for Segmentation and Visual-inertial Localization	Olaya Álvarez-Tuñón et.al.	2401.17907	link
2024-01-31	Do Object Detection Localization Errors Affect Human Performance and Trust?	Sven de Witte et.al.	2401.17821	null
2024-01-31	Haris: an Advanced Autonomous Mobile Robot for Smart Parking Assistance	Layth Hamad et.al.	2401.17741	null
2024-01-30	AdvGPS: Adversarial GPS for Multi-Agent Perception Attack	Jinlong Li et.al.	2401.17499	link
2024-01-30	YOLO-World: Real-Time Open-Vocabulary Object Detection	Tianheng Cheng et.al.	2401.17270	link
2024-01-30	A Bearing-Angle Approach for Unknown Target Motion Analysis Based on Visual Measurements	Zian Ning et.al.	2401.17117	null
2024-01-30	LF Tracy: A Unified Single-Pipeline Approach for Salient Object Detection in Light Field Cameras	Fei Teng et.al.	2401.16712	link
2024-01-30	Characterization of Magnetic Labyrinthine Structures through Junctions and Terminals Detection using Template Matching and CNN	Vinícius Yu Okubo et.al.	2401.16688	null
2024-01-30	The Why, When, and How to Use Active Learning in Large-Data-Driven 3D Object Detection for Safe Autonomous Driving: An Empirical Exploration	Ross Greer et.al.	2401.16634	null
2024-01-29	SHViT: Single-Head Vision Transformer with Memory Efficient Macro Design	Seokju Yun et.al.	2401.16456	link
2024-01-29	Computer Vision for Primate Behavior Analysis in the Wild	Richard Vogg et.al.	2401.16424	null
2024-01-29	MixSup: Mixed-grained Supervision for Label-efficient LiDAR-based 3D Object Detection	Yuxue Yang et.al.	2401.16305	link
2024-01-29	Towards Scenario Generalization for Vision-based Roadside 3D Object Detection	Lei Yang et.al.	2401.16110	link
2024-01-29	Rectify the Regression Bias in Long-Tailed Object Detection	Ke Zhu et.al.	2401.15885	null
2024-01-29	LiDAR-PTQ: Post-Training Quantization for Point Cloud 3D Object Detection	Sifan Zhou et.al.	2401.15865	link
2024-01-29	LCVO: An Efficient Pretraining-Free Framework for Visual Question Answering Grounding	Yuhan Chen et.al.	2401.15842	null
2024-01-28	Real-time object detection and robotic manipulation for agriculture using a YOLO-based learning approach	Hongyu Zhao et.al.	2401.15785	null
2024-01-27	New Foggy Object Detecting Model	Rahul Banavathu et.al.	2401.15455	null
2024-01-27	You Only Look Bottom-Up for Monocular 3D Object Detection	Kaixin Xiong et.al.	2401.15319	null
2024-01-26	pLitterStreet: Street Level Plastic Litter Detection and Mapping	Sriram Reddy Mandhati et.al.	2401.14719	link
2024-01-26	From Blurry to Brilliant Detection: YOLOv5-Based Aerial Object Detection with Super Resolution	Ragib Amin Nihal et.al.	2401.14661	null
2024-01-25	UrbanGenAI: Reconstructing Urban Landscapes using Panoptic Segmentation and Diffusion Models	Timo Kapsalis et.al.	2401.14379	null
2024-01-25	MultiTest: Physical-Aware Object Insertion for Testing Multi-sensor Fusion Perception Systems	Xinyu Gao et.al.	2401.14314	null
2024-01-25	Knowledge Graph Driven UAV Cognitive Semantic Communication Systems for Efficient Object Detection	Xi Song et.al.	2401.13995	null
2024-01-24	PLATE: A perception-latency aware estimator,	Rodrigo Aldana-López et.al.	2401.13596	null
2024-01-24	Deep Learning for Improved Polyp Detection from Synthetic Narrow-Band Imaging	Mathias Ramm Haugland et.al.	2401.13315	null
2024-01-24	AMANet: Advancing SAR Ship Detection with Adaptive Multi-Hierarchical Attention Network	Xiaolin Ma et.al.	2401.13214	null
2024-01-23	Enhancing Object Detection Performance for Small Objects through Synthetic Data Generation and Proportional Class-Balancing Technique: A Comparative Study in Industrial Scenarios	Jibinraj Antony et.al.	2401.12729	null
2024-01-23	Pragmatic Communication in Multi-Agent Collaborative Perception	Yue Hu et.al.	2401.12694	null
2024-01-23	Small Language Model Meets with Reinforced Vision Vocabulary	Haoran Wei et.al.	2401.12503	null
2024-01-23	Self-supervised Learning of LiDAR 3D Point Clouds via 2D-3D Neural Calibration	Yifan Zhang et.al.	2401.12452	null
2024-01-22	OK-Robot: What Really Matters in Integrating Open-Knowledge Models for Robotics	Peiqi Liu et.al.	2401.12202	link
2024-01-22	Out-of-Distribution Detection & Applications With Ablated Learned Temperature Energy	Will LeVine et.al.	2401.12129	link
2024-01-22	A Saliency Enhanced Feature Fusion based multiscale RGB-D Salient Object Detection Network	Rui Huang et.al.	2401.11914	null
2024-01-22	Large receptive field strategy and important feature extraction strategy in 3D object detection	Leichao Cui et.al.	2401.11913	null
2024-01-22	Detect-Order-Construct: A Tree Construction based Approach for Hierarchical Document Structure Analysis	Jiawei Wang et.al.	2401.11874	link
2024-01-22	Rethinking Centered Kernel Alignment in Knowledge Distillation	Zikai Zhou et.al.	2401.11824	link

(back to top)

Keypoint Detection

Publish Date	Title	Authors	PDF	Code
2024-08-15	Towards Practical Human Motion Prediction with LiDAR Point Clouds	Xiao Han et.al.	2408.08202	null
2024-07-31	Certifying Robustness of Learning-Based Keypoint Detection and Pose Estimation Methods	Xusheng Luo et.al.	2408.00117	null
2024-07-26	SHIC: Shape-Image Correspondences with no Keypoint Supervision	Aleksandar Shtedritski et.al.	2407.18907	null
2024-07-25	LION: Linear Group RNN for 3D Object Detection in Point Clouds	Zhe Liu et.al.	2407.18232	link
2024-07-22	RADA: Robust and Accurate Feature Learning with Domain Adaptation	Jingtai He et.al.	2407.15791	null
2024-07-09	LVLM-empowered Multi-modal Representation Learning for Visual Place Recognition	Teng Wang et.al.	2407.06730	null
2024-07-04	PFGS: High Fidelity Point Cloud Rendering via Feature Splatting	Jiaxu Wang et.al.	2407.03857	link
2024-07-03	A Radiometric Correction based Optical Modeling Approach to Removing Reflection Noise in TLS Point Clouds of Urban Scenes	Li Fang et.al.	2407.02830	link
2024-07-02	Multi-Grained Contrast for Data-Efficient Unsupervised Representation Learning	Chengchao Shen et.al.	2407.02014	link
2024-06-28	Beyond First-Order: A Multi-Scale Approach to Finger Knuckle Print Biometrics	Chengrui Gao et.al.	2406.19672	null
2024-07-23	A Certifiable Algorithm for Simultaneous Shape Estimation and Object Tracking	Lorenzo Shaikewitz et.al.	2406.16837	link
2024-06-03	Scale-Free Image Keypoints Using Differentiable Persistent Homology	Giovanni Barbarani et.al.	2406.01315	link
2024-06-23	W-Net: A Facial Feature-Guided Face Super-Resolution Network	Hao Liu et.al.	2406.00676	null
2024-05-25	Deep-PE: A Learning-Based Pose Evaluator for Point Cloud Registration	Junjie Gao et.al.	2405.16085	null
2024-06-01	Benchmarking Fish Dataset and Evaluation Metric in Keypoint Detection -- Towards Precise Fish Morphological Assessment in Aquaculture Breeding	Weizhen Liu et.al.	2405.12476	link
2024-05-14	TP3M: Transformer-based Pseudo 3D Image Matching with Reference	Liming Han et.al.	2405.08434	null
2024-05-15	Vector-Symbolic Architecture for Event-Based Optical Flow	Hongzhi You et.al.	2405.08300	null
2024-05-13	RGBD-Glue: General Feature Combination for Robust RGB-D Point Cloud Registration	Congjia Chen et.al.	2405.07594	null
2024-05-08	Unsupervised Skin Feature Tracking with Deep Neural Networks	Jose Chang et.al.	2405.04943	null
2024-05-07	A Self-Supervised Method for Body Part Segmentation and Keypoint Detection of Rat Images	László Kopácsi et.al.	2405.04650	null
2024-04-30	A Light-weight Transformer-based Self-supervised Matching Network for Heterogeneous Images	Wang Zhang et.al.	2404.19311	null
2024-04-25	Adaptive Local Binary Pattern: A Novel Feature Descriptor for Enhanced Analysis of Kidney Abnormalities in CT Scan Images using ensemble based Machine Learning Approach	Tahmim Hossain et.al.	2404.14560	null
2024-04-19	SkelFormer: Markerless 3D Pose and Shape Estimation using Skeletal Transformers	Vandad Davoodnia et.al.	2404.12625	null
2024-04-17	Pixel-Wise Symbol Spotting via Progressive Points Location for Parsing CAD Images	Junbiao Pang et.al.	2404.10985	null
2024-03-28	Towards Long Term SLAM on Thermal Imagery	Colin Keil et.al.	2403.19885	link
2024-03-28	Instance-Adaptive and Geometric-Aware Keypoint Learning for Category-Level 6D Object Pose Estimation	Xiao Lin et.al.	2403.19527	link
2024-03-27	RoboKeyGen: Robot Pose and Joint Angles Estimation via Diffusion-based 3D Keypoint Generation	Yang Tian et.al.	2403.18259	null
2024-03-18	FE-DeTr: Keypoint Detection and Tracking in Low-quality Image Frames with Events	Xiangyuan Wang et.al.	2403.11662	link
2024-03-05	Self-supervised 3D Patient Modeling with Multi-modal Attentive Fusion	Meng Zheng et.al.	2403.03217	null
2024-02-22	A Self-supervised Pressure Map human keypoint Detection Approch: Optimizing Generalization and Computational Efficiency Across Datasets	Chengzhang Yu et.al.	2402.14241	null
2024-02-25	A Feature Matching Method Based on Multi-Level Refinement Strategy	Shaojie Zhang et.al.	2402.13488	null
2024-03-05	3D Kinematics Estimation from Video with a Biomechanical Model and Synthetic Training Data	Zhi-Yi Lin et.al.	2402.13172	null
2024-02-25	Region Feature Descriptor Adapted to High Affine Transformations	Shaojie Zhang et.al.	2402.09724	null
2024-01-29	Reconstructing Close Human Interactions from Multiple Views	Qing Shuai et.al.	2401.16173	link
2024-01-17	To deform or not: treatment-aware longitudinal registration for breast DCE-MRI during neoadjuvant chemotherapy via unsupervised keypoints detection	Luyi Han et.al.	2401.09336	link
2024-01-08	Flowmind2Digital: The First Comprehensive Flowmind Recognition and Conversion Approach	Huanyu Liu et.al.	2401.03742	link
2024-03-22	6D-Diff: A Keypoint Diffusion Framework for 6D Object Pose Estimation	Li Xu et.al.	2401.00029	null
2023-12-27	Bezier-based Regression Feature Descriptor for Deformable Linear Objects	Fangqing Chen et.al.	2312.16502	null
2023-12-24	Residual Learning for Image Point Descriptors	Rashik Shrestha et.al.	2312.15471	null
2023-12-22	BonnBeetClouds3D: A Dataset Towards Point Cloud-based Organ-level Phenotyping of Sugar Beet Plants under Field Conditions	Elias Marks et.al.	2312.14706	null
2023-12-19	Adaptive Distribution Masked Autoencoders for Continual Test-Time Adaptation	Jiaming Liu et.al.	2312.12480	null
2023-12-19	An effective image copy-move forgery detection using entropy image	Zhaowei Lu et.al.	2312.11793	link
2023-12-11	VoxelKP: A Voxel-based Network Architecture for Human Keypoint Estimation in LiDAR Data	Jian Shi et.al.	2312.08871	link
2023-12-11	Keypoint-based Stereophotoclinometry for Characterizing and Navigating Small Bodies: A Factor Graph Approach	Travis Driver et.al.	2312.06865	link

(back to top)

Open-Vocabulary

Publish Date	Title	Authors	PDF	Code
2024-08-20	OpenScan: A Benchmark for Generalized Open-Vocabulary 3D Scene Understanding	Youjun Zhao et.al.	2408.11030	link
2024-08-20	Open 3D World in Autonomous Driving	Xinlong Cheng et.al.	2408.10880	null
2024-08-20	LightMDETR: A Lightweight Approach for Low-Cost Open-Vocabulary Object Detection Training	Binta Sow et.al.	2408.10787	null
2024-08-20	Vocabulary-Free 3D Instance Segmentation with Vision and Language Assistant	Guofeng Mei et.al.	2408.10652	null
2024-08-20	SZTU-CMU at MER2024: Improving Emotion-LLaMA with Conv-Attention for Multimodal Emotion Recognition	Zebang Cheng et.al.	2408.10500	link
2024-08-18	OVOSE: Open-Vocabulary Semantic Segmentation in Event-Based Cameras	Muhammad Rameez Ur Rahman et.al.	2408.09424	link
2024-08-17	Locate Anything on Earth: Advancing Open-Vocabulary Object Detection for Remote Sensing Community	Jiancheng Pan et.al.	2408.09110	null
2024-08-16	From Lazy to Prolific: Tackling Missing Labels in Open Vocabulary Extreme Classification by Positive-Unlabeled Sequence Learning	Haoran Ranran Zhang et.al.	2408.08981	null
2024-08-16	Zero-Shot Dual-Path Integration Framework for Open-Vocabulary 3D Instance Segmentation	Tri Ton et.al.	2408.08591	null
2024-08-15	Towards Flexible Visual Relationship Segmentation	Fangrui Zhu et.al.	2408.08305	null
2024-08-15	VLPG-Nav: Object Navigation Using Visual Language Pose Graph and Object Localization Probability Maps	Senthil Hariharan Arul et.al.	2408.08301	null
2024-08-15	DM2RM: Dual-Mode Multimodal Ranking for Target Objects and Receptacles Based on Open-Vocabulary Instructions	Ryosuke Korekata et.al.	2408.07910	null
2024-08-18	Rethinking Open-Vocabulary Segmentation of Radiance Fields in 3D Space	Hyunjee Lee et.al.	2408.07416	null
2024-08-13	Fingerspelling within Sign Language Translation	Garrett Tanzer et.al.	2408.07065	null
2024-08-11	An analysis of HOI: using a training-free method with multimodal visual foundation models when only the test set is available, without the training set	Chaoyi Ai et.al.	2408.05772	null
2024-08-11	Efficient and Versatile Robust Fine-Tuning of Zero-shot Models	Sungyeon Kim et.al.	2408.05749	null
2024-08-09	In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic Segmentation	Dahyun Kang et.al.	2408.04961	link
2024-08-09	ProxyCLIP: Proxy Attention Improves CLIP for Open-Vocabulary Segmentation	Mengcheng Lan et.al.	2408.04883	link
2024-08-07	Leveraging LLMs for Enhanced Open-Vocabulary 3D Scene Understanding in Autonomous Driving	Amirhosein Chahe et.al.	2408.03516	null
2024-08-05	Explain via Any Concept: Concept Bottleneck Model with Open Vocabulary Concepts	Andong Tan et.al.	2408.02265	null
2024-08-01	Collaborative Vision-Text Representation Optimizing for Open-Vocabulary Segmentation	Siyu Jiao et.al.	2408.00744	link
2024-07-31	Open-Vocabulary Audio-Visual Semantic Segmentation	Ruohao Guo et.al.	2407.21721	null
2024-07-31	MarvelOVD: Marrying Object Recognition and Vision-Language Models for Robust Open-Vocabulary Object Detection	Kuo Wang et.al.	2407.21465	link
2024-07-29	MaskInversion: Localized Embeddings via Optimization of Explainability Maps	Walid Bousselham et.al.	2407.20034	null
2024-07-24	DexGANGrasp: Dexterous Generative Adversarial Grasping Synthesis for Task-Oriented Manipulation	Qian Feng et.al.	2407.17348	null
2024-07-25	LangOcc: Self-Supervised Open Vocabulary Occupancy Estimation via Volume Rendering	Simon Boeder et.al.	2407.17310	null
2024-07-24	OVR: A Dataset for Open Vocabulary Temporal Repetition Counting in Videos	Debidatta Dwibedi et.al.	2407.17085	null
2024-07-23	SAM-CP: Marrying SAM with Composable Prompts for Versatile Segmentation	Pengfei Chen et.al.	2407.16682	null
2024-07-24	MicroEmo: Time-Sensitive Multimodal Emotion Recognition with Micro-Expression Dynamics in Video Dialogues	Liyun Zhang et.al.	2407.16552	null
2024-07-18	Which objects help me to act effectively? Reasoning about physically-grounded affordances	Anne Kemmeren et.al.	2407.13811	null
2024-07-18	SegPoint: Segment Any Point Cloud via Large Language Model	Shuting He et.al.	2407.13761	null
2024-07-18	Open-Vocabulary 3D Semantic Segmentation with Text-to-Image Diffusion Models	Xiaoyu Zhu et.al.	2407.13642	null
2024-07-18	Open Vocabulary 3D Scene Understanding via Geometry Guided Self-Distillation	Pengfei Wang et.al.	2407.13362	null
2024-07-18	OVGNet: A Unified Visual-Linguistic Framework for Open-Vocabulary Robotic Grasping	Li Meng et.al.	2407.13175	link
2024-07-17	CerberusDet: Unified Multi-Task Object Detection	Irina Tolstykh et.al.	2407.12632	link
2024-07-17	ClearCLIP: Decomposing CLIP Representations for Dense Vision-Language Inference	Mengcheng Lan et.al.	2407.12442	null
2024-07-17	VEON: Vocabulary-Enhanced Occupancy Prediction	Jilai Zheng et.al.	2407.12294	null
2024-07-18	LaMI-DETR: Open-Vocabulary Detection with Language Model Instruction	Penghui Du et.al.	2407.11335	link
2024-07-17	Unconstrained Open Vocabulary Image Classification: Zero-Shot Transfer from Text to Image via CLIP Inversion	Philipp Allgeuer et.al.	2407.11211	null
2024-07-15	OVLW-DETR: Open-Vocabulary Light-Weighted Detection Transformer	Yu Wang et.al.	2407.10655	link
2024-07-15	Evaluating Model Bias Requires Characterizing its Mistakes	Isabela Albuquerque et.al.	2407.10633	null
2024-07-13	DexGrasp-Diffusion: Diffusion-based Unified Functional Grasp Synthesis Pipeline for Multi-Dexterous Robotic Hands	Zhengshen Zhang et.al.	2407.09899	null
2024-07-13	Dense Multimodal Alignment for Open-Vocabulary 3D Scene Understanding	Ruihuang Li et.al.	2407.09781	null
2024-07-12	DART: An Automated End-to-End Object Detection Pipeline with Data Diversification, Open-Vocabulary Bounding Box Annotation, Pseudo-Label Review, and Model Training	Chen Xin et.al.	2407.09174	link
2024-07-12	Open Vocabulary Multi-Label Video Classification	Rohit Gupta et.al.	2407.09073	null
2024-07-12	Navi2Gaze: Leveraging Foundation Models for Navigation and Target Gazing	Jun Zhu et.al.	2407.09053	null
2024-07-12	OVExp: Open Vocabulary Exploration for Object-Oriented Navigation	Meng Wei et.al.	2407.09016	null
2024-07-12	Global-Local Collaborative Inference with LLM for Lidar-Based Open-Vocabulary Detection	Xingyu Peng et.al.	2407.08931	link
2024-07-11	Explore the Potential of CLIP for Training-Free Open Vocabulary Semantic Segmentation	Tong Shao et.al.	2407.08268	link
2024-07-10	OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion	Hao Wang et.al.	2407.07844	link
2024-07-10	Scaling Law in Neural Data: Non-Invasive Speech Decoding with 175 Hours of EEG Data	Motoshige Sato et.al.	2407.07595	null
2024-07-12	Unified Embedding Alignment for Open-Vocabulary Video Instance Segmentation	Hao Fang et.al.	2407.07427	link
2024-07-09	Exploring Scalability of Self-Training for Open-Vocabulary Temporal Action Localization	Jeongseok Hyun et.al.	2407.07024	link
2024-07-09	Towards Open-World Mobile Manipulation in Homes: Lessons from the Neurips 2023 HomeRobot Open Vocabulary Mobile Manipulation Challenge	Sriram Yenamandra et.al.	2407.06939	null
2024-07-09	Graph-Based Captioning: Enhancing Visual Descriptions by Interconnecting Region Captions	Yu-Guan Hsieh et.al.	2407.06723	null
2024-07-07	Unlocking Textual and Visual Wisdom: Open-Vocabulary 3D Object Detection Enhanced by Comprehensive Guidance from Text and Image	Pengkun Jiao et.al.	2407.05256	null
2024-07-06	A Study of Test-time Contrastive Concepts for Open-world, Open-vocabulary Semantic Segmentation	Monika Wysoczańska et.al.	2407.05061	null
2024-07-05	CountGD: Multi-Modal Open-World Counting	Niki Amini-Naieni et.al.	2407.04619	null
2024-07-03	A Unified Framework for 3D Scene Understanding	Wei Xu et.al.	2407.03263	null
2024-07-02	Open Panoramic Segmentation	Junwei Zheng et.al.	2407.02685	link
2024-07-01	PanopticRecon: Leverage Open-vocabulary Instance Segmentation for Zero-shot Panoptic Reconstruction	Xuan Yu et.al.	2407.01349	null
2024-07-01	Fast and Efficient: Mask Neural Fields for 3D Scene Segmentation	Zihan Gao et.al.	2407.01220	null
2024-07-01	Object Segmentation from Open-Vocabulary Manipulation Instructions Based on Optimal Transport Polygon Matching with Multimodal Foundation Models	Takayuki Nishimura et.al.	2407.00985	null
2024-06-29	When Robots Get Chatty: Grounding Multimodal Human-Robot Conversation and Collaboration	Philipp Allgeuer et.al.	2407.00518	null
2024-06-28	PoliFormer: Scaling On-Policy RL with Transformers Results in Masterful Navigators	Kuo-Hao Zeng et.al.	2406.20083	null
2024-07-01	3D Feature Distillation with Object-Centric Priors	Georgios Tziafas et.al.	2406.18742	null
2024-06-26	Open-vocabulary Mobile Manipulation in Unseen Dynamic Environments with 3D Semantic Maps	Dicong Qiu et.al.	2406.18115	null
2024-06-24	High-resolution open-vocabulary object 6D pose estimation	Jaime Corsetti et.al.	2406.16384	null
2024-07-01	A Simple Framework for Open-Vocabulary Zero-Shot Segmentation	Thomas Stegmüller et.al.	2406.16085	null
2024-06-21	Open-vocabulary Pick and Place via Patch-level Semantic Maps	Mingxi Jia et.al.	2406.15677	null
2024-06-21	Open-Vocabulary Temporal Action Localization using Multimodal Guidance	Akshita Gupta et.al.	2406.15556	null
2024-06-19	StableSemantics: A Synthetic Language-Vision Dataset of Semantic Representations in Naturalistic Images	Rushikesh Zawar et.al.	2406.13735	null
2024-06-17	V3Det Challenge 2024 on Vast Vocabulary and Open Vocabulary Object Detection: Methods and Results	Jiaqi Wang et.al.	2406.11739	null
2024-06-17	Understanding Multi-Granularity for Open-Vocabulary Part Segmentation	Jiho Choi et.al.	2406.11384	link
2024-06-16	Open-Vocabulary X-ray Prohibited Item Detection via Fine-tuning CLIP	Shuyang Lin et.al.	2406.10961	null
2024-06-14	Open-Vocabulary Semantic Segmentation with Image Embedding Balancing	Xiangheng Shan et.al.	2406.09829	link
2024-06-14	Language-Guided Manipulation with Diffusion Policies and Constrained Inpainting	Ce Hao et.al.	2406.09767	null
2024-06-13	ImageNet3D: Towards General-Purpose Object-Level 3D Understanding	Wufei Ma et.al.	2406.09613	link
2024-06-21	Enhanced Object Detection: A Study on Vast Vocabulary Object Detection Track for V3Det Challenge 2024	Peixi Wu et.al.	2406.09201	null
2024-06-13	Auto-Vocabulary Segmentation for LiDAR Points	Weijie Wei et.al.	2406.09126	null
2024-06-13	LLM-Driven Robots Risk Enacting Discrimination, Violence, and Unlawful Actions	Rumaisa Azeem et.al.	2406.08824	null
2024-06-12	OpenObj: Open-Vocabulary Object-Level Neural Radiance Fields with Fine-Grained Understanding	Yinan Deng et.al.	2406.08009	link
2024-06-12	CTC-aligned Audio-Text Embedding for Streaming Open-vocabulary Keyword Spotting	Sichen Jin et.al.	2406.07923	null
2024-06-11	Beyond Bare Queries: Open-Vocabulary Object Retrieval with 3D Scene Graph	Sergey Linok et.al.	2406.07113	null
2024-06-10	Open-Vocabulary Part-Based Grasping	Tjeard van Oort et.al.	2406.05951	null
2024-06-07	USE: Universal Segment Embeddings for Open-Vocabulary Image Segmentation	Xiaoqi Wang et.al.	2406.05271	null
2024-06-07	OVMR: Open-Vocabulary Recognition with Multi-Modal References	Zehong Ma et.al.	2406.04675	link
2024-06-07	FusionBench: A Comprehensive Benchmark of Deep Model Fusion	Anke Tang et.al.	2406.03280	link
2024-06-04	Open-YOLO 3D: Towards Fast and Accurate Open-Vocabulary 3D Instance Segmentation	Mohamed El Amine Boudjoghra et.al.	2406.02548	link
2024-06-04	OpenGaussian: Towards Point-Level 3D Gaussian-based Open Vocabulary Understanding	Yanmin Wu et.al.	2406.02058	null
2024-06-04	FastLGS: Speeding up Language Embedded Gaussians with Feature Grid Mapping	Yuzhou Ji et.al.	2406.01916	null
2024-06-03	ELSA: Evaluating Localization of Social Activities in Urban Streets	Maryam Hosseini et.al.	2406.01551	null
2024-06-03	EAGLE: Efficient Adaptive Geometry-based Learning in Cross-view Understanding	Thanh-Dat Truong et.al.	2406.01429	null
2024-06-02	Collaborative Novel Object Discovery and Box-Guided Cross-Modal Alignment for Open-Vocabulary 3D Object Detection	Yang Cao et.al.	2406.00830	link
2024-06-01	Learning Background Prompts to Discover Implicit Knowledge for Open Vocabulary Object Detection	Jiaming Li et.al.	2406.00510	null
2024-05-31	Diversifying Query: Region-Guided Transformer for Temporal Sentence Grounding	Xiaolong Sun et.al.	2406.00143	null
2024-05-30	OpenDAS: Domain Adaptation for Open-Vocabulary Segmentation	Gonca Yilmaz et.al.	2405.20141	null
2024-05-30	RTGen: Generating Region-Text Pairs for Open-Vocabulary Object Detection	Fangyi Chen et.al.	2405.19854	link
2024-05-29	Reasoning3D -- Grounding and Reasoning in 3D: Fine-Grained Zero-Shot Open-Vocabulary 3D Reasoning Part Segmentation via Large Vision-Language Models	Tianrun Chen et.al.	2405.19326	null
2024-05-29	Parameter-efficient Fine-tuning in Hyperspherical Space for Open-vocabulary Semantic Segmentation	Zelin Peng et.al.	2405.18840	null
2024-05-28	OV-DQUO: Open-Vocabulary DETR with Denoising Text Query Training and Open-World Unknown Objects Supervision	Junjie Wang et.al.	2405.17913	link
2024-06-03	EgoNCE++: Do Egocentric Video-Language Models Really Understand Hand-Object Interactions?	Boshen Xu et.al.	2405.17719	link
2024-05-27	GOI: Find 3D Gaussians of Interest with an Optimizable Open-vocabulary Semantic-space Hyperplane	Yansong Qu et.al.	2405.17596	null
2024-05-26	Map-based Modular Approach for Zero-shot Embodied Question Answering	Koya Sakamoto et.al.	2405.16559	null
2024-05-26	CRoFT: Robust Fine-Tuning with Concurrent Optimization for OOD Generalization and Open-Set OOD Detection	Lin Zhu et.al.	2405.16417	link
2024-05-25	DynRefer: Delving into Region-level Multi-modality Tasks via Dynamic Resolution	Yuzhong Zhao et.al.	2405.16071	link
2024-05-24	Open-Vocabulary SAM3D: Understand Any 3D Scene	Hanchen Tai et.al.	2405.15580	null
2024-05-24	3D Unsupervised Learning by Distilling 2D Open-Vocabulary Segmentation Models for Autonomous Driving	Boyi Sun et.al.	2405.15286	link
2024-05-23	TIGER: Text-Instructed 3D Gaussian Retrieval and Coherent Editing	Teng Xu et.al.	2405.14455	null
2024-05-23	Tuning-free Universally-Supervised Semantic Segmentation	Xiaobo Yang et.al.	2405.14294	null
2024-05-19	Unsupervised Image Prior via Prompt Learning and CLIP Semantic Guidance for Low-Light Image Enhancement	Igor Morawski et.al.	2405.11478	null
2024-05-17	Open-Vocabulary Spatio-Temporal Action Detection	Tao Wu et.al.	2405.10832	null
2024-05-16	When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models	Xianzheng Ma et.al.	2405.10255	link
2024-05-16	SHiNe: Semantic Hierarchy Nexus for Open-vocabulary Object Detection	Mingxuan Liu et.al.	2405.10053	link
2024-05-15	A Survey On Text-to-3D Contents Generation In The Wild	Chenhan Jiang et.al.	2405.09431	null
2024-05-14	Open-Vocabulary Object Detection via Neighboring Region Attention Alignment	Sunyuan Qiang et.al.	2405.08593	null
2024-05-13	Open-vocabulary Auditory Neural Decoding Using fMRI-prompted LLM	Xiaoyu Chen et.al.	2405.07840	null
2024-05-13	Constructing a BPE Tokenization DFA	Martin Berglund et.al.	2405.07671	null
2024-05-10	Are EEG-to-Text Models Working?	Hyejeong Jo et.al.	2405.06459	link
2024-05-09	Pre-trained Text-to-Image Diffusion Models Are Versatile Representation Learners for Control	Gunshi Gupta et.al.	2405.05852	link
2024-05-09	DragGaussian: Enabling Drag-style Manipulation on 3D Gaussian Representation	Sitian Shen et.al.	2405.05800	null
2024-05-09	RoboHop: Segment-based Topological Map Representation for Open-World Visual Navigation	Sourav Garg et.al.	2405.05792	null
2024-05-08	OpenESS: Event-based Semantic Scene Understanding with Open Vocabularies	Lingdong Kong et.al.	2405.05259	link
2024-05-08	Multi-Modal Data-Efficient 3D Scene Understanding for Autonomous Driving	Lingdong Kong et.al.	2405.05258	link
2024-05-08	DiffMatch: Visual-Language Guidance Makes Better Semi-supervised Change Detector	Kaiyu Li et.al.	2405.04788	link
2024-05-14	Splat-MOVER: Multi-Stage, Open-Vocabulary Robotic Manipulation via Editable Gaussian Splatting	Ola Shorinwa et.al.	2405.04378	null
2024-05-03	DreamScene4D: Dynamic Multi-Object Scene Generation from Monocular Videos	Wen-Hsuan Chu et.al.	2405.02280	link
2024-05-03	EEG2TEXT: Open Vocabulary EEG-to-Text Decoding with EEG Pre-Training and Multi-View Transformer	Hanwen Liu et.al.	2405.02165	null
2024-04-30	One-Stage Open-Vocabulary Temporal Action Detection Leveraging Temporal Multi-scale and Action Label Features	Trung Thanh Nguyen et.al.	2404.19542	link
2024-04-30	MoST: Multi-modality Scene Tokenization for Motion Prediction	Norman Mu et.al.	2404.19531	null
2024-04-28	Garbage Segmentation and Attribute Analysis by Robotic Dogs	Nuo Xu et.al.	2404.18112	null
2024-04-29	MER 2024: Semi-Supervised Learning, Noise Robustness, and Open-Vocabulary Multimodal Emotion Recognition	Zheng Lian et.al.	2404.17113	link
2024-04-23	DENOISER: Rethinking the Robustness for Open-Vocabulary Action Recognition	Haozhe Cheng et.al.	2404.14890	null
2024-04-19	ECOR: Explainable CLIP for Object Recognition	Ali Rasekh et.al.	2404.12839	null
2024-04-18	Spot-Compose: A Framework for Open-Vocabulary Object Retrieval and Drawer Manipulation in Point Clouds	Oliver Lemke et.al.	2404.12440	null
2024-04-18	The devil is in the object boundary: towards annotation-free instance segmentation using Foundation Models	Cheng Shi et.al.	2404.11957	link
2024-04-17	OVAL-Prompt: Open-Vocabulary Affordance Localization for Robot Manipulation through LLM Affordance-Grounding	Edmond Tong et.al.	2404.11000	null
2024-04-16	Watch Your Step: Optimal Retrieval for Continual Learning at Scale	Truman Hickok et.al.	2404.10758	null
2024-04-16	Closed-Loop Open-Vocabulary Mobile Manipulation with GPT-4V	Peiyuan Zhi et.al.	2404.10220	null
2024-04-15	Cross-Modal Self-Training: Aligning Images and Pointclouds to Learn Classification without Labels	Amaya Dharmasiri et.al.	2404.10146	link
2024-04-15	Evolving Interpretable Visual Classifiers with Large Language Models	Mia Chiquier et.al.	2404.09941	null
2024-04-15	kNN-CLIP: Retrieval Enables Training-Free Segmentation on Continually Expanding Large Vocabularies	Zhongrui Gui et.al.	2404.09447	null
2024-04-14	DetCLIPv3: Towards Versatile Generative Open-vocabulary Object Detection	Lewei Yao et.al.	2404.09216	null
2024-04-12	Training-free Boost for Open-Vocabulary Object Detection with Confidence Aggregation	Yanhao Zheng et.al.	2404.08603	link
2024-04-12	Pay Attention to Your Neighbours: Training-Free Open-Vocabulary Semantic Segmentation	Sina Hajimiri et.al.	2404.08181	link
2024-04-11	Transferable and Principled Efficiency for Open-Vocabulary Segmentation	Jingxuan Xu et.al.	2404.07448	link
2024-04-10	O2V-Mapping: Online Open-Vocabulary Mapping with Neural Implicit Representation	Muer Tie et.al.	2404.06836	null
2024-04-09	GOAT-Bench: A Benchmark for Multi-Modal Lifelong Navigation	Mukul Khanna et.al.	2404.06609	null
2024-04-09	Training-Free Open-Vocabulary Segmentation with Offline Diffusion-Augmented Prototype Generation	Luca Barsellotti et.al.	2404.06542	null
2024-04-10	Exploring the Potential of Large Foundation Models for Open-Vocabulary HOI Detection	Ting Lei et.al.	2404.06194	link
2024-04-08	Retrieval-Augmented Open-Vocabulary Object Detection	Jooyeon Kim et.al.	2404.05687	link
2024-04-08	MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation	Kunpeng Song et.al.	2404.05674	link
2024-04-07	Hyperbolic Learning with Synthetic Captions for Open-World Detection	Fanjie Kong et.al.	2404.05016	null
2024-04-06	Mixed-Query Transformer: A Unified Image Segmentation Architecture	Pei Wang et.al.	2404.04469	null
2024-04-05	Open vocabulary keyword spotting through transfer learning from speech synthesis	Kesavaraj V et.al.	2404.03914	null
2024-04-04	OpenNeRF: Open Set 3D Neural Scene Segmentation with Pixel-Wise Features and Rendered Novel Views	Francis Engelmann et.al.	2404.03650	null
2024-04-04	Is CLIP the main roadblock for fine-grained open-world perception?	Lorenzo Bianchi et.al.	2404.03539	link
2024-04-04	Learning Transferable Negative Prompts for Out-of-Distribution Detection	Tianqi Li et.al.	2404.03248	link
2024-04-04	LeGrad: An Explainability Method for Vision Transformers via Feature Formation Sensitivity	Walid Bousselham et.al.	2404.03214	link
2024-04-03	ALOHa: A New Measure for Hallucination in Captioning Models	Suzanne Petryk et.al.	2404.02904	null
2024-04-03	Low-resource neural machine translation with morphological modeling	Antoine Nzeyimana et.al.	2404.02392	link
2024-04-02	Segment Any 3D Object with Language	Seungjun Lee et.al.	2404.02157	null
2024-04-03	ViTamin: Designing Scalable Vision Models in the Vision-Language Era	Jieneng Chen et.al.	2404.02132	link
2024-04-01	OVFoodSeg: Elevating Open-Vocabulary Food Image Segmentation via Image-Informed Textual Representation	Xiongwei Wu et.al.	2404.01409	null
2024-04-02	Open-Vocabulary Federated Learning with Multimodal Prototyping	Huimin Zeng et.al.	2404.01232	link
2024-04-01	GOV-NeSF: Generalizable Open-Vocabulary Neural Semantic Fields	Yunsong Wang et.al.	2404.00931	link
2024-04-01	From Pixels to Graphs: Open-Vocabulary Scene Graph Generation with Vision-Language Models	Rongjie Li et.al.	2404.00906	link
2024-03-31	Training-Free Semantic Segmentation via LLM-Supervision	Wenfang Sun et.al.	2404.00701	null
2024-03-30	Do Vision-Language Models Understand Compound Nouns?	Sonal Kumar et.al.	2404.00419	link
2024-03-30	Image-to-Image Matching via Foundation Models: A New Perspective for Open-Vocabulary Semantic Segmentation	Yuan Wang et.al.	2404.00262	null
2024-03-29	FreeSeg-Diff: Training-Free Open-Vocabulary Segmentation with Diffusion Models	Barbara Toniella Corradini et.al.	2403.20105	null
2024-03-28	OV-Uni3DETR: Towards Unified Open-Vocabulary 3D Object Detection via Cycle-Modality Propagation	Zhenyu Wang et.al.	2403.19580	link
2024-03-27	Lift3D: Zero-Shot Lifting of Any 2D Vision Model to 3D	Mukund Varma T et.al.	2403.18922	null
2024-03-26	Hierarchical Open-Vocabulary 3D Scene Graphs for Language-Grounded Robot Navigation	Abdelrhman Werby et.al.	2403.17846	null
2024-03-26	OVER-NAV: Elevating Iterative Vision-and-Language Navigation with Open-Vocabulary Detection and StructurEd Representation	Ganlong Zhao et.al.	2403.17334	null
2024-03-22	Semantic Gaussians: Open-Vocabulary Scene Understanding with 3D Gaussian Splatting	Jun Guo et.al.	2403.15624	null
2024-03-21	PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model	Zheng Zhang et.al.	2403.14598	link
2024-03-21	Exosense: A Vision-Centric Scene Understanding System For Safe Exoskeleton Navigation	Jianeng Wang et.al.	2403.14320	null
2024-03-21	Open-Vocabulary Attention Maps with Token Optimization for Semantic Segmentation in Diffusion Models	Pablo Marcos-Manchón et.al.	2403.14291	link
2024-03-21	Scene-Graph ViT: End-to-End Open-Vocabulary Visual Relationship Detection	Tim Salzmann et.al.	2403.14270	null
2024-03-20	Learning from Models and Data for Visual Grounding	Ruozhen He et.al.	2403.13804	null
2024-03-20	Embedding Pose Graph, Enabling 3D Foundation Model Capabilities with a Compact Representation	Hugues Thomas et.al.	2403.13777	null
2024-03-20	Find n' Propagate: Open-Vocabulary 3D Object Detection in Urban Environments	Djamahl Etchegaray et.al.	2403.13556	link
2024-03-19	AnySkill: Learning Open-Vocabulary Physical Skill for Interactive Agents	Jieming Cui et.al.	2403.12835	null
2024-03-19	DetToolChain: A New Prompting Paradigm to Unleash Detection Ability of MLLM	Yixuan Wu et.al.	2403.12488	link
2024-03-19	CLIP-VIS: Adapting CLIP for Open-Vocabulary Video Instance Segmentation	Wenqi Zhu et.al.	2403.12455	link
2024-03-19	VisionGPT: LLM-Assisted Real-Time Anomaly Detection for Safe Visual Navigation	Hao Wang et.al.	2403.12415	link
2024-03-19	OV9D: Open-Vocabulary Category-Level 9D Object Pose and Size Estimation	Junhao Cai et.al.	2403.12396	null
2024-03-18	OpenOcc: Open Vocabulary 3D Scene Reconstruction via Occupancy Representation	Haochen Jiang et.al.	2403.11796	null
2024-03-17	TAG: Guidance-free Open-Vocabulary Semantic Segmentation	Yasufumi Kawano et.al.	2403.11197	link
2024-03-17	MaskDiffusion: Exploiting Pre-trained Diffusion Models for Semantic Segmentation	Yasufumi Kawano et.al.	2403.11194	link
2024-03-16	N2F2: Hierarchical Scene Understanding with Nested Neural Feature Fields	Yash Bhalgat et.al.	2403.10997	null
2024-03-16	Unsupervised Collaborative Metric Learning with Mixed-Scale Groups for General Object Retrieval	Shichao Kan et.al.	2403.10798	link
2024-03-15	Generative Region-Language Pretraining for Open-Ended Object Detection	Chuang Lin et.al.	2403.10191	link
2024-03-15	Do Visual-Language Maps Capture Latent Semantics?	Matti Pekkanen et.al.	2403.10117	null
2024-03-14	GaussianGrasper: 3D Language Gaussian Splatting for Open-vocabulary Robotic Grasping	Yuhang Zheng et.al.	2403.09637	link
2024-03-14	PosSAM: Panoptic Open-vocabulary Segment Anything	Vibashan VS et.al.	2403.09620	link
2024-03-14	Renovating Names in Open-Vocabulary Segmentation Benchmarks	Haiwen Huang et.al.	2403.09593	null
2024-03-14	Open-Vocabulary Object Detection with Meta Prompt Representation and Instance Contrastive Optimization	Zhao Wang et.al.	2403.09433	null
2024-03-14	OpenGraph: Open-Vocabulary Hierarchical 3D Graph Representation in Large-Scale Outdoor Environments	Yinan Deng et.al.	2403.09412	link
2024-03-14	Language-Grounded Dynamic Scene Graphs for Interactive Object Search with Mobile Manipulation	Daniel Honerkamp et.al.	2403.08605	link
2024-03-12	Learning Generalizable Feature Fields for Mobile Manipulation	Ri-Zhao Qiu et.al.	2403.07563	null
2024-03-12	Open-Vocabulary Scene Text Recognition via Pseudo-Image Labeling and Margin Loss	Xuhua Ren et.al.	2403.07518	null
2024-03-11	Real-time Transformer-based Open-Vocabulary Detection with Efficient Fusion Head	Tiancheng Zhao et.al.	2403.06892	link
2024-03-02	A Cross-Modal Approach to Silent Speech with LLM-Enhanced Recognition	Tyler Benster et.al.	2403.05583	link
2024-03-14	OmniCount: Multi-label Object Counting with Semantic-Geometric Priors	Anindya Mondal et.al.	2403.05435	null
2024-03-08	Exploring Robust Features for Few-Shot Object Detection in Satellite Imagery	Xavier Bou et.al.	2403.05381	link
2024-03-07	Self-Adapting Large Visual-Language Models to Edge Devices across Visual Modalities	Kaiwen Cai et.al.	2403.04908	link
2024-03-06	Popeye: A Unified Visual-Language Model for Multi-Source Ship Detection from Remote Sensing Imagery	Wei Zhang et.al.	2403.03790	null
2024-03-06	Multi-Grained Cross-modal Alignment for Learning Open-vocabulary Semantic Segmentation from Text Supervision	Yajie Liu et.al.	2403.03707	null
2024-03-05	MOKA: Open-Vocabulary Robotic Manipulation through Mark-Based Visual Prompting	Fangchen Liu et.al.	2403.03174	null
2024-03-03	Rethinking CLIP-based Video Learners in Cross-Domain Open-Vocabulary Action Recognition	Kun-Yu Lin et.al.	2403.01560	link
2024-03-10	Benchmarking Segmentation Models with Mask-Preserved Attribute Editing	Zijin Yin et.al.	2403.01231	link
2024-03-01	Multi-modal Attribute Prompting for Vision-Language Models	Xin Liu et.al.	2403.00219	null
2024-02-29	DOZE: A Dataset for Open-Vocabulary Zero-Shot Object Navigation in Dynamic Environments	Ji Ma et.al.	2402.19007	null
2024-02-29	MOSAIC: A Modular System for Assistive and Interactive Cooking	Huaxiaoyue Wang et.al.	2402.18796	null
2024-02-26	CARTE: pretraining and transfer for tabular learning	Myung Jun Kim et.al.	2402.16785	link
2024-02-23	OpenSUN3D: 1st Workshop Challenge on Open-Vocabulary 3D Scene Understanding	Francis Engelmann et.al.	2402.15321	null
2024-02-21	Real-time 3D-aware Portrait Editing from a Single Image	Qingyan Bai et.al.	2402.14000	link
2024-02-21	Generalizable Semantic Vision Query Generation for Zero-shot Panoptic and Semantic Segmentation	Jialei Chen et.al.	2402.13697	null
2024-02-20	A Touch, Vision, and Language Dataset for Multimodal Alignment	Letian Fu et.al.	2402.13232	link
2024-02-19	Open3DSG: Open-Vocabulary 3D Scene Graphs from Point Clouds with Queryable Objects and Open-Set Relationships	Sebastian Koch et.al.	2402.12259	link
2024-02-18	Verifiably Following Complex Robot Instructions with Foundation Models	Benedict Quartey et.al.	2402.11498	null
2024-02-15	Mind the Modality Gap: Towards a Remote Sensing Vision-Language Model via Cross-modal Alignment	Angelos Zavras et.al.	2402.09816	null
2024-02-14	Open-Vocabulary Segmentation with Unpaired Mask-Text Supervision	Zhaoqing Wang et.al.	2402.08960	link
2024-02-20	InstaGen: Enhancing Object Detection by Training on Synthetic Dataset	Chengjian Feng et.al.	2402.05937	null
2024-02-15	Open-Vocabulary Calibration for Vision-Language Models	Shuoyuan Wang et.al.	2402.04655	link
2024-02-07	OV-NeRF: Open-vocabulary Neural Radiance Fields with Vision and Language Foundation Models for 3D Semantic Understanding	Guibiao Liao et.al.	2402.04648	null
2024-02-07	LLMs Meet VLMs: Boost Open Vocabulary Object Detection with Fine-grained Descriptors	Sheng Jin et.al.	2402.04630	null
2024-02-06	Listen, Chat, and Edit: Text-Guided Soundscape Modification for Enhanced Auditory Experience	Xilin Jiang et.al.	2402.03710	null
2024-02-05	FROSTER: Frozen CLIP Is A Strong Teacher for Open-Vocabulary Action Recognition	Xiaohu Huang et.al.	2402.03241	null
2024-02-05	Cross-Domain Few-Shot Object Detection via Enhanced Open-Set Object Detector	Yuqian Fu et.al.	2402.03094	link
2024-02-02	YOLO-World: Real-Time Open-Vocabulary Object Detection	Tianheng Cheng et.al.	2401.17270	link
2024-01-29	Bridging Generative and Discriminative Models for Unified Visual Perception with Diffusion Priors	Shiyin Dong et.al.	2401.16459	null
2024-01-29	Spatial-Aware Latent Initialization for Controllable Image Generation	Wenqiang Sun et.al.	2401.16157	null
2024-01-29	LCVO: An Efficient Pretraining-Free Framework for Visual Question Answering Grounding	Yuhan Chen et.al.	2401.15842	null
2024-01-25	Grounded SAM: Assembling Open-World Models for Diverse Visual Tasks	Tianhe Ren et.al.	2401.14159	link
2024-01-25	True Knowledge Comes from Practice: Aligning LLMs with Embodied Environments via Reinforcement Learning	Weihao Tan et.al.	2401.14151	link
2024-01-22	Exploring Simple Open-Vocabulary Semantic Segmentation	Zihang Lai et.al.	2401.12217	link
2024-01-22	OK-Robot: What Really Matters in Integrating Open-Knowledge Models for Robotics	Peiqi Liu et.al.	2401.12202	link
2024-01-22	HomeRobot Open Vocabulary Mobile Manipulation Challenge 2023 Participant Report (Team KuzHum)	Volodymyr Kuzma et.al.	2401.12048	null
2024-01-31	UniM-OV3D: Uni-Modality Open-Vocabulary 3D Scene Understanding with Fine-Grained Feature Representation	Qingdong He et.al.	2401.11395	link
2024-01-18	OMG-Seg: Is One Model Good Enough For All Segmentation?	Xiangtai Li et.al.	2401.10229	link
2024-01-18	Question-Answer Cross Language Image Matching for Weakly Supervised Semantic Segmentation	Songhe Deng et.al.	2401.09883	link
2024-01-18	Instance Brownian Bridge as Texts for Open-vocabulary Video Instance Segmentation	Zesen Cheng et.al.	2401.09732	link
2024-01-17	POP-3D: Open-Vocabulary 3D Occupancy Prediction from Images	Antonin Vobecky et.al.	2401.09413	null
2024-01-17	OCTO+: A Suite for Automatic Open-Vocabulary Object Placement in Mixed Reality	Aditya Sharma et.al.	2401.08973	null
2024-01-16	Robotic Imitation of Human Actions	Josua Spisak et.al.	2401.08381	null

(back to top)

Image Captioning

Publish Date	Title	Authors	PDF	Code
2024-08-19	The Brittleness of AI-Generated Image Watermarking Techniques: Examining Their Robustness Against Visual Paraphrasing Attacks	Niyar R Barman et.al.	2408.10446	null
2024-08-16	An End-to-End Model for Photo-Sharing Multi-modal Dialogue Generation	Peiming Guo et.al.	2408.08650	null
2024-08-13	PathInsight: Instruction Tuning of Multimodal Datasets and Models for Intelligence Assisted Diagnosis in Histopathology	Xiaomin Wu et.al.	2408.07037	null
2024-08-12	Prompt Recovery for Image Generation Models: A Comparative Study of Discrete Optimizers	Joshua Nathaniel Williams et.al.	2408.06502	null
2024-08-09	Surveying the Landscape of Image Captioning Evaluation: A Comprehensive Taxonomy and Novel Ensemble Method	Uri Berger et.al.	2408.04909	null
2024-08-09	FUSE-ing Language Models: Zero-Shot Adapter Discovery for Prompt Optimization Across Tokenizers	Joshua Nathaniel Williams et.al.	2408.04816	link
2024-08-08	Enhancing Journalism with AI: A Study of Contextualized Image Captioning for News Articles using LLMs and LMMs	Aliki Anagnostopoulou et.al.	2408.04331	null
2024-08-06	Multitask and Multimodal Neural Tuning for Large Models	Hao Sun et.al.	2408.03001	null
2024-08-05	Modelling Visual Semantics via Image Captioning to extract Enhanced Multi-Level Cross-Modal Semantic Incongruity Representation with Attention for Multimodal Sarcasm Detection	Sajal Aggarwal et.al.	2408.02595	null
2024-08-04	Dataset Scale and Societal Consistency Mediate Facial Impression Bias in Vision-Language AI	Robert Wolfe et.al.	2408.01959	null
2024-08-03	A Novel Evaluation Framework for Image2Text Generation	Jia-Hong Huang et.al.	2408.01723	null
2024-08-02	The Phantom Menace: Unmasking Privacy Leakages in Vision-Language Models	Simone Caldarella et.al.	2408.01228	null
2024-07-30	AI Safety in Practice: Enhancing Adversarial Robustness in Multimodal Image Captioning	Maisha Binte Rashid et.al.	2407.21174	null
2024-07-29	BRIDGE: Bridging Gaps in Image Captioning Evaluation with Stronger Visual Cues	Sara Sarto et.al.	2407.20341	link
2024-07-29	VolDoGer: LLM-assisted Datasets for Domain Generalization in Vision-Language Tasks	Juhwan Choi et.al.	2407.19795	null
2024-07-26	SWIFT: Semantic Watermarking for Image Forgery Thwarting	Gautier Evennou et.al.	2407.18995	null
2024-07-26	HICEScore: A Hierarchical Metric for Image Captioning Evaluation	Zequn Zeng et.al.	2407.18589	null
2024-07-26	SPOLRE: Semantic Preserving Object Layout Reconstruction for Image Captioning System Testing	Yi Liu et.al.	2407.18512	null
2024-07-23	VisMin: Visual Minimal-Change Understanding	Rabiul Awal et.al.	2407.16772	null
2024-07-23	Imperfect Vision Encoders: Efficient and Robust Tuning for Vision-Language Models	Aristeidis Panos et.al.	2407.16526	null
2024-07-23	Harmonizing Visual Text Comprehension and Generation	Zhen Zhao et.al.	2407.16364	null
2024-07-26	Multimodal Unlearnable Examples: Protecting Data against Multimodal Contrastive Learning	Xinwei Liu et.al.	2407.16307	link
2024-07-28	DiffX: Guide Your Layout to Cross-Modal Generative Modeling	Zeyu Wang et.al.	2407.15488	link
2024-07-21	VideoGameBunny: Towards vision assistants for video games	Mohammad Reza Taesiri et.al.	2407.15295	null
2024-07-20	Downstream-Pretext Domain Knowledge Traceback for Active Learning	Beichen Zhang et.al.	2407.14720	null
2024-07-19	On Pre-training of Multimodal Language Models Customized for Chart Understanding	Wan-Cyuan Fan et.al.	2407.14506	null
2024-07-19	EVLM: An Efficient Vision-Language Model for Visual Understanding	Kaibing Chen et.al.	2407.14177	null
2024-07-18	Open-Vocabulary 3D Semantic Segmentation with Text-to-Image Diffusion Models	Xiaoyu Zhu et.al.	2407.13642	null
2024-07-17	LookupViT: Compressing visual information to a limited number of tokens	Rajat Koner et.al.	2407.12753	null
2024-07-16	Controllable Contextualized Image Captioning: Directing the Visual Narrative through User-Defined Highlights	Shunqi Mao et.al.	2407.11449	link
2024-07-17	CIC-BART-SSA: Controllable Image Captioning with Structured Semantic Augmentation	Kalliopi Basioti et.al.	2407.11393	link
2024-07-15	Can Textual Semantics Mitigate Sounding Object Segmentation Preference?	Yaoting Wang et.al.	2407.10947	link
2024-07-12	TCAN: Animating Human Images with Temporally Consistent Pose Guidance using Diffusion Models	Jeongho Kim et.al.	2407.09012	null
2024-07-12	15M Multimodal Facial Image-Text Dataset	Dawei Dai et.al.	2407.08515	null
2024-07-17	Pseudo-RIS: Distinctive Pseudo-supervision Generation for Referring Image Segmentation	Seonghoon Yu et.al.	2407.07412	link
2024-07-08	Leveraging image captions for selective whole slide image annotation	Jingna Qiu et.al.	2407.06363	link
2024-07-08	Pseudo-triplet Guided Few-shot Composed Image Retrieval	Bohan Hou et.al.	2407.06001	null
2024-07-08	Negative Results of Image Processing for Identifying Duplicate Questions on Stack Overflow	Faiz Ahmed et.al.	2407.05523	null
2024-07-11	Resampled Datasets Are Not Enough: Mitigating Societal Bias Beyond Single Attributes	Yusuke Hirota et.al.	2407.03623	null
2024-07-02	Certainly Uncertain: A Benchmark and Metric for Multimodal Epistemic and Aleatoric Awareness	Khyathi Raghavi Chandu et.al.	2407.01942	null
2024-07-01	Semantic Compositions Enhance Vision-Language Contrastive Learning	Maxwell Aladago et.al.	2407.01408	null
2024-06-28	Assistive Image Annotation Systems with Deep Learning and Natural Language Capabilities: A Review	Moseli Mots'oehli et.al.	2407.00252	null
2024-06-28	PathGen-1.6M: 1.6 Million Pathology Image-text Pairs Generation through Multi-agent Collaboration	Yuxuan Sun et.al.	2407.00203	null
2024-06-28	MM-Instruct: Generated Visual Instructions for Large Multimodal Model Alignment	Jihao Liu et.al.	2406.19736	link
2024-06-27	RAVEN: Multitask Retrieval Augmented Vision-Language Learning	Varun Nagaraj Rao et.al.	2406.19150	null
2024-07-02	Revisiting Backdoor Attacks against Large Vision-Language Models	Siyuan Liang et.al.	2406.18844	null
2024-06-26	MUMU: Bootstrapping Multimodal Image Generation from Text-to-Image Data	William Berman et.al.	2406.18790	null
2024-06-24	Enhancing Scientific Figure Captioning Through Cross-modal Learning	Mateo Alejandro Rojas et.al.	2406.17047	null
2024-07-01	A Simple Framework for Open-Vocabulary Zero-Shot Segmentation	Thomas Stegmüller et.al.	2406.16085	null
2024-06-22	Reading Is Believing: Revisiting Language Bottleneck Models for Image Classification	Honori Udo et.al.	2406.15816	null
2024-06-20	Does Object Grounding Really Reduce Hallucination of Large Vision-Language Models?	Gregor Geigle et.al.	2406.14492	null
2024-06-20	From Descriptive Richness to Bias: Unveiling the Dark Side of Generative Image Caption Enrichment	Yusuke Hirota et.al.	2406.13912	null
2024-06-19	Reinforcing Pre-trained Models Using Counterfactual Images	Xiang Li et.al.	2406.13316	null
2024-06-18	Do More Details Always Introduce More Hallucinations in LVLM-based Image Captioning?	Mingqian Feng et.al.	2406.12663	null
2024-06-18	VRSBench: A Versatile Vision-Language Benchmark Dataset for Remote Sensing Image Understanding	Xiang Li et.al.	2406.12384	link
2024-06-17	Composing Object Relations and Attributes for Image-Text Matching	Khoi Pham et.al.	2406.11820	null
2024-06-17	LLARVA: Vision-Action Instruction Tuning Enhances Robot Learning	Dantong Niu et.al.	2406.11815	null
2024-06-17	MFC-Bench: Benchmarking Multimodal Fact-Checking with Large Vision-Language Models	Shengkang Wang et.al.	2406.11288	link
2024-06-14	From Pixels to Prose: A Large Dataset of Dense Image Captions	Vasu Singla et.al.	2406.10328	null
2024-06-14	OSPC: Detecting Harmful Memes with Large Language Model as a Catalyst	Jingtao Cao et.al.	2406.09779	null
2024-06-13	ImageNet3D: Towards General-Purpose Object-Level 3D Understanding	Wufei Ma et.al.	2406.09613	link
2024-06-13	Yo'LLaVA: Your Personalized Language and Vision Assistant	Thao Nguyen et.al.	2406.09400	null
2024-06-13	Towards Vision-Language Geo-Foundation Model: A Survey	Yue Zhou et.al.	2406.09385	link
2024-06-11	Translating speech with just images	Dan Oneata et.al.	2406.07133	link
2024-06-11	UVIS: Unsupervised Video Instance Segmentation	Shuaiyi Huang et.al.	2406.06908	null
2024-06-10	TRINS: Towards Multimodal Language Models that Can Read	Ruiyi Zhang et.al.	2406.06730	null
2024-06-10	VCR: Visual Caption Restoration	Tianyu Zhang et.al.	2406.06462	link
2024-06-10	FLEUR: An Explainable Reference-Free Evaluation Metric for Image Captioning Using a Large Multimodal Model	Yebin Lee et.al.	2406.06004	link
2024-06-09	Stealthy Targeted Backdoor Attacks against Image Captioning	Wenshu Fan et.al.	2406.05874	link
2024-06-07	Interpretable Multimodal Out-of-context Detection with Soft Logic Regularization	Huanhuan Ma et.al.	2406.04756	null
2024-06-06	Understanding Retrieval Robustness for Retrieval-Augmented Image Captioning	Wenyan Li et.al.	2406.02265	link
2024-06-03	Dragonfly: Multi-Resolution Zoom Supercharges Large Visual-Language Model	Kezhen Chen et.al.	2406.00977	link
2024-06-01	DS@BioMed at ImageCLEFmedical Caption 2024: Enhanced Attention Mechanisms in Medical Caption Generation through Concept Detection Integration	Nhi Ngoc-Yen Nguyen et.al.	2406.00391	null
2024-06-01	Image Captioning via Dynamic Path Customization	Yiwei Ma et.al.	2406.00334	link
2024-05-30	OpenDAS: Domain Adaptation for Open-Vocabulary Segmentation	Gonca Yilmaz et.al.	2405.20141	null
2024-05-30	RTGen: Generating Region-Text Pairs for Open-Vocabulary Object Detection	Fangyi Chen et.al.	2405.19854	link
2024-05-29	Multi-Modal Generative Embedding Model	Feipeng Ma et.al.	2405.19333	null
2024-05-29	MetaToken: Detecting Hallucination in Image Descriptions by Meta Classification	Laura Fieback et.al.	2405.19186	null
2024-05-31	Benchmarking and Improving Detail Image Caption	Hongyuan Dong et.al.	2405.19092	link
2024-05-28	Text-only Synthesis for Image Captioning	Qing Zhou et.al.	2405.18258	null
2024-05-24	How Culturally Aware are Vision-Language Models?	Olena Burda-Lassen et.al.	2405.17475	null
2024-05-25	Semantic Importance-Aware Communications with Semantic Correction Using Large Language Models	Shuaishuai Guo et.al.	2405.16011	null
2024-05-23	LG-VQ: Language-Guided Codebook Learning	Guotao Liang et.al.	2405.14206	null
2024-05-23	A Survey on Vision-Language-Action Models for Embodied AI	Yueen Ma et.al.	2405.14093	null
2024-05-22	CrossCheckGPT: Universal Hallucination Ranking for Multimodal Foundation Models	Guangzhi Sun et.al.	2405.13684	null
2024-05-25	Class-Conditional self-reward mechanism for improved Text-to-Image models	Safouane El Ghazouali et.al.	2405.13473	link
2024-05-21	Towards Retrieval-Augmented Architectures for Image Captioning	Sara Sarto et.al.	2405.13127	null
2024-05-16	UniRAG: Universal Retrieval Augmentation for Multi-Modal Large Language Models	Sahel Sharifymoghaddam et.al.	2405.10311	null
2024-05-16	ROCOv2: Radiology Objects in COntext Version 2, an Updated Multimodal Image Dataset	Johannes Rückert et.al.	2405.10004	link
2024-05-16	Chameleon: Mixed-Modal Early-Fusion Foundation Models	Chameleon Team et.al.	2405.09818	null
2024-05-14	Contextual Emotion Recognition using Large Vision Language Models	Yasaman Etesam et.al.	2405.08992	null
2024-05-13	Boostlet.js: Image processing plugins for the web via JavaScript injection	Edward Gaibor et.al.	2405.07868	link
2024-05-09	Using Machine Translation to Augment Multilingual Classification	Adam King et.al.	2405.05478	null
2024-05-03	LLM as Dataset Analyst: Subpopulation Structure Discovery with Large Language Model	Yulin Luo et.al.	2405.02363	link
2024-05-02	Technical Report of NICE Challenge at CVPR 2024: Caption Re-ranking Evaluation Using Ensembled CLIP and Consensus Scores	Kiyoon Jeong et.al.	2405.01028	link
2024-05-01	Beyond Human Vision: The Role of Large Vision Language Models in Microscope Image Analysis	Prateek Verma et.al.	2405.00876	null
2024-05-01	The Pyramid of Captions	Delong Chen et.al.	2405.00485	null
2024-04-29	Enhancing Interactive Image Retrieval With Query Rewriting Using Large Language Models and Vision Language Models	Hongyi Zhu et.al.	2404.18746	null
2024-04-28	Semi-supervised Text-based Person Search	Daming Gao et.al.	2404.18106	null
2024-04-28	Compressed Image Captioning using CNN-based Encoder-Decoder Framework	Md Alif Rahman Ridoy et.al.	2404.18062	null
2024-04-26	Learning text-to-video retrieval from image captioning	Lucas Ventura et.al.	2404.17498	null
2024-04-25	OmniSearchSage: Multi-Task Multi-Entity Embeddings for Pinterest Search	Prabhat Agarwal et.al.	2404.16260	link
2024-04-24	FairDeDup: Detecting and Mitigating Vision-Language Fairness Disparities in Semantic Dataset Deduplication	Eric Slyman et.al.	2404.16123	null
2024-04-23	Visual Delta Generator with Large Multi-modal Models for Semi-supervised Composed Image Retrieval	Young Kyun Jang et.al.	2404.15516	null
2024-04-23	GeoLLM-Engine: A Realistic Environment for Building Geospatial Copilots	Simranjit Singh et.al.	2404.15500	null
2024-04-12	FLoRA: Enhancing Vision-Language Models with Parameter-Efficient Federated Learning	Duy Phuong Nguyen et.al.	2404.15182	null
2024-04-21	Lost in Space: Probing Fine-grained Spatial Understanding in Vision and Language Resamplers	Georgios Pantazopoulos et.al.	2404.13594	link
2024-04-19	Data Alignment for Zero-Shot Concept Generation in Dermatology AI	Soham Gadgil et.al.	2404.13043	null
2024-04-19	MM-PhyRLHF: Reinforcement Learning Framework for Multimodal Physics Question-Answering	Avinash Anand et.al.	2404.12926	null
2024-04-19	The Solution for the CVPR2024 NICE Image Captioning Challenge	Longfei Huang et.al.	2404.12739	null
2024-04-16	LaDiC: Are Diffusion Models Really Inferior to Autoregressive Counterparts for Image-to-Text Generation?	Yuchi Wang et.al.	2404.10763	link
2024-04-15	ANCHOR: LLM-driven News Subject Conditioning for Text-to-Image Synthesis	Aashish Anantha Ramakrishnan et.al.	2404.10141	link
2024-04-15	Bridging Vision and Language Spaces with Assignment Prediction	Jungin Park et.al.	2404.09632	link
2024-04-13	On Speculative Decoding for Multimodal Large Language Models	Mukul Gagrani et.al.	2404.08856	null
2024-04-12	Enhancing Visual Question Answering through Question-Driven Image Captions as Prompts	Övgü Özdemir et.al.	2404.08589	link
2024-04-11	View Selection for 3D Captioning via Diffusion Ranking	Tiange Luo et.al.	2404.07984	null
2024-04-11	Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models	Haotian Zhang et.al.	2404.07973	null
2024-04-09	Training-Free Open-Vocabulary Segmentation with Offline Diffusion-Augmented Prototype Generation	Luca Barsellotti et.al.	2404.06542	null
2024-04-06	Panoptic Perception: A Novel Task and Fine-grained Dataset for Universal Remote Sensing Image Interpretation	Danpei Zhao et.al.	2404.04608	null
2024-04-04	CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching	Dongzhi Jiang et.al.	2404.03653	link

(back to top)

ederev / cv-arxiv-daily Goto Github PK

cv-arxiv-daily's Introduction

Updated on 2024.08.22

Semantic Segmentation

Instance Segmentation

Panoptic Segmentation

Object Detection

Keypoint Detection

Open-Vocabulary

Image Captioning

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent