Repository logo

Towards Generalizable Few-Shot Object Detection via Enhanced Representation Learning

dc.contributor.authorZhang, Yan
dc.contributor.supervisorLaganière, Robert
dc.date.accessioned2026-01-06T13:59:59Z
dc.date.available2026-01-06T13:59:59Z
dc.date.issued2026-01-06
dc.description.abstractFew-shot object detection (FSOD), which aims to detect novel categories with minimal training examples, faces significant challenges in learning robust feature representations due to severe data scarcity. Additionally, FSOD models often struggle to distinguish objects from visually ambiguous backgrounds, restricting their generalization capability. We propose a novel FSOD framework designed to address these challenges through two key innovations. First, we introduce Wavelet‑Semantic Fusion Attention (WSFA), which enhances semantic ViT features by incorporating frequency-domain information via discrete wavelet transform, providing complementary edge and texture cues through cross-modal attention. Second, we propose the Learnable Background Prototype (LBP) that explicitly models the background patterns, significantly improving foreground-background discrimination. These contributions are then integrated into a unified single-stage transformer-based detection framework with inter-class contrastive learning. Comprehensive experiments on standard FSOD benchmarks (PASCAL VOC and MS COCO) demonstrate that our method achieves stable improvements over strong baseline methods and outperforms existing state-of-the-art approaches. This work provides a practical solution for scenarios with limited annotated data, enhancing the applicability of object detection in real-world applications.
dc.identifier.urihttp://hdl.handle.net/10393/51223
dc.identifier.urihttps://doi.org/10.20381/ruor-31646
dc.language.isoen
dc.publisherUniversité d'Ottawa | University of Ottawa
dc.rightsAttribution-ShareAlike 4.0 Internationalen
dc.rights.urihttp://creativecommons.org/licenses/by-sa/4.0/
dc.subjectComputer Vision
dc.subjectObject Detection
dc.subjectFew-Shot Learning
dc.titleTowards Generalizable Few-Shot Object Detection via Enhanced Representation Learning
dc.typeThesisen
thesis.degree.disciplineGénie / Engineering
thesis.degree.levelMasters
thesis.degree.nameMCS
uottawa.departmentScience informatique et génie électrique / Electrical Engineering and Computer Science

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail ImageThumbnail Image
Name:
Zhang_Yan_2026_thesis.pdf
Size:
42.67 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail ImageThumbnail Image
Name:
license.txt
Size:
6.65 KB
Format:
Item-specific license agreed upon to submission
Description: