Efficient Text-Driven 3D Scene Editing Based on Gaussian Splatting
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Université d'Ottawa | University of Ottawa
Abstract
Text-guided 3D scene editing has advanced with diffusion models and neural rendering. While Neural Radiance Fields (NeRF) excel at 3D reconstruction and novel view synthesis, they face key challenges: computational inefficiency, multi-view inconsistency, and poor handling of motion blur. To overcome these limitations, we introduce a 3D Gaussian Splatting (3DGS)-based framework for efficient and consistent 3D editing. 3DGS offers real-time rendering and a more practical alternative to NeRF. We enhance its performance with two key components: (1) a Complementary Information Mutual Learning Network (CIMLN) for refining 3DGS-derived depth maps, enabling depth-aware image editing; and (2) a Wavelet Consensus Attention mechanism that aligns latent codes across views during diffusion denoising, ensuring consistent multi-view outputs. We also address motion-blurred scene reconstruction by applying low-pass filtering and jointly optimizing Gaussian parameters with camera trajectories. Experiments show that our method achieves photorealistic, structurally consistent re- sults while maintaining real-time performance and significantly reducing computational overhead compared to prior methods.
Description
Keywords
3D Gaussian Splatting, Diffusion, 3D Editing
