Repository logo

Efficient Text-Driven 3D Scene Editing Based on Gaussian Splatting

Loading...
Thumbnail ImageThumbnail Image

Journal Title

Journal ISSN

Volume Title

Publisher

Université d'Ottawa | University of Ottawa

Creative Commons

Attribution 4.0 International

Abstract

Text-guided 3D scene editing has advanced with diffusion models and neural rendering. While Neural Radiance Fields (NeRF) excel at 3D reconstruction and novel view synthesis, they face key challenges: computational inefficiency, multi-view inconsistency, and poor handling of motion blur. To overcome these limitations, we introduce a 3D Gaussian Splatting (3DGS)-based framework for efficient and consistent 3D editing. 3DGS offers real-time rendering and a more practical alternative to NeRF. We enhance its performance with two key components: (1) a Complementary Information Mutual Learning Network (CIMLN) for refining 3DGS-derived depth maps, enabling depth-aware image editing; and (2) a Wavelet Consensus Attention mechanism that aligns latent codes across views during diffusion denoising, ensuring consistent multi-view outputs. We also address motion-blurred scene reconstruction by applying low-pass filtering and jointly optimizing Gaussian parameters with camera trajectories. Experiments show that our method achieves photorealistic, structurally consistent re- sults while maintaining real-time performance and significantly reducing computational overhead compared to prior methods.

Description

Keywords

3D Gaussian Splatting, Diffusion, 3D Editing

Citation

Related Materials

Alternate Version