Skip to main navigation Skip to search Skip to main content

Lang3D-XL: Language Embedded 3D Gaussians for Large-scale Scenes

  • Shai Krakovsky*
  • , Gal Fiebelman
  • , Sagie Benaim
  • , Hadar Averbuch-Elor
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Embedding a language field in a 3D representation enables richer semantic understanding of spatial environments by linking geometry with descriptive meaning. This allows for a more intuitive human-computer interaction, enabling querying or editing scenes using natural language, and could potentially improve tasks like scene retrieval, navigation, and multimodal reasoning. While such capabilities could be transformative, in particular for large-scale scenes, we find that recent feature distillation approaches cannot effectively learn over massive Internet data due to challenges in semantic feature misalignment and inefficiency in memory and runtime. To this end, we propose a novel approach to address these challenges. First, we introduce extremely low-dimensional semantic bottleneck features as part of the underlying 3D Gaussian representation. These are processed by rendering and passing them through a multi-resolution, feature-based, hash encoder. This significantly improves efficiency both in runtime and GPU memory. Second, we introduce an Attenuated Downsampler module and propose several regularizations addressing the semantic misalignment of ground truth 2D features. We evaluate our method on the in-the-wild HolyScenes dataset and demonstrate that it surpasses existing approaches in both performance and efficiency.

Original languageEnglish
Title of host publicationProceedings - SIGGRAPH Asia 2025 Conference Papers, SA 2025
EditorsStephen N. Spencer, Taku Komura, Michael Wimmer, Hongbo Fu
PublisherAssociation for Computing Machinery, Inc
ISBN (Electronic)9798400721373
DOIs
StatePublished - 14 Dec 2025
Event2025 SIGGRAPH Asia 2025 Conference Papers, SA 2025 - Hong Kong, Hong Kong
Duration: 15 Dec 202518 Dec 2025

Publication series

NameProceedings - SIGGRAPH Asia 2025 Conference Papers, SA 2025

Conference

Conference2025 SIGGRAPH Asia 2025 Conference Papers, SA 2025
Country/TerritoryHong Kong
CityHong Kong
Period15/12/2518/12/25

Bibliographical note

Publisher Copyright:
© 2025 Copyright held by the owner/author(s).

Fingerprint

Dive into the research topics of 'Lang3D-XL: Language Embedded 3D Gaussians for Large-scale Scenes'. Together they form a unique fingerprint.

Cite this