Crowning Achievement: NVIDIA Research Model Enables Fast, Efficient Dynamic Scene Reconstruction

Content streaming and engagement are entering a new dimension with QUEEN, an AI model by NVIDIA Research and the University of Maryland that makes it possible to stream free-viewpoint video, which lets viewers experience a 3D scene from any angle.

QUEEN could be used to build immersive streaming applications that teach skills like cooking, put sports fans on the field to watch their favorite teams play from any angle, or bring an extra level of depth to video conferencing in the workplace. It could also be used in industrial environments to help teleoperate robots in a warehouse or a manufacturing plant.

The model will be presented at NeurIPS, the annual conference for AI research that begins Tuesday, Dec. 10, in Vancouver.

“To stream free-viewpoint videos in near real time, we must simultaneously reconstruct and compress the 3D scene,” said Shalini De Mello, director of research and a distinguished research scientist at NVIDIA. “QUEEN balances factors including compression rate, visual quality, encoding time and rendering time to create an optimized pipeline that sets a new standard for visual quality and streamability.”

Reduce, Reuse and Recycle for Efficient Streaming

Free-viewpoint videos are typically created using video footage captured from different camera angles, like a multicamera film studio setup, a set of security cameras in a warehouse or a system of videoconferencing cameras in an office.

Prior AI methods for generating free-viewpoint videos either took too much memory for livestreaming or sacrificed visual quality for smaller file sizes. QUEEN balances both to deliver high-quality visuals — even in dynamic scenes featuring sparks, flames or furry animals — that can be easily transmitted from a host server to a client’s device. It also renders visuals faster than previous methods, supporting streaming use cases.

In most real-world environments, many elements of a scene stay static. In a video, that means a large share of pixels don’t change from one frame to another. To save computation time, QUEEN tracks and reuses renders of these static regions — focusing instead on reconstructing the content that changes over time.

Using an NVIDIA Tensor Core GPU, the researchers evaluated QUEEN’s performance on several benchmarks and found the model outperformed state-of-the-art methods for online free-viewpoint video on a range of metrics. Given 2D videos of the same scene captured from different angles, it typically takes under five seconds of training time to render free-viewpoint videos at around 350 frames per second.

This combination of speed and visual quality can support media broadcasts of concerts and sports games by offering immersive virtual reality experiences or instant replays of key moments in a competition.

In warehouse settings, robot operators could use QUEEN to better gauge depth when maneuvering physical objects. And in a videoconferencing application — such as the 3D videoconferencing demo shown at SIGGRAPH and NVIDIA GTC — it could help presenters demonstrate tasks like cooking or origami while letting viewers pick the visual angle that best supports their learning.

The code for QUEEN will soon be released as open source and shared on the project page.

QUEEN is one of over 50 NVIDIA-authored NeurIPS posters and papers that feature groundbreaking AI research with potential applications in fields including simulation, robotics and healthcare.

Generative Adversarial Nets, the paper that first introduced GAN models, won the NeurIPS 2024 Test of Time Award. Cited more than 85,000 times, the paper was coauthored by Bing Xu, distinguished engineer at NVIDIA. Hear more from its lead author, Ian Goodfellow, research scientist at DeepMind, on the AI Podcast:

Learn more about NVIDIA Research at NeurIPS.

See the latest work from NVIDIA Research, which has hundreds of scientists and engineers worldwide, with teams focused on topics including AI, computer graphics, computer vision, self-driving cars and robotics.

Academic researchers working on large language models, simulation and modeling, edge AI and more can apply to the NVIDIA Academic Grant Program.

See notice regarding software product information.

Blog Article: Here

  • Related Posts

    Inside the research: How GitHub Copilot impacts the nature of work for open source maintainers

    An interview with economic researchers analyzing the causal effect of GitHub Copilot on how open source maintainers work.

    The post Inside the research: How GitHub Copilot impacts the nature of work for open source maintainers appeared first on The GitHub Blog.

    Introducing Annotated Logger: A Python package to aid in adding metadata to logs

    We’re open sourcing Annotated Logger, a Python package that helps make logs searchable with consistent metadata.

    The post Introducing Annotated Logger: A Python package to aid in adding metadata to logs appeared first on The GitHub Blog.

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    You Missed

    Our remedies proposal in DOJ’s search distribution case

    Our remedies proposal in DOJ’s search distribution case

    How Chrome’s Autofill can drive more conversions at checkout

    How Chrome’s Autofill can drive more conversions at checkout

    The latest AI news we announced in December

    The latest AI news we announced in December

    OpenAI’s latest o1 model now available in GitHub Copilot and GitHub Models

    OpenAI’s latest o1 model now available in GitHub Copilot and GitHub Models

    Inside the research: How GitHub Copilot impacts the nature of work for open source maintainers

    Inside the research: How GitHub Copilot impacts the nature of work for open source maintainers

    Listen to our podcast conversation about Project Astra.

    Listen to our podcast conversation about Project Astra.