SegviGen: Repurposing 3D Generative Model for Part Segmentation

TL;DR

SegviGen repurposes 3D generative models for part segmentation, achieving a 40% improvement in interactive segmentation using only 0.32% labeled data.

cs.CV 🔴 Advanced 2026-03-18 114 views

Lin Li Haoran Feng Zehuan Huang Haohua Chen Wenbo Nie Shaohua Hou Keqing Fan Pan Hu Sheng Wang Buyu Li Lu Sheng

AI Reader Arxiv Page Download PDF

3D generative models part segmentation interactive segmentation data efficiency pretrained models

Key Findings

Methodology

SegviGen is a novel framework that leverages pretrained 3D generative models for 3D part segmentation. It reformulates the segmentation task as a colorization problem, using the structural and textural priors of generative models to predict part-indicative colors on active voxels of geometry-aligned reconstructions. The framework supports interactive segmentation, full segmentation, and full segmentation with 2D guidance, unifying multiple task settings.

Key Results

In interactive part segmentation, SegviGen achieved a 40% improvement in IoU@1 on the PartObjaverse-Tiny dataset and 31% on the PartNeXT dataset, using only 0.32% labeled data, significantly outperforming Point-SAM and P3-SAM.
In full segmentation tasks, SegviGen excelled on the PartNext dataset, with IoU increasing to 55.40%, and further improved to 71.53% with 2D guidance, showcasing its advantage in combining 2D semantic cues and 3D geometric consistency.
Ablation studies revealed that explicit coordinate encoding performed better in multiple interactions, especially in providing finer spatial differentiation for complex geometric details.

Significance

SegviGen significantly reduces the reliance on large-scale annotated data by transferring the prior knowledge of 3D generative models to 3D part segmentation, enhancing segmentation accuracy and efficiency. This method holds significant implications for academia and industry, particularly in industrial applications requiring fine segmentation, such as 3D printing and animation rigging.

Technical Contribution

Technical contributions include: 1) Reformulating 3D segmentation as a colorization problem, leveraging generative model priors for efficient segmentation; 2) Proposing a unified multi-task framework supporting various segmentation tasks; 3) Demonstrating the effectiveness of generative priors under limited supervision, significantly enhancing segmentation performance.

Novelty

SegviGen is the first to use 3D generative model priors for part segmentation by redefining the segmentation problem as a colorization task, differing from traditional 2D-to-3D lifting methods and native 3D discriminative methods, offering an efficient and data-saving new approach.

Limitations

When dealing with very complex geometries, there may be inaccuracies in segmentation, especially when lacking sufficient user interaction guidance.
While generally performing well, further optimization may be needed in specific industrial applications to meet particular precision requirements.
For certain specific 3D models, additional preprocessing steps may be required to ensure the effective application of generative model priors.

Future Work

Future research directions include: 1) Extending SegviGen to support more types of 3D models and application scenarios; 2) Optimizing user interaction mechanisms to improve segmentation accuracy and efficiency; 3) Exploring the integration of more multimodal data (such as voice or text) to enhance segmentation performance.

AI Executive Summary

3D part segmentation is a core technology for 3D content creation and spatial intelligence, yet existing methods often fall short in segmentation quality, producing erroneous regions and imprecise boundaries that limit their practical usability. Traditional methods either rely on 2D-to-3D lifting or require large-scale 3D annotated data, which often perform poorly when handling complex geometries.

SegviGen introduces a novel framework that leverages the prior knowledge of 3D generative models for part segmentation, significantly reducing the need for annotated data. Specifically, SegviGen reformulates the 3D segmentation task as a colorization problem, using generative model priors to predict part-indicative colors on active voxels of geometry-aligned reconstructions. The framework supports interactive segmentation, full segmentation, and full segmentation with 2D guidance, unifying multiple task settings.

In experiments, SegviGen excelled in interactive part segmentation, achieving a 40% improvement in IoU@1 on the PartObjaverse-Tiny dataset and 31% on the PartNeXT dataset, using only 0.32% labeled data, significantly outperforming Point-SAM and P3-SAM. In full segmentation tasks, SegviGen excelled on the PartNext dataset, with IoU increasing to 55.40%, and further improved to 71.53% with 2D guidance, showcasing its advantage in combining 2D semantic cues and 3D geometric consistency.

The significance of this research lies in its ability to enhance segmentation accuracy and efficiency while reducing the reliance on large-scale annotated data, providing a new approach for 3D part segmentation by leveraging generative model priors. This method holds significant implications for academia and industry, particularly in industrial applications requiring fine segmentation, such as 3D printing and animation rigging.

However, SegviGen may face inaccuracies when dealing with very complex geometries, especially when lacking sufficient user interaction guidance. Future research directions include extending SegviGen to support more types of 3D models and application scenarios, and optimizing user interaction mechanisms to improve segmentation accuracy and efficiency.

Deep Analysis

Background

3D part segmentation is a crucial research area in computer vision and computer graphics, aiming to decompose 3D models into semantically meaningful parts. The field's evolution can be traced back to early rule-based methods that relied on handcrafted features and heuristics. With the rise of deep learning, neural network-based methods have become mainstream, typically requiring large-scale annotated data for training, such as ShapeNet and PartNet datasets. However, these methods often perform poorly when handling complex geometries, especially when lacking sufficient annotated data. Recently, researchers have begun exploring the use of generative model priors for 3D segmentation, offering new opportunities for the field.

Core Problem

Existing 3D part segmentation methods face two main challenges: reliance on large-scale annotated data, which is costly and difficult to obtain in some application scenarios, and poor segmentation quality, especially when handling complex geometries, often resulting in erroneous regions and imprecise boundaries. These challenges limit the widespread use of 3D segmentation technology in practical applications. Therefore, how to improve segmentation quality while reducing the need for annotated data is the core problem to be addressed in this field.

Innovation

The core innovations of SegviGen include:

1) Reformulating the 3D segmentation task as a colorization problem, leveraging generative model priors for efficient segmentation. This innovation reduces the reliance on large-scale annotated data, improving segmentation accuracy and efficiency.

2) Proposing a unified multi-task framework that supports interactive segmentation, full segmentation, and full segmentation with 2D guidance, adapting to various task settings.

3) Demonstrating the effectiveness of generative priors under limited supervision, significantly enhancing segmentation performance, especially when handling complex geometries.

Methodology

SegviGen's methodology includes the following key steps:

�� Pretrained 3D Generative Model: Train generative models on large-scale unannotated 3D textured assets to internalize rich part-level structure and texture patterns.
�� Colorization Task Formulation: Reformulate the 3D segmentation task as a colorization problem, using generative model priors to predict part-indicative colors on active voxels of geometry-aligned reconstructions.
�� Multi-task Framework: Support interactive segmentation, full segmentation, and full segmentation with 2D guidance, unifying multiple task settings.
�� Condition Injection: Enhance the model's segmentation capability through user interaction or 2D segmentation map guidance.

Experiments

The experimental design includes:

�� Datasets: Use PartObjaverse-Tiny and PartNeXT datasets for evaluation.
�� Baselines: Compare with existing methods such as Point-SAM, P3-SAM.
�� Evaluation Metrics: Use IoU metrics to evaluate segmentation performance, with a particular focus on IoU@1 in interactive segmentation.
�� Hyperparameters: Adopt AdamW optimizer with a learning rate of 1e-4, training conducted on 8 NVIDIA A800 GPUs.

Results

Experimental results show:

�� In interactive part segmentation, SegviGen achieved a 40% improvement in IoU@1 on the PartObjaverse-Tiny dataset and 31% on the PartNeXT dataset, using only 0.32% labeled data, significantly outperforming Point-SAM and P3-SAM.
�� In full segmentation tasks, SegviGen excelled on the PartNext dataset, with IoU increasing to 55.40%, and further improved to 71.53% with 2D guidance, showcasing its advantage in combining 2D semantic cues and 3D geometric consistency.
�� Ablation studies revealed that explicit coordinate encoding performed better in multiple interactions, especially in providing finer spatial differentiation for complex geometric details.

Applications

Application scenarios for SegviGen include:

�� 3D Printing: Improve printing quality and efficiency through precise part segmentation, suitable for high-precision industrial design and manufacturing.
�� Animation Rigging: Provide fine-grained part-level control for animation production, enhancing animation effects, suitable for film and game production.
�� Industrial Design: Provide precise part segmentation in product design, supporting the realization of complex designs, applicable in automotive and aerospace industries.

Limitations & Outlook

Although SegviGen generally performs well, it may face inaccuracies when dealing with very complex geometries, especially when lacking sufficient user interaction guidance. Additionally, further optimization may be needed in specific industrial applications to meet particular precision requirements. Future research directions include extending SegviGen to support more types of 3D models and application scenarios, and optimizing user interaction mechanisms to improve segmentation accuracy and efficiency.

Plain Language Accessible to non-experts

Imagine you're in a kitchen, preparing a meal, and you need to separate various ingredients like vegetables, meats, and spices. Traditional methods are like using a big basket to mix all the ingredients together and then slowly picking them out, which is time-consuming and prone to errors. SegviGen is like a smart assistant that can automatically identify and categorize these ingredients with minimal instructions, quickly and accurately completing the task. It learns the characteristics of a large number of ingredients, such as color and shape, to help you better allocate the position of each ingredient. It's like having a super-intelligent kitchen assistant that not only helps you quickly find the ingredients you need but also adjusts according to your instructions, ensuring that every dish is perfectly presented.

ELI14 Explained like you're 14

Hey there! Imagine you're playing a 3D game and you need to divide the characters in the game into different parts, like the head, body, and limbs. Traditional methods are like you manually separating these parts one by one, which is both tedious and error-prone. But SegviGen is like a super-smart game assistant that can automatically help you identify and separate these parts with just a few instructions. It learns the characteristics of many characters, like color and shape, to help you better allocate the position of each part. It's like having a super-smart game assistant that not only helps you quickly find the parts you need but also adjusts according to your instructions, ensuring that each character is perfectly presented. Isn't that cool?

Glossary

3D Generative Model

A 3D generative model is a technology that generates new 3D models by learning from large amounts of 3D data, typically used to create 3D objects with complex geometry and texture.

In this paper, 3D generative models are used to provide rich structural and textural priors to support 3D part segmentation.

Part Segmentation

Part segmentation is the process of decomposing a 3D object into semantically meaningful independent parts, often used in 3D printing, animation, and industrial design.

This paper proposes a new part segmentation method that improves segmentation accuracy using generative model priors.

Interactive Segmentation

Interactive segmentation is a method that guides the segmentation process through user input, typically used in scenarios requiring fine control.

SegviGen supports interactive segmentation, allowing users to guide the segmentation process with simple clicks.

IoU (Intersection over Union)

IoU is a metric used to evaluate segmentation accuracy, calculating the ratio of the intersection and union between predicted and true segmentation.

This paper uses IoU metrics to evaluate SegviGen's segmentation performance on different datasets.

Pretrained Model

A pretrained model is a model trained on large-scale data that can be used for other tasks to improve performance and efficiency.

SegviGen utilizes pretrained 3D generative models to provide rich structural and textural priors.

Colorization Task

A colorization task is a problem reformulation that expresses the segmentation problem as a color prediction problem, using color to indicate different parts.

This paper reformulates 3D segmentation as a colorization task to leverage generative model priors.

Condition Injection

Condition injection is a method that enhances model capabilities through external information (such as user input or 2D segmentation maps).

SegviGen uses condition injection to support various segmentation task settings.

Ablation Study

An ablation study is a method of evaluating the impact of certain parts of a model on overall performance by removing or modifying them.

This paper conducts ablation studies to evaluate the impact of different encoding mechanisms on segmentation performance.

PartObjaverse-Tiny

PartObjaverse-Tiny is a dataset containing 200 textured mesh objects used to evaluate 3D segmentation performance.

This paper uses the PartObjaverse-Tiny dataset to evaluate SegviGen's interactive segmentation performance.

PartNeXT

PartNeXT is a dataset containing 300 textured mesh objects used to evaluate 3D segmentation performance.

This paper uses the PartNeXT dataset to evaluate SegviGen's full segmentation performance.

Open Questions Unanswered questions from this research

1 How can 3D segmentation accuracy and efficiency be further improved in the absence of sufficient annotated data? Existing methods often perform poorly when handling complex geometries, and future research needs to explore more effective strategies for transferring generative model priors.
2 How can 3D segmentation performance be enhanced with the assistance of multimodal data (such as voice or text)? Current research mainly focuses on image and 3D data, and future exploration could involve integrating more modalities.
3 How can user interaction mechanisms be optimized to improve segmentation accuracy and efficiency? Existing interaction methods may not be intuitive enough in some complex scenarios, and future work needs to develop more intelligent interaction strategies.
4 In industrial applications, how can the accuracy and consistency of 3D segmentation be ensured? Current methods may require further optimization in certain specific applications to meet particular precision requirements.
5 How can SegviGen be extended to support more types of 3D models and application scenarios? Existing research mainly focuses on specific types of 3D models, and future exploration needs to cover a broader range of applications.

Applications

Immediate Applications

3D Printing

Improve printing quality and efficiency through precise part segmentation, suitable for high-precision industrial design and manufacturing.

Animation Rigging

Provide fine-grained part-level control for animation production, enhancing animation effects, suitable for film and game production.

Industrial Design

Provide precise part segmentation in product design, supporting the realization of complex designs, applicable in automotive and aerospace industries.

Long-term Vision

Smart Manufacturing

Achieve automated assembly and inspection in smart manufacturing processes through automated 3D segmentation technology, improving production efficiency.

Virtual Reality

Provide fine-grained 3D segmentation in virtual reality environments, enhancing user experience and interaction, driving the development of virtual reality technology.

Abstract

We introduce SegviGen, a framework that repurposes native 3D generative models for 3D part segmentation. Existing pipelines either lift strong 2D priors into 3D via distillation or multi-view mask aggregation, often suffering from cross-view inconsistency and blurred boundaries, or explore native 3D discriminative segmentation, which typically requires large-scale annotated 3D data and substantial training resources. In contrast, SegviGen leverages the structured priors encoded in pretrained 3D generative model to induce segmentation through distinctive part colorization, establishing a novel and efficient framework for part segmentation. Specifically, SegviGen encodes a 3D asset and predicts part-indicative colors on active voxels of a geometry-aligned reconstruction. It supports interactive part segmentation, full segmentation, and full segmentation with 2D guidance in a unified framework. Extensive experiments show that SegviGen improves over the prior state of the art by 40% on interactive part segmentation and by 15% on full segmentation, while using only 0.32% of the labeled training data. It demonstrates that pretrained 3D generative priors transfer effectively to 3D part segmentation, enabling strong performance with limited supervision. See our project page at https://fenghora.github.io/SegviGen-Page/.

cs.CV

References (20)

Point-SAM: Promptable 3D Segmentation Model for Point Clouds

Yuchen Zhou, Jiayuan Gu, Tung Yen Chiang et al.

2024 47 citations ⭐ Influential View Analysis →

PartNeXt: A Next-Generation Dataset for Fine-Grained and Hierarchical 3D Part Understanding

Penghao Wang, Yi He, Xin Lv et al.

2025 8 citations ⭐ Influential View Analysis →

Native and Compact Structured Latents for 3D Generation

Jianfeng Xiang, Xiaoxue Chen, Sicheng Xu et al.

2025 15 citations ⭐ Influential View Analysis →

P3-SAM: Native 3D Part Segmentation

Changfeng Ma, Yang Li, Xinhao Yan et al.

2025 17 citations ⭐ Influential View Analysis →

PARTFIELD: Learning 3D Feature Fields for Part Segmentation and Beyond

Minghua Liu, M. Uy, Donglai Xiang et al.

2025 51 citations ⭐ Influential View Analysis →

Emerging Properties in Self-Supervised Vision Transformers

Mathilde Caron, Hugo Touvron, Ishan Misra et al.

2021 8515 citations View Analysis →

TELA: Text to Layer-wise 3D Clothed Human Generation

Junting Dong, Qi Fang, Zehuan Huang et al.

2024 18 citations View Analysis →

Segment3D: Learning Fine-Grained Class-Agnostic 3D Segmentation without Manual Labels

Rui Huang, Songyou Peng, Ayca Takmaz et al.

2023 68 citations View Analysis →

CraftsMan3D: High-fidelity Mesh Generation with 3D Native Generation and Interactive Geometry Refiner

Weiyu Li, Jiarui Liu, Rui Chen et al.

2024 109 citations View Analysis →

Part123: Part-aware 3D Reconstruction from a Single-view Image

Anran Liu, Cheng Lin, Yuan Liu et al.

2024 49 citations View Analysis →

SAM 2: Segment Anything in Images and Videos

Nikhila Ravi, Valentin Gabeur, Yuan-Ting Hu et al.

2024 2696 citations View Analysis →

SAM 3: Segment Anything with Concepts

Nicolas Carion, Laura Gustafson, Yuan-Ting Hu et al.

2025 134 citations View Analysis →

SAMPart3D: Segment Any Part in 3D Objects

Yu-nuo Yang, Yukun Huang, Yuan-Chen Guo et al.

2024 64 citations View Analysis →

DeOcc-1-to-3: 3D De-Occlusion from a Single Image via Self-Supervised Multi-View Diffusion

Yansong Qu, Shaohui Dai, Xinyang Li et al.

2025 8 citations View Analysis →

CraftsMan: High-fidelity Mesh Generation with 3D Native Generation and Interactive Geometry Refiner

Weiyu Li, Jiarui Liu, Rui Chen et al.

2024 37 citations

Stereo-GS: Multi-View Stereo Vision Model for Generalizable 3D Gaussian Splatting Reconstruction

Xiufeng Huang, Ka Chun Cheung, Runmin Cong et al.

2025 5 citations View Analysis →

EpiDiff: Enhancing Multi-View Synthesis via Localized Epipolar-Constrained Diffusion

Zehuan Huang, Hao Wen, Junting Dong et al.

2023 77 citations View Analysis →

MeshArt: Generating Articulated Meshes with Structure-Guided Transformers

Daoyi Gao, Yawar Siddiqui, Lei Li et al.

2024 28 citations View Analysis →

One-2-3-45: Any Single Image to 3D Mesh in 45 Seconds without Per-Shape Optimization

Minghua Liu, Chao Xu, Haian Jin et al.

2023 653 citations View Analysis →

ZeroPS: High-Quality Cross-Modal Knowledge Transfer for Zero-Shot 3D Part Segmentation

Yuheng Xue, Nenglun Chen, Jun Liu et al.

2023 16 citations View Analysis →

SegviGen: Repurposing 3D Generative Model for Part Segmentation

Key Findings

Methodology

Key Results

Significance

Technical Contribution

Novelty

Limitations

Future Work

AI Executive Summary

Deep Analysis

Background

Core Problem

Innovation

Methodology

Experiments

Results

Applications

Limitations & Outlook

Plain Language Accessible to non-experts

ELI14 Explained like you're 14

Glossary

3D Generative Model

Part Segmentation

Interactive Segmentation

IoU (Intersection over Union)

Pretrained Model

Colorization Task

Condition Injection

Ablation Study

PartObjaverse-Tiny

PartNeXT

Open Questions Unanswered questions from this research

Applications

Immediate Applications

3D Printing

Animation Rigging

Industrial Design

Long-term Vision

Smart Manufacturing

Virtual Reality

Abstract

References (20)

Related Papers

Deployment-Aligned Low-Precision Neural Architecture Search for Spaceborne Edge AI

DeepTaxon: An Interpretable Retrieval-Augmented Multimodal Framework for Unified Species Identification and Discovery

Learn&Drop: Fast Learning of CNNs based on Layer Dropping

SS3D: End2End Self-Supervised 3D from Web Videos

PASR: Pose-Aware 3D Shape Retrieval from Occluded Single Views

A Non-Invasive Alternative to RFID: Self-Sufficient 3D Identification of Group-Housed Livestock