Image and Video Synthesis and Generation - a VCLab-HKPU Collection

VCLab-HKPU 's Collections

Image/Video Restoration, Enhancement and Quality Assessment

Multimodal Perception, Understanding and Reasoning

Image and Video Synthesis and Generation

3D Perception, Reconstruction and Generation

Architecture and Training Paradigms

Benchmarks and Datasets

Image and Video Synthesis and Generation

updated Apr 22

This collection features VCLab's significant efforts in accelerating, distilling, and improving the image/video synthesis and generation models.

Diversity-Preserved Distribution Matching Distillation for Fast Visual Synthesis

Paper • 2602.03139 • Published Feb 3 • 45

Note [arXiv 2026] Diversity-preserved DMD for fast synthesis. | Code: https://github.com/Multimedia-Analytics-Laboratory/dpdmd
CoCoEdit: Content-Consistent Image Editing via Region Regularized Reinforcement Learning

Paper • 2602.14068 • Published Feb 15

Note [arXiv 2026] Content-consistent editing via region-regularized RL. | Code: https://github.com/langmanbusi/CoCoEdit
DNAEdit: Direct Noise Alignment for Text-Guided Rectified Flow Editing

Paper • 2506.01430 • Published Jun 2, 2025

Note [NeurIPS 2025 Spotlight] Direct noise alignment for rectified flow editing. | Code: https://xiechenxi99.github.io/DNAEdit/
GPSToken: Gaussian Parameterized Spatially-adaptive Tokenization for Image Representation and Generation

Paper • 2509.01109 • Published Sep 1, 2025 • 1

Note [NeurIPS 2025] Gaussian-parameterized spatial tokens. | Code: https://github.com/xtudbxk/GPSToken
InsViE-1M: Effective Instruction-based Video Editing with Elaborate Dataset Construction

Paper • 2503.20287 • Published Mar 26, 2025

Note [ICCV 2025] 1M-scale instruction-based video editing. | Code: https://github.com/langmanbusi/InsViE
Exact Feature Distribution Matching for Arbitrary Style Transfer and Domain Generalization

Paper • 2203.07740 • Published Mar 15, 2022

Note [ECCV 2022 Oral] Exact feature distribution matching. | Code: https://github.com/YBZh/EFDM