CVPR 2026 Hightlight Rare Disease Facial Analysis

RDFace: A Benchmark Dataset for Rare Disease Facial Image Analysis under Extreme Data Scarcity and Phenotype-Aware Synthetic Generation

Official repository for the CVPR 2026 accepted paper. RDFace provides a modular framework for rare disease facial image analysis, supporting data preprocessing, supervised classification, few-shot learning, synthetic image generation, and LLM-based report generation.

Ganlin Feng1, Yuxi Long1, Hafsa Ali2, Erin Lou3, Fahad Butt1, Qian Liu4, Yang Wang2, Pingzhao Hu1,3,*
1Western University, 2Concordia University, 3University of Toronto, 4University of Winnipeg
*Correspondence: phu49@uwo.ca

Overview

A modular benchmark for rare disease facial image analysis.

This repository provides a modular framework for facial image analysis in the context of rare disease diagnosis, specifically tailored for use with the RDFace dataset. It supports data preprocessing, supervised classification, few-shot learning, synthetic image generation, and LLM-based report generation.

01

Data preprocessing

Utilities for preparing pediatric facial images and analyzing synthetic images for downstream benchmarking.

02

Classification

Supports conventional supervised learning and few-shot classification for rare disease facial phenotype recognition.

03

Synthetic generation

Includes synthetic image generation modules for identity- and phenotype-consistent data generation in low-data settings.

Dataset Access

Hybrid release model for synthetic and real facial images.

RDFace includes both real pediatric facial images and synthetic data. Due to privacy considerations, the dataset is released under a hybrid access model.

Open Access

RDFace-Syn

The synthetic dataset is freely available for research and benchmarking.

RDFace-Syn on Kaggle
  • Freely available for research use
  • Designed for benchmarking under data scarcity
  • Useful for synthetic-only and synthetic-assisted experiments
Controlled Access

RDFace-Real

The real dataset contains identifiable facial images and is distributed under controlled access.

Request RDFace-Real
  • Approved users will receive a Research ID
  • Access is granted only for approved research purposes
  • Ethics approval, such as REB or IRB approval, is required
Usage Policy: Do not share or redistribute the data. Do not attempt re-identification. The dataset must be acknowledged in research outputs. Any use without a valid Research ID is considered unauthorized.
Project Video

Watch the RDFace overview video.

This video introduces the motivation, dataset construction, benchmark design, and synthetic data generation components of RDFace.

Reference

Cite RDFace.

Please cite our paper if you use RDFace, its synthetic data, benchmark splits, or related code.

@inproceedings{rdface2026, title = {RDFace: A Benchmark Dataset for Rare Disease Facial Image Analysis under Extreme Data Scarcity and Phenotype-Aware Synthetic Generation}, author = {Ganlin Feng and Yuxi Long and Hafsa Ali and Erin Lou and Fahad Butt and Qian Liu and Yang Wang and Pingzhao Hu}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition}, year = {2026} }
Acknowledgement

Funding, ethics, and data sources.

This work was supported in part by the Canada Research Chairs Tier II Program CRC-2021-00482 and the Canada Foundation for Innovation John R. Evans Leaders Fund JELF #43481. All data collection procedures were approved by the Western University Health Science Research Ethics Board HSREB Reference No. 2023-122744-77394. Facial photographs of children with rare diseases were collected from publicly available sources, including the published literature and foundation websites, and we gratefully acknowledge these sources.