Hi everyone, sharing a new research paper + dataset (Apache 2.0) that may be interesting for research on stylistic conditioning and mixture-based diffusion architecture.
Paper (arXiv):
https://arxiv.org/pdf/2601.07941
Core idea
The paper introduces the Moonworks Lunara Aesthetic Dataset, designed explicitly to study style conditioning and aesthetics, rather than treating style as an emergent byproduct of large-scale pretraining. The dataset covers regional arts (e.g., Nordic, East Asian, South Asian, Middle Eastern), as well as general art styles (e.g., oil, sketch).
Lunara Aesthetic Dataset
The dataset focuses on high-quality, human-curated image–prompt pairs with explicit stylistic grounding.
Highlights:
-
~2k image–prompt pairs with consistently high aesthetic scores
-
Prompts emphasize stylistic intent (art movement, regional style, medium, mood)
-
Suitable for:
-
Style-conditioned diffusion
-
Evaluating disentanglement between content and aesthetics
-
Diffusion mixture perspective
The dataset is created with a sub-10B parameter at inference diffusion mixture architecture.
This aligns well with recent interest in:
-
Modular fine-tuning (LoRA / adapters per style)
-
Better controllability without scaling model size indiscriminately
Colab: quick experimentation
Here’s a Colab notebook for loading the dataset and running basic visualizations:
https://colab.research.google.com/drive/1beodSkLWIyiaGfJIo4kkQzDPjS8lJb0S?usp=sharing
Discussion
I’m curious how others here are thinking about Mixture vs. monolithic diffusion models for style.
Would love to hear thoughts, critiques, or related work people are exploring.