Hi everyone,
I’m a professional in the stone restoration trade (marble/terrazzo) and I’m building a diagnostic dataset for Computer Vision. I’ve uploaded a sample project to Hugging Face and would appreciate some technical feedback on my approach.
Link to Dataset: https://huggingface.co/datasets/RomMilk/marble-surface-damage-coco/tree/main
What’s in the Zips: I’ve provided two versions of the same data so researchers can compare:
-
output_tiles.zip: 181 tiled patches (512x512) with COCO JSON annotations. -
Full Res.zip: The original high-resolution captures of the same surfaces.
The Project Focus: This specific set features unpolished and dull surfaces. I have intentionally NOT labeled for “dullness” or “dirt.” Instead, I am focusing strictly on physical substrate damage:
-
Surface Cracks & Chips
-
Grout Failure / Eroded Grout
-
Deep Scratches
The goal is to train a model that can “see through” the dirt and lack of shine to find permanent structural issues.
Future Scope: This is just the start. I have a library of thousands of images of clean, polished marble, as well as different stone types (Terrazzo, Granite) and specific architectural features like stone showers and countertops.
I’m looking for your expertise on:
-
Annotation Quality: Do these polygons/labels look precise enough for your training pipelines?
-
The “Tile” vs. “Hi-Res” Debate: For detecting hairline cracks in stone, do you prefer working with these pre-cut 640px tiles, or is it better to have the full-scale image?
-
Labeling: Since I have “Clean/Polished” versions of these same stone types, would adding those as a “Baseline” class significantly increase the value of this set?
Thanks for any insights you can share!