10–14 Jun 2025
University of Stavanger
Europe/Oslo timezone

Dataset Optimisation for Enhanced Auto-mated Sewer Damage Detection

Not scheduled
20m
University of Stavanger

University of Stavanger

Oral presentation

Speaker

Kamil ALTINAY (University of Birmingham)

Description

Starting from the hypothesis “less is more” this research proposes an auto-mated sewer damage detection method employing automated data filtering to reduce the quantity and maximize the utility of existing datasets. The study utilizes the Sewer-ML dataset and proposes data optimisation strategy that reduces the data to high-quality/relevance data by applying CLIP-IQA (Contrastive Language-Image Pre-training - Image Quality Assessment) to filter images based on their quality, as well as prediction reliability and label checks for automated filtering of high-quality data. A damage detection model, based on Residual Networks (ResNets), is trained on the refined da-taset, achieving a significant increase in performance. The Sewer-ML dataset contains more than 1.3 million images; however, many of these are low-quality or misrepresented, which can negatively impact training outcomes. Our results demonstrate that using less than 2% of the dataset—carefully selected through automated filtering—can yield performance comparable to models trained on the full dataset, validating the effectiveness of the pro-posed method.

Primary authors

Dr Jelena Ninic (University of Birmingham) Kamil ALTINAY (University of Birmingham)

Presentation materials

There are no materials yet.