Urban scene segmentation using homogeneous U-Net ensemble: a study on the cityscapes dataset

dc.contributor.authorHmyria, I. O.
dc.contributor.authorKravets, N. S.
dc.contributor.authorГмиря, І. О.
dc.contributor.authorКравець, Н. С.
dc.date.accessioned2025-12-26T08:48:29Z
dc.date.available2025-12-26T08:48:29Z
dc.date.issued2025
dc.descriptionHmyria I. O. Urban scene segmentation using homogeneous U-Net ensemble: a study on the cityscapes dataset / I. O. Hmyria, N. S. Kravets // Радіоелектроніка, інформатика, управління. – 2025. – № 3 (74). – C. 64-76.
dc.description.abstractEN: Context. Semantic segmentation plays a critical role in computer vision tasks such as autonomous driving and urban scene understanding. While designing new model architectures can be complex, improving performance through ensemble techniques applied to existing models has shown promising potential. This paper investigates ensemble learning as a strategy to enhance segmentation accuracy without modifying the underlying U-Net architecture. Objective. The aim of this work is to develop and evaluate a homogeneous ensemble of U-Net models trained with distinct initialization and data augmentation techniques, and to assess the effectiveness of various ensemble aggregation strategies in improving segmentation performance on complex urban dataset. Method. The proposed approach constructs an ensemble of five structurally identical U-Net models, each trained with unique weight initialization and augmentation schemes to ensure prediction diversity. Several ensemble strategies are examined, including softmax averaging, max voting, proportional weighting, exponential weighting, and optimized weighted voting. Evaluation is conducted on the Cityscapes dataset using a range of segmentation metrics. Results. Experimental findings demonstrate that ensemble models outperform individual U-Net instances and the baseline in terms of accuracy, mean IoU, and specificity. The optimized weighted ensemble achieved the highest accuracy (87.56%) and mean IoU (0.6504), exceeding the best individual model by approximately 3%. However, these improvements come with a notable increase in inference time, highlighting a trade-off between accuracy and computational efficiency. Conclusions. The ensemble-based approach effectively enhances segmentation accuracy while leveraging existing model architectures. Although the increased computational cost presents a limitation for real-time applications, the method is well-suited for high-precision tasks. Future research will focus on reducing inference time and extending the ensemble methodology to other architectures and datasets. UK:
dc.identifier.urihttps://eir.zp.edu.ua/handle/123456789/25707
dc.language.isoen
dc.publisherНаціональний університет «Запорізька політехніка»
dc.subjectconvolutional neural network, semantic segmentation, U-Net, ensemble learning, data augmentation techniques, model initialization, Cityscapes, urban scenes
dc.subjectзгорткова нейронна мережа, семантична сегментація, U-Net, ансамблеве навчання, методи збільшення обсягу даних, ініціалізація ваг, Cityscapes, урбаністичні сцени
dc.titleUrban scene segmentation using homogeneous U-Net ensemble: a study on the cityscapes dataset
dc.title.alternativeСегментація міських сцен за допомогою однорідного ансамблю U-Net: дослідження на датасеті cityscapes
dc.typeArticle

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
S_64 Hmyria.pdf
Size:
922.93 KB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: