Data clustering based on inductive learning of neuro-fuzzy network with distance hashing

Subbotin, S. A.; Субботін, Сергій Олександрович

Data clustering based on inductive learning of neuro-fuzzy network with distance hashing

dc.contributor.author	Subbotin, S. A.
dc.contributor.author	Субботін, Сергій Олександрович
dc.date.accessioned	2026-02-06T11:11:41Z
dc.date.available	2026-02-06T11:11:41Z
dc.date.issued	2022
dc.description	Subbotin S. A. Data clustering based on inductive learning of neuro-fuzzy network with distance hashing / S. A. Subbotin // Радіоелектроніка, інформатика, управління. – 2022. – № 4 (63). – C. 71-85.
dc.description.abstract	EN: Context. Cluster analysis is widely used to analyze data of various nature and dimensions. However, the known methods of cluster analysis are characterized by low speed and are demanding on computer memory resources due to the need to calculate pairwise distances between instances in a multidimensional feature space. In addition, the results of known methods of cluster analysis are difficult for human perception and analysis with a large number of features. Objective. The purpose of the work is to increase the speed of cluster analysis, the interpretability of the resulting partition into clusters, as well as to reduce the requirements of cluster analysis to computer memory. Method. A method for cluster analysis of multidimensional data is proposed, which for each instance calculates its hash based on the distance to the conditional center of coordinates, uses a one-dimensional coordinate along the hash axis to determine the distances between instances, considers the resulting hash as a pseudo-output feature, breaking it into intervals, which matches the labels pseudo-classes – clusters, having received a rough crisp partition of the feature space and sample instances, automatically generates a partition of input features into fuzzy terms, determines the rules for referring instances to clusters and, as a result, forms a fuzzy inference system of the Mamdani-Zadeh classifier type, which is further trained in the form of a neuro-fuzzy network to ensure acceptable values of the clustering quality functional. This makes it possible to reduce the number of terms and features used, to evaluate their contribution to making decisions about assigning instances to clusters, to increase the speed of data cluster analysis, and to increase the interpretability of the resulting data splitting into clusters. Results. The mathematical support for solving the problem of cluster data analysis in conditions of large data dimensions has been developed. The experiments confirmed the operability of the developed mathematical support have been carried out. Conclusions. The developed method and its software implementation can be recommended for use in practice in the problems of analyzing data of various nature and dimensions. UK: Актуальність. Для аналізу даних різної природи та розмірності широко застосовують кластерний аналіз. Однак відомі методи кластер-аналізу характеризуються низькою швидкістю та є вимогливими до ресурсів пам’яті ЕОМ внаслідок необхідності розрахунку попарних відстаней між екземплярами у багатовимірному просторі ознак. Крім того, результати відомих методів кластер-аналізу складні для сприйняття та аналізу людиною при великій кількості ознак. Мета – підвищення швидкості кластер-аналізу, інтерпретабельності одержуваного розбиття на кластери, а також зниження вимог кластер-аналізу до пам’яті ЕОМ. Метод. Запропоновано метод кластер-аналізу багатовимірних даних, який для кожного екземпляра обчислює його хеш на основі відстані до умовного центру координат, використовує одновимірну координату по осі хешу для визначення відстаней між екземплярами, розглядає отриманий хеш як псевдовихідну ознаку, розбивши її на інтервали, яким співставляє мітки псевдокласів-кластерів, отримавши грубе чітке розбиття простору ознак і екземплярів вибірки, автоматично формує розбиття вхідних ознак на нечіткі терми, визначає правила віднесення екземплярів до кластерів і в результаті формує систему нечіткого виведення типу класифікатора Мамдані-Заде, який у вигляді нейро-нечіткої мережі донавчається для забезпечення прийнятного значення функціоналу якості кластеризації. Це дозволяє скоротити кількість використовуваних термів і ознак, оцінити їх внесок у прийняття рішень про віднесення екземплярів до кластерів, підвищити швидкість кластер-аналізу даних, а також підвищити інтерпретабельність отримуваного розбиття даних на кластери. Результати. Розроблено математичне забезпечення, що дозволяє вирішувати завдання кластерного аналізу даних в умовах великої розмірності даних, проведено експерименти, що підтвердили працездатність розробленого математичного забезпечення. Висновки. Розроблений метод та його програмна реалізація можуть бути рекомендовані для використання практиці у завданнях аналізу даних різної природи та розмірності.
dc.identifier.uri	https://eir.zp.edu.ua/handle/123456789/26648
dc.language.iso	en
dc.publisher	Національний університет "Запорізька політехніка"
dc.subject	cluster analysis
dc.subject	neuro-fuzzy network
dc.subject	hash
dc.subject	fuzzy inference
dc.subject	data analysis
dc.subject	кластер-аналіз
dc.subject	нейро-нечітка мережа
dc.subject	хеш
dc.subject	нечітке виведення
dc.subject	аналіз даних
dc.title	Data clustering based on inductive learning of neuro-fuzzy network with distance hashing
dc.title.alternative	Нейро-нечітка мережа для кластеризації даних з хешуванням відстаней та самонавчанням
dc.type	Article

Files

Original bundle

Now showing 1 - 1 of 1

Name:: S_71 Subbotin.pdf
Size:: 1.04 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Радіоелектроніка, інформатика, управління - 2022, №4 (63)