Data Preprocessors - Cleaning Data and Removing NoiseΒΆ

Data preprocessing describes the process in which a noisy data set is processed into a cleaned data set by transforming its data features. There are different kinds of noise, two prominent examples are biological noise and technical noise. Both affect how precisely a data set reflects the true information. Biological noise describes the fact that biological systems exhibit variation which may be interpreted falsely as signal. Technical noise is introduced into the data set during its measurement process by technical confounding factors.

ClustEval provides means to clean data sets from noise. For a list of all available data preprocessors head over here.

Check Writing Data Preprocessors for more information on how to extend ClustEval by new data preprocessors.