Article | 22-July-2019
This paper focuses on hierarchical clustering of categorical data and compares two approaches which can be used for this task. The first one, an extremely common approach, is to perform a binary transformation of the categorical variables into sets of dummy variables and then use the similarity measures suited for binary data. These similarity measures are well examined, and they occur in both commercial and non-commercial software. However, a binary transformation can possibly cause a loss of
Jana Cibulková,
Zdenek Šulc,
Sergej Sirota,
Hana Rezanková
Statistics in Transition New Series, Volume 20 , ISSUE 2, 33–47
Research Article | 13-December-2019
Large-scale complex surveys typically contain a large number of variables measured on an even larger number of respondents. Missing data is a common problem in such surveys. Since usually most of the variables in a survey are categorical, multiple imputation requires robust methods for modelling high-dimensional categorical data distributions. This paper introduces the 3-stage Hybrid Multiple Imputation (HMI) approach, computationally efficient and easy to implement, to impute complex survey
Humera Razzak,
Christian Heumann
Statistics in Transition New Series, Volume 20 , ISSUE 4, 33–58