Polish Statistical Association
Central Statistical Office of Poland
Subject: Economics, Statistics & Probability
ISSN: 1234-7655
eISSN: 2450-0291
SEARCH WITHIN CONTENT
Katarzyna Stąpor * / Tomasz Smolarczyk * / Piotr Fabian *
Keywords : heteroscedastic discriminant analysis, feature subset selection, variable importance, credit scoring model
Citation Information : Statistics in Transition New Series. Volume 17, Issue 2, Pages 265-280, DOI: https://doi.org/10.21307/stattrans-2016-018
License : (CC BY 4.0)
Published Online: 06-July-2017
Credit granting is a fundamental question and one of the most complex tasks that every credit institution is faced with. Typically, credit scoring databases are often large and characterized by redundant and irrelevant features. An effective classification model will objectively help managers instead of intuitive experience. This study proposes an approach for building a credit scoring model based on the combination of heteroscedastic extension (Loog, Duin, 2002) of classical Fisher Linear Discriminant Analysis (Fisher, 1936, Krzyśko, 1990) and a feature selection algorithm that retains sufficient information for classification purpose. We have tested five feature subset selection algorithms: two filters and three wrappers. To evaluate the accuracy of the proposed credit scoring model and to compare it with the existing approaches we have used the German credit data set from the study (Chen, Li, 2010). The results of our study suggest that the proposed hybrid approach is an effective and promising method for building credit scoring models.