Assessment of customer credit with efficient multivariate classifier

Part of : WSEAS transactions on business and economics ; Vol.10, No.3, 2013, pages 170-179

Issue:
Pages:
170-179
Author:
Abstract:
The assessment of customer credit is important for financial institutions. Many techniques are developed for customer credit classification. Traditional methods suffer from the problems of inaccurate prediction and/or inefficient data analysis. In this paper, we adopt the entropy-based evaluation method and PCA-based classifier to improve these two problems. A new feature evaluation criterion collaborated with a novel feature selection are proposed to identify the critical factors of classification. By means of multivariate analysis, not only a reduced subset of relevant features is achieved but also the efficient classifier is generated. This paper aims at assessing customer credit with effectiveness and efficiency. From the experimental results of German credit case, it shows that our propsed method can simultanously improve accuracies in the classes of high-credit and bad-cedit when competing with the traditional C4.5 scheme.
Subject:
Subject (LC):
Keywords:
customer credit assessment, feature selection, multi-class classification, PCA, multivariate, analysis, C4.5
Notes:
Περιέχει σχήματα, πίνακες και βιβλιογραφία
References (1):
  1. [1] http://archive.ics.uci.edu/ml/machine-learningdatabases/statlog/german/[2] http://www.cs.waikato.ac.nz/ml/weka/[3] L. Becchetti, C. Castillo, D. Donato, R. BaezaYates,S. Leonardi, Link analysis for Webspam detection, ACM Transactions on the Web(TWEB), Vol.2, No.1, 2008.[4] T. Bellotti, J. Crook, Support vector machinesfor credit scoring and discovery of significantfeatures, Expert Systems with Applications,Vol.36, No.2, 2009, pp. 3302-3308.[5] R.J. Bolton, D.J Hand, Statistical frauddetection: a review, Statistical Science, Vol.17,No.3, 2002, pp. 235–255.[6] C. Castillo, D. Donato, A. Gionis, V. Murdock,F. Silvestri, Know your neighbors: web spamdetection using the web topology, Annual ACMConference on Research and Development inInformation Retrieval, Amsterdam,Netherlands, 2007, pp. 423-430.[7] V. Chandola, A. Banerjee, V. Kumar, Anomalydetection: A survey, ACM Computing Surveys(CSUR), Vol. 41, No.3, 2009.[8] L. Chen, Q. Ye, Y. Li, Research on GA-basedbank customer's credit evaluation, ComputerEngineering, Vol.32, No.3, 2007, pp. 70-72.[9] D. Delen, G. Walker, A. Kadam, Predictingbreast cancer survivability: a comparison ofthree data mining methods, ArtificialIntelligence in Medicine, Vol.34, No.2, 2005,pp. 113-127.[10] C. Ding, H. Peng, Minimum redundancyfeature selection from microarray geneexpression data, Journal of Bioinformatics andComputational Biology, Vol.3, No.2, 2005, pp.185–205.[11] V. Estivill-Castro, I. Lee, Data miningtechniques for autonomous exploration of largevolumes of geo-referenced crime data,Proceedings of the 6th InternationalConference on Geocomputation, 2001.[12] J. A. Etzel, V. Gazzola, C. Keysers, Testingsimulation theory with cross-modalmultivariate classification of fMRI data. PLoSONE, Vol.3, N.11, 2008.[13] Y. Feng, Z. Wu, X. Zhou, Z. Zhou, W. Fan,Knowledge discovery in traditional Chinesemedicine: State of the art and perspectives.Artificial Intelligence in Medicine, Vol.38,No.3, 2006, pp. 219-236.[14] K. Fukunaga, Introduction to Statistical PatternRecognition (2nd ed.). Academic Press, 1990.[15] A. Genkin, D.D. Lewis, D. Madigan, LargescaleBayesian logistic regression for textcategorization. Technometrics, Vol.49, No.3,2007, pp. 291-304.[16] X. Hao, W. Deng-sheng, X. Yang-Qun, Studyon enterprise credit evaluation based onPCA/FCM, Technology Economics, 3, 2007.[17] A.J. Hawyard, Mathematics and Politics (NewYork: Macmillan Company, 1965), as cited in a1989 computer program by Tom Finholt,Department of Social and Decision Sciences,Carnegie Mellon University.[18] A. Khashman, Neural networks for credit riskevaluation: Investigation of different neuralmodels and learning schemes. Expert Systemswith Applications, Vol.37, No.9, 2010, pp.6233-6239.[19] H.-W. Kim, H. C. Chan, S. Gupta, Value-basedadoption of mobile internet: An empiricalinvestigation, Decision Support Systems,Vol.43, No.1, 2007, pp. 111-126.[20] C. Orsenigo, C.Vercellis, Multivariateclassification trees based on minimum featuresdiscrete support vector machines, IMA Journalof Management Mathematics, Vol.14, No.3,2003, pp. 221-234.[21] C. Piao, J.An, M. Fang, Study on creditevaluation model and algorithm for C2C ECommerce,IEEE International Conference one-Business Engineering, Hong Kong, 2007, pp.392-395.[22] X. Qi, B.D. Davison, Web page classification:Features and algorithms. ACM ComputingSurveys (CSUR), Vol.41, No.2, 2009.[23] M. Šušteršič, D. Mramor, J. Zupan, ().Consumer credit scoring models with limiteddata. Expert Systems with Applications, Vol.36,No.3, 2009, pp. 4736-4744.[24] C.F. Tsai, Feature selection in bankruptcyprediction, Knowledge-Based Systems, Vol.22,No.2, 2009, pp. 120-127.[25] C. Wu, H. Xia, Study of personal creditevaluation under C2C environment based onsupport vector machines ensemble,International Conference on ManagementScience and Engineering, pp. 25-31, CA: LongBeach, 2008.[26] L. Yu, W. Yue, S. Wang, K.K. Lai, Supportvector machine based multiagent ensemblelearning for credit risk evaluation, ExpertSystems with Applications, Vol.37, No.2, 2010,pp. 1351-1360.[27] H. Zhao, A multi-objective geneticprogramming approach to developing Paretooptimal decision trees, Decision SupportSystems, Vol.43, No.3, 2007, pp. 809-826.[28] M. Zucknick, S. Richardson, E.A. Stronach,Comparing the characteristics of geneexpression profiles derived by univariate andmultivariate classification methods, StatisticalApplications in Genetics and MolecularBiology, Vol.7, No.1, 2008.