Authors: David Bell, Gongde Guo, Hui Wang, Zhining Liao
Tags: 2004, conceptual modeling
The k-Nearest-Neighbors (kNN) method for classification is simple but effective in many cases. The success of kNN in classification depends on the selection of a “good value” for k. In this paper, we proposed a contextual probability-based classification algorithm (CPC) which looks at multiple sets of nearest neighbors rather than just one set of k nearest neighbors for classification to reduce the bias of k. The proposed formalism is based on probability, and the idea is to aggregate the support of multiple neighborhoods for various classes to better reveal the true class of each new instance. To choose a series of more relevant neighborhoods for aggregation, three neighborhood selection methods: distance-based, symmetric-based, and entropy-based neighborhood selection methods are proposed and evaluated respectively. The experimental results show that CPC obtains better classification accuracy than kNN and is indeed less biased by k after saturation is reached. Moreover, the entropy-based CPC obtains the best performance among the three proposed neighborhood selection methods.Read the full paper here: https://link.springer.com/chapter/10.1007/978-3-540-30464-7_25