Approximate k Nearest Neighbor Search with Linear Combination Method

Authors

1 Computer Engineering Department, Imam Reza International University, Mashhad, Iran

2 Faculty of Computer Engineering and Information Technology, Sadjad University of Technology, Mashhad, Iran

Abstract

Approximate k nearest neighbor search problem in high dimensional Euclidean spaces is a classical problem in computational geometry, image similarity search, video, and so on. In this problem, we are given a point set P of size n in the d-dimensional space and a parameter k, the goal is to preprocess P. So that given a query point q we can return fairly fast k points in which the points are good approximations of the k nearest neighbors to Q in P. In this paper, an algorithm for searching k nearest neighbor is presented for high dimensional data. In this method, first, data with high-dimensional are embedded in hamming space, then with a linear combination of random vectors and embedded data in hamming space, hash tables are formed. We conduct extensive experiments for this algorithm on big dataset of handwriting English single-digit images. This algorithm led good results for sparse matrices. Experimental results show that the proposed algorithm has the better accuracy comparing to the new methods.

Keywords