## Nearest "Word" Problem

A place to discuss the science of computers and programs, from algorithms to computability.

Formal proofs preferred.

Moderators: phlip, Larson, Moderators General, Prelates

### Nearest "Word" Problem

I've got to implement a system that, given a "signature" (a,b,c,d) where a-d are single digits, returns nearest neighbours within a data structure, with a percentage difference.

Naively, you could use a binary search tree, treating the signature as a single int for the key, but there's a chance that you'll miss a neighbour without doing a full iteration of the tree, so I want to do something different. My current thought is to do something similar to a binary tree, but using the difference (using Euclidean distance), and intertwining a sorted doubly-linked list (so each node has pointers left, right, previous and next, and maybe parent), using the tree as the primary search, and using the linked list to quickly find neighbours. What are your thoughts on this?

I've read about BK Trees, but I can't find any good information on them, except for this blog post. Does anyone know where else I can try looking for info on BK Trees, or if there's a better structure I should look into?
roband wrote:Mav is a cow.

UniJam 2012: Inter-university Games Jam hosted by Nottingham Trent University DevSoc.
nlug: Nottingham Linux User Group
DevSoc: The Nottingham Trent University Software Development Society

RoadieRich
The Black Hand

Posts: 1030
Joined: Tue Feb 12, 2008 11:40 am UTC
Location: Somewhere only we know

### Re: Nearest "Word" Problem

Sounds like you're learning about K Nearest Neighbors?

R-Trees, M-Trees, K-D Trees and BSP-Trees should all work (each solves the problem in a slightly different fashion). Each has a (much) larger wiki entry than BK Trees.

freakish777

Posts: 328
Joined: Wed Jul 13, 2011 2:14 pm UTC

### Re: Nearest "Word" Problem

If the number of dimensions is large there is locality sensitive hashing, if it is low there are the trees mentioned above.
Unfortunately I don't know if 4 is large .
korona

Posts: 116
Joined: Sun Jul 04, 2010 8:40 pm UTC

### Re: Nearest "Word" Problem

It looks like a k-d tree will do what I want. Thanks!
roband wrote:Mav is a cow.

UniJam 2012: Inter-university Games Jam hosted by Nottingham Trent University DevSoc.
nlug: Nottingham Linux User Group
DevSoc: The Nottingham Trent University Software Development Society

RoadieRich
The Black Hand

Posts: 1030
Joined: Tue Feb 12, 2008 11:40 am UTC
Location: Somewhere only we know

Return to Computer Science

### Who is online

Users browsing this forum: No registered users and 5 guests