Nearest "Word" Problem

A place to discuss the science of computers and programs, from algorithms to computability.

Formal proofs preferred.

Moderators: phlip, Larson, Moderators General, Prelates

Nearest "Word" Problem

Postby RoadieRich » Mon Apr 09, 2012 3:03 pm UTC

I've got to implement a system that, given a "signature" (a,b,c,d) where a-d are single digits, returns nearest neighbours within a data structure, with a percentage difference.

Naively, you could use a binary search tree, treating the signature as a single int for the key, but there's a chance that you'll miss a neighbour without doing a full iteration of the tree, so I want to do something different. My current thought is to do something similar to a binary tree, but using the difference (using Euclidean distance), and intertwining a sorted doubly-linked list (so each node has pointers left, right, previous and next, and maybe parent), using the tree as the primary search, and using the linked list to quickly find neighbours. What are your thoughts on this?

I've read about BK Trees, but I can't find any good information on them, except for this blog post. Does anyone know where else I can try looking for info on BK Trees, or if there's a better structure I should look into?
roband wrote:Mav is a cow.

UniJam 2012: Inter-university Games Jam hosted by Nottingham Trent University DevSoc.
nlug: Nottingham Linux User Group
DevSoc: The Nottingham Trent University Software Development Society
User avatar
RoadieRich
The Black Hand
 
Posts: 1030
Joined: Tue Feb 12, 2008 11:40 am UTC
Location: Somewhere only we know

Re: Nearest "Word" Problem

Postby freakish777 » Mon Apr 09, 2012 5:59 pm UTC

Sounds like you're learning about K Nearest Neighbors?

R-Trees, M-Trees, K-D Trees and BSP-Trees should all work (each solves the problem in a slightly different fashion). Each has a (much) larger wiki entry than BK Trees.
User avatar
freakish777
 
Posts: 328
Joined: Wed Jul 13, 2011 2:14 pm UTC

Re: Nearest "Word" Problem

Postby korona » Mon Apr 09, 2012 7:57 pm UTC

If the number of dimensions is large there is locality sensitive hashing, if it is low there are the trees mentioned above.
Unfortunately I don't know if 4 is large :D.
korona
 
Posts: 116
Joined: Sun Jul 04, 2010 8:40 pm UTC

Re: Nearest "Word" Problem

Postby RoadieRich » Tue Apr 10, 2012 4:01 am UTC

It looks like a k-d tree will do what I want. Thanks!
roband wrote:Mav is a cow.

UniJam 2012: Inter-university Games Jam hosted by Nottingham Trent University DevSoc.
nlug: Nottingham Linux User Group
DevSoc: The Nottingham Trent University Software Development Society
User avatar
RoadieRich
The Black Hand
 
Posts: 1030
Joined: Tue Feb 12, 2008 11:40 am UTC
Location: Somewhere only we know


Return to Computer Science

Who is online

Users browsing this forum: Slageammalymn and 3 guests