Looking for an algorithm to test for abridgement

A place to discuss the science of computers and programs, from algorithms to computability.

Formal proofs preferred.

Moderators: phlip, Moderators General, Prelates

User avatar
Qaanol
The Cheshirest Catamount
Posts: 3054
Joined: Sat May 09, 2009 11:55 pm UTC

Looking for an algorithm to test for abridgement

Postby Qaanol » Fri Mar 16, 2018 4:22 am UTC

Here’s the scenario:

• You have a book, and some of the words are underlined.
• I hand you another book.
• You have to figure out if it is possible to start with your book, erase some of the non-underlined words, and end up with my book (or at least, the same words in the same order as my book).

The straightforward “greedy” algorithm takes exponential time, and I’d like to do better.

It only takes linear time to check that the words of my book are a subsequence of the words in your book. It also only takes linear time to check that the underlined words in your book are a subsequence of the words in my book. Yet even with both of those conditions satisfied, I can’t see a good way to tackle the original problem.
wee free kings

>-)
Posts: 524
Joined: Tue Apr 24, 2012 1:10 am UTC

Re: Looking for an algorithm to test for abridgement

Postby >-) » Fri Mar 16, 2018 6:05 am UTC

Here is a first attempt:

Name the words of the book as X_i for the ith word in the first book (which contains underlined words), and Y_j for the jth word in the second book. Let U(i) denote whether X_i is underlined.

Define A(i,j) to be a boolean function such that A(i,j) is true when X[i:] can be abridged to Y[j:] (in other words, consider all words after and including word X_i, and Y_j -- with 0 indexing).

So the answer to our question is given by A(0,0)

Then we can compute A(0,0) in time |X|*|Y| as follows:

A(|X|, |Y|) = True
A(|X|, z) = False, for z < |Y|

Other than these special cases, we have the following cases

A(i,j) = ...

A(i+1, j+1) if U(i) and X_i = Y_j

False if U(i) and X_i != Y_j

A(i+1, j) if X_i if not U(i) and X_i != Y_i

OR[A(i+1, j), A(i+1, j+1)] if not U(i) and X_i = Y_j

User avatar
notzeb
Without Warning
Posts: 629
Joined: Thu Mar 08, 2007 5:44 am UTC
Location: a series of tubes

Re: Looking for an algorithm to test for abridgement

Postby notzeb » Fri Mar 16, 2018 6:07 am UTC

Spoiler:
You can do this in quadratic time with dynamic programming. As you go through the original book, word by word, keep track of all the possible corresponding positions you might be at in the possibly-abridged-book. Your list has at most linear size at each step, and updating it once takes at most linear time.

Is linear time possible?


Edit: ninja'd
Zµ«V­jÕ«ZµjÖ­Zµ«VµjÕ­ZµkV­ZÕ«VµjÖ­Zµ«V­jÕ«ZµjÖ­ZÕ«VµjÕ­ZµkV­ZÕ«VµjÖ­Zµ«V­jÕ«ZµjÖ­ZÕ«VµjÕ­ZµkV­ZÕ«ZµjÖ­Zµ«V­jÕ«ZµjÖ­ZÕ«VµjÕ­Z

User avatar
Qaanol
The Cheshirest Catamount
Posts: 3054
Joined: Sat May 09, 2009 11:55 pm UTC

Re: Looking for an algorithm to test for abridgement

Postby Qaanol » Fri Mar 16, 2018 8:50 pm UTC

Thanks!
wee free kings


Return to “Computer Science”

Who is online

Users browsing this forum: No registered users and 5 guests