1185: "Ineffective Sorts"

This forum is for the individual discussion thread that goes with each new comic.

Moderators: Moderators General, Prelates, Magistrates

bluegrey
Posts: 1
Joined: Wed Mar 20, 2013 7:24 pm UTC

Re: 1185: "Ineffective Sorts"

Postby bluegrey » Wed Mar 20, 2013 7:32 pm UTC

The important bound on the runtime of sleepsort will be a function of the operating system's task scheduler. This looks to be O(log (N)).

bfeist
Posts: 12
Joined: Mon May 07, 2012 10:54 am UTC

Re: 1185: "Ineffective Sorts"

Postby bfeist » Thu Mar 21, 2013 2:35 am UTC

Elvish Pillager wrote:If n = O(k log k) then log n = O(log(k log k)) = O(log k + log log k) = O(log k)
Hence there is no asymptotic difference between log n and log k.


You're right; thanks!

Mr_Happy
Posts: 1
Joined: Thu Mar 21, 2013 10:26 pm UTC

Re: 1185: "Ineffective Sorts"

Postby Mr_Happy » Thu Mar 21, 2013 10:39 pm UTC

So yeah, I implemented StackSort in C# (in one go so it's really ugly :P ). It sorts my list of 4 elements in about 4 minutes on my machine, it's close to O(1). I'm a first time poster (long time lurker) so I can't place the code or an actual link, but you can find it here :) : pastebin dot com/pzP19zJa

User avatar
pitareio
Posts: 128
Joined: Wed Sep 19, 2012 7:03 pm UTC
Location: the land of smelly cheese

Re: 1185: "Ineffective Sorts"

Postby pitareio » Fri Mar 22, 2013 8:06 pm UTC

Randall, you created a monster. If people start coding programs that search StackOverflow, download and compile the code, the next step is coding programs that will search the internet for programs that will search StackOverflow for the solution to any given problem. And the day is close when all computer programmers will actually be replaced by computer programs.

Or maybe it's just part of the plan that will lead to the singularity.

Kit.
Posts: 1117
Joined: Thu Jun 16, 2011 5:14 pm UTC

Re: 1185: "Ineffective Sorts"

Postby Kit. » Fri Mar 22, 2013 9:17 pm UTC

pitareio wrote:Randall, you created a monster. If people start coding programs that search StackOverflow, download and compile the code, the next step is coding programs that will search the internet for programs that will search StackOverflow for the solution to any given problem. And the day is close when all computer programmers will actually be replaced by computer programs.

...except those who put stuff on StackOverflow.

HarpyRotter
Posts: 6
Joined: Wed Mar 20, 2013 11:16 am UTC

Re: 1185: "Ineffective Sorts"

Postby HarpyRotter » Fri Mar 22, 2013 10:36 pm UTC

Mr_Happy wrote:So yeah, I implemented StackSort in C# (in one go so it's really ugly :P ). It sorts my list of 4 elements in about 4 minutes on my machine, it's close to O(1). I'm a first time poster (long time lurker) so I can't place the code or an actual link, but you can find it here :) : pastebin dot com/pzP19zJa


Dude, that's freaking awesome! But still, 4 minutes? What's keeping it so long? Is it only finding a good algorithm after a few tries? Perhaps you should try ordering the search results under different criteria, such as number of votes vs. relevance vs. oldest, and if you do, please tell us your results :D.

User avatar
pitareio
Posts: 128
Joined: Wed Sep 19, 2012 7:03 pm UTC
Location: the land of smelly cheese

Re: 1185: "Ineffective Sorts"

Postby pitareio » Fri Mar 22, 2013 10:44 pm UTC

HarpyRotter wrote:
Mr_Happy wrote:So yeah, I implemented StackSort in C# (in one go so it's really ugly :P ). It sorts my list of 4 elements in about 4 minutes on my machine, it's close to O(1). I'm a first time poster (long time lurker) so I can't place the code or an actual link, but you can find it here :) : pastebin dot com/pzP19zJa


Dude, that's freaking awesome! But still, 4 minutes? What's keeping it so long? Is it only finding a good algorithm after a few tries? Perhaps you should try ordering the search results under different criteria, such as number of votes vs. relevance vs. oldest, and if you do, please tell us your results :D.


The problem is that he'll need a sort algorithm for that.

EugeneStyles
Posts: 38
Joined: Fri Apr 27, 2007 2:59 am UTC

Re: 1185: "Ineffective Sorts"

Postby EugeneStyles » Sat Mar 23, 2013 1:34 am UTC

Okay, so this comic got me thinking about sort algorithms again, as I sometimes do (thanks to whoever posted sleepsort, BTW, that's brilliant. Stupid, but brilliant), and especially the "half-hearted" sort got me thinking. I can think of some applications where a half-hearted sort might actually be pretty useful - search result rankings, for instance. You want them in "relevance" order, but relevance is a nebulous thing, and you really don't care about the relative positions of items that are within about 5% of each other in relevance. If you can do better than O(n log n) complexity by not caring if the result is fully sorted, that might be worthwhile. Maybe even bonus points if the order of "close" values is non-deterministic.

So, thinking that someone had probably already thought about it (since there are so many articles even on purposely bad search algorithms), I looked around, but couldn't find anything. Does anyone know anything about half-hearted sorting algorithms? Or am I basically just talking about a specialized hash table?

(edit: my word-brain isn't working well this evening)

Manabu
Posts: 30
Joined: Tue Nov 30, 2010 1:57 am UTC

Re: 1185: "Ineffective Sorts"

Postby Manabu » Sun Mar 24, 2013 10:07 pm UTC

EugeneStyles wrote:Okay, so this comic got me thinking about sort algorithms again, as I sometimes do (thanks to whoever posted sleepsort, BTW, that's brilliant. Stupid, but brilliant), and especially the "half-hearted" sort got me thinking. I can think of some applications where a half-hearted sort might actually be pretty useful - search result rankings, for instance. You want them in "relevance" order, but relevance is a nebulous thing, and you really don't care about the relative positions of items that are within about 5% of each other in relevance. If you can do better than O(n log n) complexity by not caring if the result is fully sorted, that might be worthwhile. Maybe even bonus points if the order of "close" values is non-deterministic.

So, thinking that someone had probably already thought about it (since there are so many articles even on purposely bad search algorithms), I looked around, but couldn't find anything. Does anyone know anything about half-hearted sorting algorithms? Or am I basically just talking about a specialized hash table?

The half-hearted merge sort, as shown in the comic, don't even start to sort anything. It is completely useless. What can be more useful is a "half-hearted quicksort", where when reaching small enough partition size, it stops sorting instead of handing the small array to a insertion sort. But you only gain on the constant factor, it is still O(n lg n).

If you have many items with a convenient binary representation, like integers, MSB radix sort is probably the best choice, runs in O(n), and it can too be kinda stopped in the middle for a partially ordered array. But as you will be probably using radix sizes of 256, opportunities for stopping in the middle while keeping reasonable closeness will be narrow.

User avatar
Darekun
Posts: 38
Joined: Thu Jan 03, 2013 7:57 am UTC

Re: 1185: "Ineffective Sorts"

Postby Darekun » Wed Apr 10, 2013 3:55 am UTC

I'm not real sure why, but I optimized sleepsort:

Code: Select all

PBYTE OrdSort(PBYTE pBytesToSort, int Length)
{
   BOOL bMember[256];
   int iSrc, iDst = 0, iMember;

   //This part represents the spawning of the threads.
   PBYTE Result = (PBYTE)malloc(Length); //Not an in-place sort.
   for ( iMember = 0; iMember < 256; ++iMember ) bMember[iMember] = false;
   for ( iSrc = 0; iSrc < Length; ++iSrc ) bMember[pBytesToSort[iSrc]] = true;

   //This part represents the task scheduler dequeuing threads.
   for ( iMember = 0; iMember < 256; ++iMember ) if ( bMember[iMember] )
      for ( iSrc = 0; iSrc < Length; ++iSrc ) if ( pBytesToSort[iSrc] == iMember )
      {
         //This part represents the threads delivering their payloads.
         Result[iDst] = iMember;
         ++iDst;
      }

   return Result;
}

(Translated from Delphi as I post this, there may be bugs but hopefully you get the idea.)

It… looks like it's O(N).
Spoiler:
More accurately, it limits at K×N×V time, where K is constant, N is the length of input as usual, and V is the number of unique values in the input. There's a small time contribution from the size of each element, but that has more effect on the memory it needs. An optimization for large elements would be to make bMember a collection instead of an array, but V is more normally proportional to the size of each element, and by this point you've probably realized why I made the example sort bytes.




Unrelated, I've coded three sorts in the past month or so — all based on bubble sort. Bubble sort is actually not that bad when N is very small… like the number of attributes in an RPG. The other two were garage games where it helps to run a partial bubble sort(one pass each direction, O(N)) each frame, as a half-hearted sort.

Appity
Posts: 6
Joined: Wed May 21, 2008 4:33 am UTC

Re: 1185: "Ineffective Sorts"

Postby Appity » Wed Apr 10, 2013 11:59 am UTC

Darekun wrote:I'm not real sure why, but I optimized sleepsort:

Code: Select all

PBYTE OrdSort(PBYTE pBytesToSort, int Length)
{
   BOOL bMember[256];
   int iSrc, iDst = 0, iMember;

   //This part represents the spawning of the threads.
   PBYTE Result = (PBYTE)malloc(Length); //Not an in-place sort.
   for ( iMember = 0; iMember < 256; ++iMember ) bMember[iMember] = false;
   for ( iSrc = 0; iSrc < Length; ++iSrc ) bMember[pBytesToSort[iSrc]] = true;

   //This part represents the task scheduler dequeuing threads.
   for ( iMember = 0; iMember < 256; ++iMember ) if ( bMember[iMember] )
      for ( iSrc = 0; iSrc < Length; ++iSrc ) if ( pBytesToSort[iSrc] == iMember )
      {
         //This part represents the threads delivering their payloads.
         Result[iDst] = iMember;
         ++iDst;
      }

   return Result;
}

(Translated from Delphi as I post this, there may be bugs but hopefully you get the idea.)

It… looks like it's O(N).
Spoiler:
More accurately, it limits at K×N×V time, where K is constant, N is the length of input as usual, and V is the number of unique values in the input. There's a small time contribution from the size of each element, but that has more effect on the memory it needs. An optimization for large elements would be to make bMember a collection instead of an array, but V is more normally proportional to the size of each element, and by this point you've probably realized why I made the example sort bytes.




Unrelated, I've coded three sorts in the past month or so — all based on bubble sort. Bubble sort is actually not that bad when N is very small… like the number of attributes in an RPG. The other two were garage games where it helps to run a partial bubble sort(one pass each direction, O(N)) each frame, as a half-hearted sort.



The enigma that is sleepsort just keeps getting more interesting to me, and I'm not exactly sure why. The audacity, for this sorting algorithm to join together such different concepts. I get the same unexplainable joy when I see physics take extreme liberties with mathematics to arrive at a good solution.

Thanks for sharing, Darekun!!

User avatar
phlip
Restorer of Worlds
Posts: 7573
Joined: Sat Sep 23, 2006 3:56 am UTC
Location: Australia
Contact:

Re: 1185: "Ineffective Sorts"

Postby phlip » Wed Apr 10, 2013 2:13 pm UTC

Darekun: With a few tweaks, to make it able to handle having input values that are equal, and the like... this essentially reduces to radix sort, which is O(n), but has a bunch of restrictions that apply to the input (eg the values you're sorting must be integers, and must have a compile-time-fixed minimum and maximum range).

Code: Select all

enum ಠ_ಠ {°□°╰=1, °Д°╰, ಠ益ಠ╰};
void ┻━┻︵​╰(ಠ_ಠ ⚠) {exit((int)⚠);}
[he/him/his]

User avatar
Darekun
Posts: 38
Joined: Thu Jan 03, 2013 7:57 am UTC

Re: 1185: "Ineffective Sorts"

Postby Darekun » Fri Apr 12, 2013 2:58 am UTC

Appity wrote:The enigma that is sleepsort just keeps getting more interesting to me, and I'm not exactly sure why. The audacity, for this sorting algorithm to join together such different concepts.

There's also a deliberate conceptual brutality to it — it's not just deoptimizing a sort, it's using the deoptimization to do work "efficiently".

Appity wrote:Thanks for sharing, Darekun!!

Certainly!

phlip wrote:Darekun: With a few tweaks, to make it able to handle having input values that are equal, and the like... this essentially reduces to radix sort, which is O(n), but has a bunch of restrictions that apply to the input (eg the values you're sorting must be integers, and must have a compile-time-fixed minimum and maximum range).

Mine does actually handle duplicates; otherwise you could get rid of the inner for-if, leaving its contents running on the outer for-if.

In a sense, the part that represents spawning threads detects the range — or did you mean in the sense that my implementation uses a range of 00h-FFh? From the Wikipedia page, it looks like it'd be relatively easy to have item length as a parameter to a radix sort, and then run that many passes…

But, interesting, sleepsort is a deoptimized radix sort with only one pass. Since it's deoptimizing a fast sort, I guess the next question is what's the slowest sort we can apply this deoptimization to? :3

User avatar
phlip
Restorer of Worlds
Posts: 7573
Joined: Sat Sep 23, 2006 3:56 am UTC
Location: Australia
Contact:

Re: 1185: "Ineffective Sorts"

Postby phlip » Fri Apr 12, 2013 3:29 am UTC

Darekun wrote:In a sense, the part that represents spawning threads detects the range — or did you mean in the sense that my implementation uses a range of 00h-FFh?

Yeah, it's that the inputs are restricted to the range of a byte... so your code can loop from 0 to 255 and still call that O(1) since it's independent of the size of the input. If you make the range of the inputs unbounded then suddenly the length of that loop needs to vary with the input, or you need to do some kind of extra processing, and that affects the asymptotic runtime.

Code: Select all

enum ಠ_ಠ {°□°╰=1, °Д°╰, ಠ益ಠ╰};
void ┻━┻︵​╰(ಠ_ಠ ⚠) {exit((int)⚠);}
[he/him/his]

keldor
Posts: 81
Joined: Thu Jan 26, 2012 9:18 am UTC

Re: 1185: "Ineffective Sorts"

Postby keldor » Tue May 07, 2013 1:57 am UTC

Presenting:

SuperBogoSort!

Code: Select all

superBogoSort(input)
{
    while (true)
    {
        //  generate a list of random elements
        candidate = generateRandomList(input.length);
        // is our list sorted?
        if (isSorted(candidate))
        {
            //does our list match the input?
            check = shuffle(candidate);
            if (input.equals(check))
                return candidate;
        }
    }
}


Works in O(n!*j^n), where j = the number of possible elements.

elasto
Posts: 3778
Joined: Mon May 10, 2010 1:53 am UTC

Re: 1185: "Ineffective Sorts"

Postby elasto » Tue May 07, 2013 2:35 am UTC

Why generate a random list of the same length? Feels like cheating!

Generate a random list of random length, then check if it's sorted, and then check if it matches a randomly shuffled version of the input.

Much better!

rmsgrey
Posts: 3655
Joined: Wed Nov 16, 2011 6:35 pm UTC

Re: 1185: "Ineffective Sorts"

Postby rmsgrey » Tue May 07, 2013 1:40 pm UTC

elasto wrote:Why generate a random list of the same length? Feels like cheating!

Generate a random list of random length, then check if it's sorted, and then check if it matches a randomly shuffled version of the input.

Much better!


And it's still only O(n2) time for a non-deterministic computer (unless you also parallelise the check whether the two lists are permutations, in which case it's O(n) to check that the candidate is sorted, or O(1) if you also parallelise that)

lgw
Posts: 437
Joined: Mon Apr 12, 2010 10:52 pm UTC

Re: 1185: "Ineffective Sorts"

Postby lgw » Tue May 07, 2013 5:44 pm UTC

Manabu wrote:If you have many items with a convenient binary representation, like integers, MSB radix sort is probably the best choice, runs in O(n), and it can too be kinda stopped in the middle for a partially ordered array. But as you will be probably using radix sizes of 256, opportunities for stopping in the middle while keeping reasonable closeness will be narrow.


phlip wrote:Darekun: With a few tweaks, to make it able to handle having input values that are equal, and the like... this essentially reduces to radix sort, which is O(n), but has a bunch of restrictions that apply to the input (eg the values you're sorting must be integers, and must have a compile-time-fixed minimum and maximum range).


For some reason, it really bugs me when I see radix sort described as O(n). It's better to think of it this way:

Given n elements from a set of size k distinct choices, radix sort works in O(n * log(k)).

There are a lot of algorithms like this - you can often choose between O(n * log(n)) and O(n * log(k)).

Darekun wrote:In a sense, the part that represents spawning threads detects the range — or did you mean in the sense that my implementation uses a range of 00h-FFh? From the Wikipedia page, it looks like it'd be relatively easy to have item length as a parameter to a radix sort, and then run that many passes…


There used to be mechanical card sorters that were built around radix sort. They had 3 hoppers, and would divide cards in the input hopper into the 2 working hoppers on each pass, then stack them back in the input hopper for the next pass. Very cool to watch. They did, in fact, have the length as a control on the front - the one I remember had two knobs, one for start column and one for end column.
"In no set of physics laws do you get two cats." - doogly

arthurd006_5
Posts: 98
Joined: Mon Oct 18, 2010 9:49 am UTC

Re: 1185: "Ineffective Sorts"

Postby arthurd006_5 » Wed May 08, 2013 10:56 am UTC

lgw wrote:For some reason, it really bugs me when I see radix sort described as O(n). It's better to think of it this way:

Given n elements from a set of size k distinct choices, radix sort works in O(n * log(k)).

O(n+k) - k for clearing and reading the array, and n for inserting the elements in the middle, where:
- k >= n
- if it's appropriate, k is close to n

so describing an appropriate radix sort as O(n) works for me.

User avatar
phlip
Restorer of Worlds
Posts: 7573
Joined: Sat Sep 23, 2006 3:56 am UTC
Location: Australia
Contact:

Re: 1185: "Ineffective Sorts"

Postby phlip » Fri May 10, 2013 12:33 am UTC

lgw wrote:For some reason, it really bugs me when I see radix sort described as O(n). It's better to think of it this way:

Given n elements from a set of size k distinct choices, radix sort works in O(n * log(k)).

Sure, that's fair. "k" is often a constant for a given implementation of radix sort, but for the algorithm in general it's certainly a parameter.

Code: Select all

enum ಠ_ಠ {°□°╰=1, °Д°╰, ಠ益ಠ╰};
void ┻━┻︵​╰(ಠ_ಠ ⚠) {exit((int)⚠);}
[he/him/his]

User avatar
Von_Cheam
Posts: 17
Joined: Wed Sep 19, 2012 2:30 pm UTC
Location: Cheam, England

Re: 1185: "Ineffective Sorts"

Postby Von_Cheam » Wed Jul 19, 2017 12:46 am UTC

Hate to be a necromancer and all that, but I just came across this:

https://github.com/drathier/stack-overflow-import
Do you ever feel like all you’re doing is copy/pasting from Stack Overflow?

Let’s take it one step further.

Code: Select all

from stackoverflow import quick_sort
will go through the search results
of [python] quick sort looking for the largest code block that doesn’t
syntax error in the highest voted answer from the highest voted question
and return it as a module.


Gotta ask: is this in fact most awfully open to exploitation? Or is getting upvotes for malicious code on StackOverflow difficult enough that it's not as bad as it looks?


Return to “Individual XKCD Comic Threads”

Who is online

Users browsing this forum: acunning40, BytEfLUSh and 117 guests