Stacks and Queues

A place to discuss the implementation and style of computer programs.

Moderators: phlip, Moderators General, Prelates

User avatar
thePowerworks
Posts: 71
Joined: Wed Feb 20, 2008 5:40 am UTC

Stacks and Queues

Postby thePowerworks » Wed Feb 20, 2008 6:25 pm UTC

I'm in my second Java class (it's CS302 here) and we are learning about stacks and queues. We have to do all sorts of pointless tasks like sort them, remove elements, reverse them, etc.

My question is, what is a useful tasks that stacks or queues would perform more efficiently than an array or ArrayList.

Thanks!

Micron
Posts: 319
Joined: Sat Feb 16, 2008 1:03 am UTC

Re: Stacks and Queues

Postby Micron » Wed Feb 20, 2008 7:07 pm UTC

thePowerworks wrote:My question is, what is a useful tasks that stacks or queues would perform more efficiently than an array or ArrayList.


Forgive my wariness but that sounds an awfully lot like a homework question.

In any case the "pointless tasks" you are learning now are actually vitally important in a program of any complexity. You are probably working with simplified examples but the skills this is supposed to teach you are no less important. Even if you are never the person writing the data structure or its "sort" method you will need to be able to understand which structure to use and what the performance implications are.

If you want a specific example of one of these data structures go look up the "call stack" and see what you find.

EvanED
Posts: 4331
Joined: Mon Aug 07, 2006 6:28 am UTC
Location: Madison, WI
Contact:

Re: Stacks and Queues

Postby EvanED » Wed Feb 20, 2008 7:31 pm UTC

It's not so much whether they would be more efficient or not, and in fact it doesn't even really make sense to ask that question. Stacks and queues specify an interface, not an implementation, and different implementations could have differing performance capabilities. Some of the time, you'll be able to use a stack or queue implementation that is faster than an array, sometimes not. (If you are then a linked list probably would be the backing store.)

The point of using a stack or a queue is that you look at a problem you want to solve, say "to solve this problem I need to put stuff onto a list, take it back off in a FIFO [LIFO] order, and I don't need to look at anything but the top element, thus I will use a stack (queue)." It's the same sort of reasoning that you go through when deciding to use an array instead of a linked list: "to solve this problem I need fast random access into my collection, and i do little inserting and deleting into the middle, thus I will use an array." (Edit.) For instance, if you're doing something where you need to sort the elements of the stack or queue, that's probably not the right data structure to use. It is probably still worthwhile to think about how you would do it so you become more familiar with the sorts of things that they can and can't do easily, but it's not something you would do in practice.

Now, efficiency, as much as it can be answered.

First, stacks. Stacks are actually really easy to implement with a "backing store" of an array, and they are efficient. All you need is the array and an index to the last element. If you push something, you add it to the array and increment the index. (You have to deal with overflow too; if you use a vector or something instead of a raw array it will do a lot of this for you.) Poping means returning the element at the last index, and then decrementing it.

You can also really easily do stacks with a linked list. In fact the code is a bit simpler than raw arrays as you don't have to worry about overflow at all. This avoids resizing costs but gives you allocation and deallocation/GC costs. (Which dominates depends on your usage patterns.)

Queues are a bit different story. Queues are actually a bit messier to implement with an array, because you need to also keep a pointer to the first element. And you need to take into account that even though the queue may only ever have 10 elements in it at a time, if you keep pushing and popping it will hit the end of the backing array after a bit. You can fix this problem with something called a ring buffer, but the code isn't particularly pretty. It's definitely sufficiently ugly that if it wasn't encapsulated in a set of other functions, I would not be happy. (And one you do that... hey, you have a Queue class!)

The linked list solution though is almost as easy as the stack case if you were to implement it all yourself. It has the same sorts of tradeoffs in terms of what performance you get and lose.
Last edited by EvanED on Wed Feb 20, 2008 7:41 pm UTC, edited 1 time in total.

User avatar
segmentation fault
Posts: 1770
Joined: Wed Dec 05, 2007 4:10 pm UTC
Location: Nu Jersey
Contact:

Re: Stacks and Queues

Postby segmentation fault » Wed Feb 20, 2008 7:34 pm UTC

arrays are good for random access, but need to be resized when inserting/deleting
lists (stack/queue) perform poorly on random access, but dont need to be resized.
people are like LDL cholesterol for the internet

User avatar
rrwoods
Posts: 1509
Joined: Mon Sep 24, 2007 5:57 pm UTC
Location: US

Re: Stacks and Queues

Postby rrwoods » Wed Feb 20, 2008 7:36 pm UTC

Meh, might as well toss up one more here:

thePowerworks wrote:stacks and queues ... sort them

Huh?

To answer your question, here's a real-world use for each. Terms I've used in quotes will be defined by your textbook/lecturer/online resource and may vary from source to source.

Stacks: An undo/redo stack. Say you're programming an application that allows users to undo and redo actions. Each action the user takes has a corresponding undo action (for example, in a word processor, deleting text would have an undo action of putting the text back). When the user takes an action, its corresponding undo action is "pushed" onto an undo stack. If the user clicks undo, then the action on the "top" of the undo stack is "popped" and performed, and the original action is "pushed" onto a redo stack. If the user clicks redo, the action on the "top" of the redo stack is "popped" and performed, and its corresponding undo action is "pushed" onto the undo stack.

Queues: A message queue. Say you're programming an application that handles messages which result in some (possibly message-dependent) code being executed. Most of this code is nontrivial to execute, so any given message might take several seconds (or minutes, or days, depends on the application) to handle. If multiple messages arrive in quick succession, you don't want to lose any, so you use a queue. Whenever a message arrives, you "enqueue" that message. When you're ready to handle a message, you get the one at the "front" of the queue, "dequeue" it, and process it. (If there isn't anything in the queue, you wait.) This ensures that all messages are handled (within memory constraints) and that messages are handled in the order they are received.
31/M/taken/US
age/gender/interest/country

Belial wrote:The sex card is tournament legal. And I am tapping it for, like, six mana.

User avatar
thePowerworks
Posts: 71
Joined: Wed Feb 20, 2008 5:40 am UTC

Re: Stacks and Queues

Postby thePowerworks » Wed Feb 20, 2008 11:32 pm UTC

Micron wrote:Forgive my wariness but that sounds an awfully lot like a homework question.


I can understand but I assure you it was simply for my better understanding (considering my professor is an excellent Computer Scientist but lack the ability to teach).

Thanks to everyone who posted. I think I have a better grasp after reading this thread + working out the examples in my textbook.

User avatar
Aperfectring
Posts: 252
Joined: Fri Sep 07, 2007 3:47 am UTC
Location: Oregon (happily)

Re: Stacks and Queues

Postby Aperfectring » Thu Feb 21, 2008 1:39 am UTC

Sorting a stack or queue does seem like a pointless exercise. A priority queue is the only situation I can think of (off the top of my head) where you would have to do any sorting, and in that case it is just a merge sort when a new element comes in.

Ring buffers are used quite often, though usually only in situations where you want to keep memory footprint down and performance up. A typical use case is in drivers and in the kernel. I agree that ring buffers can be confusing, but once you have dealt with a couple situations, you start to understand them at a glance.
Odds are I did well on my probability exam.

User avatar
Durinthal
Posts: 799
Joined: Mon Dec 11, 2006 9:46 pm UTC
Location: 127.0.0.1

Re: Stacks and Queues

Postby Durinthal » Thu Feb 21, 2008 4:53 am UTC

With a high-level language like Java, a stack or queue isn't necessarily better than any other similar structure for a particular task. At a level closer to the hardware or in theoretical areas, a single stack might be all you have to work with. Having a good working knowledge of them now can be useful later.

User avatar
thePowerworks
Posts: 71
Joined: Wed Feb 20, 2008 5:40 am UTC

Re: Stacks and Queues

Postby thePowerworks » Thu Feb 21, 2008 5:24 am UTC

To be honest, the jump from my first java class (CS201) and this one about 10x in difficulty.

I am making perfect score on my programming homework exercise, but I don't think I am doing so well in exams or quizes.

When it comes to programming I understand it completely but the written problems that are all theory are killing me.

Not to mention we had one test that was nothing but Big Oh analysis and it killed me. I did not study it that heavily (focused more on everything else we learned) and he put nothing but Big Oh on the test.

I'm also taking a discrete structures class right now which is destroying me.

oh well. life goes on.

Posi
Posts: 111
Joined: Mon Jul 16, 2007 6:08 am UTC

Re: Stacks and Queues

Postby Posi » Thu Feb 21, 2008 8:26 am UTC

thePowerworks wrote:To be honest, the jump from my first java class (CS201) and this one about 10x in difficulty.

I am making perfect score on my programming homework exercise, but I don't think I am doing so well in exams or quizes.

When it comes to programming I understand it completely but the written problems that are all theory are killing me.

Not to mention we had one test that was nothing but Big Oh analysis and it killed me. I did not study it that heavily (focused more on everything else we learned) and he put nothing but Big Oh on the test.

I'm also taking a discrete structures class right now which is destroying me.

oh well. life goes on.

A better understanding of Big O might help you understand why the stack and queue are important. Big O provides a theoretic benchmark for each type of data structure. With out it, you might as well use a list for everything. Most any important data structure can be emulated with a list, that is significantly slower.

User avatar
thePowerworks
Posts: 71
Joined: Wed Feb 20, 2008 5:40 am UTC

Re: Stacks and Queues

Postby thePowerworks » Thu Feb 21, 2008 6:20 pm UTC

I understand that basis of big oh. The problem came in when it was time to estimate running times. I forgot most of the rules that we learned in class.

And I think it was a bit unfair of him to put 15 big oh questions on a 25 problem test that was supposed to cover what we learned in the first 1/4 of the semester (we learned a lot more than big oh).

He just knew that would screw most of us over, I think the class average was around 42.

orangeperson
Posts: 117
Joined: Sun May 13, 2007 2:17 am UTC

Re: Stacks and Queues

Postby orangeperson » Thu Feb 21, 2008 8:02 pm UTC

Disclaimer: Everything I know about CS I learned from the internet, so I don't really know that much.
EvanED (Paraphrase) wrote:Arrays may be better than linked lists for implementing Stacks and Queues, depending on stuff.

I fail to see how arrays could possibly be better. To make a linked list Stack, you just need to store a pointer to the list. Take the car (this is the only term I know for that idea), store the cdr, and delete the cons to take something off the stack. If you use an array, you have to have a limit on the array size, reallocate the array, or waste a lot of space with an oversized array.
spjork.

Micron
Posts: 319
Joined: Sat Feb 16, 2008 1:03 am UTC

Re: Stacks and Queues

Postby Micron » Thu Feb 21, 2008 8:20 pm UTC

orangeperson wrote:If you use an array, you have to have a limit on the array size, reallocate the array, or waste a lot of space with an oversized array.


Using a linked list you probably have to allocate or free memory every time you push or pop an element from the stack and that memory can end up highly fragmented. With an array you only perform a single allocation and your cache may be able to hold to top of the stack so you can perform several operations without even having to go out to your main memory.

Like EvanED said, it depends on "stuff" and which implementation works better for you really depends on how you expect it to be used.

btilly
Posts: 1877
Joined: Tue Nov 06, 2007 7:08 pm UTC

Re: Stacks and Queues

Postby btilly » Thu Feb 21, 2008 8:24 pm UTC

orangeperson wrote:Disclaimer: Everything I know about CS I learned from the internet, so I don't really know that much.
EvanED (Paraphrase) wrote:Arrays may be better than linked lists for implementing Stacks and Queues, depending on stuff.

I fail to see how arrays could possibly be better. To make a linked list Stack, you just need to store a pointer to the list. Take the car (this is the only term I know for that idea), store the cdr, and delete the cons to take something off the stack. If you use an array, you have to have a limit on the array size, reallocate the array, or waste a lot of space with an oversized array.

Look at it in more detail. To store 1000 pieces of data on the linked list stack you need 1000 structs, each of which has both a pointer and a piece of data. Plus as you add/remove elements you have to manage allocating/deallocating memory. (Which takes work and may impose overhead of its own.)

To store the same data on the array you don't need to have those 1000 pointers. And as you add/remove elements you don't have to allocate/deallocate memory unless the whole array needs to be moved. And if you follow an exponential allocation process (that's one where every time you need to move the array you allocate a fixed fraction extra) then the amortized average cost of those allocations for growing a large dataset turns out to be constant per element added to the stack. And the extra memory at the end of the array is itself bounded by a constant times the number of things in the stack. Those costs are at the least comparable to (and sometimes less than) the cost of memory allocation with a stack.

You have to get into a very detailed analysis, but to within a constant factor (which could go either way depending on the exact implementation) the array implementation uses the same memory overhead as the linked list, allocation takes the same overhead (if you're adding, removing, adding, etc then the array probably wins), and deallocation is faster.

The array clearly loses if you will occasionally have the stack swell and you need to reclaim memory after it goes down, or if you really need each individual insert to take a bounded amount of time (some inserts into the array will cause an expensive reallocation).

In practice these cases are fairly rare, and the array implementation is just fine (and possibly better) than the linked list. Which is why most scripting languages (from Perl on) just offer arrays and then offer fast array operations (shift, unshift, push, pop) that can be used to emulate both a stack and a queue. In those languages the array approach is almost always better simply because using native data structures is faster than manipulating data structures with high level operations.
Some of us exist to find out what can and can't be done.

Others exist to hold the beer.

User avatar
Berengal
Superabacus Mystic of the First Rank
Posts: 2707
Joined: Thu May 24, 2007 5:51 am UTC
Location: Bergen, Norway
Contact:

Re: Stacks and Queues

Postby Berengal » Fri Feb 22, 2008 3:00 am UTC

Speaking of scripting languages, stacks and queues, I feel I have to share my recent discovery in python:

Code: Select all

def performActionOnEveryElementAndItsSubElementsInAListUsingAFunctionWithAReallyLongNameBecauseCommentsCauseCancer(list, stack == true):
  if stack:
    pop = list.pop
  else:
    pop = lambda x = 0: list.pop(x)
  while list:
    element = pop()
    list.extend(element.subelements)
    action(element)


The thought process was:
- "So a stack is neat for traversing trees in a non-recursive manner, but so is a que. I should try implementing both, to prove to myself I can."
- "Neat, python lists have a pop method! I can use this for both!"
- "So pop() pops the last, pop(0) pops the first... okay, but I can't go if stack: pop = list.pop; else: pop = list.pop(0)"
- "Extra variable? Works, but I still think I could do it better."
- "Lambdas!"

These days I'm programming mostly in java because that's required for my course. This has made me love python's "functions as first-class objects" deal, and it's lambda operator even more. This little discovery did nothing to lessen my love for those.
It is practically impossible to teach good programming to students who are motivated by money: As potential programmers they are mentally mutilated beyond hope of regeneration.

orangeperson
Posts: 117
Joined: Sun May 13, 2007 2:17 am UTC

Re: Stacks and Queues

Postby orangeperson » Fri Feb 22, 2008 3:43 am UTC

So, if your stack length fluctuates and you need to reclaim the space, or the architecture has better support for them, use linked lists. Sounds OK. Array queues seem a bit more complex though, because of this ring buffer that was mentioned. This is what came to mind:

Code: Select all

queue = array of length we know we won't go over
firstElementPos=0
lastElementPos=0
enqueue(element):
   if lastElementPos == queue.length: lastElementPos = 1
   else lastElementPos++
   queue[lastElementPos] = element
dequeue:
   element = queue[firstElementPos]
   if firstElementPos == queue.length: firstElementPos = 1
   else firstElementPos++

Of course it's more complicated, because we might dequeue an empty list. And if we don't have a limit on the queue length, I see how it could get really messy.

Berengal: Why can't you just use list.pop()? you'll still get through all of the elements in the list. Also, the subelements property seems useful, but I couldn't find anything about it.
spjork.

akashra
Posts: 503
Joined: Tue Jan 01, 2008 6:54 am UTC
Location: Melbourne, AU
Contact:

Re: Stacks and Queues

Postby akashra » Fri Feb 22, 2008 4:02 am UTC

Stacks are incredibly useful for things like RPN and building trees. While there's other ways to do it, I'd argue the most efficient way to convert a sorted List into a Binary Tree is using a Stack.
It's also the best way to build graphical representation structures for a number of things (just off the top of my head: Elimination brackets for sports). They're also going to get used if you process recursive rules matching those rules against properties.
You may also want to use them when you're splitting string data such as say lap times in races where the number of laps is variable, and each line contains various data.

Queues are, well... they're just damn efficient. When you think Queue, think LinkedList.
( find / -name \*base\* -exec chown us : us { } \ ; )

User avatar
0xDEADBEEF
Posts: 284
Joined: Wed Jan 30, 2008 6:05 am UTC
Location: Austin, TX
Contact:

Re: Stacks and Queues

Postby 0xDEADBEEF » Fri Feb 22, 2008 4:40 am UTC

Newer languages either have some kind of vector template, or dynamic list, that can function easily and efficiently as a stack or queue.

However, if you ever find yourself coding in C, or maybe maintaining a C++ program written without the benefit of the Standard Template Library, you'll learn to appreciate the ability to dynamically allocate a new node, put it at the "correct" end of the list, and de-allocate the memory when you're done with it. It's harder to code, but the performance is excellent.

If you're learning how to do that manually in Java, you'll find C++ very similar. Believe me, you're learning how to operate a valuable tool, even if they haven't taught you when to really use said tool just yet. The time will come.

Posi
Posts: 111
Joined: Mon Jul 16, 2007 6:08 am UTC

Re: Stacks and Queues

Postby Posi » Fri Feb 22, 2008 5:44 am UTC

thePowerworks wrote:I understand that basis of big oh. The problem came in when it was time to estimate running times. I forgot most of the rules that we learned in class.

And I think it was a bit unfair of him to put 15 big oh questions on a 25 problem test that was supposed to cover what we learned in the first 1/4 of the semester (we learned a lot more than big oh).

He just knew that would screw most of us over, I think the class average was around 42.
That is about how much of my exams where for big oh. For me at least, big oh was never covered in any of the assignments (but really, how could it they were all coding), so it was made up with increased emphasis on the exam. The prof did dedicate the two lectures before an exam to big oh analysis however.

Also, how bad is 42 at your school? While a class average of 42 would be rather bad at my school, it is nothing terrible. My school also scales nearly ever test.

btilly
Posts: 1877
Joined: Tue Nov 06, 2007 7:08 pm UTC

Re: Stacks and Queues

Postby btilly » Fri Feb 22, 2008 7:23 am UTC

orangeperson wrote:So, if your stack length fluctuates and you need to reclaim the space, or the architecture has better support for them, use linked lists. Sounds OK. Array queues seem a bit more complex though, because of this ring buffer that was mentioned. This is what came to mind:

Array queues are no more complicated to implement than array stacks. Just have a regular array. When the array threatens to go off of the bounds, allocate a new array with some buffer room and copy it over. Wash, rinse and repeat. If you're using a high level dynamic language then all of the bookkeeping is already done for you, just use push to add elements and shift to take them off. (For technical reasons in Perl it is somewhat slower to use unshift/pop instead, though it will work. I suspect that few other languages have the algorithm trick I gave Perl, so unshift/pop is likely far slower than push/shift.)

What gets complicated is trying to implement an array queue efficiently. However the naive implemention is, within a constant factor of the best possible queue implementation. (Not, admittedly, with a great constant...) Therefore if you're working in a scripting language like JavaScript where the naive array solution is easy to implement and anything else is difficult, you'd be strongly advised to implement it that way. Otherwise it just depends on how much work you want to do.

This reminds me of a study that I saw many years ago. This group asked a number of people to solve a problem in any language they wanted. They were asked to record how long it took to solve it, and to send in the solution. What they found is that users of dynamic languages (Perl, Python, Lisp, etc) consistently quickly came up with solutions that were very short, did what was asked, and performed reasonably well. Users of languages like C and C++ inevitably took longer and generally came up with much longer solutions that were, on average, slower than the high-level scripting languages. But some of the solutions in C were much, much faster than any solution in the scripting language.

You can draw many lessons from this study. The one that I draw is that unless you really, really need the extra speed, you should stick to more productive languages. And if you really, really need the extra speed, odds are that you're going to shoot yourself in the foot and still not get it.
Some of us exist to find out what can and can't be done.

Others exist to hold the beer.

User avatar
Berengal
Superabacus Mystic of the First Rank
Posts: 2707
Joined: Thu May 24, 2007 5:51 am UTC
Location: Bergen, Norway
Contact:

Re: Stacks and Queues

Postby Berengal » Fri Feb 22, 2008 6:17 pm UTC

orangeperson wrote:Berengal: Why can't you just use list.pop()? you'll still get through all of the elements in the list. Also, the subelements property seems useful, but I couldn't find anything about it.

Because list.pop() returns the last element in the list. This means that the second element to be evaluated is the last element of the last element in the original list's list of subelements, also known as depth-first when traversing trees. To implement breadth-first as an option I needed the local pop variable to be one of two functions, popping either the first or the last element of the list, and to remove lots of "if stack: pop(); else: pop(0)" from the function it needed to take the same amount of arguements. By using lambda : list.pop(0) (I think that's still valid python, removing the argument thingamajig), I can set pop to either list.pop, making it pop the last, or the lambda function, making it pop the first, with both functions taking no arguements.

@btilly: I've been trying to tell my brother this for a year now, but he's constantly implementing algorithms running 10x as fast as mine while only using 2x the amount of time (if even that much) :P
He has this knack of seeing the best way to implement something really easy. Often when I tell him something like "So I need to take that double, convert it to an int, run it through a regexp, and multiply the results with the original double's mantissa" and he'll answer with something like "No you don't. Just cast the double* to an int*, bitshift right twice, xor with 0x54AF0436 and wave your magic wand."
It is practically impossible to teach good programming to students who are motivated by money: As potential programmers they are mentally mutilated beyond hope of regeneration.

User avatar
thePowerworks
Posts: 71
Joined: Wed Feb 20, 2008 5:40 am UTC

Re: Stacks and Queues

Postby thePowerworks » Fri Feb 22, 2008 6:50 pm UTC

Posi wrote:
thePowerworks wrote:Also, how bad is 42 at your school? While a class average of 42 would be rather bad at my school, it is nothing terrible. My school also scales nearly ever test.


It's not that uncommon. But he does not scale at all. So that makes it a bit more disturbing.

I like him, but he does suck at teaching.

btilly
Posts: 1877
Joined: Tue Nov 06, 2007 7:08 pm UTC

Re: Stacks and Queues

Postby btilly » Fri Feb 22, 2008 7:57 pm UTC

Berengal wrote:@btilly: I've been trying to tell my brother this for a year now, but he's constantly implementing algorithms running 10x as fast as mine while only using 2x the amount of time (if even that much) :P
He has this knack of seeing the best way to implement something really easy. Often when I tell him something like "So I need to take that double, convert it to an int, run it through a regexp, and multiply the results with the original double's mantissa" and he'll answer with something like "No you don't. Just cast the double* to an int*, bitshift right twice, xor with 0x54AF0436 and wave your magic wand."

That's a good skill and it is wonderful when programming in the small.

It won't scale to programming a large piece of software. Instead you should write your software in a straightforward fashion, benchmark, find the critical routines, and then rewrite them as quickly as possible.

That kind of micro-optimization also won't give the same kind of wins that the right algorithm gives over the wrong one. Let me give a simple example for him to play with. (This comes from an actual business problem that came up for a friend of mine when dynamically laying out a web page.) You have a collection of articles that you want to group into a fixed number of columns n. Each article has a size > 0, the order in which the articles will appear has been specified, and you wish to make the longest column be as short as possible. (Note: the fact that the order has been specified makes this not be the knacksack problem.) Write a function that takes the number of columns followed by an array of sizes of the articles and returns an array listing the number of articles to put into each column.

So, for example, column_counts(3, [25, 10, 10, 10, 20]) might return [1, 3, 1]. Or [1, 2, 2]. But it could not return [2,2,1] because that last solution has the longest column of length 35 while the first two have none longer than 30.

Challenge him to come up with the fastest version he can. As motivation tell him that I have an unoptimized version written in Perl which can divide 10,000 articles of random length into 100 columns in under half a second.
Some of us exist to find out what can and can't be done.

Others exist to hold the beer.

User avatar
Berengal
Superabacus Mystic of the First Rank
Posts: 2707
Joined: Thu May 24, 2007 5:51 am UTC
Location: Bergen, Norway
Contact:

Re: Stacks and Queues

Postby Berengal » Fri Feb 22, 2008 10:55 pm UTC

I just challenged my brother. After I had explained what he had to do, he said "Hmmm. I can think of several algorithms to do this, but the fast ones are hard to implement, and not good, while the good ones are slow." to which I replied "you have to sort 10000 articles of random length into 100 coloumns in less than half a second." His reply? "Slow and clunky algorithm it is then."
It is practically impossible to teach good programming to students who are motivated by money: As potential programmers they are mentally mutilated beyond hope of regeneration.

btilly
Posts: 1877
Joined: Tue Nov 06, 2007 7:08 pm UTC

Re: Stacks and Queues

Postby btilly » Fri Feb 22, 2008 11:21 pm UTC

Berengal wrote:I just challenged my brother. After I had explained what he had to do, he said "Hmmm. I can think of several algorithms to do this, but the fast ones are hard to implement, and not good, while the good ones are slow." to which I replied "you have to sort 10000 articles of random length into 100 coloumns in less than half a second." His reply? "Slow and clunky algorithm it is then."

Please keep us updated on whether he actually succeeds.
Some of us exist to find out what can and can't be done.

Others exist to hold the beer.

User avatar
Aperfectring
Posts: 252
Joined: Fri Sep 07, 2007 3:47 am UTC
Location: Oregon (happily)

Re: Stacks and Queues

Postby Aperfectring » Mon Feb 25, 2008 12:58 am UTC

While premature optimization is bad, you can't completely ignore optimization while writing code.

Any programmer worth their weight will know about, and how to deal with, linked lists. They may be rusty on them, but the will still know about them and when they could be beneficial. Because of this, an algorithm for stacks and queues will generally be much clearer to programmers in general if written with a linked list.

However, if you know that you are in a size-constrained, and performance critical situation, you shouldn't start with a linked list. My argument with the statement that premature optimization is bad, is that bad programmers interpret that as an excuse not to understand the situation.

First try to understand the situation the code will be used in first, then choose an algorithm that seems appropriate, then optimize the algorithm if necessary.
Odds are I did well on my probability exam.

btilly
Posts: 1877
Joined: Tue Nov 06, 2007 7:08 pm UTC

Re: Stacks and Queues

Postby btilly » Mon Feb 25, 2008 7:19 am UTC

Aperfectring wrote:While premature optimization is bad, you can't completely ignore optimization while writing code.

Any programmer worth their weight will know about, and how to deal with, linked lists. They may be rusty on them, but the will still know about them and when they could be beneficial. Because of this, an algorithm for stacks and queues will generally be much clearer to programmers in general if written with a linked list.

While this advice may be good, it is highly language specific. For C programmers? Yes. For Perl programmers? No, there your default implementation is to use an array. (It is also the most efficient possible implementation in pure Perl.) For Java programmers? No, they expect to find it already built, and both stacks and queues are in java.util.

The lesson here is that when you argue based on expectations, you need to be very aware of how expectations vary, and what the right expectations are within your programming community.

That said, good programmers generally know several languages, and C is a common one that people learn. So the linked list implementation
Aperfectring wrote:However, if you know that you are in a size-constrained, and performance critical situation, you shouldn't start with a linked list. My argument with the statement that premature optimization is bad, is that bad programmers interpret that as an excuse not to understand the situation.

The key is the word "premature". If you know up front that you need to optimize, then optimizing is hardly premature. But for every case where someone properly starts optimizing up front, there are dozens where programmers start micro-optimizing when they don't need to and get into trouble as a result.

As to your point, my opinion is that bad programmers find excuses to not understand the situation because they don't have the mental capacity and/or skills to understand the situation. But having them think about optimization on top of everything else doesn't help them understand. And since most programs spend most of their time in a small fraction of the code, it is reasonable to tell programmers to write clear code all along, and then have one or two good ones benchmark the code, find the hot spots, and figure out how to optimize them. Sure, you don't get perfect that way. But you generally get "good enough".
Aperfectring wrote:First try to understand the situation the code will be used in first, then choose an algorithm that seems appropriate, then optimize the algorithm if necessary.

Yup.

Berengal wrote:I just challenged my brother. After I had explained what he had to do, he said "Hmmm. I can think of several algorithms to do this, but the fast ones are hard to implement, and not good, while the good ones are slow." to which I replied "you have to sort 10000 articles of random length into 100 coloumns in less than half a second." His reply? "Slow and clunky algorithm it is then."

Update request. Did he succeed yet?
Some of us exist to find out what can and can't be done.

Others exist to hold the beer.

User avatar
Berengal
Superabacus Mystic of the First Rank
Posts: 2707
Joined: Thu May 24, 2007 5:51 am UTC
Location: Bergen, Norway
Contact:

Re: Stacks and Queues

Postby Berengal » Mon Feb 25, 2008 12:56 pm UTC

No, he's been too busy trying to implement twisted's "multithreading" in c, or so he says.

Myself I've been thinking about how to do it in python, but haven't come up with a bullet-proof algorithm yet. I've been busy on an assignement as well, so I haven't had too much time to think about it, but the fact that I can't see a clear way after a few days is starting to bug me. Any hints? (Small ones, don't drop the entire secret).
It is practically impossible to teach good programming to students who are motivated by money: As potential programmers they are mentally mutilated beyond hope of regeneration.

btilly
Posts: 1877
Joined: Tue Nov 06, 2007 7:08 pm UTC

Re: Stacks and Queues

Postby btilly » Mon Feb 25, 2008 3:54 pm UTC

Berengal wrote:No, he's been too busy trying to implement twisted's "multithreading" in c, or so he says.

Myself I've been thinking about how to do it in python, but haven't come up with a bullet-proof algorithm yet. I've been busy on an assignement as well, so I haven't had too much time to think about it, but the fact that I can't see a clear way after a few days is starting to bug me. Any hints? (Small ones, don't drop the entire secret).

Sorry, I can't figure out a way to drop a hint without making the efficient solution obvious. (Albeit somewhat tricky.)

However it can be challenging to come up with a version that will divide, say, 150 random articles into 10 buckets in reasonable time on a PC. That you can do armed with common sense and knowledge of a few types of optimization techniques. (eg memoization)
Some of us exist to find out what can and can't be done.

Others exist to hold the beer.

qbg
Posts: 586
Joined: Tue Dec 18, 2007 3:37 pm UTC

Re: Stacks and Queues

Postby qbg » Mon Feb 25, 2008 9:48 pm UTC

Berengal wrote:Myself I've been thinking about how to do it in python, but haven't come up with a bullet-proof algorithm yet. I've been busy on an assignement as well, so I haven't had too much time to think about it, but the fact that I can't see a clear way after a few days is starting to bug me. Any hints? (Small ones, don't drop the entire secret).

I coded up the first (and probably wrong) algorithm that came to mind:
Spoiler:

Code: Select all

(defun fit (cols sizes)
  (let* ((total (reduce #'+ sizes))
    (avg (/ total cols))
    (size 0)
    (result (list 0)))
    (if (<= (length sizes) cols)
   (return-from fit (append sizes (loop repeat (- cols (length sizes)) collect 0)))
   (progn
     (dolist (art-size sizes)
       (cond
         ((<= (+ art-size size) avg)
          (incf (car result))
          (incf size art-size))
         ((> (- art-size (- avg size)) (/ art-size 2))
          (decf cols)
          (decf total size)
          (push 1 result)
          (setf size art-size)
          (setf avg (/ total cols)))
         (t
          (incf (car result))
          (push 0 result)
          (decf cols)
          (decf total (+ art-size size))
          (setf size 0)
          (setf avg (/ total cols)))))
     (nreverse result)))))

It takes in a number and a list and returns a list:
(fit 3 '(25 10 10 10 20)) => (1 3 1)

It can fit 10000 articles into 100 columns on my machine in 5-6 milliseconds. (another sign that it is probably wrong)

User avatar
e946
Posts: 621
Joined: Wed Jul 11, 2007 6:32 am UTC

Re: Stacks and Queues

Postby e946 » Mon Feb 25, 2008 10:07 pm UTC

Some thoughts on my attempt at this.

Spoiler:
Apparently, recursion isn't the way to go. With some optimizations like calculating a minimum and maximum value for each column I got it to finish in under a second for up to about 7 columns and 100 articles but it sucks after that. Even if you get it down to 2 choices (like choosing between using 15 or 16 articles, etc) per column, it should take something like 2^100 iterations or several hundred thousand times the current age of the universe to find the best one.

The only problem is the only other way I can think of doing this would run in polynomial time but would be vulnerable to falling into local minimums, so I don't know where to go.

btilly
Posts: 1877
Joined: Tue Nov 06, 2007 7:08 pm UTC

Re: Stacks and Queues

Postby btilly » Mon Feb 25, 2008 10:09 pm UTC

I don't have scheme installed so I can't test that directly. But for a test, make the sizes of the articles be 1, 2, ... , 100 and try to put them into 10 columns.

The answer that you should get is [32, 13, 10, 9, 7, 7, 6, 6, 5, 5].
Some of us exist to find out what can and can't be done.

Others exist to hold the beer.

User avatar
Cosmologicon
Posts: 1806
Joined: Sat Nov 25, 2006 9:47 am UTC
Location: Cambridge MA USA
Contact:

Re: Stacks and Queues

Postby Cosmologicon » Mon Feb 25, 2008 11:46 pm UTC

Are you sure it's not easy to do this problem naively? I tried doing it as naively as I could in C++, and it only takes about 0.3 seconds to bin n = 1,000,000 random numbers of average size k = 500 into m = 1000 bins. It takes 0.9 seconds to read in the numbers, and a total of 1.2 seconds running time. If I understand the notation correctly, this algorithm should take O(n lg nk) operations. (It may be identical to qbg's algorithm; I don't know Scheme.)

btilly
Posts: 1877
Joined: Tue Nov 06, 2007 7:08 pm UTC

Re: Stacks and Queues

Postby btilly » Tue Feb 26, 2008 12:07 am UTC

Cosmologicon wrote:Are you sure it's not easy to do this problem naively? I tried doing it as naively as I could in C++, and it only takes about 0.3 seconds to bin n = 1,000,000 random numbers of average size k = 500 into m = 1000 bins. It takes 0.9 seconds to read in the numbers, and a total of 1.2 seconds running time. If I understand the notation correctly, this algorithm should take O(n lg nk) operations. (It may be identical to qbg's algorithm; I don't know Scheme.)

I can confirm that if you get the right idea, there is a straightforward solution. I suspect that you've got that idea, but not expressed it in the most general way. If you express that idea carefully, then k should not enter into the running time. In fact you can make the sizes be floating point instead of integers and not change the efficiency at all.

Here is that idea.

Spoiler:
Find the smallest possible maximum column length with at most m columns.
Some of us exist to find out what can and can't be done.

Others exist to hold the beer.

User avatar
Cosmologicon
Posts: 1806
Joined: Sat Nov 25, 2006 9:47 am UTC
Location: Cambridge MA USA
Contact:

Re: Stacks and Queues

Postby Cosmologicon » Tue Feb 26, 2008 12:49 am UTC

Yeah, I guess that's what I did....
Spoiler:
I'm not sure why k shouldn't enter into the running time. It's no big deal, but I'm just wondering if I missed something big. The initial lower bound I put on the bin size is max(k, max(a)) where max(a) is the maximum article length. The initial upper bound on the bin size is nk. So the range that I needed to search is O(nk) in size, with n operations to test whether a given bin size is viable. And I thought that doing a binary search on a range of size N is O(lg(N)). It seems to me that if they're floats, it'll take O(lg(nk/e)) checks, where e is the precision you want on the bin size.

User avatar
Berengal
Superabacus Mystic of the First Rank
Posts: 2707
Joined: Thu May 24, 2007 5:51 am UTC
Location: Bergen, Norway
Contact:

Re: Stacks and Queues

Postby Berengal » Tue Feb 26, 2008 2:03 am UTC

So I had some time and tried to code one of the ideas I had. It sort of works, but not quite:

Spoiler:
The idea is that the best solution is the one where all columns are of equal length. Since that's usually not possible however, you have to give the program some leeway. That's where I screwed up, I think. I'll have another go at it when I've got more time. I think I can sort out the bugs eventually:

Code: Select all

def columnSort(nrColumns, articleList):
    totalArticleLength = sum(articleList)
    averageColumnLength = totalArticleLength / nrColumns
    print 'Average column length is: %d' % averageColumnLength # prints for debugging
    averageArticleLength = totalArticleLength / len(articleList)
    print 'Average article length is: %d' % averageArticleLength
    sortedColumns = [[] for x in xrange(nrColumns)]
    for column in sortedColumns:
        columnLength = sum(column)
        while columnLength < averageColumnLength and articleList:
            if columnLength + articleList[0] <= averageColumnLength:
                print 'Adding article %d' % articleList[0]
                column.append(articleList.pop(0))
            elif articleList[0] < averageArticleLength:
                print 'Adding article %d' % articleList[0]
                column.append(articleList.pop(0))
            else:
                columnLength = sum(column)
                break
            columnLength = sum(column)
    return [len(column) for column in sortedColumns]

And the printout:

Code: Select all

>>> columnSort(10, range(1,101))
Average column length is: 505
Average article length is: 50
Adding article 1
Adding article 2
Adding article 3
Adding article 4
...
Adding article 95
Adding article 96
Adding article 97
[32, 13, 10, 8, 7, 6, 6, 5, 5, 5]
>>>
It is practically impossible to teach good programming to students who are motivated by money: As potential programmers they are mentally mutilated beyond hope of regeneration.

qbg
Posts: 586
Joined: Tue Dec 18, 2007 3:37 pm UTC

Re: Stacks and Queues

Postby qbg » Tue Feb 26, 2008 2:25 am UTC

btilly wrote:I don't have scheme installed so I can't test that directly. But for a test, make the sizes of the articles be 1, 2, ... , 100 and try to put them into 10 columns.

The answer that you should get is [32, 13, 10, 9, 7, 7, 6, 6, 5, 5].

1) Your numbers add up to 99, not 100
2) It's not Scheme, but Common Lisp:
CL-USER> *s*
(1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55
56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81
82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100)
CL-USER> (fit 10 *s*)
(31 13 10 9 8 7 6 6 5 5)

Also, I think my algorithm is O(n)

btilly
Posts: 1877
Joined: Tue Nov 06, 2007 7:08 pm UTC

Re: Stacks and Queues

Postby btilly » Tue Feb 26, 2008 3:07 am UTC

qbg wrote:
btilly wrote:I don't have scheme installed so I can't test that directly. But for a test, make the sizes of the articles be 1, 2, ... , 100 and try to put them into 10 columns.

The answer that you should get is [32, 13, 10, 9, 7, 7, 6, 6, 5, 5].

1) Your numbers add up to 99, not 100

Add them again. 32+13 = 45. 35+10 = 55. 45+9 = 64. 54+7 = 71. 61+7 = 78. 68+6 = 84. 74+6 = 90, 90+5 = 95 and 95+5 = 100.
qbg wrote:2) It's not Scheme, but Common Lisp:
CL-USER> *s*
(1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55
56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81
82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100)
CL-USER> (fit 10 *s*)
(31 13 10 9 8 7 6 6 5 5)

Also, I think my algorithm is O(n)

While it is different than mine, your answer is correct.

I don't have common lisp on this machine either. (Rummages around, installs.) OK, I sorted 1..100 in alphabetical order. Your solution gives:

> (fit 10 (list 1 10 100 11 12 13 14 15 16 17 18 19 2 20 21 22 23 24 25 26 27 28 29 3 30 31 32 33 34 35 36 37 38 39 4 40 41 42 43 44 45 46 47 48 49 5 50 51 52 53 54 55 56 57 58 59 6 60 61 62 63 64 65 66 67 68 69 7 70 71 72 73 74 75 76 77 78 79 8 80 81 82 83 84 85 86 87 88 89 9 90 91 92 93 94 95 96 97 98 99
))
(24 15 11 10 8 7 7 6 7 5)

A correct solution is 25 15 11 10 8 7 7 7 5 5. Your solution has the 9th column of length 558. The correct solution never exceeds 537. Therefore your solution is wrong.

Cosmologicon wrote:Yeah, I guess that's what I did....
Spoiler:
I'm not sure why k shouldn't enter into the running time. It's no big deal, but I'm just wondering if I missed something big. The initial lower bound I put on the bin size is max(k, max(a)) where max(a) is the maximum article length. The initial upper bound on the bin size is nk. So the range that I needed to search is O(nk) in size, with n operations to test whether a given bin size is viable. And I thought that doing a binary search on a range of size N is O(lg(N)). It seems to me that if they're floats, it'll take O(lg(nk/e)) checks, where e is the precision you want on the bin size.

Here is the trick you needed.
Spoiler:
When you test a particular upper bound of the bin size, it is easy to keep track of what range of bin sizes would result in every decision being the same, and therefore result in the same solution. This allows you to "walk" the solutions together more quickly and get an exact solution in fewer steps. In the case of floating point it also gives you as exact a solution as possible without having to think carefully about precision and the internal representation.
Some of us exist to find out what can and can't be done.

Others exist to hold the beer.

qbg
Posts: 586
Joined: Tue Dec 18, 2007 3:37 pm UTC

Re: Stacks and Queues

Postby qbg » Tue Feb 26, 2008 5:00 am UTC

Updated code that produces better answers:
Spoiler:

Code: Select all

(defun fit (cols sizes)
  (labels ((fit-help (fixed-size)
        (let ((cols-done 0)
         (size 0)
         min-increase
         (result (list 0)))
          (do ((art-sizes sizes (cdr art-sizes)))
         ((null art-sizes))
       (if (<= (+ size (car art-sizes)) fixed-size)
           (progn
             (incf (car result))
             (incf size (car art-sizes)))
           (progn
             (let ((mi2 (- (+ size (car art-sizes)) fixed-size)))
          (setf min-increase (if (null min-increase) mi2 (min min-increase mi2))))
             (push 1 result)
             (incf cols-done)
             (setf size (car art-sizes))
             (if (>= cols-done cols)
            (progn
              (return-from fit-help (fit-help (+ fixed-size min-increase))))))))
          (nreverse result))))
    (fit-help (/ (reduce #'+ sizes) cols))))


Only takes 10 milliseconds for the 10000 articles/100 columns problem

User avatar
e946
Posts: 621
Joined: Wed Jul 11, 2007 6:32 am UTC

Re: Stacks and Queues

Postby e946 » Tue Feb 26, 2008 5:43 am UTC

btilly wrote:A correct solution is 25 15 11 10 8 7 7 7 5 5. Your solution has the 9th column of length 558. The correct solution never exceeds 537. Therefore your solution is wrong.


Just curious, how'd you get 537? I've got one written up with a maximum column length of 540, and when I manually set the max length to 537 it can't fit them all in 10 columns. Unless you're referring to the alphabetically sorted version?

I rewrote mine to move away from recursively guessing how many should go in each. Instead,

Spoiler:
I calculate the maximum column length, although I have no idea why the function i'm using to get the max works, I just know it does.

Code: Select all

max_column_length=avg_column_length+avg_article_length-(longest_article/num_columns)


From here I just run through the list of articles. I see what the column's length would be if I added the article, and if it would be longer than the max, I move on to the next article. Either way, I then add the the article to the "currently selected column" and so on.


The only problem, as I referred to in my last post, is that the solution I come up with might just be "close" but not really the minimum and I'd have no way of knowing.


Return to “Coding”

Who is online

Users browsing this forum: No registered users and 8 guests