Coding: Fleeting Thoughts

A place to discuss the implementation and style of computer programs.

Moderators: phlip, Moderators General, Prelates

User avatar
Jplus
Posts: 1721
Joined: Wed Apr 21, 2010 12:29 pm UTC
Location: Netherlands

Re: Coding: Fleeting Thoughts

Postby Jplus » Mon Dec 03, 2012 11:54 am UTC

troyp wrote:
Jplus wrote:[1] Just like the type signature of a function in Haskell, the template declaration serves to provide information as well as to catch errors early on. [...]

well, okay, but the type info is already in the declaration line (or whatever you call it). Ideally, the parameter would be there. There's no reason to have the "Template ..." line: that's just C++ boilerplate.

To the contrary. The template declaration is part of the function declaration. C++ has a very good reason to do it like that: it encodes type information differently than Haskell does.
Spoiler:
Consider these type signatures in Haskell:

Code: Select all

qsort :: [a] -> [a]
qsort :: [Int] -> [Int]

If you know Haskell, you can tell that the first is a template function while the second is not. That's because Haskell has a rule that type variables start with a lowercase letter while concrete types and type constructors start with an uppercase. In C++ on the other hand concrete types may start with either case (the most common convention is to have library types start with a lowercase and user-defined types with an uppercase). So if you'd just leave out the template declaration of my C++ code, the parameter list suddenly means something entirely different:

Code: Select all

// template <class I>
void qsort (I begin, I end);

The compiler now must try to find some concrete type with the name "I". If you were to redesign the language you could do this:

Code: Select all

qsort (auto begin, auto end); // auto means "automatically infer the type"

But then you lose the power to specify that begin and end should be of the same type. Alternatively you could introduce additional keywords in the parameter list to indicate type variables:

Code: Select all

void qsort (typename I begin, typename I end);

But this approach suffers from a significant disadvantage compared to the current approach. The fact that the function is a template doesn't stand out anymore, for two reasons. The first is that the parameter list may contain several other kinds of additions, most of which serve to further specify a concrete type; e.g. "const", "volatile", "&" and even "template" and "typename" to disambiguate dependent names. So the reader of the code would have to be really attentive at reading the parameter list in order to find out whether each type is concrete or templated. The second reason is that you loose a very obvious tag that screams "I'M A TEMPLATE! LOOK WHAT PARAMETERS I'VE GOT!".
In addition, it doesn't even make the function declaration any shorter. You replace two short lines by a single long one, which is less readable.

By the way, in a sense Haskell also separates the type variables from the data variables. It's just so much more clean. Look at this, see what I mean?

Code: Select all

qsort :: (Ord a) => [a] -> [a]

troyp wrote:Sorry, "pointer declarations" was inaccurate. I meant the overhead of using pointers. I was also thinking of "auto", although presumably that's not only used when derefererncing pointers.

Wrong again. You seem to know less about C++ than I assumed. Those are not necessarily pointers, although they use pointer syntax; they are iterators. Iterators provide a common interface to containers so algorithms can do their stuff without worrying whether they're dealing with a std::vector, a std::deque, a raw array or something else. The Haskell equivalent would be a "!" function which you could universally apply to IArray, MArray and any user-defined array-like type constructor. Using iterators in C++ is not overhead, just like using the "!" function in Haskell is not overhead.

troyp wrote:
[3] Admittedly that one is long-winded, but if that is of any concern you can use Boost.Phoenix and write it like "_1 < p" instead. C++ does not enforce long-winded code.
um...that's a pretty reasonable syntax. Can I ask why you didn't use it in the first place (and why everyone doesn't use it - always)? I suspect there are drawbacks to using it, but if there aren't...please, please never use that horrific bind syntax again!

I didn't use it in the first place because it's part of the Boost library rather than the standard library and I started out writing code that only relied on the standard library. It has no disadvantages other than not being in the standard library (then again, Boost is usually considered almost-standard). Actually I think it will be adopted into the C++ standard library at some point, because Bind was also adopted from Boost. Boost.Phoenix is newer. Neither Phoenix nor Bind are very well-known among the general population of C++ programmers, by the way; that's the consequence of there being tons of C++ programmers.

troyp wrote:One way to see that there's stuff in the C++ code that's irrelevant to the algorithm is to just compare it to the equivalent Haskell. If something's missing from the Haskell version, it couldn't be essential to the algorithm.
I'm familiar with both languages. While I do find the Haskell version easy to read, I don't find it easier than the C++ version. The class of programs we're talking about is the class of programs that are very short because they use the most naieve version of an algorithm. While you need to type slightly more in C++ to express these algorithms, I think it comes just as close to what is in your head. It's a different way to translate what's in your head, that's all.

The C++ version may be just as easy to understand (although I'm skeptical on whether that would scale to more complex code), but it's still going to take longer to actually read, and longer to scan, simply because it's more verbose and there are extraneous details to worry about. Also, the class of programs for which Haskell is extremely readable is not just toy showoff programs. It includes a large range of programs: basically, if you're dealing with numerical or algorithmic code which doesn't involve mutation or randomness, chances are the Haskell implementation will be particularly elegant.

There is nothing verbose about the C++ version (except for the choice to use longer, more speaking names), nor does it contain "extraneous details", not even compared to the Haskell version. The only difference is that C++ tends to be written as a sequence of short imperative statements while Haskell tends to be written as long lines of nested function applications, and that C++ requires all type information to be explicit (even automatic type deduction) where Haskell allows you to have the compiler figure it out. That you have the option to leave out type information in Haskell does not mean it isn't essential, in fact it very much is and usually it's better to not leave it out.

troyp wrote:
No, you want median-of-three quicksort because that's optimal for sorted and reversed ranges and yields a better pivot on average for all other ranges. The "flip the pivot to the start" thing was invented only for that class of very naieve but easy to demonstrate programs.

Huh? There's nothing naieve about flipping the pivot. It just simplifies the code in some algorithms (possibly at the cost of a tiny inefficiency). And while I'm a bit vague on quicksort, I'm pretty sure randomized pivots are better than median-of-3. The ideal pivot would be the true median, but that's impossible to determine efficiently, so random pivot selection provides a good-enough approximation on average. Median of first/middle/last is also an approximation, but it's obviously a flawed one. On pathological inputs you could be pivoting every time on the second smallest(/largest) element! Asymptotically, that's just as bad as flipping on the first element. Of course, the pathological input for median-of-3 is much less likely than for pivot-on-first (which I guess is the only reason this method is used). Still, random is better.

You are wrong, really. Allow me to explain.
Spoiler:
First off, random number generators, especially good ones, are complex beasts that take quite a lot of time and space to generate every next number. Taking the median of the first, middle and last element is much more efficient.
Secondly, the median-of-three approach is superior probability-wise. Consider three distinct classes of inputs:
  • Sorted or reverse-sorted data. Median-of-three selects the optimal pivot, i.e. the middle, while a random pivot will on average be halfway between the middle and an extreme end.
  • Random data. Median-of-three effectively takes the median of three random samples, while a single random pivot isn't any better than just taking the first element (actually worse, because it spends more effort).
  • Artificially constructed median-of-three killer sequences (extremely unlikely unless on purpose). Random pivot selection is fairly robust against such an attack, but you can secure median-of-three selection against them as well by switching to another sorting algorithm at O(log n) recursion depth. The latter is called introsort, and it's the algorithm that implements std::sort in the C++ STL.
Taking these three classes into account, median-of-three performs much better than random pivot selection on average.

troyp wrote:
Here's complete median-of-three quicksort in C++ as derived from my previous implementation. I invite you to write a Haskell equivalent where choosing the pivot takes only constant time (i.e. it should operate on arrays). You may assume median3 to be already implemented, as I did.

(Complete? You seem to be missing a partition routine ;-) )

Very cheeky, but std::partition is available from <algorithm> just like Haskell's partition algorithm is available from Data.List. You were the one to introduce it.

troyp wrote:I'm really not familiar with mutable arrays in Haskell. I couldn't write an inplace sort offhand. But I have seen inplace procedures written as recursive do-blocks in Haskell and it really wasn't that bad. I wouldn't say it was particularly good, either, but reasonable (not worse than C++).

Perhaps it's not much worse but I assure you it certainly isn't any better. I've seen Haskell translations of imperative algorithms and they were uglier than the original. I've been hinting at this all the time; you are all extatic about Haskell elegance but as soon as we start doing real-world things such as mutating arrays inplace it turns ugly. Haskell looks more natural when you stay in the Platonic dream world but C++ looks more natural for real stuff. In either case the difference is marginal, as C++ has Boost.Phoenix and Haskell has monads.

(Let me emphasise that I don't mean to suggest that C++ is better; I just coloured my formulation to show that I'm an Aristotelian.)

troyp wrote:
You only demonstrated for fmap and foldr. I don't see how this would work for qsort.

If you want indexes, I'm not sure if there's standard class, but it's trivial to write:

Code: Select all

class Indexable a where
    (!*) :: a t -> Integer -> t
instance Indexable [] where
    xs !* n = xs !! (fromInteger n)
instance Indexable (Array Integer) where
    a !* n = let (beg, end) = bounds a
             in a ! (beg + n)

Thank you for the attempt, but this is not enough for implementing quicksort. You also need a way to write to an index (not necessarily in-place). There is nothing in the Haskell hierarchical libraries that meets the requirements; I tell you it's because it would be very non-trivial. Otherwise it would already be there, preventing a lot of code duplication. Note how every container type constructor in the hierarchical libraries has its own sorting function...

troyp wrote:If you mean in an in-place implementation of quicksort, lists and normal arrays are immutable, so you couldn't use them anyway. I think there's a type class in Data.Array that serves as an interface for all mutable array types, but I don't know the details (like I said, I don't know mutable arrays). Of course, you could still make the quicksort function work on lists and arrays: you'd just have to convert to and from a mutable array to do the sorting.

You are talking about MArray. Converting to another container is not a solution, because you still need a separate conversion function for each container type constructor. Face it: C++ offers a more general interface for this kind of thing than Haskell. That's not something for Haskell to be ashamed of, because C++ and Haskell were designed with different purposes in mind. Still, it's ridiculous to claim that Haskell provides a more elegant language for implementing quicksort than C++.
"There are only two hard problems in computer science: cache coherence, naming things, and off-by-one errors." (Phil Karlton and Leon Bambrick)

coding and xkcd combined

(Julian/Julian's)

EvanED
Posts: 4331
Joined: Mon Aug 07, 2006 6:28 am UTC
Location: Madison, WI
Contact:

Re: Coding: Fleeting Thoughts

Postby EvanED » Mon Dec 03, 2012 3:32 pm UTC

Jplus wrote:
troyp wrote:
Jplus wrote:[1] Just like the type signature of a function in Haskell, the template declaration serves to provide information as well as to catch errors early on. [...]

well, okay, but the type info is already in the declaration line (or whatever you call it). Ideally, the parameter would be there. There's no reason to have the "Template ..." line: that's just C++ boilerplate.

To the contrary. The template declaration is part of the function declaration. C++ has a very good reason to do it like that: it encodes type information differently than Haskell does.
Spoiler:
Consider these type signatures in Haskell:

Code: Select all

qsort :: [a] -> [a]
qsort :: [Int] -> [Int]

If you know Haskell, you can tell that the first is a template function while the second is not. That's because Haskell has a rule that type variables start with a lowercase letter while concrete types and type constructors start with an uppercase. In C++ on the other hand concrete types may start with either case (the most common convention is to have library types start with a lowercase and user-defined types with an uppercase). So if you'd just leave out the template declaration of my C++ code, the parameter list suddenly means something entirely different:

Code: Select all

// template <class I>
void qsort (I begin, I end);

The compiler now must try to find some concrete type with the name "I". If you were to redesign the language you could do this:

Code: Select all

qsort (auto begin, auto end); // auto means "automatically infer the type"

But then you lose the power to specify that begin and end should be of the same type. Alternatively you could introduce additional keywords in the parameter list to indicate type variables:

Code: Select all

void qsort (typename I begin, typename I end);

But this approach suffers from a significant disadvantage compared to the current approach. The fact that the function is a template doesn't stand out anymore, for two reasons. The first is that the parameter list may contain several other kinds of additions, most of which serve to further specify a concrete type; e.g. "const", "volatile", "&" and even "template" and "typename" to disambiguate dependent names. So the reader of the code would have to be really attentive at reading the parameter list in order to find out whether each type is concrete or templated. The second reason is that you loose a very obvious tag that screams "I'M A TEMPLATE! LOOK WHAT PARAMETERS I'VE GOT!".
In addition, it doesn't even make the function declaration any shorter. You replace two short lines by a single long one, which is less readable.

By the way, in a sense Haskell also separates the type variables from the data variables. It's just so much more clean. Look at this, see what I mean?

Code: Select all

qsort :: (Ord a) => [a] -> [a]

I feel like this part of the conversation just went:

troyp: "C++ is so verbose; look at all this extra concrete syntax you need to mark templates"
JPlus: "Nuh uh! Here's why it's verbose!"

:-)

BTW, I think there's a pretty good syntax for doing inline template definitions where the template parameters stand out -- void foo(<T> param1, <U> & param2).

But you also missed a couple big reasons to not do this. The first is that you "couldn't" do this for classes, and having to use template<...> everywhere makes things more consistent between class & function templates (not that the standards committee apparently cares much about consistency). The second is that it's occasionally useful to have template parameters that do not appear in the parameter list, and which callers must then explicitly supply. The canonical example of this is something like boost::lexical_cast<string>(10) to convert an int to a string. (Though you could also offer the full template<...> form and just desugar my syntax to it.) The third is it's occasionally useful to explicitly instantiate a template -- say foo<double>(5) or something -- even if you could apply argument deduction. A syntax that removes template<...> makes the order of template parameters implicit, which means now you have to make sure you pick up the right arguments in the right order.

In short: I kind of agree with both positions. C++ does things pretty well for the overall designs of the language (the fact that you've got explicit types, it has classes, it's imperative, it doesn't want to impose lexical conventions on names, etc.) -- but I also think that said designs do lead to a fair bit of syntactic cruft.

troyp
Posts: 557
Joined: Thu May 22, 2008 9:20 pm UTC
Location: Lismore, NSW

Re: Coding: Fleeting Thoughts

Postby troyp » Mon Dec 03, 2012 3:36 pm UTC

Alternatively you could introduce additional keywords in the parameter list to indicate type variables:

Code: Select all

void qsort (typename I begin, typename I end);

But this approach suffers from a significant disadvantage compared to the current approach. The fact that the function is a template doesn't stand out anymore, for two reasons. The first is that the parameter list may contain several other kinds of additions, most of which serve to further specify a concrete type; e.g. "const", "volatile", "&" and even "template" and "typename" to disambiguate dependent names. So the reader of the code would have to be really attentive at reading the parameter list in order to find out whether each type is concrete or templated. The second reason is that you loose a very obvious tag that screams "I'M A TEMPLATE! LOOK WHAT PARAMETERS I'VE GOT!".
In addition, it doesn't even make the function declaration any shorter. You replace two short lines by a single long one, which is less readable.

What's wrong with something like

Code: Select all

void qsort<T>(I begin, I end);
? That's how other languages do it. Even if there's some technical reason against it, the fact remains that it's more readable, which is the issue in question.

By the way, in a sense Haskell also separates the type variables from the data variables. It's just so much more clean. Look at this, see what I mean?

Code: Select all

qsort :: (Ord a) => [a] -> [a]
[/spoiler]

So, a few things:
[1] Being so much more clean is really the crucial point.
[2] Your terminology is off. "a" above is a type variable. "Ord" is a type class, not any sort of variable. Data variables are regular variables in the function definition, they don't occur in the type signature at all (so it actually is true that type and data variables are separated, but that's not what the signature above illustrates)
[3] A type class is quite different from a template. A template is a way of implementing (pseudo-) parametric polymorphism, so it corresponds to the "a"s above. A type class is a disciplined way of handling ad-hoc polymorphism (overloading). C++ doesn't really have a counterpart (concepts sound like they would have been similar, but I'm just going off my vague impression of them).
[4] Haskell does separate type variables from data variables, but the situation is quite different. Haskell separates all types (variable or constant) from the definition into a type signature. This works out really well. The signature and parameter list are uncluttered by being separated and the type signature can be read or ignored separately as desired, commented out, etc. I'd be perfectly happy if C++ moved it's types to the line above to join the template parameter.
Wrong again. You seem to know less about C++ than I assumed. Those are not necessarily pointers, although they use pointer syntax; they are iterators. Iterators provide a common interface to containers so algorithms can do their stuff without worrying whether they're dealing with a std::vector, a std::deque, a raw array or something else. The Haskell equivalent would be a "!" function which you could universally apply to IArray, MArray and any user-defined array-like type constructor. Using iterators in C++ is not overhead, just like using the "!" function in Haskell is not overhead.

I know what iterators are. I don't know much C++, but I understand all your code (well except for the "auto" keyword - I know that does some kind of memory management, but I'm not sure what). If I was going to write it, I'd have to look stuff up, but I'm okay to read simple C++.

You're right, there are no pointers defined. When I was writing my post I just looked and saw the "*"s and forgot they were iterators. I don't have a problem with iterators, per se (although I do think they're often used where other things - indexes or whole-sequence operations - would be more readable).

I didn't use it in the first place because it's part of the Boost library rather than the standard library and I started out writing code that only relied on the standard library. It has no disadvantages other than not being in the standard library (then again, Boost is usually considered almost-standard). Actually I think it will be adopted into the C++ standard library at some point, because Bind was also adopted from Boost. Boost.Phoenix is newer. Neither Phoenix nor Bind are very well-known among the general population of C++ programmers, by the way; that's the consequence of there being tons of C++ programmers.

Oh, okay. I was thinking they were both from Boost. Well, I hope they bring Phoenix into the standard library, because that syntax is a huge improvement.

First off, random number generators, especially good ones, are complex beasts that take quite a lot of time and space to generate every next number. Taking the median of the first, middle and last element is much more efficient.
Secondly, the median-of-three approach is superior probability-wise. Consider three distinct classes of inputs:
  • Sorted or reverse-sorted data. Median-of-three selects the optimal pivot, i.e. the middle, while a random pivot will on average be halfway between the middle and an extreme end.
  • Random data. Median-of-three effectively takes the median of three random samples, while a single random pivot isn't any better than just taking the first element (actually worse, because it spends more effort).
  • Artificially constructed median-of-three killer sequences (extremely unlikely unless on purpose). Random pivot selection is fairly robust against such an attack, but you can secure median-of-three selection against them as well by switching to another sorting algorithm at O(log n) recursion depth. The latter is called introsort, and it's the algorithm that implements std::sort in the C++ STL.
Taking these three classes into account, median-of-three performs much better than random pivot selection on average.

I'm not sure how high a quality PRNG you'd need (I guess you might need cryptographically secure to truly give a watertight statistical guarantee of O(log(n)), but we're only talking about a constant factor, in any case. I don't know a lot about the relative performance of quicksort implementations. You may be right about median-of-3 being most efficient in typical cases. I do know (as you concede) that it's O(n^2) worst-time. That's reason enough to at least consider the randomized alternative. I admit it's very unlikely to occur accidentally (but there may be cases when you need to worry about malicious input). As for swapping sorts at O(log(n)) depth, that may be a viable approach, but it's not directly relevant, since it's not even remotely quicksort.

Very cheeky, but std::partition is available from <algorithm> just like Haskell's partition algorithm is available from Data.List. You were the one to introduce it.

Ah. I did wonder about that after I posted (I thought it was odd that you were passing in a predicate rather than just the pivot). I guess that's my assumptions showing. I've never seen a standard partition function that partitions in-place. The ones I've seen all return a pair of new lists. I guess it shouldn't be surprising given that it's C++.

Perhaps it's not much worse but I assure you it certainly isn't any better. I've seen Haskell translations of imperative algorithms and they were uglier than the original. I've been hinting at this all the time; you are all extatic about Haskell elegance but as soon as we start doing real-world things such as mutating arrays inplace it turns ugly. Haskell looks more natural when you stay in the Platonic dream world but C++ looks more natural for real stuff. In either case the difference is marginal, as C++ has Boost.Phoenix and Haskell has monads.


I never claimed Haskell was particularly elegant for imperative code. But it is particularly elegant for functional code, which occurs a lot more than you want to admit. Often, mutation can be avoided using persistent data structures (and new, efficient data structures are being written all the time). Stuff like IO and randomness require monadic code, but this can be isolated to prevent the infection from spreading. And even the ugly parts are rarely worse than C++. Saying C++ is as readable as Haskell is like saying Haskell is as fast as C++: you might be able to find examples to support it if you look hard, but in reality it's just not so.

You are talking about MArray. Converting to another container is not a solution, because you still need a separate conversion function for each container type constructor. Face it: C++ offers a more general interface for this kind of thing than Haskell.

I'm not sure what you're asking for. You wanted to be able to use a function on either lists or arrays. But you say you want to be able to write to an index. You have to convert to another container to do that, since lists and ("standard") arrays are immutable. That would be the case in any language (it's the point of an immutable data type). You don't have to use separate conversion functions, though, you could have a single toMArray function that works on both lists and arrays (I have no idea if it exists, if not you'd have to implement it as a type class method).
Still, it's ridiculous to claim that Haskell provides a more elegant language for implementing quicksort than C++.

Um, what? I said that about the functional implementations of quicksort in Haskell and C++. You've moved the discussion to an imperative implementation, but I never claimed Haskell was more elegant for an in-place quicksort.

User avatar
Yakk
Poster with most posts but no title.
Posts: 11129
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

Re: Coding: Fleeting Thoughts

Postby Yakk » Mon Dec 03, 2012 3:55 pm UTC

The auto keyword no longer refers to automatic storage, and in practice it hasn't since the 1970s. The auto keyword refers to automatic type deduction, saying that the variable's type is set to exactly what it is constructed from.
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

EvanED
Posts: 4331
Joined: Mon Aug 07, 2006 6:28 am UTC
Location: Madison, WI
Contact:

Re: Coding: Fleeting Thoughts

Postby EvanED » Mon Dec 03, 2012 4:37 pm UTC

troyp wrote:[3] A type class is quite different from a template. A template is a way of implementing (pseudo-) parametric polymorphism, so it corresponds to the "a"s above. A type class is a disciplined way of handling ad-hoc polymorphism (overloading). C++ doesn't really have a counterpart (concepts sound like they would have been similar, but I'm just going off my vague impression of them).

I am not super familiar with concepts or Haskell, but I almost said the same thing. If template<typename T, typename U> T foo(T a, U b) is like foo:: t * u -> t, and you have a concept LessThanComparable, then template<LessThanComparable T> T max(T a, T b) is like max:: (LessThanComparable a) => a * a -> a.

I think the main differences are that (1) in Haskell terms, the definitions of the typeclass instantiation are local to uses of the corresponding concept (e.g. you could implement a function in different ways for different typeclasses), (2) in C++ you're probably not restricted to using functions from the type class. (I could be wrong on either or both of these, on either the C++ or Haskell side. :-))

troyp
Posts: 557
Joined: Thu May 22, 2008 9:20 pm UTC
Location: Lismore, NSW

Re: Coding: Fleeting Thoughts

Postby troyp » Mon Dec 03, 2012 4:39 pm UTC

Okay, that makes sense, I guess. So it's used in generic code that takes iterators because you don't have immediate access to the container type.

Is it actually possible to recover the container type from an iterator?

Also, I can't believe they changed the meaning of a keyword! I would never have expected C++ to do that.

edit: @EvanED:
In Haskell, type classes don't have their own namespaces. They define overloaded functions in the global namespace. AFAIK, there's not even any way to rename the type class methods when you import one. This, along with global instances, has always bothered me about type classes. They don't seem very scalable. I feel they should have been designed differently (although I don't know how).

I'm not quite sure what you mean by (2), but Haskell type classes can use outside functions (if they're from another type class, it has to appear as a superclass (prerequisite) in "context" of the type class)
Last edited by troyp on Mon Dec 03, 2012 5:00 pm UTC, edited 1 time in total.

User avatar
You, sir, name?
Posts: 6983
Joined: Sun Apr 22, 2007 10:07 am UTC
Location: Chako Paul City
Contact:

Re: Coding: Fleeting Thoughts

Postby You, sir, name? » Mon Dec 03, 2012 4:51 pm UTC

Xanthir wrote:
You, sir, name? wrote:I've always done my parsing via graph transformation rules where at all possible. Works for most semi-context-free grammars (you can add some context dependency at the cost of rule complexity), and is much closer to how humans evaluate expressions (PEMDAS being the archetypal rule-stack example from mathematics).

Could you point to an example of this? A search in the scholarly literature doesn't appear to be immediately fruitful.


Pratt parsing is perhaps the simplest version of this type of parsing. I'm using a home cooked version that uses more complex "operators" and emphasizes the digraph/tree rewriting aspect of what I'm doing.

Although it's a bit forgotten and underused, it's surprisingly efficient and fairly easy to implement.
Last edited by You, sir, name? on Mon Dec 03, 2012 5:52 pm UTC, edited 1 time in total.
I edit my posts a lot and sometimes the words wrong order words appear in sentences get messed up.

EvanED
Posts: 4331
Joined: Mon Aug 07, 2006 6:28 am UTC
Location: Madison, WI
Contact:

Re: Coding: Fleeting Thoughts

Postby EvanED » Mon Dec 03, 2012 5:13 pm UTC

troyp wrote:Is it actually possible to recover the container type from an iterator?

Iterators needn't come from containers, which is a large part of why so much C++ code operates on iterator ranges rather than containers. But you can recover the element type (i.e. what gets returned by *iter) using typename std::iterator_traits<TheIteratorTypeInQuestion>::value_type, ..::pointer, or ...::reference as appropriate. (Most iterators would let you say TheIteratorTypeInQuestion::value_type and stuff, but the fact that pointers into an array can be used like iterators means that won't always work.

Also, I can't believe they changed the meaning of a keyword! I would never have expected C++ to do that.

Yup. But in this case it's a keyword that basically hasn't been regularly used for a timespan naturally measurable in decades.

I'm not quite sure what you mean by (2), but Haskell type classes can use outside functions (if they're from another type class, it has to appear as a superclass (prerequisite) in "context" of the type class)

Take the typeclass-less version, in C++ just template<typename T> void foo(T x). There's no restriction (at declaration time) of what I do with x. Maybe I call x.bar(), maybe I say x < y, maybe I call bar(x). All of those are fine for the declaration of foo, and they'll be looked up when foo is instantiated (roughly speaking, "called"). By contrast, my understanding is that in Haskell you couldn't even make such a definition unless you had a typeclass (or multiple typeclasses) that stated those functions were present, and then those typeclasses would be part of the type of foo.

I don't actually know this, but I'd speculate the same is true if you put in concepts. I could say template<LessThanComparable T> void foo(T x) and then the type class guarantees that x < y will work (well, will provide more accurate error messages), but I suspect it's still true that you could continue to say x.foo() even if that function's not part of the LessThanComparable type class.

(And to let me forestall another potential question: template<LessThanComparable T> ... is desugared to template<typename T> requires LessThanComparable<T> ..., and you can use the latter syntax directly to impose multiple concepts on a single template parameter.)

User avatar
Jplus
Posts: 1721
Joined: Wed Apr 21, 2010 12:29 pm UTC
Location: Netherlands

Re: Coding: Fleeting Thoughts

Postby Jplus » Mon Dec 03, 2012 6:14 pm UTC

EvanED wrote:I feel like this part of the conversation just went:

troyp: "C++ is so verbose; look at all this extra concrete syntax you need to mark templates"
JPlus: "Nuh uh! Here's why it's verbose!"

*Scratches behind ears*
*Looks around awkwardly*

EvanED wrote:In short: I kind of agree with both positions. C++ does things pretty well for the overall designs of the language (the fact that you've got explicit types, it has classes, it's imperative, it doesn't want to impose lexical conventions on names, etc.) -- but I also think that said designs do lead to a fair bit of syntactic cruft.

Really? I honestly, seriously don't find it verbose at all.

troyp wrote:What's wrong with something like

Code: Select all

void qsort<T>(I begin, I end);
? That's how other languages do it. Even if there's some technical reason against it, the fact remains that it's more readable, which is the issue in question.

Perhaps you mean the Java syntax:

Code: Select all

<I> void qsort (I begin, I end); // I'll refrain from the public final static stuff

It's basically the same as in C++, but they're leaving out the "template" and "class" keywords. The latter is not an option in C++ because template parameters aren't always types. The former would probably be possible, but it would be less consistent: "template <class I, class J>" looks very analogous to "void (int a, int b)" and that's what you want because the first is a function over types while the latter is a function over objects.
But well, I guess you have a point that the word "template" isn't strictly necessary.

troyp wrote:
By the way, in a sense Haskell also separates the type variables from the data variables. It's just so much more clean. Look at this, see what I mean?

Code: Select all

qsort :: (Ord a) => [a] -> [a]

So, a few things:
[1] Being so much more clean is really the crucial point.
I meant "separating type variables from data variables is just so much more clean", not "Haskell is just so much more clean". I know that you're arguing for the latter but I wasn't.
troyp wrote:[2] Your terminology is off. "a" above is a type variable. "Ord" is a type class, not any sort of variable. [...]
[3] A type class is quite different from a template. [...]
[4] Haskell does separate type variables from data variables, but the situation is quite different. Haskell separates all types (variable or constant) from the definition into a type signature. [...]

Yes I should have formulated that better, but all that is just semantics. What I meant to say is this. The Haskell type signature above consists of two parts, "(Ord a)" and "[a] -> [a]", separated by "=>". The first part gives requirements on types and is analogous to the "template <class I>" part in the C++ function declaration. The second part gives the actual type of the function, and is analogous to the "void (I, I)" part of the C++ function declaration. So I was pointing out that that separation is important and that Haskell is doing it too.

You are right to point out that Haskell on top of that also separates the function type from the names of its data parameters. This is an orthogonal issue, and contrary to what you seem to suggest I don't necessarily agree that it is a good thing.

(I gave some explanation on concepts and typeclasses here but was ninja'd by EvanED.)

troyp wrote:
I didn't use it in the first place because it's part of the Boost library rather than the standard library and I started out writing code that only relied on the standard library. [...]

Oh, okay. I was thinking they were both from Boost. Well, I hope they bring Phoenix into the standard library, because that syntax is a huge improvement.

Actually, I hope so too.

troyp wrote:I'm not sure how high a quality PRNG you'd need (I guess you might need cryptographically secure to truly give a watertight statistical guarantee of O(log(n)), but we're only talking about a constant factor, in any case. I don't know a lot about the relative performance of quicksort implementations. You may be right about median-of-3 being most efficient in typical cases. I do know (as you concede) that it's O(n^2) worst-time. That's reason enough to at least consider the randomized alternative. I admit it's very unlikely to occur accidentally (but there may be cases when you need to worry about malicious input). As for swapping sorts at O(log(n)) depth, that may be a viable approach, but it's not directly relevant, since it's not even remotely quicksort.

I have not stated this clearly enough: random pivot is more likely to degrade to O(n^2) than median-of-three.

Also, introsort is very much quicksort because in practice it will never switch to the emergency sort. Introsort is basically just quicksort done right.

troyp wrote:I never claimed Haskell was particularly elegant for imperative code. But it is particularly elegant for functional code, which occurs a lot more than you want to admit. Often, mutation can be avoided using persistent data structures (and new, efficient data structures are being written all the time). Stuff like IO and randomness require monadic code, but this can be isolated to prevent the infection from spreading. And even the ugly parts are rarely worse than C++. Saying C++ is as readable as Haskell is like saying Haskell is as fast as C++: you might be able to find examples to support it if you look hard, but in reality it's just not so.

I'm going to say the exact opposite: Haskell and C++ are incredibly close with respect to performance and verbosity. I think it's bizarre that we're having this discussion while we have as diverse language out there as Forth, Python and APL.

C++ and Haskell are like dual twin brothers with their emphasis on rich static strong typing and Turing-complete metafunctions. Haskell chose to focus on the lazy functional side and C++ chose to focus on the explicit procedural side, but Haskell can mimic C++ with monads (in which case it may become just slightly less elegant and almost as performant) and C++ can mimic Haskell with template metaprogramming libraries like Boost.Phoenix (in which case it may become just slightly less elegant and almost as lazy).

The language benchmarks game seems to confirm that for anything more than just a single function for demonstration purposes, C++ and Haskell are relatively close both in verbosity and performance.

troyp wrote:
You are talking about MArray. Converting to another container is not a solution, because you still need a separate conversion function for each container type constructor. Face it: C++ offers a more general interface for this kind of thing than Haskell.

I'm not sure what you're asking for. You wanted to be able to use a function on either lists or arrays. But you say you want to be able to write to an index. You have to convert to another container to do that, since lists and ("standard") arrays are immutable. That would be the case in any language (it's the point of an immutable data type). You don't have to use separate conversion functions, though, you could have a single toMArray function that works on both lists and arrays (I have no idea if it exists, if not you'd have to implement it as a type class method).
Still, it's ridiculous to claim that Haskell provides a more elegant language for implementing quicksort than C++.

Um, what? I said that about the functional implementations of quicksort in Haskell and C++. You've moved the discussion to an imperative implementation, but I never claimed Haskell was more elegant for an in-place quicksort.

Three things are intertwining here. One thing is that you'd have to implement median-of-three pivot selection in order to not make your quicksort implementation a joke. Or at least use the middle element rather than the first. Either way requires a random access container (something array-ish). But that will already make your code more complex (and force you to write your own partitioning algorithm), even if you just use the lazy, nonmonadic IArray.
Another thing is that it's perfectly possible to write a sorting algorithm in C++ that works on raw arrays, vectors, deques, linked lists and whatever other sequence container you might like to roll on your own. The quicksort I wrote, with only a slight modification, can work on a doubly linked list even though actually applying it to a linked list would render the pivot selection inefficiënt; the point is that C++ offers that functionality out of the box while Haskell does not. You can probably solve that with some more ad-hoc typeclasses, although you still haven't fully demonstrated that, but even then you'll probably need to write more ad-hoc typeclasses for the next algorithm that you want to work across multiple container types.
The third thing is that if you want your sort to be inplace (more on that below), you have to use MArray, or if on top of that you want to avoid all the complex datastructure overhead that is associated with lazyness (i.e. for C++-like performance) you must use STUArray. The problem is that in either case your code is bound to become less elegant, and in the latter case you lose full genericity.

So, two remarks on the inplace issue. First, I never actually meant to challenge you to write a truly in-place sorting algorithm; I just wanted to point out that doing so would make your code messy. Secondly, "writing to an index" doesn't necessarily mean "mutating the container inplace", at least not in Haskell (I've hinted at that before). IArray lets you write to an index, but it returns a new array that has the particular index changed. In order to keep that somewhat manageable it's backed by a highly complex datastructure.

Concluding, I didn't really move the discussion to an imperative implementation. I did challenge you to write a quicksort implementation in Haskell that makes more sense than the silly algorithm that only serves demonstration purposes. But as far as I'm concerned that can still be purely functional, lazy, nonmonadic code with immutable datastructures. I predict that that will still be much less concise than the demonstration algorithm, probably the same length as the C++ algorithm or even beyond.

For maximum clarity: there are two challenges. The first is to write quicksort with efficient median-of-three pivot selection. The other (older) is to write typeclasses that allow you to apply the same quicksort implementation to IArray, MArray and [] (at second thought, you may scratch MArray because it probably requires monadic code and substitute Seq for bonus points). I've already done the equivalent of both at the same time in C++.
You (troyp) of course don't have to implement the solutions yourself, or even take up the challenge at all; I'm just making a claim that I consider justified until proven otherwise. I'm actually quite curious whether somebody will be able to prove me wrong.
"There are only two hard problems in computer science: cache coherence, naming things, and off-by-one errors." (Phil Karlton and Leon Bambrick)

coding and xkcd combined

(Julian/Julian's)

EvanED
Posts: 4331
Joined: Mon Aug 07, 2006 6:28 am UTC
Location: Madison, WI
Contact:

Re: Coding: Fleeting Thoughts

Postby EvanED » Mon Dec 03, 2012 7:35 pm UTC

Jplus wrote:
troyp wrote:

Code: Select all

void qsort<T>(I begin, I end);
Perhaps you mean the Java syntax:

Code: Select all

<I> void qsort (I begin, I end); // I'll refrain from the public final static stuff

I was actually surprised to see you're right for Java. However, C# uses the syntax troyp suggested.

It's basically the same as in C++, but they're leaving out the "template" and "class" keywords. The latter is not an option in C++ because template parameters aren't always types. The former would probably be possible, but it would be less consistent: "template <class I, class J>" looks very analogous to "void (int a, int b)" and that's what you want because the first is a function over types while the latter is a function over objects.

I would also suggest that the typename/class could be optional (or even prohibited) for type template parameters, while still allowing declarations of non-type and template template parameters as they are currently. Heck, from the parser's standpoint such a thing would be a trivial modification. I'm not sure it'd be a good design or not. FWIW, D appears to take this approach (though I'm... less enamored with some other choices of theirs).

Yes I should have formulated that better, but all that is just semantics. What I meant to say is this. The Haskell type signature above consists of two parts, "(Ord a)" and "[a] -> [a]", separated by "=>". The first part gives requirements on types and is analogous to the "template <class I>" part in the C++ function declaration.

I definitely disagree here. Consider the head function. (hd? first? Whatever.) In Haskell it would have the type signature [a]->a, which has no type constraints even though it is a polymorphic function; the C++ version would still have template<typename T> .... What the template<...> says (sticking to the common case anyway) is that the function is polymorphic -- the Haskell equivalent of that is that there is a type variable in the type.

In the absence of concepts, the C++ equivalent of the Haskell type constraint Ord a is implicit in the definition of the function, not ever explicit. (In fact, this lack of explicitness is a reasonably big problem in practice and really what concepts were aimed at solving.)

User avatar
Xeio
Friends, Faidites, Countrymen
Posts: 5101
Joined: Wed Jul 25, 2007 11:12 am UTC
Location: C:\Users\Xeio\
Contact:

Re: Coding: Fleeting Thoughts

Postby Xeio » Mon Dec 03, 2012 7:54 pm UTC

Hrmmm, so I guess you could do some functional list programming in C# with IEnumerable. Just need some extra extension methods like

Code: Select all

public IEnumerable<T> Add(this IEnumerable<T> list, T item){
    foreach(T t in list)
        yield return t;
    yield return item;
}

public IEnumerable<T> Remove(this IEnumerable<T> list, T item){
    foreach(T t in list)
        if(t != item)
            yield return t;
}
Bonus, it's all lazy (if you want that, if not you could just call ToList() or something). Still not really a way around mutable objects though.

I think things like this is why IEnumerable scares me a bit.

User avatar
Yakk
Poster with most posts but no title.
Posts: 11129
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

Re: Coding: Fleeting Thoughts

Postby Yakk » Mon Dec 03, 2012 8:12 pm UTC

EvanED wrote:In the absence of concepts, the C++ equivalent of the Haskell type constraint Ord a is implicit in the definition of the function, not ever explicit. (In fact, this lack of explicitness is a reasonably big problem in practice and really what concepts were aimed at solving.)

Well, there is enable_if and SFINAE hackery.
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

User avatar
Xeio
Friends, Faidites, Countrymen
Posts: 5101
Joined: Wed Jul 25, 2007 11:12 am UTC
Location: C:\Users\Xeio\
Contact:

Re: Coding: Fleeting Thoughts

Postby Xeio » Mon Dec 03, 2012 10:03 pm UTC

TIL: DateTime.MaxValue can throw an ArugmentOutOfRangeException.

Which leads me to wonder why anyone would design a calendar that has a maximum date.

User avatar
You, sir, name?
Posts: 6983
Joined: Sun Apr 22, 2007 10:07 am UTC
Location: Chako Paul City
Contact:

Re: Coding: Fleeting Thoughts

Postby You, sir, name? » Mon Dec 03, 2012 10:16 pm UTC

Xeio wrote:TIL: DateTime.MaxValue can throw an ArugmentOutOfRangeException.

Which leads me to wonder why anyone would design a calendar that has a maximum date.


Ask the mayans?
I edit my posts a lot and sometimes the words wrong order words appear in sentences get messed up.

User avatar
Xeio
Friends, Faidites, Countrymen
Posts: 5101
Joined: Wed Jul 25, 2007 11:12 am UTC
Location: C:\Users\Xeio\
Contact:

Re: Coding: Fleeting Thoughts

Postby Xeio » Mon Dec 03, 2012 10:23 pm UTC

From what I understand they just stopped counting on the Mayan calendar, there isn't any reason it couldn't go higher just by incrementing the date.

User avatar
Shivahn
Posts: 2200
Joined: Tue Jan 06, 2009 6:17 am UTC

Re: Coding: Fleeting Thoughts

Postby Shivahn » Tue Dec 04, 2012 12:35 am UTC

Xeio wrote:From what I understand they just stopped counting on the Mayan calendar, there isn't any reason it couldn't go higher just by incrementing the date.

This is correct.

The Mayan calendar only ends 10 days before the Gregorian calendar.

troyp
Posts: 557
Joined: Thu May 22, 2008 9:20 pm UTC
Location: Lismore, NSW

Re: Coding: Fleeting Thoughts

Postby troyp » Tue Dec 04, 2012 3:27 am UTC

EvanED wrote:Iterators needn't come from containers, which is a large part of why so much C++ code operates on iterator ranges rather than containers. But you can recover the element type (i.e. what gets returned by *iter) using typename std::iterator_traits<TheIteratorTypeInQuestion>::value_type, ..::pointer, or ...::reference as appropriate. (Most iterators would let you say TheIteratorTypeInQuestion::value_type and stuff, but the fact that pointers into an array can be used like iterators means that won't always work.

Yeah, the element type is what I meant. Sorry, I got confused.

Take the typeclass-less version, in C++ just template<typename T> void foo(T x). There's no restriction (at declaration time) of what I do with x. Maybe I call x.bar(), maybe I say x < y, maybe I call bar(x). All of those are fine for the declaration of foo, and they'll be looked up when foo is instantiated (roughly speaking, "called"). By contrast, my understanding is that in Haskell you couldn't even make such a definition unless you had a typeclass (or multiple typeclasses) that stated those functions were present, and then those typeclasses would be part of the type of foo.


Oh, I see: you mean functions that apply to method parameters of the type of the type class instance (surely there has to be a simple way to say such a simple thing...). Yeah, if you apply overloaded functions from a different type class to the variable in a default definition, they must be in the context. Keep in mind, though, there are some polymorphic functions you could apply that don't come from a type class (parametric functions that apply to any type: constant functions or functions that embed one or more copies of the variable in some structure). So those could be used without affecting the context.

I don't actually know this, but I'd speculate the same is true if you put in concepts. I could say template<LessThanComparable T> void foo(T x) and then the type class guarantees that x < y will work (well, will provide more accurate error messages), but I suspect it's still true that you could continue to say x.foo() even if that function's not part of the LessThanComparable type class.

Sounds like a reasonable prediction. Making the prereqs optional would be a hole in the type system, but I guess it would make sense for C++ (you might want to try something and catch the exception to apply a default if it doesn't work).

Jplus wrote:I have not stated this clearly enough: random pivot is more likely to degrade to O(n^2) than median-of-three.

What do you mean? A random pivot guarantees (probabilistically) O(n log n) worst-case. For a specific execution, there is always a finite chance of it taking quadratic time, but the bigger the data set, the less likely that is - and you can't have a pathological scenario where data sets are systematically ordered in such a way as to provoke quadratic behaviour. (Aside: I'm not sure if it makes sense to call a quadratic execution of randomized quicksort O(n^2), since it's an asymptotic notation and you can't have asymptotic behaviour of a single execution)

Also, introsort is very much quicksort because in practice it will never switch to the emergency sort. Introsort is basically just quicksort done right.

Ah, I'm an idiot. Sorry, I wasn't thinking of n in O(log n) as n ;-) I was thinking of it as "total recursion depth". I understand now. You're right, that's just quicksort with an "escape hatch". And probably a good idea for something like quicksort where you have a pathological case you expect to be very rare.

I'm going to say the exact opposite: Haskell and C++ are incredibly close with respect to performance and verbosity. I think it's bizarre that we're having this discussion while we have as diverse language out there as Forth, Python and APL.

Well, I see where you're coming from. My impression is that Haskell is usually several times slower than C++. And I do see that as very close. But also, I realize that if performance is really critical and the C++ code is only just adequate, it might make all the difference. Haskell may simply not be good enough. This is kind of the situation I see with readability. Even with the most readable language and code, it's not as easy to read code as I'd like. I feel there's not much wriggle room for making it even worse.

I think to some extent we disagree because we have different standards for readability and you consider distractions insignificant that I'd consider important. For me, it's not enough to be able to read a line and understand it immediately. I want to be able to understand it just from catching it in my peripheral vision. I want to be able to see and take in as much information as possible on a single screen (though "take in" has precedence over "see"). And I don't want to have to think about something extraneous to the code logic even for a fraction of a second. That tiny distraction could be enough to make me lose some incipient thought or intuition forming in the back of my mind.

The quicksort I wrote, with only a slight modification, can work on a doubly linked list even though actually applying it to a linked list would render the pivot selection inefficiënt; the point is that C++ offers that functionality out of the box while Haskell does not.

Is that what you're worried about? Personally I think it would be a mistake to overload random access for arrays and lists since the time complexity is so different. But if you want to, you can do it easily enough. You don't need an ad hoc type class for each algorithm, just make one type class that abstracts whatever common properties you want. You might even be able to use some type extension to define an MArray instance for lists and use them with the same methods. The reason this is not straightforward is that Haskell arrays have a higher kind than C++ arrays (or Haskell lists). They can use any sensible type for indexes, not just integers. So to unify arrays and lists, you probably want to make a newtype from IArray that fixes the index type to Integer. Then make that an instance of your new type class.

For maximum clarity: there are two challenges. The first is to write quicksort with efficient median-of-three pivot selection. The other (older) is to write typeclasses that allow you to apply the same quicksort implementation to IArray, MArray and [] (at second thought, you may scratch MArray because it probably requires monadic code and substitute Seq for bonus points). I've already done the equivalent of both at the same time in C++.
You (troyp) of course don't have to implement the solutions yourself, or even take up the challenge at all; I'm just making a claim that I consider justified until proven otherwise. I'm actually quite curious whether somebody will be able to prove me wrong.

Okay, I'm not going to do them now, but I'll try to get around to them. The type class should be simple. To write an efficient quicksort, I'll have to have a look at what data structures are available. Haskell may be quite readable, but as far as writability goes, it's a lot more difficult (at least for novices) to evaluate and learn fancy persistent data structure libraries than it would be to just mutate array elements.

So, two remarks on the inplace issue. First, I never actually meant to challenge you to write a truly in-place sorting algorithm; I just wanted to point out that doing so would make your code messy. Secondly, "writing to an index" doesn't necessarily mean "mutating the container inplace", at least not in Haskell (I've hinted at that before). IArray lets you write to an index, but it returns a new array that has the particular index changed. In order to keep that somewhat manageable it's backed by a highly complex datastructure.

Oh, Ok. Well, if you're using a persistent data type that allows efficient updates, there's no reason I can see that it should make your code ugly. Not if the interface is well designed. The code wouldn't even need to be monadic unless that made it simpler. List functions are always going to be a bit nicer because they're supported by special syntax, but code using other data structures can still be expressed elegantly. User-defined operators (with user-defined precedence and associativity) are a big help. One issue is that arrays, unfortunately, have a disadvantage in that they have no useful data constructor for pattern matching against. It's not a huge deal, though.

EvanED wrote:However, C# uses the syntax troyp suggested.

So does Scala. I was actually also thinking Java did.

EvanED
Posts: 4331
Joined: Mon Aug 07, 2006 6:28 am UTC
Location: Madison, WI
Contact:

Re: Coding: Fleeting Thoughts

Postby EvanED » Tue Dec 04, 2012 4:30 am UTC

troyp wrote:Sounds like a reasonable prediction. Making the prereqs optional would be a hole in the type system, but I guess it would make sense for C++ (you might want to try something and catch the exception to apply a default if it doesn't work).

There's no type system hole in the C++ way, merely a "how informative are the declarations and the error messages" hole. In the Haskell/concepts way (well, concepts if my hypothesis about being able to stray outside the typeclass is wrong), when you say template<LessThanComparable T> void foo(T x) {...}, the compiler could check at the time foo() is declared that it does not impose any requirements on x other than those stated in the LessThanComparable concept, just like if I have foo :: (Ord a) => ... the compiler can check then that foo only calls functions in the Ord type class (and other polymorphic functions that don't impose additional constraints on the argument).

If my guess is right and you're allowed to stray outside the concept, things would just work the same way they do now: the compiler can't check things when foo is declared, but when it is instantiated with a specific type, it goes through and substitutes that type in place and does type checking then.

You would and are never able to call a function that's not declared; there's no type-safety compromise. If you get an exception as a result of some call, it's because you or someone else wrote throw WhyDidYouDoThatException() :-).

troyp
Posts: 557
Joined: Thu May 22, 2008 9:20 pm UTC
Location: Lismore, NSW

Re: Coding: Fleeting Thoughts

Postby troyp » Tue Dec 04, 2012 7:58 am UTC

Sorry, I misread what you wrote. I thought you were defining an "instance" of the concept and saying you could take the "concept variable" x in one of the "concept methods" and apply an arbitrary method .foo() to it, which every class instantiating the concept was assumed to have, even though it's not from a prerequisite concept. (So: kind of like a wrote a type class "class Bar x where..." with a default method that tests a::x and b::x for equality even though (Eq x) isn't in the context.) I missed the fact you were actually defining a function template (is that the word?)*
* I know you were quite clear and gave the signature for foo, but I was expecting instance syntax I wouldn't recognize so I just glossed over the code and picked out the bits I was looking for :oops: Somehow I even failed to notice that the identifier "foo" had already occurred when you spoke of calling x.foo() - I think I'll blame C++ verbosity for that one...

If my guess is right and you're allowed to stray outside the concept, things would just work the same way they do now: the compiler can't check things when foo is declared, but when it is instantiated with a specific type, it goes through and substitutes that type in place and does type checking then.

Okay, so on the right page now I hope...would there be any advantage in leaving a necessary concept out of the declaration? That seems like a bad idea with no upside to me. Of course, it may just be I'm not understanding how concepts would work. I'm going off the most cursory knowledge of how templates work and no knowledge of concepts at all...

User avatar
Jplus
Posts: 1721
Joined: Wed Apr 21, 2010 12:29 pm UTC
Location: Netherlands

Re: Coding: Fleeting Thoughts

Postby Jplus » Tue Dec 04, 2012 12:28 pm UTC

EvanED wrote:I was actually surprised to see you're right for Java. However, C# uses the syntax troyp suggested.
troyp wrote:So does Scala. I was actually also thinking Java did.

I think the Java syntax would make more sense for C++ because the return type of a function may depend on the template parameters. It's mostly a moot point, though.

EvanED wrote:I would also suggest that the typename/class could be optional (or even prohibited) for type template parameters, while still allowing declarations of non-type and template template parameters as they are currently. Heck, from the parser's standpoint such a thing would be a trivial modification. I'm not sure it'd be a good design or not. FWIW, D appears to take this approach (though I'm... less enamored with some other choices of theirs).

I think that's a bad idea for the same reason it's a bad idea to make int the default type for function parameters.

EvanED wrote:
Yes I should have formulated that better, but all that is just semantics. What I meant to say is this. The Haskell type signature above consists of two parts, "(Ord a)" and "[a] -> [a]", separated by "=>". The first part gives requirements on types and is analogous to the "template <class I>" part in the C++ function declaration.

I definitely disagree here. Consider the head function. (hd? first? Whatever.) In Haskell it would have the type signature [a]->a, which has no type constraints even though it is a polymorphic function; the C++ version would still have template<typename T> .... What the template<...> says (sticking to the common case anyway) is that the function is polymorphic -- the Haskell equivalent of that is that there is a type variable in the type.

The template declaration really does do more than signaling polymorphy. That's why a few posts ago I compared

Code: Select all

template <class I>
void qsort (I begin, I end);
with the hypothetical alternative

Code: Select all

void qsort (auto begin, auto end);
.
Both would be polymorphic, but only the former shows that both arguments have to be of the same type. Concepts, when they're finally explicit, will also go into the template declaration. So it's definitely the "type requirements" section; even when it's absent, it's still in a sense conveying that all argument and return types are required to be concrete.
You are right to point out that Haskell handles polymorphic type identity without needing the typeclass section, but I consider that a subtlety. Both languages still split type information over two places: requirements in the first (mostly) and function interface in the second.

troyp wrote:
Jplus wrote:I have not stated this clearly enough: random pivot is more likely to degrade to O(n^2) than median-of-three.

What do you mean? A random pivot guarantees (probabilistically) O(n log n) worst-case. For a specific execution, there is always a finite chance of it taking quadratic time, but the bigger the data set, the less likely that is - and you can't have a pathological scenario where data sets are systematically ordered in such a way as to provoke quadratic behaviour.[/size]

If there is a finite chance that the algorithm takes quadratic time, then obviously it doesn't guarantee O(n log n) worst-case. In fact quicksort never can, by itself. Nothing in principle stops your random pivot from being the smallest or largest element at each recursion. In fact on random data, while improbable, this is much more likely with a random pivot than taking the second smallest or second largest element with median-of-three (because in the latter two samples must be "unlucky" instead of one). You can however rely on quicksort performing log-linear in the average case, and protect it against "bad luck" by providing an emergency sort. If you're doing that anyway, median-of-three pivot selection again makes more sense than random pivot selection.

troyp wrote:
I'm going to say the exact opposite: Haskell and C++ are incredibly close with respect to performance and verbosity. I think it's bizarre that we're having this discussion while we have as diverse language out there as Forth, Python and APL.

Well, I see where you're coming from. [...]

I think to some extent we disagree because we have different standards for readability and you consider distractions insignificant that I'd consider important. For me, it's not enough to be able to read a line and understand it immediately. I want to be able to understand it just from catching it in my peripheral vision. I want to be able to see and take in as much information as possible on a single screen (though "take in" has precedence over "see"). And I don't want to have to think about something extraneous to the code logic even for a fraction of a second. That tiny distraction could be enough to make me lose some incipient thought or intuition forming in the back of my mind.

Those distractions that you're talking about, i.e. keywords like "template" and "auto"; it's not just that I consider them "insignificant distractions". In fact I don't consider them distractions at all. To the contrary, I think they serve as clarifications. Since you consider them distractions, I can understand that you like Haskell better. But the "being able to understand it just from catching it in your peripheral vision" seems exceptional to me; I think it's not feasible for most people to have such a direct Haskell-brain interface. The same is true of C++ and any other programming language of course. I'm just saying there's an alternative way to look at those keywords, i.e. to consider them clarifications rather than distractions.

More on the understandability of code: I side with Bret Victor when he says that basically all mainstream languages were badly designed. I can recommend reading that page from top to bottom, as well as the rest of his website, though most of it isn't directly related to our discussion.

troyp wrote:[...] Haskell arrays have a higher kind than C++ arrays (or Haskell lists). They can use any sensible type for indexes, not just integers.

Good point.
"There are only two hard problems in computer science: cache coherence, naming things, and off-by-one errors." (Phil Karlton and Leon Bambrick)

coding and xkcd combined

(Julian/Julian's)

User avatar
Yakk
Poster with most posts but no title.
Posts: 11129
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

Re: Coding: Fleeting Thoughts

Postby Yakk » Tue Dec 04, 2012 1:36 pm UTC

An advantage of checked concepts that the implementation can violate is that you get earlier error messages that are more informative. Instead of an error 3 deep in some library code, you get the error of failure to match at instantiation of the template.

Plus you can overload templates with concept based pattern matching.

Concept maps where also proposed.

Checked concepts give you additional advantages in making sure all errors in type happen at the interface, but are harder and require possibly a perfect first iteration of adding concepts to template code.
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

User avatar
tastelikecoke
Posts: 1208
Joined: Mon Feb 01, 2010 7:58 am UTC
Location: Antipode of Brazil
Contact:

Re: Coding: Fleeting Thoughts

Postby tastelikecoke » Tue Dec 04, 2012 2:01 pm UTC

Wow, so much for a Haskell quicksort.
To be fair, My merge sort in Haskell isn't very spectacular:

Code: Select all

merge' [] ys = ys
merge' xs [] = xs
merge' (x:xs) (y:ys) = if x > y then y:(merge' (x:xs) ys) else x:(merge' xs (y:ys))
msort [] = []
msort [a] = [a]
msort xs = merge' (msort (take (l `quot` 2) xs)) (msort (drop (l `quot` 2) $ xs))
           where l = length xs


Back in the lower levels, I like C++'s STL. The syntax is crazy, but at least you don't need to do structs anymore. Plus I can do this:

Code: Select all

vector<vector<vector<vector<int> > > > vvvv(11,
      vector<vector<vector<int> > >(11,
            vector<vector<int> >(11,
               vector<int>(11))));

I only wish the compiler errors about it were shorter and more readable.

troyp
Posts: 557
Joined: Thu May 22, 2008 9:20 pm UTC
Location: Lismore, NSW

Re: Coding: Fleeting Thoughts

Postby troyp » Tue Dec 04, 2012 2:42 pm UTC

Jplus wrote:If there is a finite chance that the algorithm takes quadratic time, then obviously it doesn't guarantee O(n log n) worst-case. In fact quicksort never can, by itself. Nothing in principle stops your random pivot from being the smallest or largest element at each recursion. In fact on random data, while improbable, this is much more likely with a random pivot than taking the second smallest or second largest element with median-of-three (because in the latter two samples must be "unlucky" instead of one). You can however rely on quicksort performing log-linear in the average case, and protect it against "bad luck" by providing an emergency sort. If you're doing that anyway, median-of-three pivot selection again makes more sense than random pivot selection.


There's a reason I said "quadratic" rather than "O(n)". A particular execution may be quadratic compared to typical running times for various sized inputs, but it's only O(n) if it's asymptotically quadratic. Which it's not.

The distinction here isn't just semantics. Randomized quicksort offers genuinely stronger guarantees of loglinearity than median-of-3 quicksort.
  • Median3 can only provide such a guarantee by making assumptions about the order of inputs. Randomized quicksort requires no assuptions about order.
  • For randomized quicksort, the probability of quadratic behaviour vanishes as input size increases to infinity. For median3, quadratic behaviour could occur no matter how large the input grows.
If you think median-of-3 is the best sort to use, fair enough. You're probably right. But the differences in worst-case behaviour you're dismissing are real. And yes, you could use a backup sort to provide O(n log n) worst case, but while that may be of practical relevance in choosing a sort, it doesn't say anything about the guarantees of randomized or median-of-3 quicksort. In that case, overall worst-case running time is just inherited from the other sort (assuming it's better, of course).
edit: regarding the "randomized quicksort only needs one bad pivot choice": I'm a bit rusty on the details of the pivot choice routines, but I think randomized quicksort usually also uses a median of 3 (just 3 random choices rather than 3 deterministic ones). I'm just mentioning this because your remark brought it to mind and I thought it might be necessary to give the worst-case probabilistic guarantee (I'm not sure whether it is or not - it might just be an optimization)


Those distractions that you're talking about, i.e. keywords like "template" and "auto"; it's not just that I consider them "insignificant distractions". In fact I don't consider them distractions at all. To the contrary, I think they serve as clarifications. Since you consider them distractions, I can understand that you like Haskell better.

Yeah, I do appreciate clarifications, but my interpretation of C++ syntax is different. I find C++ less readable mainly because (1) the verbosity is just too great - programs become great walls of busy text; and (2) a lot of the clarifications are not clarifying program logic, but rather lower-level details that are not salient to high-level programming. (Of course, C++ is also used for lower-level programming where the criteria would be quite different.)

I acknowledge that readability is going to vary quite a bit from person to person, though.

But the "being able to understand it just from catching it in your peripheral vision" seems exceptional to me; I think it's not feasible for most people to have such a direct Haskell-brain interface. The same is true of C++ and any other programming language of course. I'm just saying there's an alternative way to look at those keywords, i.e. to consider them clarifications rather than distractions.

Well, it is exceptional. I certainly can't understand most code that easily. But I would like to. And in some cases, with simple mathematical or list-recursive equations, Haskell really does approach that ideal level of readability. Most people don't have a Haskell-Brain interface, but people with a degree of mathematical maturity do have a Math-Brain interface to some degree. If you see a simple equation or expression using notation you're highly familiar with, you can perceive its meaning seemingly as a unit, without actually having to read. At its best Haskell captures the flavour of mathematical syntax, which is a key to its extreme elegance in those cases.

Now of course most code is going to fall far short of my ideals for readability, but my point is that this is the kind of yardstick by which I measure readability. When it comes to readability, I virtually never think "well, that's good enough". Because IMO, it virtually never is. Hence my complaining about every little thing ;-)

More on the understandability of code: I side with Bret Victor when he says that basically all mainstream languages were badly designed. I can recommend reading that page from top to bottom, as well as the rest of his website, though most of it isn't directly related to our discussion.

Yeah, I've seen that before, and I do agree with what he's saying. We really should have much better tools than we do. He seems to focus on visual tools, but I think even in terms of simple text editing, we fall short. I often find myself aware of limitations and annoyances that just shouldn't be there. Just earlier, for instance, when I said in really simple cases a type signature could just be a distraction and waste of screen space...what I was thinking (but didn't go into, in the interest of some degree of brevity) was that this should simply not be an issue because any text editor with a haskell-mode should be able to hide the type sigs at a keystroke (and any without a haskell-mode should just need a short regex in a filter bar the first time you do it). But too often, we do without simple functionality like this.

Yakk wrote:An advantage of checked concepts that the implementation can violate is that you get earlier error messages that are more informative. Instead of an error 3 deep in some library code, you get the error of failure to match at instantiation of the template.

Ah, I get it. I'm not used to thinking about templates. I was thinking that delaying the error would tend to bury it, but of course it actually gives at a chance to be instantiated so the error ends up in user code. So, yeah, I reverse my opinion. That makes perfect sense.
Last edited by troyp on Tue Dec 04, 2012 3:06 pm UTC, edited 1 time in total.

User avatar
Xeio
Friends, Faidites, Countrymen
Posts: 5101
Joined: Wed Jul 25, 2007 11:12 am UTC
Location: C:\Users\Xeio\
Contact:

Re: Coding: Fleeting Thoughts

Postby Xeio » Tue Dec 04, 2012 2:55 pm UTC

Ooooooooooo. Automatic threading by compilers? Neat.

Might have to actually read that paper later, but I have a feeling it might make my brain melt.

User avatar
Jplus
Posts: 1721
Joined: Wed Apr 21, 2010 12:29 pm UTC
Location: Netherlands

Re: Coding: Fleeting Thoughts

Postby Jplus » Tue Dec 04, 2012 5:15 pm UTC

tastelikecoke wrote:Wow, so much for a Haskell quicksort.

Hope I didn't ruin your fun. :P

tastelikecoke wrote:To be fair, My merge sort in Haskell isn't very spectacular:

Code: Select all

merge' [] ys = ys
merge' xs [] = xs
merge' (x:xs) (y:ys) = if x > y then y:(merge' (x:xs) ys) else x:(merge' xs (y:ys))
msort [] = []
msort [a] = [a]
msort xs = merge' (msort (take (l `quot` 2) xs)) (msort (drop (l `quot` 2) $ xs))
           where l = length xs

To be fair, merge sort takes a bit more code in C++ as well. By the way, taking the length of a list is expensive. You can improve your code by doing a bottom-up merge sort.

tastelikecoke wrote:Back in the lower levels, I like C++'s STL. The syntax is crazy, but at least you don't need to do structs anymore. Plus I can do this:

Code: Select all

vector<vector<vector<vector<int> > > > vvvv(11,
      vector<vector<vector<int> > >(11,
            vector<vector<int> >(11,
               vector<int>(11))));

I only wish the compiler errors about it were shorter and more readable.

But, er, why would you want to do that?


troyp wrote:There's a reason I said "quadratic" rather than "O(n)". A particular execution may be quadratic compared to typical running times for various sized inputs, but it's only O(n) if it's asymptotically quadratic. Which it's not.

I'm sorry, but this doesn't seem to make any sense?

troyp wrote:The distinction here isn't just semantics. Randomized quicksort offers genuinely stronger guarantees of loglinearity than median-of-3 quicksort.
  • Median3 can only provide such a guarantee by making assumptions about the order of inputs. Randomized quicksort requires no assuptions about order.
  • For randomized quicksort, the probability of quadratic behaviour vanishes as input size increases to infinity. For median3, quadratic behaviour could occur no matter how large the input grows.

Spoiler:
Let's assume random data for now, we'll address structured data below. Using a PRNG to select a pivot in this case makes no difference, because whatever element of the range you choose, it's always random. So the only distinction we're really making here is between picking one element and making that one the pivot, or taking three elements and making the median of those the pivot.
In the single element case, the probability that you pick an extreme value is 2/n. Indeed this probability approaches zero as n approaches infinity, but now you're glossing over the fact that you also have to select pivots for each subrange, recursively. A large range has so many recursive subranges, all with smaller n, that the probability that you'll pick a bad pivot in a significant portion of the subranges is actually very high. If you look at online sort visualisations like here, you see it happening before your own eyes.
That's not all; if you pick a pivot close to the extreme values, that's almost as bad as picking the extreme values themselves. The bad news is that "close" grows with n. So in reality the probability of your single element being a bad pivot is much greater than 2/n; in fact you could maintain that the probability does not diminish with n.

Now contrast that with median-of-three. The probability that you pick an extreme value is zero. The probability that you pick the next extreme value (i.e. second smallest or second greatest) is only
2(1/n)(1/(n-1)). Again, the larger the range the more bad pivots there are, but median-of-three pivot selection has the strong advantage over single element pivot selection that values closer to the middle of the range are more likely to be selected than values at the extremes. You pick the center value with asymptotic probability 3/n, where single element pivot selection does it only with probability 1/n. Concluding, median-of-three selects much better pivots on average.

Now, structured data. For structured data it does matter where you sample your data, so the option to pick randomly becomes relevant. Examples of structured data include, among many other kinds, sorted data and so called "killer sequences", i.e. malign sequences that are constructed specifically to force a deterministic pivot selection mechanism to pick the worst possible pivot on O(n) recursions. Now, choosing the pivot samples from fixed locations may have both advantages and disadvantages, depending on the structure of the data:
  • For sorted data, the middle element is always the optimal pivot so using that as a fixed sample location has a strong advantage.
  • For random data, it does not matter where you sample, so sampling at fixed locations has the advantage over random sampling that it adds much less overhead.
  • For killer sequences, using a fixed location is a clear disadvantage because it can be exploited.
That's two points for fixed locations and only one for random locations.

troyp wrote:If you think median-of-3 is the best sort to use, fair enough. You're probably right. But the differences in worst-case behaviour you're dismissing are real.

I'm emphasising worst-case behaviours, not dismissing them. Explaining to you that they're the opposite of what you think they are, in fact.

troyp wrote:edit: regarding the "randomized quicksort only needs one bad pivot choice": I'm a bit rusty on the details of the pivot choice routines, but I think randomized quicksort usually also uses a median of 3 (just 3 random choices rather than 3 deterministic ones). I'm just mentioning this because your remark brought it to mind and I thought it might be necessary to give the worst-case probabilistic guarantee (I'm not sure whether it is or not - it might just be an optimization)

Random median-of-three pivot selection
  • picks the optimal pivot for sorted ranges with the same probability as for random data 3/n, while median-of-three with fixed sampling on the center will pick it with probability 1;
  • for random data, adds overhead but not better pivot selection;
  • is less vulnerable to killer sequences, but at the cost of adding a lot of overhead for all other situations;
  • is still quadratic in the worst case.

troyp wrote:[...] and (2) a lot of the clarifications are not clarifying program logic, but rather lower-level details that are not salient to high-level programming.

Such as what? Perhaps there are low-level details in the code when you're actually doing low-level programming, but as far as I'm aware of there were no low-level details lying around in my example quicksort code.
"There are only two hard problems in computer science: cache coherence, naming things, and off-by-one errors." (Phil Karlton and Leon Bambrick)

coding and xkcd combined

(Julian/Julian's)

User avatar
Yakk
Poster with most posts but no title.
Posts: 11129
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

Re: Coding: Fleeting Thoughts

Postby Yakk » Tue Dec 04, 2012 6:11 pm UTC

Your complexity theory seems rough, troyp.

There are ways to say that "almost always it takes ~ n lg n time for big n, unless we get really unlucky", but that isn't O(n lg n) notation.

O(n lg n) is for: there is a K such that for large n, it always takes K n lg n time or less. Always. Not "almost always".
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

EvanED
Posts: 4331
Joined: Mon Aug 07, 2006 6:28 am UTC
Location: Madison, WI
Contact:

Re: Coding: Fleeting Thoughts

Postby EvanED » Tue Dec 04, 2012 6:54 pm UTC

Jplus wrote:
EvanED wrote:I was actually surprised to see you're right for Java. However, C# uses the syntax troyp suggested.
troyp wrote:So does Scala. I was actually also thinking Java did.

I think the Java syntax would make more sense for C++ because the return type of a function may depend on the template parameters. It's mostly a moot point, though.

Personally I think the way return values are looked up currently is pretty annoying anyway, as you can't do this:

Code: Select all

struct S {
    struct I {};
    I foo();
};
I S::foo() { return I(); }

and have to explicitly state that I is a member of S. The alternative auto foo() -> type syntax or whatever it is was invented to help with almost this situation, but I'm not sure how widely-supported it is yet. I would have preferred (from more of a clean-slate perspective) if the return type was looked up a little "later" anyway; after seeing any potential template declaration in foo<I, J>(I a, J b) would make fine sense to me even though in some sense the <I, J> appears after the return value.

I think that's a bad idea for the same reason it's a bad idea to make int the default type for function parameters.

Like I said, I don't necessarily think it's a good idea, but I also think the two situations are not very comparable. In both my experience and from a quick grep through /usr/include/boost, non-type template parameters are in the extreme minority (my search was definitely imperfect and incomplete in a couple different ways, but scanning across a couple thousand lines of grep template **/*.hpp, I saw under a dozen that took non-types); passing an int is probably not even a majority let alone so overwhelmingly a non-majority.

The template declaration really does do more than signaling polymorphy. That's why a few posts ago I compared

Code: Select all

template <class I>
void qsort (I begin, I end);
with the hypothetical alternative

Code: Select all

void qsort (auto begin, auto end);
.
Both would be polymorphic, but only the former shows that both arguments have to be of the same type.

I get what you're saying here, but I have two counterarguments.

The first is that's not what you said. You said "The Haskell type signature above consists of two parts, "(Ord a)" and "[a] -> [a]", separated by "=>". The first part gives requirements on types and is analogous to the "template <class I>" part in the C++ function declaration" but that's just not true, for the reasons I stated above: until and if the standards committee incorporates concepts, C++'s equivalent of the typeclass requirements is entirely implicit and appears nowhere in the type declaration. The template<typename T> bit would be like if Haskell you had to say foo :: TypeVar a: [a] -> [a] or something and explicitly declare what the type placeholders in the type is. The difference between qsort(I begin, I end) and qsort(auto begin, auto end) entirely corresponds to the fact that in the "main" part of the Haskell type you actually use explicit type variables (e.g. [a] -> [b] -> [(a,b)]) instead of just saying something like [auto] -> [auto] -> [(auto, auto)].

Which leads me to my second point, which is that there are still reasonable syntaxes that would eliminate the extra syntactic cruft, keep it obvious what arguments are type parameters, and let you restrict things the way you want here for the very very common case of all your template parameters being non-type and appearing in the parameter list, which is to introduce type variables to C++. I already suggested a syntax like void qsort(<T> begin, <T> end), but you could also steal from ML to get void qsort('a begin, 'b begin). Both of these (particularly the first) I think should be reasonably easy to implement. (The second would be made easier if you said void qsort(''a begin, ''b end).) Personally I think that would improve the readability of the common case a little, though that's different from saying that it'd be a good idea. (Remember, my argument is less that "C++ is doing things wrong" as "C++ syntax does carry around a lot of baggage that's unnecessary in the common case but is there to make rare things easier," and that Haskell either choose design choices that obviate the need for those less common cases or drops support for them, with the benefit that the common case becomes more readable.)


troyp wrote:
Jplus wrote:I have not stated this clearly enough: random pivot is more likely to degrade to O(n^2) than median-of-three.

What do you mean? A random pivot guarantees (probabilistically) O(n log n) worst-case. For a specific execution, there is always a finite chance of it taking quadratic time, but the bigger the data set, the less likely that is - and you can't have a pathological scenario where data sets are systematically ordered in such a way as to provoke quadratic behaviour.[/size]

If there is a finite chance that the algorithm takes quadratic time, then obviously it doesn't guarantee O(n log n) worst-case. In fact quicksort never can, by itself. Nothing in principle stops your random pivot from being the smallest or largest element at each recursion. In fact on random data, while improbable, this is much more likely with a random pivot than taking the second smallest or second largest element with median-of-three (because in the latter two samples must be "unlucky" instead of one). You can however rely on quicksort performing log-linear in the average case, and protect it against "bad luck" by providing an emergency sort. If you're doing that anyway, median-of-three pivot selection again makes more sense than random pivot selection.

I suspect both of these arguments are pretty firmly-footed. The first points were already made by troyp in his latest post, but I'll state them perhaps in a bit different terms.

troyp's argument that randomized pivot provides better guarantees than pure quicksort is definitely on solid ground from a theoretical standpoint. When looking at randomized algorithms, complexity folks usually look at the expected performance (over the random choices) of the worst-case inputs. Under this analysis, randomized quicksort is O(n log n) expected in the worst case (in fact, it's O(n log n) expected for every input) and median-of-three quicksort is O(n^2) worst case. That's just a fact. Why does this analysis make sense? Because if your random choices are really random, there is no possibility of bias towards worst-case behavior; you're guaranteed O(n log n) expected. Some executions make take longer, but some will take shorter. With median-of-three, you can have a bias towards O(n^2). For instance, if an attacker wanted to carry out a denial-of-service attack or something by making you do a lot of extra computation, they could craft an input that would force O(n^2). With randomized quicksort, the attacker could not. Point: troyp.

However, I suspect Jplus is also on solid ground from a practical standpoint. Maybe it is reasonable to make assumptions about the input -- in which case the attacker scenario may no longer apply. Or if you add in a fallback sort, even bad cases become not so bad. Point: Jplus.

:-)

User avatar
Yakk
Poster with most posts but no title.
Posts: 11129
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

Re: Coding: Fleeting Thoughts

Postby Yakk » Tue Dec 04, 2012 9:15 pm UTC

EvanED wrote:Like I said, I don't necessarily think it's a good idea, but I also think the two situations are not very comparable. In both my experience and from a quick grep through /usr/include/boost, non-type template parameters are in the extreme minority (my search was definitely imperfect and incomplete in a couple different ways, but scanning across a couple thousand lines of grep template **/*.hpp, I saw under a dozen that took non-types); passing an int is probably not even a majority let alone so overwhelmingly a non-majority.

Except the C++ template system is analogous to a pre-struct pre-double C language.

Imagine C with only pointers, int, shorts, char's and longs. That's it. Almost every argument is either a pointer, or an int. And most arguments are ints (or can be seemlessly converted to ints).

As of C++03 template parameters can be integral types, function pointers, and types.

In C++11 we have template template parameters, type parameters, integral parameters, function pointers, and varardic lists of same.

Next we can expect to have concept-typed types passed to templates. Heck, maybe we'll even have concepts passed to templates, which can then be used to qualify another type.

If we had robust concept support, you'd be expected to do:

Code: Select all

template< strict (movable || copyable) T, strict forward_iterator iterator, strict function_type_compatible<T(iterator::value_type const&)> func >
generator<T> generate_transform( iterator begin, iterator end, func f=func() );

where the "default" case is no longer "just types".
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

EvanED
Posts: 4331
Joined: Mon Aug 07, 2006 6:28 am UTC
Location: Madison, WI
Contact:

Re: Coding: Fleeting Thoughts

Postby EvanED » Tue Dec 04, 2012 9:48 pm UTC

Yakk wrote:As of C++03 template parameters can be integral types, function pointers, and types.

In C++11 we have template template parameters, type parameters, integral parameters, function pointers, and varardic lists of same.

Template template parameters are in C++98/03 (and I could probably count on one hand the times I've seen one), so the only thing that C++11 adds from your lists is variadic lists, which doesn't really affect my point, as my alternative syntax could easily incorporate variadic syntax. (I also have vague memories of additional loosening of what kinds of non-type template parameters you can pass, but I don't remember specifics.)

Next we can expect to have concept-typed types passed to templates. Heck, maybe we'll even have concepts passed to templates, which can then be used to qualify another type. ... where the "default" case is no longer "just types".

This may be true, and it would mean that my "overwhelming majority' argument would cease to hold, but that's also not yet C++. Mostly I'm talking about what would be a reasonable alternative syntax for what we have now, aka what are reasonable choices that the standards committee could have made when they were coming up with the original version.

User avatar
Yakk
Poster with most posts but no title.
Posts: 11129
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

Re: Coding: Fleeting Thoughts

Postby Yakk » Tue Dec 04, 2012 9:59 pm UTC

Except standards should take into account that the language is going to be extended.

Making decisions about features that are easy to parse, and easy to parse extensions for, is a good idea. Help claw C++ out of its serious parsing mess.

Maybe it is easy to parse "typed" and "non-typed" template parameters. I would guess that making that optional parameter mandatory will make it much easier to parse, and much easier to parse future extensions. Throw in the bad experience with typeless default int parameters from C, and I could see the standards committee being pretty gun shy.
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

troyp
Posts: 557
Joined: Thu May 22, 2008 9:20 pm UTC
Location: Lismore, NSW

Re: Coding: Fleeting Thoughts

Postby troyp » Wed Dec 05, 2012 3:52 am UTC

Jplus wrote:I'm sorry, but this doesn't seem to make any sense?

Sorry, there seems to be some confusion here, but I'm not sure where it is so I don't know what I need to clarify. All I was saying there is that O(..) is a property of an algorithm, not a single execution. It doesn't make sense to say that a single execution is quadratic so randomized quicksort is quadratic. You need to show that it could always be quadratic no matter how large the input gets.

Now of course, you can show that, and that establishes the regular Ω(n^2) worst case bound. But for the probabilistic worst-case guarantee of O(n log n), you're dealing with a statistical property (most often, expectation). If we're talking about the expected worst-case running-time, to show that's quadratic, you'd have to show that there's some choice of input data for which the sort takes quadratic time on average (over a large number of runs) *. Note that expected worst-case is quite distinct from average-case. For median-of-3, being deterministic, the expected worst-case is just the worst-case: O(n^2). For randomized, it's O(n log n)

* I'm doing the same thing here I was complaining about earlier: acting as if a single case could prove asymptotic properties. However, here the simplification doesn't really affect the argument. Technically, you need to provide not just one choice of input data, but one for every given input size above some (arbitrary) bound.

Let's assume random data for now, we'll address structured data below. Using a PRNG to select a pivot in this case makes no difference, because whatever element of the range you choose, it's always random. So the only distinction we're really making here is between picking one element and making that one the pivot, or taking three elements and making the median of those the pivot...

For randomized data things are the same, yes. The number of elements we're picking isn't the issue (except that it's O(1)). The issue is randomization, this is just adding noise. Assume the randomized version uses the median of 3 points as well, to keep things simple.

Now, structured data. For structured data it does matter where you sample your data, so the option to pick randomly becomes relevant. Examples of structured data include, among many other kinds, sorted data and so called "killer sequences", i.e. malign sequences that are constructed specifically to force a deterministic pivot selection mechanism to pick the worst possible pivot on O(n) recursions. Now, choosing the pivot samples from fixed locations may have both advantages and disadvantages, depending on the structure of the data:
For sorted data, the middle element is always the optimal pivot so using that as a fixed sample location has a strong advantage.
For random data, it does not matter where you sample, so sampling at fixed locations has the advantage over random sampling that it adds much less overhead.
For killer sequences, using a fixed location is a clear disadvantage because it can be exploited.
That's two points for fixed locations and only one for random locations.


(1) I thought you specifically talking about nonrandom data here?
(2) More relevantly, what the hell!? :-)
This isn't a beauty pageant! And it's not a debate about "which is better" either. I keep saying that I'll take your word for median-of-3's superiority, but you just don't seem to believe me. ;-)

I'm emphasising worst-case behaviours, not dismissing them. Explaining to you that they're the opposite of what you think they are, in fact.

That sounds like you're saying that median-of-3 rather than randomized has the probabilistic loglinear guarantee...but I'm sure you're not. So I'll assume you mean there's no difference in expected running-time. That simply isn't true, though. If you look up any treatment of randomized quicksort it should have a proof of expected loglinearity. It's quite intuitive, anyway. Since the input is randomized, expected case for randomized is just the same as average case for median-of-3. It's just in one case the randomized input is an assumption of the definition and in the other it's a property of the algorithm.

note: Btw, there's really no point for me to say "worst case expected running time". With randomized input, there's no distinction between worst-, average- and best-case. I think I started saying worst-case so it would be clear that I wasn't just talking about deterministic average case.


Yakk wrote:Your complexity theory seems rough, troyp. There are ways to say that "almost always it takes ~ n lg n time for big n, unless we get really unlucky", but that isn't O(n lg n) notation.


? Well, I'm no expert on complexity theory, but if you're suggesting that big-Oh notation is inappropriate for use in stating probabilistic guarantees, you'll have to take it up with someone who is. Here - this is a search for "expected worst-case running time" for CiteSeerX. You'll notice some big Ohs and Thetas there.

Of course, even if I were to agree to reject that standard usage, it's just notation. It wouldn't change the fact that randomized quicksort has been proven to offer guarantees that deterministic quicksort does not.

O(n lg n) is for: there is a K such that for large n, it always takes K n lg n time or less. Always. Not "almost always".


Um...yes, exactly...so a single execution cannot ever disprove an algorithm being O(n log n) since the bound for "large" inputs can always be chosen to be larger than the input size.
edit: Maybe that isn't relevant to what you're saying. It's hard to know what you're referring to without context.
Are you just complaining about probabilistic bounds again? The definition of Big Oh is quite consistent with expected running times. Just as you can use it for the growth of running time or memory usage or number of array accesses, you can use it for expected running time. That's all that's happening.

EvanED wrote:However, I suspect Jplus is also on solid ground from a practical standpoint. Maybe it is reasonable to make assumptions about the input -- in which case the attacker scenario may no longer apply. Or if you add in a fallback sort, even bad cases become not so bad. Point: Jplus.

Well, yes, but that point was conceded at the start without contest. I've repeatedly (in every post) accepted Julian's claims that median-of-3 is a better sort practically. He refuses to accept my agreement! :-) But that doesn't mean there's actually a debate.

No...the only thing under debate is the probabilistic behaviour of quicksort in the worst-case. And I don't see any room for opinion there. As you said, it's just a matter of fact.

User avatar
Yakk
Poster with most posts but no title.
Posts: 11129
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

Re: Coding: Fleeting Thoughts

Postby Yakk » Wed Dec 05, 2012 9:55 am UTC

Sure: but that is really carefully qualified. It isn't O(nlgn) worst case, it is an average execution time of O(nlgn). I guess that is sort of what you said though! Sigh.
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

troyp
Posts: 557
Joined: Thu May 22, 2008 9:20 pm UTC
Location: Lismore, NSW

Re: Coding: Fleeting Thoughts

Postby troyp » Wed Dec 05, 2012 11:55 am UTC

I think there's some terminological confusion. I'm just using the terminology I remember from the algorithms class I took and that I saw in the textbook. I thought it was well known, but now I wonder if it's not as standard as I thought, or just more obscure. Anyway, I'll explain the definitions I'm using.

Average running time means "running time on average input". Which is a bit ambiguous but usually average input just means randomize the inputs and take the average.

Expected running time is used with randomized algorithms and means "running time on an average execution". It doesn't mean average input, so you could have best-case, average-case and worst-case expected running times. But: if the randomized algorithm is one that randomizes the inputs, then obviously those 3 cases are equivalent (which is the case for randomized quicksort).

So to say the expected worse case running time is O(n log n) means you can choose any input whatsoever, and the expectation of the running time will follow an asymptotic loglinear pattern for that input. The core point is "no pathological input is possible".

Spoiler:
[edit: I just realized that randomized implementations of quicksort usually don't randomize the input, just the pivot choice. So it's harder to see what the expected running times are than I'm making out here. What I'm saying below still applies to algorithms that do randomize the inputs, but not to (typical) randomized quicksorts. Still, I think it's a useful intuition* for why randomized quicksort is loglinear expected RT, so I'll leave it.]

To put it informally, randomizing the input converts the deterministic average case into the expected worse case (because whatever input you fix on, it's just gets randomized by the algorithm). Average case assumes random inputs. Randomizing enforces random inputs, so both best and worst case "collapse" to average.

* why? because it seems intuitive that the two are equivalent: if you pick random pivots, what's the difference if the rest of the elements are in a different order (They're still partitioned the same)? Well, the resulting subarrays have their elements in a different order - but that's only going to affect the subsequent pivot choices - which are randomized.
Last edited by troyp on Wed Dec 05, 2012 2:18 pm UTC, edited 2 times in total.

User avatar
Vieneoume
Posts: 4
Joined: Sun Oct 28, 2012 10:04 am UTC

Re: Coding: Fleeting Thoughts

Postby Vieneoume » Wed Dec 05, 2012 2:00 pm UTC

Bravo, what words..., a brilliant idea

EvanED
Posts: 4331
Joined: Mon Aug 07, 2006 6:28 am UTC
Location: Madison, WI
Contact:

Re: Coding: Fleeting Thoughts

Postby EvanED » Wed Dec 05, 2012 3:20 pm UTC

Yakk wrote:Sure: but that is really carefully qualified. It isn't O(nlgn) worst case, it is an average execution time of O(nlgn). I guess that is sort of what you said though! Sigh.

It's a little misleading to say the randomized quicksort is "O(n log n) worst case" because of the possibility of a pathologically bad set of random choices. However, for the reasons I gave in an earlier post and troyp just gave as well, I would say troyp is still closer to correct than you: the misleading part of that statement (dropping the expectation) is less of a problem IMO than confusing average/worst-case analysis (which refers to inputs) with deviations in running time caused by random choices. (Or at the very least using basically the same term "average execution time" for those two very different concepts.)

troyp wrote:Randomizing enforces random inputs, so both best and worst case "collapse" to average.

This is a good way of looking at it.

troyp
Posts: 557
Joined: Thu May 22, 2008 9:20 pm UTC
Location: Lismore, NSW

Re: Coding: Fleeting Thoughts

Postby troyp » Wed Dec 05, 2012 8:02 pm UTC

EvanED wrote:It's a little misleading to say the randomized quicksort is "O(n log n) worst case" because of the possibility of a pathologically bad set of random choices.

Yes, by convention "worst case O(..)" would refer to running time, not expected running time. I did my best to keep things straight without resorting to tedious formalism and to only use unmodified terms like "worst-case O(n log n)" when the context was clearly established. But there's always the chance I've done so when the context wasn't as clear to the reader as I believed, in which case I apologize.

User avatar
ahammel
My Little Cabbage
Posts: 2135
Joined: Mon Jan 30, 2012 12:46 am UTC
Location: Vancouver BC
Contact:

Re: Coding: Fleeting Thoughts

Postby ahammel » Wed Dec 05, 2012 8:15 pm UTC

I suppose the real worst case scenario for randomized quicksort is when there's a successful attack on your PRNG with the result that the attacker can construct a pathological O(n^2) list :lol:
He/Him/His/Alex
God damn these electric sex pants!

EvanED
Posts: 4331
Joined: Mon Aug 07, 2006 6:28 am UTC
Location: Madison, WI
Contact:

Re: Coding: Fleeting Thoughts

Postby EvanED » Wed Dec 05, 2012 8:26 pm UTC

ahammel wrote:I suppose the real worst case scenario for randomized quicksort is when there's a successful attack on your PRNG with the result that the attacker can construct a pathological O(n^2) list :lol:

Well that's why I wrote a "if your random choices are really random" at one point. You just need one of those hardware noise generators. :-)

User avatar
ahammel
My Little Cabbage
Posts: 2135
Joined: Mon Jan 30, 2012 12:46 am UTC
Location: Vancouver BC
Contact:

Re: Coding: Fleeting Thoughts

Postby ahammel » Wed Dec 05, 2012 8:29 pm UTC

It's no use, I've got root access to atmospheric noise and radioactive decay!
He/Him/His/Alex
God damn these electric sex pants!

User avatar
Yakk
Poster with most posts but no title.
Posts: 11129
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

Re: Coding: Fleeting Thoughts

Postby Yakk » Wed Dec 05, 2012 9:08 pm UTC

ahammel wrote:I suppose the real worst case scenario for randomized quicksort is when there's a successful attack on your PRNG with the result that the attacker can construct a pathological O(n^2) list :lol:

O(n^2) in time and space space space space space space space space2.
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.


Return to “Coding”

Who is online

Users browsing this forum: No registered users and 4 guests