troyp wrote:Jplus wrote: Just like the type signature of a function in Haskell, the template declaration serves to provide information as well as to catch errors early on. [...]
well, okay, but the type info is already in the declaration line (or whatever you call it). Ideally, the parameter would be there. There's no reason to have the "Template ..." line: that's just C++ boilerplate.
To the contrary. The template declaration is part of the function declaration. C++ has a very good reason to do it like that: it encodes type information differently than Haskell does.
Code: Select all
qsort :: [a] -> [a]
qsort :: [Int] -> [Int]
If you know Haskell, you can tell that the first is a template function while the second is not. That's because Haskell has a rule that type variables start with a lowercase letter while concrete types and type constructors start with an uppercase. In C++ on the other hand concrete types may start with either case (the most common convention is to have library types start with a lowercase and user-defined types with an uppercase). So if you'd just leave out the template declaration of my C++ code, the parameter list suddenly means something entirely different:
Code: Select all
// template <class I>
void qsort (I begin, I end);
The compiler now must try to find some concrete type with the name "I". If you were to redesign the language you could do this:
Code: Select all
qsort (auto begin, auto end); // auto means "automatically infer the type"
But then you lose the power to specify that begin and end should be of the same type. Alternatively you could introduce additional keywords in the parameter list to indicate type variables:
Code: Select all
void qsort (typename I begin, typename I end);
But this approach suffers from a significant disadvantage compared to the current approach. The fact that the function is a template doesn't stand out anymore, for two reasons. The first is that the parameter list may contain several other kinds of additions, most of which serve to further specify a concrete type; e.g. "const", "volatile", "&" and even "template" and "typename" to disambiguate dependent names. So the reader of the code would have to be really attentive at reading the parameter list in order to find out whether each type is concrete or templated. The second reason is that you loose a very obvious tag that screams "I'M A TEMPLATE! LOOK WHAT PARAMETERS I'VE GOT!".
In addition, it doesn't even make the function declaration any shorter. You replace two short lines by a single long one, which is less readable.
By the way, in a sense Haskell also separates the type variables from the data variables. It's just so much more clean. Look at this, see what I mean?
Code: Select all
qsort :: (Ord a) => [a] -> [a]
troyp wrote:Sorry, "pointer declarations" was inaccurate. I meant the overhead of using pointers. I was also thinking of "auto", although presumably that's not only used when derefererncing pointers.
Wrong again. You seem to know less about C++ than I assumed. Those are not necessarily pointers, although they use pointer syntax; they are iterators. Iterators provide a common interface to containers so algorithms can do their stuff without worrying whether they're dealing with a std::vector, a std::deque, a raw array or something else. The Haskell equivalent would be a "!" function which you could universally apply to IArray, MArray and any user-defined array-like type constructor. Using iterators in C++ is not overhead, just like using the "!" function in Haskell is not overhead.
troyp wrote:um...that's a pretty reasonable syntax. Can I ask why you didn't use it in the first place (and why everyone doesn't use it - always)? I suspect there are drawbacks to using it, but if there aren't...please, please never use that horrific bind syntax again! Admittedly that one is long-winded, but if that is of any concern you can use Boost.Phoenix and write it like "_1 < p" instead. C++ does not enforce long-winded code.
I didn't use it in the first place because it's part of the Boost library rather than the standard library and I started out writing code that only relied on the standard library. It has no disadvantages other than not being in the standard library (then again, Boost is usually considered almost-standard). Actually I think it will be adopted into the C++ standard library at some point, because Bind was also adopted from Boost. Boost.Phoenix is newer. Neither Phoenix nor Bind are very well-known among the general population of C++ programmers, by the way; that's the consequence of there being tons of C++ programmers.
troyp wrote:One way to see that there's stuff in the C++ code that's irrelevant to the algorithm is to just compare it to the equivalent Haskell. If something's missing from the Haskell version, it couldn't be essential to the algorithm.I'm familiar with both languages. While I do find the Haskell version easy to read, I don't find it easier than the C++ version. The class of programs we're talking about is the class of programs that are very short because they use the most naieve version of an algorithm. While you need to type slightly more in C++ to express these algorithms, I think it comes just as close to what is in your head. It's a different way to translate what's in your head, that's all.
The C++ version may be just as easy to understand (although I'm skeptical on whether that would scale to more complex code), but it's still going to take longer to actually read, and longer to scan, simply because it's more verbose and there are extraneous details to worry about. Also, the class of programs for which Haskell is extremely readable is not just toy showoff programs. It includes a large range of programs: basically, if you're dealing with numerical or algorithmic code which doesn't involve mutation or randomness, chances are the Haskell implementation will be particularly elegant.
There is nothing verbose about the C++ version (except for the choice to use longer, more speaking names), nor does it contain "extraneous details", not even compared to the Haskell version. The only difference is that C++ tends to be written as a sequence of short imperative statements while Haskell tends to be written as long lines of nested function applications, and that C++ requires all type information to be explicit (even automatic type deduction) where Haskell allows you to have the compiler figure it out. That you have the option to leave out type information in Haskell does not mean it isn't essential, in fact it very much is and usually it's better to not leave it out.
troyp wrote:No, you want median-of-three quicksort because that's optimal for sorted and reversed ranges and yields a better pivot on average for all other ranges. The "flip the pivot to the start" thing was invented only for that class of very naieve but easy to demonstrate programs.
Huh? There's nothing naieve about flipping the pivot. It just simplifies the code in some algorithms (possibly at the cost of a tiny inefficiency). And while I'm a bit vague on quicksort, I'm pretty sure randomized pivots are better than median-of-3. The ideal pivot would be the true median, but that's impossible to determine efficiently, so random pivot selection provides a good-enough approximation on average. Median of first/middle/last is also an approximation, but it's obviously a flawed one. On pathological inputs you could be pivoting every time on the second smallest(/largest) element! Asymptotically, that's just as bad as flipping on the first element. Of course, the pathological input for median-of-3 is much less likely than for pivot-on-first (which I guess is the only reason this method is used). Still, random is better.
You are wrong, really. Allow me to explain.
Secondly, the median-of-three approach is superior probability-wise. Consider three distinct classes of inputs:
- Sorted or reverse-sorted data. Median-of-three selects the optimal pivot, i.e. the middle, while a random pivot will on average be halfway between the middle and an extreme end.
- Random data. Median-of-three effectively takes the median of three random samples, while a single random pivot isn't any better than just taking the first element (actually worse, because it spends more effort).
- Artificially constructed median-of-three killer sequences (extremely unlikely unless on purpose). Random pivot selection is fairly robust against such an attack, but you can secure median-of-three selection against them as well by switching to another sorting algorithm at O(log n) recursion depth. The latter is called introsort, and it's the algorithm that implements std::sort in the C++ STL.
troyp wrote:Here's complete median-of-three quicksort in C++ as derived from my previous implementation. I invite you to write a Haskell equivalent where choosing the pivot takes only constant time (i.e. it should operate on arrays). You may assume median3 to be already implemented, as I did.
(Complete? You seem to be missing a partition routine )
Very cheeky, but std::partition is available from <algorithm> just like Haskell's partition algorithm is available from Data.List. You were the one to introduce it.
troyp wrote:I'm really not familiar with mutable arrays in Haskell. I couldn't write an inplace sort offhand. But I have seen inplace procedures written as recursive do-blocks in Haskell and it really wasn't that bad. I wouldn't say it was particularly good, either, but reasonable (not worse than C++).
Perhaps it's not much worse but I assure you it certainly isn't any better. I've seen Haskell translations of imperative algorithms and they were uglier than the original. I've been hinting at this all the time; you are all extatic about Haskell elegance but as soon as we start doing real-world things such as mutating arrays inplace it turns ugly. Haskell looks more natural when you stay in the Platonic dream world but C++ looks more natural for real stuff. In either case the difference is marginal, as C++ has Boost.Phoenix and Haskell has monads.
(Let me emphasise that I don't mean to suggest that C++ is better; I just coloured my formulation to show that I'm an Aristotelian.)
troyp wrote:You only demonstrated for fmap and foldr. I don't see how this would work for qsort.
If you want indexes, I'm not sure if there's standard class, but it's trivial to write:
Code: Select all
class Indexable a where
(!*) :: a t -> Integer -> t
instance Indexable  where
xs !* n = xs !! (fromInteger n)
instance Indexable (Array Integer) where
a !* n = let (beg, end) = bounds a
in a ! (beg + n)
Thank you for the attempt, but this is not enough for implementing quicksort. You also need a way to write to an index (not necessarily in-place). There is nothing in the Haskell hierarchical libraries that meets the requirements; I tell you it's because it would be very non-trivial. Otherwise it would already be there, preventing a lot of code duplication. Note how every container type constructor in the hierarchical libraries has its own sorting function...
troyp wrote:If you mean in an in-place implementation of quicksort, lists and normal arrays are immutable, so you couldn't use them anyway. I think there's a type class in Data.Array that serves as an interface for all mutable array types, but I don't know the details (like I said, I don't know mutable arrays). Of course, you could still make the quicksort function work on lists and arrays: you'd just have to convert to and from a mutable array to do the sorting.
You are talking about MArray. Converting to another container is not a solution, because you still need a separate conversion function for each container type constructor. Face it: C++ offers a more general interface for this kind of thing than Haskell. That's not something for Haskell to be ashamed of, because C++ and Haskell were designed with different purposes in mind. Still, it's ridiculous to claim that Haskell provides a more elegant language for implementing quicksort than C++.