Coding: Fleeting Thoughts

A place to discuss the implementation and style of computer programs.

Moderators: phlip, Prelates, Moderators General

Re: Coding: Fleeting Thoughts

Postby EvanED » Wed Jan 25, 2012 5:28 pm UTC

b.i.o wrote:It also has some other interesting things happening like typestates, which, as far as I understand them (not very far), are kind of like compile-time assertions.

Sort of. It's just a fancier type system. You can view something like int x; as a compile-time assertion that x can never hold, say, a string if you want, and from some point of view that's actually basically what types do.

Typestates just allow you to have far more powerful types. For example, I think you can do things like have a mutex variable have a type which allows you to call mutex.lock() when it's unlocked and mutex.unlock() when it's locked, but not vice-versa. That's something very few languages can do: generally speaking, if mutex has a lock() function, you can call it anywhere and even if it fails at runtime, the compiler will be perfectly happy.

(Note: I haven't looked at Rust in particular so I'm not sure what typestates look like in it.)

Rust also has strong built-in support for concurrency. Things are immutable by default, and there's support for message-passing concurrency. (I know there are ways to do concurrency in C++ too, but the whole of the Rust language is designed at the ground up with doing concurrency safely.)

And for what it's worth, I strongly feel that immutability is a key part of single-process concurrency both from a complier optimization standpoint and a programmer "WTF is going on" standpoint.
EvanED
 
Posts: 4141
Joined: Mon Aug 07, 2006 6:28 am UTC
Location: Madison, WI

Re: Coding: Fleeting Thoughts

Postby Jplus » Wed Jan 25, 2012 6:57 pm UTC

b.i.o wrote:
Jplus wrote:Is there an online document that explains why this language exists? Because, I don't see any strong advantages over C++.

From the GitHub Wiki:
What is this project's goal, in one sentence?
To design and implement a safe, concurrent, practical, static systems language.
Those goals/properties are also listed on the homepage, but I didn't find them very convincing.

b.i.o wrote:I think the differences are in the first two. Rust, unlike C++, deals with memory safety for you,
The STL deals with memory safety for you, too. A new C++ program that applies modern coding conventions is as safe as a new Rust program. C++, however, allows you to make use of established libraries.
b.i.o wrote:and also has a GC that you can use if you want to.
There are several GCs for C++, too. With the third-party GCs for C++ you can GC everything instead of only the shared types.
b.i.o wrote:It also has some other interesting things happening like typestates, which, as far as I understand them (not very far), are kind of like compile-time assertions.
Compile-time assertions have been available in C++ for years. You need to use Boost for it, but it's perfectly portable. (Also, see my reaction to EvanED below.)
b.i.o wrote:Rust also has strong built-in support for concurrency. Things are immutable by default, and there's support for message-passing concurrency. (I know there are ways to do concurrency in C++ too, but the whole of the Rust language is designed at the ground up with doing concurrency safely.)
Yeah. Well. This is probably the most only convincing advantage. C++, however, is catching up. C++11 has builtin concurrency (admittedly without message passing), FastFlow has been around for a while and provides message passing (admittedly with a more restrictive license), and Boost.Context is coming up which will provide lightweight threads (i.e. fibers) and also message passing. Nowadays everyone wants message passing, so C++ is inevitably going to become a full-fledged member of the club.
Immutability isn't a real difference because const-qualified objects in C++ are just as thread-safe as default-immutable objects in Rust.
As for "being designed from the ground up for doing concurrency safely": it's just a matter of strategy. C++ is designed such that you can add any functionality through libraries, while Rust attempts to provide everything you might need as a built-in. By analogy, you could also say that Rust "is being designed from the ground up to provide memory-safe vectors", but that only serves to distract people from the fact that C++ has memory-safe vectors, too. Personally I think C++'s "extensibility strategy" is more powerful because it allows for efficient implementations of unanticipated technologies such as hash tables, lambda functions, currying, event handling through signals and slots, generic type concepts, static assertions, distributed processing and concurrency with message passing.

EvanED wrote:Typestates just allow you to have far more powerful types. For example, I think you can do things like have a mutex variable have a type which allows you to call mutex.lock() when it's unlocked and mutex.unlock() when it's locked, but not vice-versa. That's something very few languages can do: generally speaking, if mutex has a lock() function, you can call it anywhere and even if it fails at runtime, the compiler will be perfectly happy.
If this is what typestates in Rust provide then that's kind of interesting, but still not something you wouldn't be able to reproduce in C++. As a simple example, take the mutex again. You can always try to lock a C++ std::mutex, even when it's already locked, but that's exactly the point; in such a case the caller will be forced to wait until the mutex is unlocked.
Note: using mutexes would defeat the purpose of message passing.

All other features listed on the Rust homepage are equivalent to what C++ has to offer, if not weak or restricted versions thereof. Conclusion: I'm even less convinced than I was.
Feel free to call me Julian. J+ is just an abbreviation.
Image coding and xkcd combined
User avatar
Jplus
 
Posts: 1558
Joined: Wed Apr 21, 2010 12:29 pm UTC
Location: classified

Re: Coding: Fleeting Thoughts

Postby EvanED » Wed Jan 25, 2012 7:13 pm UTC

Jplus wrote:
b.i.o wrote:I think the differences are in the first two. Rust, unlike C++, deals with memory safety for you,
The STL deals with memory safety for you, too.

If the STL really dealt with memory safety, vector::operator[] would be range-checked.

Even in the STL there is metric fuckton of ways to corrupt your heap. C++ isn't even remotely memory-safe.

b.i.o wrote:and also has a GC that you can use if you want to.
There are several GCs for C++, too.

Sort of. I don't know if Rust has a moving collector, but you can't have one for C or C++ and still be compliant. And it's moving collectors which have made automatic GC bearable.

b.i.o wrote:Immutability isn't a real difference because const-qualified objects in C++ are just as thread-safe as default-immutable objects in Rust.

If your fancy-schmancy STL doesn't give you decent functional data structures -- and it doesn't -- then you basically don't have immutability. (I don't know how Rust does here, but Clojure does this very very right.)

If this is what typestates in Rust provide then that's kind of interesting, but still not something you wouldn't be able to reproduce in C++. As a simple example, take the mutex again. You can always try to lock a C++ std::mutex, even when it's already locked, but that's exactly the point; in such a case the caller will be forced to wait until the mutex is unlocked.

You're missing the point. With typestates, you can't try to lock a mutex a second time, because the program won't compile.

Or if you want another example, take files. You can't read from a file that hasn't been opened. This has to be a run time error without typestates, but is something that the compiler would be able to prove correct.

All other features listed on the Rust homepage are equivalent to what C++ has to offer, if not weak or restricted versions thereof. Conclusion: I'm even less convinced than I was.

You should study more PL.
EvanED
 
Posts: 4141
Joined: Mon Aug 07, 2006 6:28 am UTC
Location: Madison, WI

Re: Coding: Fleeting Thoughts

Postby Yakk » Wed Jan 25, 2012 7:44 pm UTC

As an example of how a typestate system would work in C++:
Code: Select all
void foo() {
  File f;
  int bar;
  f >> bar; // compile time error
  f.open(); // typestate of f changes to being an open file
  f >> bar; // legal
  File f2;
  if (testfunc()) {
    f2.open(); // typestate of f2 changes to open
  } else {
    // nothing -- doing nothing here without returning might be illegal!
  }
  f2 >> bar; // illegal -- f2's openness is ambiguous.
  // alternatively, it might be simply illegal to reach here without opening f2 in the else branch above!
}

Marking up such typestates is an interesting problem. I wonder how Rust does it? I should learn.

You could structure code in C++ so that this works, but it would be annoying:
Code: Select all
void foo() {
  File f;
  int bar;
  f >> bar; // compile time error, Files don't have a >> operator
  OpenFile fo = f.open(); // open the file f
  fo >> bar; // legal
  File f2;
  try {
    struct FailedToOpen {};
    OpenFile of2 = [&]()->OpenFile {
      if (testfunc()) {
        return f2.open;
      }
      throw FailedToOpen();
    }();
    of2 >> bar;
  } catch( FailedToOpen ) {
    // handle not having an open file.
  }
}

whereby I create a new type whenever I'd create a new typestate, and manage their lifetimes with scopes and lambdas and catch and throw.
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.
User avatar
Yakk
Poster with most posts but no title.
 
Posts: 10450
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

Re: Coding: Fleeting Thoughts

Postby Jplus » Wed Jan 25, 2012 10:45 pm UTC

EvanED wrote:If the STL really dealt with memory safety, vector::operator[] would be range-checked.

Even in the STL there is metric fuckton of ways to corrupt your heap. C++ isn't even remotely memory-safe.

You can restrict yourself to the safe operations without losing any functionality (except for the ability to corrupt the heap). There is vector::at for those who want it.
(BTW: woah, no need to get so excited...)

EvanED wrote:
b.i.o wrote:Immutability isn't a real difference because const-qualified objects in C++ are just as thread-safe as default-immutable objects in Rust.

If your fancy-schmancy STL doesn't give you decent functional data structures -- and it doesn't -- then you basically don't have immutability.

Make a const std::list or a const std::vector and you'll be perfectly sure that you can't change it after initialization. If you think there is more to immutability than not being able to change an object, I'm glad to hear about it.

EvanED wrote:
If this is what typestates in Rust provide then that's kind of interesting, but still not something you wouldn't be able to reproduce in C++. As a simple example, take the mutex again. You can always try to lock a C++ std::mutex, even when it's already locked, but that's exactly the point; in such a case the caller will be forced to wait until the mutex is unlocked.

You're missing the point. With typestates, you can't try to lock a mutex a second time, because the program won't compile.

Or if you want another example, take files. You can't read from a file that hasn't been opened. This has to be a run time error without typestates, but is something that the compiler would be able to prove correct.

Yes, I missed your point. I agree that if this is truly what Rust implements, it's a very interesting feature. However, I suspect this would come at a high compile-time cost.

EvanED wrote:
All other features listed on the Rust homepage are equivalent to what C++ has to offer, if not weak or restricted versions thereof. Conclusion: I'm even less convinced than I was.

You should study more PL.

Oh really? Let's have a look at the list.

rust-lang.org wrote:Compilation model: batch, ahead-of-time, C/C++ compatible
Exactly like C++.
rust-lang.org wrote:Type system: static, structural-algebraic, with metadata
"Static": like C++. "Structural-algebraic": sounds more like Haskell et al, but that's definitely equivalent to what C++ has to offer. "With metadata": alright, I admit I don't know what they mean by that. If this refers to the typestate stuff I repeat that I find it an impressive feature. Otherwise it might also refer to something like type traits, which have been present in C++ for years.
rust-lang.org wrote:Type inference: yes, only local variables
C++11 has type inference too. Everywhere.
rust-lang.org wrote:Generic types: yes, only simple, non-turing-complete substitution
C++ has fully Turing-complete generics.
rust-lang.org wrote:Concurrency: isolated tasks, message passing
Already discussed.
rust-lang.org wrote:Unique pointers: move semantics, no races or sharing
Apart from unique pointers and move semantics C++ also offers shared pointers, intrusive pointers, weak pointers, bare pointers and references, as well as the possibility for library developers to implement any other kind of pointer they want.
rust-lang.org wrote:Memory safety: no buffer overflow, use before init, NULL or free()
Already discussed.
rust-lang.org wrote:Immutability: immutable by default, mutability is the special case
Already discussed.
rust-lang.org wrote:Garbage collection: optional, per-task, only "shared" types
Already discussed.
rust-lang.org wrote:Error handling: isolated tasks, unrecoverable unwinding
Sounds exactly like exceptions in C++.
rust-lang.org wrote:Text: utf8 strings, ucs4 characters
Less choice than in C++.
Feel free to call me Julian. J+ is just an abbreviation.
Image coding and xkcd combined
User avatar
Jplus
 
Posts: 1558
Joined: Wed Apr 21, 2010 12:29 pm UTC
Location: classified

Re: Coding: Fleeting Thoughts

Postby EvanED » Wed Jan 25, 2012 11:14 pm UTC

Jplus wrote:
EvanED wrote:If the STL really dealt with memory safety, vector::operator[] would be range-checked.

Even in the STL there is metric fuckton of ways to corrupt your heap. C++ isn't even remotely memory-safe.

You can restrict yourself to the safe operations without losing any functionality (except for the ability to corrupt the heap). There is vector::at for those who want it.
(BTW: woah, no need to get so excited...)

And who actually does that?

Also, you can't use iterators. Or at least you can't move them. Or call most <algorithm>s with them.

Make a const std::list or a const std::vector and you'll be perfectly sure that you can't change it after initialization. If you think there is more to immutability than not being able to change an object, I'm glad to hear about it.

Code: Select all
void frob(const list<int>& l) {
    print_list(l);
    print_list(l);
}


Are the two print_list calls guaranteed to print the same thing? (Answer: no, if another thread changes l at the same time.)

And that's even ignoring issues like mutable members and the ability to cast away const from non-physically-const objects.

rust-lang.org wrote:Type inference: yes, only local variables
C++11 has type inference too. Everywhere.

"Everywhere" isn't true for C++. It can't infer return types or argument types. (It can infer template argument types, but only if they are used as parameter types.)

Furthermore, Rust has unification-based type inference, which C++ doesn't even remotely have. See this blog, and note in particular the second example. That's something C++ can't do.

Edit: also, structural typing is the exact opposite of what C++ has.

Edit 2: this page has an example of its typestate abilities. It's actually a bit different than I was envisioning... I'm not sure if it's more powerful or less powerful. It is definitely not emulatible in C++ without separate objects (a la Yakk's example), however.
EvanED
 
Posts: 4141
Joined: Mon Aug 07, 2006 6:28 am UTC
Location: Madison, WI

Re: Coding: Fleeting Thoughts

Postby Jplus » Thu Jan 26, 2012 12:29 am UTC

EvanED wrote:
Jplus wrote:You can restrict yourself to the safe operations without losing any functionality (except for the ability to corrupt the heap). There is vector::at for those who want it.

And who actually does that?

I suppose all people who make a fuss about the Ada/Java/Rust-type of safety.

EvanED wrote:Also, you can't use iterators. Or at least you can't move them. Or call most <algorithm>s with them.

Could you clarify? I think I'm using iterators and calling algorithms with them all the time, but perhaps you mean something different.

EvanED wrote:
Make a const std::list or a const std::vector and you'll be perfectly sure that you can't change it after initialization. If you think there is more to immutability than not being able to change an object, I'm glad to hear about it.

Code: Select all
void frob(const list<int>& l) {
    print_list(l);
    print_list(l);
}


Are the two print_list calls guaranteed to print the same thing? (Answer: no, if another thread changes l at the same time.)

And that's even ignoring issues like mutable members and the ability to cast away const from non-physically-const objects.

Correct answer: if the source object of l was declared const (and not volatile), then yes, the two print_list calls are guaranteed to print the same result.
If a class allows for member mutation in const-declared objects, you shouldn't use it because the interface has been poorly defined. Only member functions that do not affect the state of the object must be marked const.
const_cast should only be used on rvalue references to objects that were declared non-const, otherwise it's undefined behaviour.

EvanED wrote:
rust-lang.org wrote:Type inference: yes, only local variables
C++11 has type inference too. Everywhere.

"Everywhere" isn't true for C++. It can't infer return types or argument types. (It can infer template argument types, but only if they are used as parameter types.)

Furthermore, Rust has unification-based type inference, which C++ doesn't even remotely have. See this blog, and note in particular the second example. That's something C++ can't do.

Alright, C++ doesn't infer argument types, because then the language couldn't be statically typed anymore, so "everywhere" isn't true. But C++11 can infer return types and I think it can also do the things in the examples from the blog. Learn about auto and decltype. (The article makes mention of "C++09" because at the time the new standard was expected to be published by 2009. The exact specifications in the final standard might differ a bit from this article.)

EvanED wrote:Edit: also, structural typing is the exact opposite of what C++ has.

Well, more or less. So what? You can achieve the same things with either approach.

EvanED wrote:Edit 2: this page has an example of its typestate abilities. It's actually a bit different than I was envisioning... I'm not sure if it's more powerful or less powerful. It is definitely not emulatible in C++ without separate objects (a la Yakk's example), however.

Yes that's definitely something you can't do as easily in C++. Hooray for Rust.
(FWIW: you could probably use static assertions and boost::enable_if to avoid most of the inconveniences in Yakk's workaround, including the need to define a new type for every typestate. Still, I agree Rust has a strong advantage over C++ here.)
Feel free to call me Julian. J+ is just an abbreviation.
Image coding and xkcd combined
User avatar
Jplus
 
Posts: 1558
Joined: Wed Apr 21, 2010 12:29 pm UTC
Location: classified

Re: Coding: Fleeting Thoughts

Postby EvanED » Thu Jan 26, 2012 1:25 am UTC

Jplus wrote:
EvanED wrote:
Jplus wrote:You can restrict yourself to the safe operations without losing any functionality (except for the ability to corrupt the heap). There is vector::at for those who want it.
...
Could you clarify? I think I'm using iterators and calling algorithms with them all the time, but perhaps you mean something different.

You said you can restrict yourself to safe operations without losing functionality. Moving iterators is not an operation where there is a guaranteed-safe version. In other words, you cannot know whether iter++ will corrupt your heap (or more realistically, crash or reveal private information) without understanding what containers it may have come from and where in those containers it might be. Nor is there an alternative type of iterator you can ask for whose operations are guaranteed safe.

vector<int> v1, v2; ... find(v1.begin(), v2.end(), 4) is an obvious error, but there is no standard way for find to figure that out, nor is there an equivalent operation which will find such an error.

(Edit: Actually my final statement is a bit off; I think C++11 may have added convenience versions of the <algorithm>s that take a container instead of an iterator range, or something like that (e.g. boost::range). However, it is true that there is no way to write a guaranteed-safe function which takes an iterator range.)

You can come back with "but don't do that", but I can rightfully say "we already have a term for 'is memory safe if you don't make mistakes': it's called not memory safe".

Correct answer: if the source object of l was declared const (and not volatile), then yes, the two print_list calls are guaranteed to print the same result.

But was it declared const? How can you know? How can the complier know? The answer to that is that you as the programmer and the compiler (as, well, the compiler) has to be prepared for the case where that function is called with an object which isn't physically const. ("Physically const" = "declared as const".) This makes reasoning more difficult for the programmer and can inhibit optimizations.

And guess what? There's no way to say "this function must be passed an object which is physically const."

That's the difference between C++ const and building your language around true immutability.


Alright, C++ doesn't infer argument types, because then the language couldn't be statically typed anymore, so "everywhere" isn't true.

That's also not true; Haskell, ML, and some other more obscure languages are capable of full type inference, including parameters, and are statically typed.

But C++11 can infer return types...

OK, this looks to be true. I didn't know that.

...and I think it can also do the things in the examples from the blog.

However, this isn't.

auto variables need an initializer, and decltype isn't type inference; it just gives you another way to specify a particular type. The use case is basically templates, where you can't know what the type of an expression is until instantiation time, so you were previously unable to name it.
EvanED
 
Posts: 4141
Joined: Mon Aug 07, 2006 6:28 am UTC
Location: Madison, WI

Re: Coding: Fleeting Thoughts

Postby b.i.o » Thu Jan 26, 2012 3:36 am UTC

Jplus wrote:Alright, C++ doesn't infer argument types, because then the language couldn't be statically typed anymore

...the OCaml code I write every day would like a word worth you.
User avatar
b.i.o
Green is the loneliest number
 
Posts: 2520
Joined: Fri Jul 27, 2007 4:38 pm UTC
Location: Hong Kong

Re: Coding: Fleeting Thoughts

Postby EvanED » Thu Jan 26, 2012 3:55 am UTC

I want to expand a bit more on the const vs immutable objects thing, because I really believe that an ML/Clojure-style language, where the only variables which can ever change are those explicitly marked as such, has as it Right™.

First, let's keep going with the concurrency example. Does frob() (the function that prints a list twice) have to lock its argument if it wants consistent results between the two calls to print_list? The answer is yes, for the same reason as I gave in my last post: it can't know whether what its passed is physically const. The only way around this is if you make it an idiom that frob can never be called with something that isn't physically const. And that's something which is just as intrusive as if you've ever had to add const to non-const code -- only this time, you don't have the type system to help you find those places. If you've got even one path -- maybe you have to trace an object through parameters to five functions, or worse -- to frob with a non-physically-const object, you have a potential race condition on your hand.

By contrast, in a language designed with immutability in mind, you don't have to lock. Why do you have locks? Because you have conflicts between threads. Those conflicts are either writer-writer or writer-reader conflicts. But if you can't have writers, you can't have conflicts. And you've eliminated race conditions. (This is too optimistic, of course. But immutability does go a very long way towards that goal.)

Furthermore, you can't even really say "program with immutable objects in C++". While technically you could, you'd have to build/assemble your own library, because the C++ standard library isn't even remotely built for it. (Given a physically-const list and item, how can I construct a new physically-const list with that item appended to the end of that list? You can do it, but it ain't pretty and it must be slow because list must be a doubly-linked list.) And furthermore, even if you did that, I suspect you'd only be able to reap the programmer benefits -- and the compiler wouldn't be able to do the deep reasoning required to realize that the object can't change.

But second, you don't even need concurrency.
Code: Select all
void frob(list<int> const & l) {
    print_list(l);
    do_something();
    print_list(l);
}

The program is single-threaded. Now can you guarantee that you'll get the same result?

You still can't, because do_something() might change l. (This is true even if it's not passed l as a parameter. Maybe a pointer or reference to l's referent is stored in a global, or maybe in a member variable in a class where frob and do_something are members.)

Thus the same arguments as to why immutability is good for concurrent programs applies here in the sequential case. Both programmer and compiler have to reason about the potential of l changing. (More realistically you'll have the following situation. For the compiler, if it doesn't inline do_something (and BTW it can't if do_something() is virtual) the compiler will assume that l does change and give up optimization opportunities. For the programmer, he'll have to either guess that l doesn't change or go tromping through do_something to establish that it won't.)
EvanED
 
Posts: 4141
Joined: Mon Aug 07, 2006 6:28 am UTC
Location: Madison, WI

Re: Coding: Fleeting Thoughts

Postby troyp » Thu Jan 26, 2012 6:06 am UTC

Type state looks interesting. There's actually a classic paper[PDF] (which I haven't read) on this construct.
troyp
 
Posts: 529
Joined: Thu May 22, 2008 9:20 pm UTC
Location: Lismore, NSW

Re: Coding: Fleeting Thoughts

Postby Jplus » Thu Jan 26, 2012 12:03 pm UTC

EvanED wrote:You said you can restrict yourself to safe operations without losing functionality. Moving iterators is not an operation where there is a guaranteed-safe version. In other words, you cannot know whether iter++ will corrupt your heap (or more realistically, crash or reveal private information) without understanding what containers it may have come from and where in those containers it might be. Nor is there an alternative type of iterator you can ask for whose operations are guaranteed safe.

You're dead wrong. There is an alternative type of iterator, called the const iterator, which is memory safe (it can't corrupt your heap because you can't write to it). Const-declared and const-reference-passed containers will only return const iterators. Even if your iterator is not const, just incrementing it will never corrupt your heap.

EvanED wrote:vector<int> v1, v2; ... find(v1.begin(), v2.end(), 4) is an obvious error, but there is no standard way for find to figure that out, nor is there an equivalent operation which will find such an error.
[...]
You can come back with "but don't do that", but I can rightfully say "we already have a term for 'is memory safe if you don't make mistakes': it's called not memory safe".

Containers guarantee that they always return valid begin and end iterators. Algorithms guarantee to do the right thing if you give them valid iterators. So it's safe as long as you don't interfere with the natural flow of things on purpose. "Memory safe if the programmer doesn't fuck with it on purpose" is memory safe. I challenge you to give an example of incorrect iterator use where the programmer doesn't do it on purpose.

EvanED wrote:
Correct answer: if the source object of l was declared const (and not volatile), then yes, the two print_list calls are guaranteed to print the same result.

But was it declared const? How can you know? How can the complier know? [...]

And guess what? There's no way to say "this function must be passed an object which is physically const."

That's the difference between C++ const and building your language around true immutability.

Point taken.

EvanED wrote:
Alright, C++ doesn't infer argument types, because then the language couldn't be statically typed anymore, so "everywhere" isn't true.

That's also not true; Haskell, ML, and some other more obscure languages are capable of full type inference, including parameters, and are statically typed.

(Also in response to b.i.o on OCaml:)
I said "C++ doesn't infer argument types" in response to your own claim, assuming that you were talking about non-template functions. As you already pointed out, C++ can infer argument types in templates. So you can't do the following in C++:
Code: Select all
auto add3 (auto first, auto second, auto third) {
    return first + second + third;
}

void foo () {
    int a = 1, b = 2, c = 3;
    std::string x = "He", y = "ll", z = "o!";
    add3(a, b, c);
    add3(x, y, z);
}

But you can do this:
Code: Select all
template <class T>
T add3 (T first, T second, T third) {
    return first + second + third;
}

void foo () {
    int a = 1, b = 2, c = 3;
    std::string x = "He", y = "ll", z = "o!";
    add3(a, b, c);
    add3(x, y, z);
}

So why can Haskell and OCaml do it everywhere, while C++ can only do it in templates? Because Haskell and OCaml don't offer the possibility to define non-generic functions. If they did, you'd have the same situation where you can't always let the compiler infer argument types.

EvanED wrote:
...and I think it can also do the things in the examples from the blog.

However, this isn't.

auto variables need an initializer, and decltype isn't type inference; it just gives you another way to specify a particular type. The use case is basically templates, where you can't know what the type of an expression is until instantiation time, so you were previously unable to name it.

It's true, this is another thing that C++ can't do while Rust can.

On to your next post:
EvanED wrote:[...]
By contrast, in a language designed with immutability in mind, you don't have to lock. Why do you have locks? Because you have conflicts between threads. Those conflicts are either writer-writer or writer-reader conflicts. But if you can't have writers, you can't have conflicts. And you've eliminated race conditions. (This is too optimistic, of course. But immutability does go a very long way towards that goal.)

It's true that interface-enforced immutability eliminates the need for locks, but so does message passing. As I said before, C++ is already on its way to provide message passing, and in fact it will probably do so before Rust is released. So although "true" immutability will probably always be absent from C++, lock-free multithreading is definitely going to be available.

EvanED wrote:[...]
But second, you don't even need concurrency.
Code: Select all
void frob(list<int> const & l) {
    print_list(l);
    do_something();
    print_list(l);
}

The program is single-threaded. Now can you guarantee that you'll get the same result?

You still can't, because do_something() might change l. (This is true even if it's not passed l as a parameter. Maybe a pointer or reference to l's referent is stored in a global, or maybe in a member variable in a class where frob and do_something are members.)

While it's technically true that you can (try to) do such things in C++, this is again a case where the programmer is sabotaging their own program on purpose. This is even more contrived than the example where you pass an algorithm the wrong iterators on purpose. In fact it's simply undefined behaviour, and to continue with your next paragraph:

EvanED wrote:Thus the same arguments as to why immutability is good for concurrent programs applies here in the sequential case. Both programmer and compiler have to reason about the potential of l changing. (More realistically you'll have the following situation. For the compiler, if it doesn't inline do_something (and BTW it can't if do_something() is virtual) the compiler will assume that l does change and give up optimization opportunities. For the programmer, he'll have to either guess that l doesn't change or go tromping through do_something to establish that it won't.)

No, a conformant C++ compiler will assume that l does NOT change. Because the programmer has explicitly stated that frob will not affect its argument. Hence, the compiler will also apply const-related optimizations, for examply by altering the order of the statements. Obviously if the programmer has crafted some sneaky workaround in order to be able to affect the referent of l in the body of do_something, his program might do all kinds of unexpected things and might behave differently on different compilers.
Because the kind of trick you described constitutes undefined behaviour, programmers are also justified to assume it doesn't happen.

Also, for clarity: whether a function is inlined or not does not affect a C++ compiler's assumptions about object change. If a function takes its argument by value or by const reference, the compiler will always assume that the original object remains unaltered.

troyp wrote:Type state looks interesting. There's actually a classic paper[PDF] (which I haven't read) on this construct.

Thanks. Again, let me emphasize that I think typestate is a real advantage of Rust.
Feel free to call me Julian. J+ is just an abbreviation.
Image coding and xkcd combined
User avatar
Jplus
 
Posts: 1558
Joined: Wed Apr 21, 2010 12:29 pm UTC
Location: classified

Re: Coding: Fleeting Thoughts

Postby Yakk » Thu Jan 26, 2012 3:51 pm UTC

Jplus wrote:You're dead wrong. There is an alternative type of iterator, called the const iterator, which is memory safe (it can't corrupt your heap because you can't write to it). Const-declared and const-reference-passed containers will only return const iterators. Even if your iterator is not const, just incrementing it will never corrupt your heap.

You didn't cover "leak information from other parts of your heap".

I was under the impression that incrementing an iterator past end is undefined behavior, regardless of it being a const_iterator or not.

Dereferencing a const iterator past end is going to execute undefined behavior, isn't it? (Ie, a crash, or worse -- well, probably just junk data or a memory fault)
Containers guarantee that they always return valid begin and end iterators. Algorithms guarantee to do the right thing if you give them valid iterators. So it's safe as long as you don't interfere with the natural flow of things on purpose. "Memory safe if the programmer doesn't fuck with it on purpose" is memory safe. I challenge you to give an example of incorrect iterator use where the programmer doesn't do it on purpose.
Every typo by every programmer is included in the set of on purpose actions?

You are making a joke here, no?

Please may it be a joke.
Point taken.
It is the general problem that a restricted foo is not a type of foo in every sense. A square is not a restricted rectangle. An immutable foo is not a foo.

A square is almost a restricted rectangle, and an immutable foo is almost a foo. In fact, an immutable square is an immutable rectangle.

And, with some caveats that are impossible to guarantee in practice in C++, you can treat a foo as an immutable foo in a particular stretch of code.
But you can do this:
Code: Select all
template <class T>
T add3 (T first, T second, T third) {
    return first + second + third;
}

You can do much better:
Code: Select all
template<class A, class B, class C>
auto add3(A a, B b, C c)->decltype(a+b+c)  {
  return a+b+c;
}

which now takes 3 of any type and has a deduced return type.

Note, however, in both cases I told the compiler how to work out the return type separate from how I told it how to calculate the return value. In one case, we said it was T (when operator+ might be overloaded to return something else -- don't joke, I ran into that issue with operator<< more than once), and in another case I have to duplicate the expression.
On to your next post:
EvanED wrote:[...]By contrast, in a language designed with immutability in mind, you don't have to lock. Why do you have locks? Because you have conflicts between threads. Those conflicts are either writer-writer or writer-reader conflicts. But if you can't have writers, you can't have conflicts. And you've eliminated race conditions. (This is too optimistic, of course. But immutability does go a very long way towards that goal.)

It's true that interface-enforced immutability eliminates the need for locks, but so does message passing. As I said before, C++ is already on its way to provide message passing, and in fact it will probably do so before Rust is released. So although "true" immutability will probably always be absent from C++, lock-free multithreading is definitely going to be available.
Well, constexpr is coming to C++, which is a step closer to true immutability in a sense.

All you need is a const-type qualifier that cannot be cast away, and you cannot turn non-const versions into a const version. Call it "const_I_am_not_kidding_this_time".
EvanED wrote:[...]
But second, you don't even need concurrency.
Code: Select all
void frob(list<int> const & l) {
    print_list(l);
    do_something();
    print_list(l);
}

The program is single-threaded. Now can you guarantee that you'll get the same result?

You still can't, because do_something() might change l. (This is true even if it's not passed l as a parameter. Maybe a pointer or reference to l's referent is stored in a global, or maybe in a member variable in a class where frob and do_something are members.)

While it's technically true that you can (try to) do such things in C++, this is again a case where the programmer is sabotaging their own program on purpose.

Ok, now I think you actually aren't joking.

You clearly haven't worked on a project of a non-trivial size and complexity. Large programs end up being worked on by people you have never met, and parts of the code are doing things that you will never, ever read or know about. do_something() could quite easily cause arbitrary code to be run -- maybe it goes off and sends a message to the user interface, which updates some dodad, which causes something else to happen, which eventually touches some control that makes a twerk to the contents of the list.

Any program you can hold in your head and fully understand is a trivial program, and trivial programs are an easy problem and not really worth talking about when talking about programming language features (other than "we can make more programs into trivial programs").

This is at least half of the point of immutable objects and "safe" pointers and the like -- it reduces the number of things you have to hold in your head to understand what a chunk of code does. (it does the same for the compiler, which allows for optimizations, and is at least half of the point as well... don't ask me about the third half).
This is even more contrived than the example where you pass an algorithm the wrong iterators on purpose. In fact it's simply undefined behaviour, and to continue with your next paragraph:
As far as my current rudimentary understanding of the C++ standard, modifying l in that function is perfectly ok, and the compiler has to assume that it could happen. By making that function call, you have wandered into lala land.

I could be wrong. But Herb agrees with me:
http://gotw.ca/gotw/081.htm
Because the kind of trick you described constitutes undefined behaviour, programmers are also justified to assume it doesn't happen.

I'd seriously like a citation from the standard (or draft version). I've done a quick googling, and failed to find such a citation.

Here is the draft standard (11 MB!)
http://www.open-std.org/jtc1/sc22/wg21/ ... /n3242.pdf
... sadly, that might just be changes since 98.
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.
User avatar
Yakk
Poster with most posts but no title.
 
Posts: 10450
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

Re: Coding: Fleeting Thoughts

Postby EvanED » Thu Jan 26, 2012 4:34 pm UTC

Jplus wrote:You're dead wrong. There is an alternative type of iterator, called the const iterator, which is memory safe (it can't corrupt your heap because you can't write to it).

You keep missing the point. Moving iterators, ironically enough, is the unsafe operation I'm talking about. Yakk's response is right: moving iterators past the end of a container is undefined behavior. (You can go one item past the end or, for reverse iterators, one past the beginning.) Theoretically, it could corrupt your heap. More practically, it will either crash (for the node-based containers like lists and maps) or allow you to reveal private information (for vectors). Why? Because it's not memory safe..

And how can you tell if an iterator is at the beginning or end of a container? Without knowing the container, you can't. Hence my statement that it's impossible to write a function which takes an iterator range, iterates over it, and is guaranteed-safe.

I challenge you to give an example of incorrect iterator use where the programmer doesn't do it on purpose.

My example. That's an easy enough mistake to make.

Don't like it? What about
Code: Select all
mismatch(v1.begin(), v2.begin(), v1.end());
mismatch(v1.begin(), v1.end(), v2.begin());

Without looking at the docs, which one is correct? If you know C++ well, you can probably guess... but are you 100% sure?

And what else do you have to know in order to know that this call is safe? Suppose that line is in a function foo where you are just given the three iterators. Can you determine whether that call will be safe? (No.) Do you want to go looking at all the callers of foo, and maybe their callers?

So why can Haskell and OCaml do it everywhere, while C++ can only do it in templates? Because Haskell and OCaml don't offer the possibility to define non-generic functions.

Wrong again.
Code: Select all
# let add x y = x + y;;
val add : int -> int -> int = <fun>
# add 1 1;;
- : int = 2
# add 1.0 1.0;;
Error: This expression has type float but an expression was expected of type
         int


While it's technically true that you can (try to) do such things in C++, this is again a case where the programmer is sabotaging their own program on purpose. This is even more contrived than the example where you pass an algorithm the wrong iterators on purpose. In fact it's simply undefined behaviour,...

I am not 100% positive about this, but I think you're wrong. I'd like to see a citation for your standpoint. What you're essentially saying is that the following program provokes undefined behavior:
Code: Select all
int main() {
    list<int> l;
    list<int> const & r = l;
    std::cout << "r's size: " << r.size() << "\n";
    l.push_back(4);
    std::cout << "r's size: " << r.size() << "\n";
    return 0;
}


No, a conformant C++ compiler will assume that l does NOT change. Because the programmer has explicitly stated that frob will not affect its argument.

Two mistakes. First, stating that frob will not affect its argument is different from saying that argument won't change, for reasons which we have discussed plenty. Second, it's not even saying that frob will not affect it's argument... it's saying it won't affect its argument through that reference. Here's another frob:
Code: Select all
void frob2(list<int> const & r1, list<int> & r2) {
    print_list(r1);
    r2.push_back(1);
    print_list(r2);
}


The compiler is not allowed to assume that r1 and r2 do not alias, and by my understanding this program must work "as expected" even if you call it as in frob2(some_list, some_list)[i] -- that is, print out the contents of [i]some_list, add 1 to it, then print out the new contents of some_list.

(Now, you would be correct if that reference was marked restrict. Oh wait, C++ doesn't have restrict -- not even C++11.)

Also, for clarity: whether a function is inlined or not does not affect a C++ compiler's assumptions about object change. If a function takes its argument by value or by const reference, the compiler will always assume that the original object remains unaltered.

Inlining would make visible whether there are changes. If you're right and the compiler actually is allowed to assume the referent of a const pointer or reference can't change, this doesn't matter; if I'm right and it isn't, this does matter.
EvanED
 
Posts: 4141
Joined: Mon Aug 07, 2006 6:28 am UTC
Location: Madison, WI

Re: Coding: Fleeting Thoughts

Postby jareds » Thu Jan 26, 2012 4:52 pm UTC

So...I have a pet peeve with treating someone like they don't know what they're talking about while giving copious misinformation yourself.
Jplus wrote:You're dead wrong. There is an alternative type of iterator, called the const iterator, which is memory safe (it can't corrupt your heap because you can't write to it). Const-declared and const-reference-passed containers will only return const iterators. Even if your iterator is not const, just incrementing it will never corrupt your heap.
And lo,
ISO/IEC 14882:2003 subclause 17.4.3.8 wrote:Violation of the preconditions specified in a function’s Required behavior paragraph results in undefined behavior [emphasis mine] unless the function’s Throws paragraph specifies throwing an exception when the precondition is violated.
ISO/IEC 14882:2003 subclause 24.1 paragraph 5 wrote:Just as a regular pointer to an array guarantees that there is a pointer value pointing past the last element of the array, so for any iterator type there is an iterator value that points past the last element of a corresponding container. These values are called past-the-end values. Values of an iterator i for which the expression *i is defined are called dereferenceable. The library never assumes that past-the-end values are dereferenceable.
ISO/IEC 14882:2003 Table 72—Input iterator requirements wrote:operation ; type ; semantics, pre/post-conditions
...
++r ; X& ; pre: r is dereferenceable, pre: ...
ISO/IEC 14882:2003 subclause 24.1 paragraph 4 wrote:Besides its category, a forward, bidirectional, or random access iterator can also be mutable or constant depending on whether the result of the expression *i behaves as a reference or as a reference to a constant. Constant iterators do not satisfy the requirements for output iterators, and the result of the expression *i (for constant iterator i) cannot be used in an expression where an lvalue is required.
Ergo, incrementing an input iterator past the past-the-end value is undefined as it violates the precondition of ++, and constant iterators add nothing that makes this legal. (Of course, it is illegal for output iterators as well.)

Jplus wrote:
EvanED wrote:vector<int> v1, v2; ... find(v1.begin(), v2.end(), 4) is an obvious error, but there is no standard way for find to figure that out, nor is there an equivalent operation which will find such an error.
[...]
You can come back with "but don't do that", but I can rightfully say "we already have a term for 'is memory safe if you don't make mistakes': it's called not memory safe".

Containers guarantee that they always return valid begin and end iterators.
Correct.
Jplus wrote:Algorithms guarantee to do the right thing if you give them valid iterators.
And lo,
ISO/IEC 14882:2003 subclause 24.1 paragraph 7 wrote:Most of the library’s algorithmic templates that operate on data structures have interfaces that use ranges. A range is a pair of iterators that designate the beginning and end of the computation. A range [i, i) is an empty range; in general, a range [i, j) refers to the elements in the data structure starting with the one pointed to by i and up to but not including the one pointed to by j. Range [i, j) is valid if and only if j is reachable from i. The result of the application of functions in the library to invalid ranges is undefined. [emphasis mine]
So algorithms are not guaranteed to do the correct thing if you pass them a range using iterators from two different containers.

Jplus wrote:So it's safe as long as you don't interfere with the natural flow of things on purpose. "Memory safe if the programmer doesn't fuck with it on purpose" is memory safe. I challenge you to give an example of incorrect iterator use where the programmer doesn't do it on purpose.
Of course, you cannot seriously claim that calling find(v1.begin(), v2.end(), 4) can only be done on purpose.

Jplus wrote:
EvanED wrote:[...]
But second, you don't even need concurrency.
Code: Select all
void frob(list<int> const & l) {
    print_list(l);
    do_something();
    print_list(l);
}

The program is single-threaded. Now can you guarantee that you'll get the same result?

You still can't, because do_something() might change l. (This is true even if it's not passed l as a parameter. Maybe a pointer or reference to l's referent is stored in a global, or maybe in a member variable in a class where frob and do_something are members.)

While it's technically true that you can (try to) do such things in C++, this is again a case where the programmer is sabotaging their own program on purpose. This is even more contrived than the example where you pass an algorithm the wrong iterators on purpose. In fact it's simply undefined behaviour, and to continue with your next paragraph:
[snip]
No, a conformant C++ compiler will assume that l does NOT change. Because the programmer has explicitly stated that frob will not affect its argument. Hence, the compiler will also apply const-related optimizations, for examply by altering the order of the statements. Obviously if the programmer has crafted some sneaky workaround in order to be able to affect the referent of l in the body of do_something, his program might do all kinds of unexpected things and might behave differently on different compilers.
Because the kind of trick you described constitutes undefined behaviour, programmers are also justified to assume it doesn't happen.

No, this is simply ignorant of what const and volatile qualifiers mean in C++. As you might expect...
ISO/IEC 14882:2003 subclause 7.1.5.1 paragraph 3 wrote:A pointer or reference to a cv-qualified type need not actually point or refer to a cv-qualified object, but it is treated as if it does; a const-qualified access path cannot be used to modify an object even if the object referenced is a non-const object and can be modified through some other access path.

In other words, it explicitly points out that a pointer or reference to a const type might point or refer to a non-const object that can therefore be modified by some means other than via the const pointer or reference.
ISO/IEC 14882:2003 subclause 7.1.5.1 paragraph 5 wrote:Example:
Code: Select all
[...]
int i = 2; // not cv-qualified
const int* cip; // pointer to const int
cip = &i; // OK: cv-qualified access path to unqualified
*cip = 4; // ill-formed: attempt to modify through ptr to const

int* ip;
ip = const_cast<int*>(cip); // cast needed to convert const int* to int*
*ip = 4; // defined: *ip points to i, a non-const object   [ <===== emphasis mine ]
So, the following is totally legal, since var is not const.
Code: Select all
list<int> var;
void do_something()
{
    var.push_back(0);
}
void foo()
{
    frob(var);
}

Pointers and references to const-qualified types rarely, if ever, help the implementation generate better code. They just help the programmer maintain const-correctness.
jareds
 
Posts: 414
Joined: Wed Jan 03, 2007 3:56 pm UTC

Re: Coding: Fleeting Thoughts

Postby EvanED » Thu Jan 26, 2012 5:30 pm UTC

Yakk wrote:Well, constexpr is coming to C++, which is a step closer to true immutability in a sense. All you need is a const-type qualifier that cannot be cast away, and you cannot turn non-const versions into a const version. Call it "const_I_am_not_kidding_this_time".

constexpr only helps in a tiny, tiny way. You really need that physically_const keyword. (My suggestion is easier to spell. :-))

But that's not all you need... you also need a standard library that makes it reasonable to work with physically-const objects. And the STL isn't even in the right ballpark. It's over in the football stadium or something.
EvanED
 
Posts: 4141
Joined: Mon Aug 07, 2006 6:28 am UTC
Location: Madison, WI

Re: Coding: Fleeting Thoughts

Postby Yakk » Thu Jan 26, 2012 6:31 pm UTC

Meh, most algorithms that take template const_iterators will work fine with this_is_actually_const_no_I_am_not_kidding_iterators.

You might have to write up immutable list-type objects that can support soft-splicing for efficiency purposes (ie, immutable tree manipulation for fast insert that uses reference counted subtrees). Polishing that would be hard, writing it wouldn't.
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.
User avatar
Yakk
Poster with most posts but no title.
 
Posts: 10450
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

Re: Coding: Fleeting Thoughts

Postby EvanED » Thu Jan 26, 2012 7:18 pm UTC

Yakk wrote:Meh, most algorithms that take template const_iterators will work fine with this_is_actually_const_no_I_am_not_kidding_iterators.

The algorithms will work, but none of the data structures in the STL are amenable to being used in a const fashion.

I mean, what's probably the staple data structure of languages designed around immutability? Singly-linked lists. And the STL doesn't even have that, let alone fancier stuff like functional maps or Clojure-style persistent "hash tables".

You can't even say "well that's only an implementation problem", because the interfaces of the STL don't even admit the possibility of functional structures. E.g. map::insert returns an iterator to the new position and an indication of whether it was inserted; a functional tree would need to return a new map. Even the interfaces of the container-adapters, like stack, can't be used in an immutable fashion.

That's what I mean by "the STL isn't even in the right ballpark." Though I guess that only applies to the container portion. So maybe I should have said half of the STL isn't even in the right ballpark. :-)
EvanED
 
Posts: 4141
Joined: Mon Aug 07, 2006 6:28 am UTC
Location: Madison, WI

Re: Coding: Fleeting Thoughts

Postby Yakk » Thu Jan 26, 2012 8:12 pm UTC

Code: Select all
template<typename T>
struct Stack {
  typedef std::shared_ptr<really_const Stack> spStack;
  Stack( T data_, spStack next_ = spStack() ):data(data_), next(next_) {}
  T data;
  std::shared_ptr<really_const Stack> next;

  really_const Stack pop() really_const { return *next; }
  T examine() really_const { return data; }
  really_const Stack push(T data) really_const { return Stack( data, *this ); }
  struct really_const_iterator { ... };
  really_const_iterator begin() really_const { ... };
  really_const_iterator end() really_const { ... };
  really_const Stack reverse() really_const { ... };
};
I'm not seeing anything particularly tricky. Sure, lots of polish would need be done, but that kind of simple data structure isn't hard to get a prototype out.

The STL is nice to have and all (as it is generates a bunch of near-metal performance data structures that can be used at a relatively high level of abstraction), but I really don't get the angst that really_const data structures would be missing.

One thing that I did notice is that really_const classes aren't really const. The destruction of a really_const class ends up decrementing a reference count of things it owns, which can cause instances of things pointed to to be destroyed. In a sense, Immutable objects do not change between the end of construction and the beginning of destruction, which is important to note.
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.
User avatar
Yakk
Poster with most posts but no title.
 
Posts: 10450
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

Re: Coding: Fleeting Thoughts

Postby oops » Fri Jan 27, 2012 7:05 am UTC

https://www.destroyallsoftware.com/talks/wat

A video documenting the hilarious behaviors of javascript.
oops
 
Posts: 9
Joined: Tue Jan 24, 2012 12:14 am UTC

Re: Coding: Fleeting Thoughts

Postby Xanthir » Fri Jan 27, 2012 7:06 pm UTC

oops wrote:https://www.destroyallsoftware.com/talks/wat

A video documenting the hilarious behaviors of javascript.

I *just* learned why the behavior of "{}+{}" and "{}+[]" happen. It actually does make sense! (For some definition of "sense".)

So, first, JS is somewhat weakly-typed around a few operations - for "+" it first tries to coerce the objects into a number via valueOf() so it can do addition, then if that fails, coerces them into a string via toString() so it can do concatenation.

This is why "[]+[]" (returns "") and "[]+{}" (returns "[object Object]") work the way they do - arrays toString as a comma-separated toString of their elements, so [].toString() yields "", while objects toString as "[object MyClass]", which is "[object Object]" for {}.

However, "{}+[]" returns 0 and "{}+{}" returns NaN, not "[object Object]" and "[object Object][object Object]", as you might expect. You may be able to figure out the reason if you first store the values in variables and evaluate the expressions again (you'll get the "expected" results).

It's because of the strange way that JS REPLs work, and the particulars of the JS language's grammar. If you just input something like "{}+[]", rather than interpreting it as "<object> <binary-plus> <array>", it interprets it as "<block> <unary-plus> <array>". An empty block is just a statement and doesn't do anything. Unary-plus adds a wrinkle to the coercion rules - if first uses valueOf to try and get a number, and if that fails it calls toString and then *parses it as a number*. The empty string (which is what [] stringifies as) parses as 0, while "[object Object]" parses as NaN.

If you force the REPL to interpret the first two characters as an object, rather than a block, it works as expected. Storing them into vars does that, as does wrapping the first arg in parentheses - "({})+{}" yields "[object Object][object Object]" like it's supposed to.
(defun fibs (n &optional (a 1) (b 1)) (take n (unfold '+ a b)))
User avatar
Xanthir
My HERO!!!
 
Posts: 4325
Joined: Tue Feb 20, 2007 12:49 am UTC
Location: The Googleplex

Re: Coding: Fleeting Thoughts

Postby troyp » Fri Jan 27, 2012 11:42 pm UTC

I suspected weak typing and implicit coercion were the reason for the noncommutativity of +. I also suspect they might be the reason for the nontransitivity of ==. I've never investigated them, though (I haven't used JS that much).

I'm a bit shocked that JS interprets {} as an empty block in expressions, though. That really seems pretty stupid. I mean {} is the idiomatic way to write an empty object, isn't it? And empty blocks surely aren't very common. They should just require an empty block to be {;}.
troyp
 
Posts: 529
Joined: Thu May 22, 2008 9:20 pm UTC
Location: Lismore, NSW

Re: Coding: Fleeting Thoughts

Postby Laguana » Sat Jan 28, 2012 12:39 am UTC

I don't suppose anyone here knows how to make a native executable from a java project? I'm updating a java program, and previously it had been made into a native 32bit executable for a server without the JRE, but the previous author can't remember how they did it. I've tried gcj, but it gives compile errors, despite javac being happy with it.
Laguana
 
Posts: 49
Joined: Sat Jan 19, 2008 10:13 pm UTC

Re: Coding: Fleeting Thoughts

Postby Jplus » Sat Jan 28, 2012 2:47 am UTC

EvanED, Yakk and jareds wrote:Let's all hammer on Julian!
No, just kidding.
I know you guys are serious, and I realize I've been wrong about quite a few things.

I feel I shouldn't hijack this thread any longer with enormous quote trees, so I'll just concede the main points without properly attributing the posters who convinced me or the sources they referred to:
  • Incrementing an iterator beyond past-the-end is always undefined behaviour, regardless of its state of mutability, and therefore it's never memory-safe. (OK, some incomplete attribution. @jareds: I had to read those ISO C++03 standard quotes a few times before I could understand how you derived your conclusion. Nice bit of induction you did there.)
  • Despite the guarantees offered by STL containers and by STL algorithms, using iterators is never strictly guaranteed to be safe because the library user can make mistakes.
  • Modifying an object inside a function to which it was passed by const reference is not undefined behaviour (I did a search of my own to confirm this; see here).
  • That an argument to a function is passed by const reference of itself doesn't help the compiler to optimize the function.

Then, a few reactions to some less-central points that passed along the way.

@EvanED: I claimed Haskell and OCaml don't offer the possibility to define a non-generic function, and you rightly pointed out that they can. Still, my underlying claim remains valid that because of static typing, C++ and ML essentially behave the same with regard to generics and type inference: either the function is generic, or the type of its arguments can't be inferred from the calling context.

On treating someone like they don't know what they're talking about and hypocrisy:
@jareds: you're absolutely right, and in fact I have that pet peeve too.
@EvanED: I apologize for my way of responding to you. FWIW, I was really thinking that you did know what you were talking about, even though I thought you were wrong about some things.

@jareds: of course STL algorithms don't guarantee to do the right thing if you pass them non-matching iterators. I admit I was sloppy in my formulation, but wasn't it obvious that I meant that STL algorithms guarantee to do the right thing if you pass them iterators that are valid relative to each other?

On the possibility of practical immutable "functional" containers in C++:
While I agree with Yakk that it can be implemented, I think EvanED is right that it couldn't be done without violating the STL container interface conventions.

Finally, let me remark that I've learned something. Thanks to all three of you for the instructive comments.


Now that I've mostly cleaned my hands (I hope), I would like to return to the issue of the unexpected side effect because I think there is more to it. To summarize, we have (in C++)
  • two objects of the same type T: arg and imp;
  • a function frob, which takes arg as a const reference and modifies imp as a side effect.
If arg and imp happen to be the same object, frob will behave in a way that programmers normally don't expect (this scenario was originally brought forward by EvanED to illustrate why constness in C++ isn't the same as immutability in Rust, but I won't dispute that anymore).

To make such a situation as plausible as possible, we assume that imp is not an argument to frob, nor derived from an argument to frob (e.g. through a const_cast). We also assume that the programmer who makes the problematic call to frob (the client programmer) isn't the one who implemented frob (the library programmer).

Now I can think of a few ways in which the problem could emerge, but I think none of these is very realistic. I'm open to your suggestions and opinions.
Note: I'm not denying that the problem is technically possible, I just don't think it will occur under realistic assumptions.

1. imp is a mutable global variable which is included through the header of the library.
Apart from <iostream>, where the use of mutable global variables is more or less necessary, nobody would trust a library that works with such global variables . Even in exceptional libraries such as <iostream> where there is an understandable reason to do it, no sane library writer would provide functions that both take an argument of type T and modify the global variable as a side effect. Not just because it's error-prone but also because there is no sensible use for it (try to imagine such a thing to see what I mean).

2. frob is a non-const member function of an object of which imp is a data member, and that same object allows clients to retrieve a (const) reference to imp.
This would be something like a std::complex<T> where real() returns a reference instead of a copy and where frob is a calculation over a T on the one hand and a std::complex<T> on the other hand, modifying the latter in place. Note that <complex> is not like that because it's bad interface design and that nobody would want to use it otherwise. Still, in this case the client programmer would be consciously passing a data member of the parent object to a non-const member function of that same object, so to expect that the argument passed to frob remains unmodified would be a direct contradiction.

3. frob is a const member function of an object of which imp is a data member, and that same object allows clients to retrieve a (const) reference to imp.
This would basically mean that the library programmer is actively misleading the client programmer, i.e. sabotaging the program on purpose.

4. frob is a member function of imp itself.
This is the only realistic scenario so far: the assignment family of operators are expected to work this way (so frob could be one of those operators in this case). But if you assign an object to itself, you are justified to expect that it remains unchanged, and if you modify-assign it to itself inplace, you will always expect it to change. Also note that you can't take an argument by const reference and then (modify-)assign anything to it, including itself. So here the problem does not actually apply.

Why am I making this point? Because I think it illustrates how passing by const reference for all practical purposes still guarantees that the source object will not be changed by the receiving function. For multi-threaded applications const references are less helpful than "immutable references" (because in the former case the receiving function cannot rely on the value not to change), but for single-threaded applications I think they're practically equivalent.
Feel free to call me Julian. J+ is just an abbreviation.
Image coding and xkcd combined
User avatar
Jplus
 
Posts: 1558
Joined: Wed Apr 21, 2010 12:29 pm UTC
Location: classified

Re: Coding: Fleeting Thoughts

Postby TheChewanater » Sat Jan 28, 2012 3:29 am UTC

Laguana wrote:I don't suppose anyone here knows how to make a native executable from a java project? I'm updating a java program, and previously it had been made into a native 32bit executable for a server without the JRE, but the previous author can't remember how they did it. I've tried gcj, but it gives compile errors, despite javac being happy with it.

It seems most reasonable to just install JRE on the server, but a quick Google search brings up this.
ImageImage
http://internetometer.com/give/4279
No one can agree how to count how many types of people there are. You could ask two people and get 10 different answers.
User avatar
TheChewanater
 
Posts: 1286
Joined: Sat Aug 08, 2009 5:24 am UTC
Location: lol why am I still wearing a Santa suit?

Re: Coding: Fleeting Thoughts

Postby Laguana » Sat Jan 28, 2012 5:29 am UTC

Anything which installs the JRE won't work, since the server deliberately doesn't have java on it for security reasons. I think that means that jsmooth won't work for me either (also jsmooth seems to be windows based, and the server runs linux).

Maybe I can poke at the source until gcj likes it...
Laguana
 
Posts: 49
Joined: Sat Jan 19, 2008 10:13 pm UTC

Re: Coding: Fleeting Thoughts

Postby Xanthir » Sat Jan 28, 2012 5:36 am UTC

troyp wrote:I'm a bit shocked that JS interprets {} as an empty block in expressions, though. That really seems pretty stupid. I mean {} is the idiomatic way to write an empty object, isn't it? And empty blocks surely aren't very common. They should just require an empty block to be {;}.

Note: that is mostly an artifact of how *the REPL* evaluates things. It's less likely for that to come up in real code; for example, typing "var x = {} + [];" gives you the expected result, not the crazy one, because in that context it's unambiguous that {} is an object, not a block. To get the equivalent in real code you'd have to write something like "function foo() { {}+[]; }" which is obviously useless.
(defun fibs (n &optional (a 1) (b 1)) (take n (unfold '+ a b)))
User avatar
Xanthir
My HERO!!!
 
Posts: 4325
Joined: Tue Feb 20, 2007 12:49 am UTC
Location: The Googleplex

Re: Coding: Fleeting Thoughts

Postby troyp » Sat Jan 28, 2012 6:15 am UTC

Xanthir wrote:
troyp wrote:I'm a bit shocked that JS interprets {} as an empty block in expressions, though. That really seems pretty stupid. I mean {} is the idiomatic way to write an empty object, isn't it? And empty blocks surely aren't very common. They should just require an empty block to be {;}.

Note: that is mostly an artifact of how *the REPL* evaluates things. It's less likely for that to come up in real code; for example, typing "var x = {} + [];" gives you the expected result, not the crazy one, because in that context it's unambiguous that {} is an object, not a block. To get the equivalent in real code you'd have to write something like "function foo() { {}+[]; }" which is obviously useless.


So if it's in any context where it has to be an expression, it'll be interpreted correctly as an object? Yeah, I guess that's not so bad. I mean, I'd never say that if it was another language (except maybe Perl), but given JS's whole fast-and-loose ethos I suppose it doesn't matter so much.

edit:
Laguaga wrote:Anything which installs the JRE won't work, since the server deliberately doesn't have java on it for security reasons. I think that means that jsmooth won't work for me either (also jsmooth seems to be windows based, and the server runs linux).

If you're not allowed to stall the JRE, you probably wouldn't be allowed to install a wrapper program like that anyhow, since they'd contain the JRE inside them.
Laguana wrote:Maybe I can poke at the source until gcj likes it...

This might help.
[edit-edit] oh, you said gcj, not gcc (Well, this gives you another option)

[edit again] Another thing you could try is to compile it to byte code with javac and then see if gcj will compile that byte code.
troyp
 
Posts: 529
Joined: Thu May 22, 2008 9:20 pm UTC
Location: Lismore, NSW

Re: Coding: Fleeting Thoughts

Postby jaap » Sat Jan 28, 2012 12:00 pm UTC

Jplus wrote:I would like to return to the issue of the unexpected side effect because I think there is more to it. To summarize, we have (in C++)
  • two objects of the same type T: arg and imp;
  • a function frob, which takes arg as a const reference and modifies imp as a side effect.
If arg and imp happen to be the same object, frob will behave in a way that programmers normally don't expect (this scenario was originally brought forward by EvanED to illustrate why constness in C++ isn't the same as immutability in Rust, but I won't dispute that anymore).

I've run into this problem when I wrote a matrix multiplication function. I had it as a static utility function outside of my Matrix class, with a signature like matrixMultiply(Matrix& result, const Matrix& m1, const Matrix& m2). For various reasons I did not want to use any temporary memory storage and wrote the calculated entries into the result matrix as they were being calculated.
Things went wrong when I wanted to do A=A*B and essentially called the function with matrixMultiply(A,A,B). Matrix A was being modified during the calculation whilst also still being used as input to the same calculation. This is an easy mistake to make if you unthinkingly trust in const.
User avatar
jaap
 
Posts: 1829
Joined: Fri Jul 06, 2007 7:06 am UTC

Re: Coding: Fleeting Thoughts

Postby Jplus » Sat Jan 28, 2012 1:41 pm UTC

I find your case interesting, because you defy my assumption that imp is not passed as an argument to frob, as well as my assumption that the library programmer is not the same person as the client programmer.
Did you know you were passing A both as the first and as the second argument?
Did you assume that the first argument would not alias with the other arguments when you implemented matrixMultiply?

FWIW, this is how Boost.uBLAS deals with such issues.
Feel free to call me Julian. J+ is just an abbreviation.
Image coding and xkcd combined
User avatar
Jplus
 
Posts: 1558
Joined: Wed Apr 21, 2010 12:29 pm UTC
Location: classified

Re: Coding: Fleeting Thoughts

Postby jaap » Sat Jan 28, 2012 2:34 pm UTC

Jplus wrote:I find your case interesting, because you defy my assumption that imp is not passed as an argument to frob, as well as my assumption that the library programmer is not the same person as the client programmer.
Did you know you were passing A both as the first and as the second argument?
Did you assume that the first argument would not alias with the other arguments when you implemented matrixMultiply?

Yes, and yes, but it wasn't until I tracked the cause of the problem that I realised I had made that assumption in the implementation, and what the consequences were. I'm less likely to make the same mistake again.
User avatar
jaap
 
Posts: 1829
Joined: Fri Jul 06, 2007 7:06 am UTC

Re: Coding: Fleeting Thoughts

Postby Yakk » Sat Jan 28, 2012 3:18 pm UTC

Note that self-assignment breaks quite often in C++ classes. The copy-swap idiom is an attempt to get around the issues surrounding it, but even it is imperfect.

Amusingly, code that explicitly checks for self-assignment, then stops the assignment, also tends to also have bugs. Because many of the problems in self assignment are actually subtle problems with assignment, that are first noticed in a self-assignment check, and then that possibility is explicitly tested and eliminated, and the programmer things they are done.

---

In a program with a non-trivial amount of state (be it encapsulated in object pointers, a document state, or what have you), for whom reaching that state is reasonably possible within a given function, and whenever you pass a const reference to a part of that state into a function, that function modifying that global state by accident isn't a rare occurrence. It is common enough that, when working on reasonably large software projects, when calling non-trivial functions, I make sure I have a local copy of things rather than a const reference to things, to make sure that it doesn't get blown away.

The issue isn't with "one library programmer" and "one client programmer" -- it is with 100s of programmers on the client and interface side, spread over a decade+ of development. The functions being called often aren't of narrow purpose, and understanding what they do isn't easy, because they contain dozens of special cases that have built up over the years to handle real bugs and real problems and real edge cases. In addition, said code grows cruft -- bug fixes and edge cases that are no longer valid. They are functions written by programmers who have never met.

A const reference to something in your "global" or somewhat global application state cannot be presumed to not change -- and you don't know, in a given function, where your const reference came from. And it might not even be you doing the work -- you might run into a problem, block on user input, pop up a message dialog, and process messages that results in a timer firing which processes an incremental rationalization of the applications data structures, or picks up on a message coming back from a worker thread whose job was to load a texture, and when the texture was loaded it sends a message to the main thread, where the texture is inserted into the list of textures that the document stores and added to a particular object.

Immutability isn't a panacea for that kind of problem, but it at least lets you reason locally about what you are doing. You still need to worry about "more global" problems -- when you get an immutable copy of the texture list from the document, and you go off and process it, you cannot safely reassign it back to the texture list of the document, because the document's texture list might have been changed since you got it by some third party. (note that here, the texture list is immutable while the document is not, and the problem is with what texture list the document has).
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.
User avatar
Yakk
Poster with most posts but no title.
 
Posts: 10450
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

Re: Coding: Fleeting Thoughts

Postby Jplus » Sun Jan 29, 2012 2:49 am UTC

The copy-swap idiom is an attempt to make assignment more robust against interruption, and it doesn't have anything to do with self-assignment per se. See the second half of chapter 3 of this document for a discussion.

For dealing with self-assignment there is another, very simple and widely known idiom, which should cover all cases. You simply wrap the entire body of your assignment function (except for the return statement) into an if statement, which is only executed if the right operand does not alias with this. The Herb Sutters, Dave Abrahamses, Alexander Stepanovs and Sean Parents all know this and always include this idiom in their code examples (for example in the document I mentioned above). Also, every introductory C++ textbook warns novice programmers to perform said check.
So although you can forget to apply this idiom (yes, by accident), it is both quickly discovered and easily fixed. Hence, any library that enjoyed some adoption can be trusted in this regard.

Of course, all code can contain bugs, so when an assignment function checks for aliasing it can still contain other, perhaps more subtle bugs. Such remaining bugs however have nothing to do with (the alleged complexity of) self-assignment.


As for a function accidentally modifying global state: first note that this is a different class of problems than the one I was talking about. I was specifically interested in those cases where a function is modifying the referent of an argument that was passed by const-reference, against the expectations of some client programmer. As illustrated by my four scenarios, this neither requires that the affected object is "global" (it might be a member of an object on which the function is called), nor that the modification is by accident (it might be intended).

I agree that it's possible for a function to accidentally modify global state, in general (whether that function is passed a const reference or not). A very minimal example is writing to an invalid pointer.
It appears that you intended to address the intersection of these two classes of problems, probably in response to my first scenario, but frankly, your reasoning seems all rather theoretical and even a bit vague. You obviously have experience with this; how about you just give a concrete-ish example?

Besides, note that I didn't talk about "the library programmer" and "the client programmer" because I thought that two is a special number. Rather, two is the minimal amount of programmers that you need in order to create a situation where a programmer may call a function of which they don't know how it was implemented (as you pointed out, the program would be too trivial if they did know). In other words, the number two is only playing a role as the minimal abstraction of "multiple". I'll be very sceptical about any suggestion that a hundred programmers are fundamentally different from two programmers in this respect.
Feel free to call me Julian. J+ is just an abbreviation.
Image coding and xkcd combined
User avatar
Jplus
 
Posts: 1558
Joined: Wed Apr 21, 2010 12:29 pm UTC
Location: classified

Re: Coding: Fleeting Thoughts

Postby Xanthir » Sun Jan 29, 2012 6:47 am UTC

troyp wrote:
Xanthir wrote:
troyp wrote:I'm a bit shocked that JS interprets {} as an empty block in expressions, though. That really seems pretty stupid. I mean {} is the idiomatic way to write an empty object, isn't it? And empty blocks surely aren't very common. They should just require an empty block to be {;}.

Note: that is mostly an artifact of how *the REPL* evaluates things. It's less likely for that to come up in real code; for example, typing "var x = {} + [];" gives you the expected result, not the crazy one, because in that context it's unambiguous that {} is an object, not a block. To get the equivalent in real code you'd have to write something like "function foo() { {}+[]; }" which is obviously useless.


So if it's in any context where it has to be an expression, it'll be interpreted correctly as an object? Yeah, I guess that's not so bad. I mean, I'd never say that if it was another language (except maybe Perl), but given JS's whole fast-and-loose ethos I suppose it doesn't matter so much.

It's not so odd. In *any* language with braces denoting blocks, like C, having "{}+foo" occur at the top level will interpret that {} as a block.
(defun fibs (n &optional (a 1) (b 1)) (take n (unfold '+ a b)))
User avatar
Xanthir
My HERO!!!
 
Posts: 4325
Joined: Tue Feb 20, 2007 12:49 am UTC
Location: The Googleplex

Re: Coding: Fleeting Thoughts

Postby troyp » Sun Jan 29, 2012 12:10 pm UTC

Xanthir wrote:It's not so odd. In *any* language with braces denoting blocks, like C, having "{}+foo" occur at the top level will interpret that {} as a block

I don't have a problem with C doing it. That's what it should do. My issue with JS was, there are two possible interpretations. You'd think if a language uses a symbol for more than one purpose and ambiguities are possible, the language would restrict one of the usages to resolve them. In this case, they could have disallowed empty blocks or required a different syntax for an empty object literal. Instead, they just allowed the ambiguities to remain.

I don't know any other language that uses braces for blocks/scopes as well as literals, so I'm not sure how other languages handle this [ actually, I just tried irb: Ruby interprets {} as a hash on its own, and even does so in {}+foo, despite the fact that hashes don't have a + method (so it throws an error) ].

edit: one scenario occurs to me where {} might be misinterpreted outside of the REPL, namely runtime code generation. If you were building an expression to be passed to eval, you'd have to specifically account for these cases.
troyp
 
Posts: 529
Joined: Thu May 22, 2008 9:20 pm UTC
Location: Lismore, NSW

Re: Coding: Fleeting Thoughts

Postby Dason » Mon Jan 30, 2012 12:49 am UTC

If I'm going to pick up Python should I go with Python2 or Python3?
double epsilon = -.0000001;
User avatar
Dason
 
Posts: 1293
Joined: Wed Dec 02, 2009 7:06 am UTC
Location: ~/

Re: Coding: Fleeting Thoughts

Postby b.i.o » Mon Jan 30, 2012 3:38 am UTC

What are you using Python for? They're similar enough that switching between the two isn't very hard at all.
User avatar
b.i.o
Green is the loneliest number
 
Posts: 2520
Joined: Fri Jul 27, 2007 4:38 pm UTC
Location: Hong Kong

Re: Coding: Fleeting Thoughts

Postby Dason » Mon Jan 30, 2012 4:03 am UTC

Mainly just messing around and python is a nice enough scripting language to have in my bag of known languages. I think I'll go with Python2 though because one of the interfaces that looks interesting to me looks to be for python2 only for the time being.
double epsilon = -.0000001;
User avatar
Dason
 
Posts: 1293
Joined: Wed Dec 02, 2009 7:06 am UTC
Location: ~/

Re: Coding: Fleeting Thoughts

Postby joshz » Mon Jan 30, 2012 4:03 am UTC

Python 2 is much more commonly used. From what I've been told, 3 has some strange behavior, though I can't remember it off the top of my head.
You, sir, name? wrote:If you have over 26 levels of nesting, you've got bigger problems ... than variable naming.
suffer-cait wrote:it might also be interesting to note here that i don't like 5 fingers. they feel too bulky.
User avatar
joshz
 
Posts: 1466
Joined: Tue Nov 11, 2008 2:51 am UTC
Location: Pittsburgh, PA

Re: Coding: Fleeting Thoughts

Postby Aaeriele » Mon Jan 30, 2012 4:46 am UTC

Python 2 and Python 3 both see use, but many of the major/popular 3rd-party libraries are written for 2, not 3, and thus a lot of people still use 2 because of that.
Vaniver wrote:Harvard is a hedge fund that runs the most prestigious dating agency in the world, and incidentally employs famous scientists to do research.

afuzzyduck wrote:ITS MEANT TO BE FLUTTERSHY BUT I JUST SEE AAERIELE! CURSE YOU FORA!
User avatar
Aaeriele
 
Posts: 2107
Joined: Tue Feb 23, 2010 3:30 am UTC
Location: San Francisco, CA

PreviousNext

Return to Coding

Who is online

Users browsing this forum: No registered users and 4 guests