Smart Pointers

A place to discuss the implementation and style of computer programs.

Moderators: phlip, Prelates, Moderators General

Smart Pointers

Postby sourmìlk » Wed Apr 04, 2012 11:10 am UTC

So, smart pointers: When should I use them? Should I ever not use them?

I'm not asking about when to use specific kinds of smart pointers. I understand that. I'm wondering how often I should use them, if at all, or if at all I shouldn't ever use them.
Terry Pratchett wrote:The trouble with having an open mind, of course, is that people will insist on coming along and trying to put things in it.
User avatar
sourmìlk
If I can't complain, can I at least express my fear?
 
Posts: 6407
Joined: Mon Dec 22, 2008 10:53 pm UTC
Location: permanently in the wrong

Re: Smart Pointers

Postby gametaku » Wed Apr 04, 2012 11:37 am UTC

Every time you need a pointer.

If you want to reduce overhead, raw pointers can be safely used in classes where by design , they are not involved in the creation or destruction of the object they are pointing at. For example in a tree structure a nodes parent pointer.
gametaku
 
Posts: 149
Joined: Tue Dec 30, 2008 2:21 am UTC

Re: Smart Pointers

Postby Divinas » Wed Apr 04, 2012 12:21 pm UTC

Agreed. Use smart pointers as much as possible. BUT, be sure to understand the semantics of the smart pointers that you are using, what kind of problems they have, and how you can avoid getting into those pitfalls (most notably, cyclic references)
Divinas
 
Posts: 57
Joined: Wed Aug 26, 2009 7:04 am UTC

Re: Smart Pointers

Postby Jplus » Wed Apr 04, 2012 3:10 pm UTC

I'd say, always use them except when you're implementing some new kind of linked data structure that should be available to other prgrammers through a library. So that's still virtually always. Smart pointers are better than raw ones in all possible ways except efficiency, so typically you'll only find exceptions only where efficiency is a prime concern.
Feel free to call me Julian. J+ is just an abbreviation.
Image coding and xkcd combined
User avatar
Jplus
 
Posts: 1570
Joined: Wed Apr 21, 2010 12:29 pm UTC
Location: classified

Re: Smart Pointers

Postby Yakk » Wed Apr 04, 2012 3:27 pm UTC

The categories of smart pointer are important. They solve different problems. The short answer for when you shouldn't use them is "when the categories of smart pointer that exist don't work well for your problem".

I'd sketch out that there are at least 3 categories of smart pointer.

There are seriously intrusive smart pointers, such as what is used in garbage collected languages. These have unpredictable run-time behavior and costs in general.

There are run time limited smart pointers that require careful compile-time work to make them work correctly, like reference counted shared_ptr in C++ and weak_ptr.

There are compile-time smart pointers that enable RAII semantics on resource sharing, like unique_ptr in C++. These can have basically zero overhead. (So the "you cannot afford the overhead" is not a good reason to not use smart pointers)

There are cases where none of the above apply. When you have a high performance situation where the ownership of a pointer is passed around in a multi-threaded environment in a limited chunk of code that can be completely audited, and will remain unchanged after being written, and does not grant access to said pointer. (Ie, you are writing a high-performance data structure at the lowest level, and you have identified the smart pointer as being a serious performance bottleneck, and the specs for said data structure are fixed and are never, ever going to change later.)

Another case is where you have pointers that are not intended for heap-like allocation situations -- where your pointer simply says where your output is, but said target can be changed. This shouldn't occur in a persistent data structure, only a transactional one (like the arguments to a function). Often references work better here, but operator= semantics on references are annoying if you want to be able to copy your transactional data structure around easily. (Note: I'd avoid using pointers as an "optional reference" -- boost::optional style seems like a better idea.)

In the context of C++, a smart pointer is just something that behaves like a pointer, with a class and RAII semantics wrapped around it. There are a ridiculous number of ways you can have a C++ smart pointer with various quirks and advantages, and (in theory) each of these should be ruled out before you go off and use raw pointers. (memory coloring, intrustive refcount, external refcount, augmented allocation refcount, weak refcount (and general category hierarchy), ownership based, single threaded circular linked list, the abhorrent auto_ptr, etc)

However, with all of the above in mind, the most important reason to use raw pointers is because you are interacting with a code base that is built on top of raw pointers, and no easy way to shoe horn your smart pointers into the system. Because you cannot rewrite everything, and because consistency makes maintenance easier, mimicing the existing code logic and style is an important skill.
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.
User avatar
Yakk
Poster with most posts but no title.
 
Posts: 10466
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

Re: Smart Pointers

Postby Sc4Freak » Thu Apr 05, 2012 3:13 am UTC

I disagree with some of these answers.

Use a smart pointer when you need to express ownership semantics. That is, if you have a pointer which owns the pointed-to object, then you should use a smart pointer. Non-owning pointers should use raw pointers (but prefer references when applicable). In C++11, that means using unique_ptr to express unique ownership (a single instance owns an object) and shared_ptr to express shared ownership (multiple instances own an object).

Take for example a tree structure. A node owns its children, so a node should have a smart pointer to its children. But a child node obviously doesn't own its parent - that would be cyclical. Instead, it should have a non-owning reference back to its parent. In this case, a raw pointer expresses that relationship perfectly correctly, safely, and succinctly.
User avatar
Sc4Freak
 
Posts: 673
Joined: Thu Jul 12, 2007 4:50 am UTC
Location: Redmond, Washington

Re: Smart Pointers

Postby Divinas » Thu Apr 05, 2012 8:48 am UTC

I disagree. In a single threaded application ,what Sc4Freak is saying is reasonable. But in a multithreaded one, that is not required to be so. If we have a structure A owns B, B keeps a raw ptr to A, if you use the raw pointer while A is being destroyed , the pointer is invalid. If you used a smart weak, non-owning pointer, such as boost::weak_ptr, you're guarded against that.
Last edited by Divinas on Thu Apr 05, 2012 9:06 am UTC, edited 2 times in total.
Divinas
 
Posts: 57
Joined: Wed Aug 26, 2009 7:04 am UTC

Re: Smart Pointers

Postby EvanED » Thu Apr 05, 2012 9:01 am UTC

Sc4Freak wrote:I disagree with some of these answers.

Use a smart pointer when you need to express ownership semantics. That is, if you have a pointer which owns the pointed-to object, then you should use a smart pointer. Non-owning pointers should use raw pointers (but prefer references when applicable). In C++11, that means using unique_ptr to express unique ownership (a single instance owns an object) and shared_ptr to express shared ownership (multiple instances own an object).

Take for example a tree structure. A node owns its children, so a node should have a smart pointer to its children. But a child node obviously doesn't own its parent - that would be cyclical. Instead, it should have a non-owning reference back to its parent. In this case, a raw pointer expresses that relationship perfectly correctly, safely, and succinctly.

I somewhat disagree. I think this depends on what kind of bugs you like.

The problem is that logic errors in your program can still leave you with dangling pointers. Full use of reference-counted smart pointers avoids this. Do you like that behavior, where errors will often be more evident during development but more dangerous when deployed, or would you prefer that the ownership semantics become a bit unclear, and errors can be masked, but you are cutting out any problems with dangling pointers?
EvanED
 
Posts: 4145
Joined: Mon Aug 07, 2006 6:28 am UTC
Location: Madison, WI

Re: Smart Pointers

Postby Divinas » Thu Apr 05, 2012 9:19 am UTC

Well, If you use owning pointers everywhere, you run into different bugs - A owns B, B owns A, and you've got in the very best a memory leak, at the worst case some other resource acquired through RAII being stuck and non-released. One should know the smart pointers, use whatever is appropriate.
Divinas
 
Posts: 57
Joined: Wed Aug 26, 2009 7:04 am UTC

Re: Smart Pointers

Postby Yakk » Thu Apr 05, 2012 12:25 pm UTC

Sc4Freak wrote:Use a smart pointer when you need to express ownership semantics.
There are smart pointers that don't express ownership semantics.

Either these pointers are a bad idea, or you should use smart pointers when you aren't trying to express ownership semantics sometimes.
Take for example a tree structure. A node owns its children, so a node should have a smart pointer to its children. But a child node obviously doesn't own its parent - that would be cyclical. Instead, it should have a non-owning reference back to its parent. In this case, a raw pointer expresses that relationship perfectly correctly, safely, and succinctly.
The thing is, there are non-owning smart pointers. In the standard library, weak_ptr is a non-owning smart pointer.

If I was building a tree structure and didn't want to do std library QA, and my children needed a pointer back to their parents, I'd use shared_ptr for the pointer-to-child, and weak_ptr for the pointer-to-parent. If I decided that no child should ever have a pointer to an invalid or null parent unless it was a root, I'd add in a root flag, and then assert that the call to get on the weak_ptr always returned non-nullptr so long as the root flag is unset. (I'm only using shared_ptr here in order to permit weak_ptr).

The goal of this would be robustness -- I'm trying to avoid certain classes of error which I find to be common, and using smart pointers to sanity check my code.

Now, the persistent pointer-to-parent would, if nothing goes wrong, be no more dangerous than the weak_ptr, given that I'm asserting things left right and center that the weak_ptr get never fails. But as a side effect to this change I've decoupled the "I am a root" flag and made it explicit, instead of overriding a pointer value so that nullptr serves double duty -- I consider that to be a plus, maintenance wise (while the nullptr as flag is a pretty common idiom, the boolean saying explicitly what it means carries with it a bit more documentation).
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.
User avatar
Yakk
Poster with most posts but no title.
 
Posts: 10466
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

Re: Smart Pointers

Postby Sc4Freak » Thu Apr 05, 2012 5:03 pm UTC

Divinas wrote:I disagree. In a single threaded application ,what Sc4Freak is saying is reasonable. But in a multithreaded one, that is not required to be so. If we have a structure A owns B, B keeps a raw ptr to A, if you use the raw pointer while A is being destroyed , the pointer is invalid. If you used a smart weak, non-owning pointer, such as boost::weak_ptr, you're guarded against that.

But multi-threading doesn't change the picture at all. If A owns B, B cannot outlive A because it's owned by it. If you violate that invariant, not even a smart pointer can save you. Smart pointers alone are not enough to guarantee consistency in multi-threaded applications - which is no different to raw pointers. That is, since A owns B and A is getting destroyed, it means that B is getting destroyed as well. Using a smart pointer to try and get to A from B doesn't help you here.

I should probably also note that the thread safety of shared_ptr isn't any stronger than that of a raw pointer. Interlocked increments/decrements are used in the implementation because that's what is necessary to get thread safety to the same level as raw pointers. The guarantee that shared_ptr gives is that manipulating separate instances from different threads is guaranteed to be safe - which is the same as raw pointers. So if your application isn't thread-safe when using raw pointers for non-owning references, it isn't thread-safe when using smart pointers.

EvanED wrote:I somewhat disagree. I think this depends on what kind of bugs you like.

The problem is that logic errors in your program can still leave you with dangling pointers. Full use of reference-counted smart pointers avoids this. Do you like that behavior, where errors will often be more evident during development but more dangerous when deployed, or would you prefer that the ownership semantics become a bit unclear, and errors can be masked, but you are cutting out any problems with dangling pointers?

Yes, that's right from a pure practical standpoint. If you use smart pointers to define ownership and make a mistake somewhere, you can potentially have a dangling pointer. But using reference counting doesn't save you from having to figure out ownership - and if you make a mistake there, you get a cyclic reference.

I suppose at that point it comes down to a matter of opinion which you prefer, and a debate over which situation is more likely or more dangerous. But if you get your ownership right in the first place, then using raw pointers for non-owning references is just as correct, and more efficient.

Yakk wrote:The thing is, there are non-owning smart pointers. In the standard library, weak_ptr is a non-owning smart pointer.

Either these pointers are a bad idea, or you should use smart pointers when you aren't trying to express ownership semantics sometimes.

Yes, that's exactly what I'm saying. weak_ptr is a weapon of last resort - its main use is when you want a non-owning reference to an object who isn't your ancestor somewhere in the ownership graph. The only example I can think of where this would be useful is a resource cache - you'd have a weak_ptr to an expensive resource held by smart_ptr which you can use to query if the resource is still alive or not.

But for expressing plain old non-owning references like a tree node? I say use a raw pointer. It's exactly as correct, and more efficient.

Yakk wrote:If I was building a tree structure and didn't want to do std library QA, and my children needed a pointer back to their parents, I'd use shared_ptr for the pointer-to-child, and weak_ptr for the pointer-to-parent. If I decided that no child should ever have a pointer to an invalid or null parent unless it was a root, I'd add in a root flag, and then assert that the call to get on the weak_ptr always returned non-nullptr so long as the root flag is unset. (I'm only using shared_ptr here in order to permit weak_ptr).

The goal of this would be robustness -- I'm trying to avoid certain classes of error which I find to be common, and using smart pointers to sanity check my code.

Now, the persistent pointer-to-parent would, if nothing goes wrong, be no more dangerous than the weak_ptr, given that I'm asserting things left right and center that the weak_ptr get never fails. But as a side effect to this change I've decoupled the "I am a root" flag and made it explicit, instead of overriding a pointer value so that nullptr serves double duty -- I consider that to be a plus, maintenance wise (while the nullptr as flag is a pretty common idiom, the boolean saying explicitly what it means carries with it a bit more documentation).

But I'm not seeing where the difference is between using a raw pointer vs. a weak pointer in this case. I don't think I'm understanding your example - what, if anything, has weak_ptr actually bought you here?
User avatar
Sc4Freak
 
Posts: 673
Joined: Thu Jul 12, 2007 4:50 am UTC
Location: Redmond, Washington

Re: Smart Pointers

Postby Sc4Freak » Thu Apr 05, 2012 5:18 pm UTC

An example to help illustrate what I mean:
Code: Select all
void Frobnicate(const shared_ptr<Foo>& foo)
{
   foo->Bar();
   // Do other stuff to foo (observe, mutate, etc)
}
This uses shared_ptr to enforce reference counting. That's great, but what has it bought you? Frobnicate doesn't change the ownership of foo. It doesn't hold any references, it doesn't modify any of its owners. It just observes (or mutates) foo - it's a non-owning reference. So why not this instead:

Code: Select all
void Frobnicate(Foo* foo)
{
   foo->Bar();
   // Do other stuff to foo (observe, mutate, etc)
}

Or even better, this:
Code: Select all
void Frobnicate(Foo& foo)
{
   foo.Bar();
   // Do other stuff to foo (observe, mutate, etc)
}

Passing parameters by pointer or reference is something we've been doing for years and years. But I've seen people using shared_ptr's everywhere - even for non-owning references. And it doesn't make any sense - why use a shared_ptr when a raw pointer (or preferably a reference where applicable) does just as well? It's exactly the same principle whether you're passing paramters into function, or storing references back to a parent node.

This is why I advocate using smart pointers like shared_ptr and unique_ptr only if you need to express an owning reference. But if you just want to use an object that you don't own? Use a raw pointer or reference. (I probably should have said "if you want to use an object that you don't own and which is guaranteed to outlive you" to be completely correct, but in C++ lifetime is almost always tied to ownership anyway)
User avatar
Sc4Freak
 
Posts: 673
Joined: Thu Jul 12, 2007 4:50 am UTC
Location: Redmond, Washington

Re: Smart Pointers

Postby Yakk » Thu Apr 05, 2012 5:33 pm UTC

Sc4Freak wrote:But for expressing plain old non-owning references like a tree node? I say use a raw pointer. It's exactly as correct, and more efficient.
You plan for it to be exactly as correct. In small, toy programs, you can probably track down the bugs.

But you have persistent state.
Yakk wrote:The goal of this would be robustness -- I'm trying to avoid certain classes of error which I find to be common, and using smart pointers to sanity check my code.

Now, the persistent pointer-to-parent would, if nothing goes wrong, be no more dangerous than the weak_ptr, given that I'm asserting things left right and center that the weak_ptr get never fails.

But I'm not seeing where the difference is between using a raw pointer vs. a weak pointer in this case. I don't think I'm understanding your example - what, if anything, has weak_ptr actually bought you here?
The goal of this (using a weak_ptr with asserts that it never fails) instead of a raw pointer would be robustness. I'm trying to avoid certain classes of error which I find common, and using smart pointers to sanity check my code.

Now, the persistent pointer-to-parent would, if nothing goes wrong, be no more dangerous than weak_ptr, given that I'm asserting things left right and center that the weak_ptr get never fails. But when something goes wrong, behavior is different.

In the weak_ptr case, we get an asset that something happened -- which, in a debug or instrumented release, gives us the exact line number where we have a weak_ptr that was no longer valid. In a raw pointer case, you may or may not get crap data from what you dereferenced -- you'll be accessing memory that at one point was an instance of a certain class, but no longer. Checks against obvious things (like bool flags or the like) will silently do something random, and it is not unreasonable for a deallocated instance-of-X to be replaced with a newly allocated instance-of-X, so it could even be a perfectly valid node.

In the weak_ptr case, instead we get a nullptr from the get. This means a guaranteed segfault if we dereference it (or a reasonable offset from it) on any reasonable system. And we can assert that it isn't nullptr (generating diagnostics instead of a crash), and branch away from dereferencing it (and avoid a crash entirely).

This can happen if something goes wrong.

Now, will something go wrong? In my experience, yes, something will go wrong. If not in this class, then in the next 100 classes you write. The ease of debugging the problem becomes easier when the weak_ptr assert is hit instead of an assert. Sure, there will be a performance hit -- but programmer time can be turned into performance improvements. And by making debugging easier and bugs easier to track and fix, I'm freeing up programmer time. If this tree structure isn't in a performance critical section, I'm losing performance where I can afford to, and I can spend it on improved performance where I need it.

If the tree structure is in a performance critical section that isn't anywhere near "this fails, someone dies" level code, I can do things from only conditionally using the weak_ptr, through to spending a bunch more time tuning it and actually guaranteeing that the parent pointer of the child never gets out of sync.

You don't code to make your code the fastest it can be. You code it to make your code the easiest to debug and maintain such that it is still fast enough to do the task it needs doing. We aren't talking about an O-notation hit here (barring massive concurrency), we are talking about a constant factor slowdown.
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.
User avatar
Yakk
Poster with most posts but no title.
 
Posts: 10466
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

Re: Smart Pointers

Postby zmic » Thu Apr 05, 2012 7:13 pm UTC

sourmìlk wrote:So, smart pointers: When should I use them? Should I ever not use them?

I'm not asking about when to use specific kinds of smart pointers. I understand that. I'm wondering how often I should use them, if at all, or if at all I shouldn't ever use them.


In my own experience I've had little use for them. At the high level there's usually your STL containers to do your allocation/deallocation stuff, at the low level I usually write small classes that are straightforward so not much can go wrong.
User avatar
zmic
 
Posts: 392
Joined: Fri Mar 02, 2012 10:38 pm UTC

Re: Smart Pointers

Postby You, sir, name? » Sat Apr 07, 2012 12:22 am UTC

EvanED wrote:
Sc4Freak wrote:I disagree with some of these answers.

Use a smart pointer when you need to express ownership semantics. That is, if you have a pointer which owns the pointed-to object, then you should use a smart pointer. Non-owning pointers should use raw pointers (but prefer references when applicable). In C++11, that means using unique_ptr to express unique ownership (a single instance owns an object) and shared_ptr to express shared ownership (multiple instances own an object).

Take for example a tree structure. A node owns its children, so a node should have a smart pointer to its children. But a child node obviously doesn't own its parent - that would be cyclical. Instead, it should have a non-owning reference back to its parent. In this case, a raw pointer expresses that relationship perfectly correctly, safely, and succinctly.

I somewhat disagree. I think this depends on what kind of bugs you like.

The problem is that logic errors in your program can still leave you with dangling pointers. Full use of reference-counted smart pointers avoids this. Do you like that behavior, where errors will often be more evident during development but more dangerous when deployed, or would you prefer that the ownership semantics become a bit unclear, and errors can be masked, but you are cutting out any problems with dangling pointers?


Huh? If A owns B, and B refers to A, then if A expires, B expires as well (because A owns it) and no pointer is left dangling. Safe as houses.
I now occasionally update my rarely-updated blog.

I edit my posts a lot and sometimes the words wrong order words appear in sentences get messed up.
User avatar
You, sir, name?
 
Posts: 6617
Joined: Sun Apr 22, 2007 10:07 am UTC
Location: Chako Paul City

Re: Smart Pointers

Postby EvanED » Sat Apr 07, 2012 1:18 am UTC

You, sir, name? wrote:
EvanED wrote:The problem is that logic errors in your program can still leave you with dangling pointers. Full use of reference-counted smart pointers avoids this. Do you like that behavior, where errors will often be more evident during development but more dangerous when deployed, or would you prefer that the ownership semantics become a bit unclear, and errors can be masked, but you are cutting out any problems with dangling pointers?


Huh? If A owns B, and B refers to A, then if A expires, B expires as well (because A owns it) and no pointer is left dangling. Safe as houses.

You've assumed the code is right. You can't talk about what happens if it's wrong if you take that as an assumption. :-)

I'm saying: A owns B. C (perhaps a function, as a parameter) gets a non-counted reference to B. A expires, which expires B. C is left with a dangling reference.

Or say you have a tree. A has a child of B (ref counted), and B has a raw pointer back to A. Now you just want the B subtree, so you make a copy of B's shared_ptr (incrementing it's count to 2) then delete A (which in the process decrements B's count back to 1). But you forget to reset B's parent pointer to NULL. That's your dangling pointer. It's an error that would have been caught with something like weak_ptr, but of course won't be caught by the raw pointer.
EvanED
 
Posts: 4145
Joined: Mon Aug 07, 2006 6:28 am UTC
Location: Madison, WI

Re: Smart Pointers

Postby You, sir, name? » Sat Apr 07, 2012 10:01 am UTC

EvanED wrote:
You, sir, name? wrote:
EvanED wrote:The problem is that logic errors in your program can still leave you with dangling pointers. Full use of reference-counted smart pointers avoids this. Do you like that behavior, where errors will often be more evident during development but more dangerous when deployed, or would you prefer that the ownership semantics become a bit unclear, and errors can be masked, but you are cutting out any problems with dangling pointers?


Huh? If A owns B, and B refers to A, then if A expires, B expires as well (because A owns it) and no pointer is left dangling. Safe as houses.

You've assumed the code is right. You can't talk about what happens if it's wrong if you take that as an assumption. :-)

I'm saying: A owns B. C (perhaps a function, as a parameter) gets a non-counted reference to B. A expires, which expires B. C is left with a dangling reference.


The same thing could be said about passing a pointer to an element in an array to a function, and then deallocating the array. You can't guard against idiots shooting themselves in the foot.

Or say you have a tree. A has a child of B (ref counted), and B has a raw pointer back to A. Now you just want the B subtree, so you make a copy of B's shared_ptr (incrementing it's count to 2) then delete A (which in the process decrements B's count back to 1). But you forget to reset B's parent pointer to NULL. That's your dangling pointer. It's an error that would have been caught with something like weak_ptr, but of course won't be caught by the raw pointer.


This is leaving the pattern altogether. A should have an unique_ptr to B if it owns B.
I now occasionally update my rarely-updated blog.

I edit my posts a lot and sometimes the words wrong order words appear in sentences get messed up.
User avatar
You, sir, name?
 
Posts: 6617
Joined: Sun Apr 22, 2007 10:07 am UTC
Location: Chako Paul City

Re: Smart Pointers

Postby Yakk » Sat Apr 07, 2012 1:45 pm UTC

You, sir, name? wrote:The same thing could be said about passing a pointer to an element in an array to a function, and then deallocating the array. You can't guard against idiots shooting themselves in the foot.
The existence of the possibility of a crash in C++ does not mean that avoiding a crash is pointless.

In particular, it is a really common case that raw pointers stored in persistent data structures end up dangling. This actually happens in production code. Sometimes the case that causes it to dangle is obscure and rare -- sometimes the code path is written and tested, and 5 years later someone changes a different bit of code without realizing that it impacts that old bit of code, and the unit test has bit rotted in the meantime (or it didn't catch that corner case).

Dangling pointers happen. If it is your opinion that there is never a justified run time cost to reduce the possibility of segfault due to dangling pointers and logic errors in the code, you are quite simply wrong. Quite often, performance is not an issue in a section of code, while reliability and stability are.

C++ is able to write code that is as fast as or faster than hand-coded C, but that doesn't mean that C++ cannot also be used to write code that burns performance for reliability. And in any large project, reliability is the bigger problem than performance for the majority of the code.

Unguarded pointers in persistent data structures is a dangerous thing. Sometimes this is worth the performance. At other times, the performance isn't worth it, and guarding them is a better idea.
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.
User avatar
Yakk
Poster with most posts but no title.
 
Posts: 10466
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

Re: Smart Pointers

Postby EvanED » Sat Apr 07, 2012 4:00 pm UTC

You, sir, name? wrote:The same thing could be said about passing a pointer to an element in an array to a function, and then deallocating the array. You can't guard against idiots shooting themselves in the foot.

Basically, what Yakk said. Smart pointers can't save you from crashes and other memory errors entirely (which is why we should all stop using memory unsafe languages1). In fact, there are a couple kinds of errors (like cycles -> leaks) that, from some point of view, can't happen without them.

What they can do is, with reasonably diligent use, make it easier to avoid particular errors. Memory leaks is of course the clearest, but in general that's a pretty benign error as errors go. Far more important IMO is that if you use them pervasively, you can eliminate other, more damaging, errors -- like dangling pointers. But that requires that you don't go with the "use smart pointers for ownership and raw pointers for others" perspective that you and Sc4Freak suggest.

You can't always do this of course if you have to interface with libraries that take raw pointers, or if you've measured the overhead of the smart pointer operations in one part of your code to be too much. But I do think it's an ideal to aspire to.

1 This is a troll, but a bit of a heartfelt one; basically, I'm really fucking tired of hearing the words "buffer overflow". We had a faculty candidate talk here a month or so ago that basically told us how he could have killed someone with an infection where the initial attack vector was a buffer overflow, after explaining how to use a buffer overflow in an electronic voting system which was actually reasonably-well designed with security in mind (except for the lack of a VVPT) running on an embedded chip with no OS. In 2012, this is ridiculous. I'm basically to the point where I don't trust the collective programmers of any project to write it correctly in a non-memory-safe language.

I'm not sure what the solution is here to the broader question, as there are other vulnerabilities like SQL injections that also should just not happen today but do, and that memory-safe languages can't help. I think the long-term solution is one of two things. The first is an actual certification process for software developers like engineers, with consequences for screwing up. It's possible that people who had to put their stamp of approval that "this has no buffer overflows" and would face loss of livelihood if they were wrong would be careful enough that they could use languages like C and C++, but I sort of doubt it. The second possibility is formal verification of the partial correctness of code, in order to prove the absence of certain kinds of errors. I having a firm background in formal verification... I don't think it's ever going to be tractable in a language like C and C++.

Wow, that turned into a bit of a tirade... might have to start a new thread if this goes much of anywhere. :-)

You, sir, name? wrote:
Or say you have a tree. A has a child of B (ref counted), and B has a raw pointer back to A. Now you just want the B subtree, so you make a copy of B's shared_ptr (incrementing it's count to 2) then delete A (which in the process decrements B's count back to 1). But you forget to reset B's parent pointer to NULL. That's your dangling pointer. It's an error that would have been caught with something like weak_ptr, but of course won't be caught by the raw pointer.


This is leaving the pattern altogether. A should have an unique_ptr to B if it owns B.

Um, what if you want to support the operation I said? (I.e. slice out a subtree and drop the rest.)
EvanED
 
Posts: 4145
Joined: Mon Aug 07, 2006 6:28 am UTC
Location: Madison, WI

Re: Smart Pointers

Postby Yakk » Mon Apr 23, 2012 2:46 pm UTC

I ran into this in an old tab. And I wanted to pipe up about SQL injections.

SQL injections can be seriously mitigated with type-safe improvements. Getting it to happen is hard.

Basically, a string that comes from outside of the compilation unit (be it a config file, user input, or the like) should be a different type of string than a compiled in string.

And SQL shouldn't take user generated strings as arguments.

perl, for example, added a over-type system to deal with exactly that -- languages like C++ should also be able to handle it (C++ via their type system). Other languages could either handle it via their type system, or via a language augmentation (like how perl did it).

The problem with all of these is that it can easily make doing something harder. And as a society, we reward doing something higher than doing something right. So for something like the above to work, it has to work in a way that it is easy for the developer. You can see some of this in the language extensions in C# (I think it is called LINK or LINC?) that create sub-languages -- similar techniques work in C++.
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.
User avatar
Yakk
Poster with most posts but no title.
 
Posts: 10466
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

Re: Smart Pointers

Postby Xeio » Mon Apr 23, 2012 3:15 pm UTC

Yakk wrote:SQL injections can be seriously mitigated with type-safe improvements. Getting it to happen is hard.

Basically, a string that comes from outside of the compilation unit (be it a config file, user input, or the like) should be a different type of string than a compiled in string.
I think that's probably overkill though, given that prepared statements already essentially eliminate any chance of SQL injection. There's no need to try and implement a type safe system to do this.
User avatar
Xeio
Friends, Faidites, Countrymen
 
Posts: 4853
Joined: Wed Jul 25, 2007 11:12 am UTC
Location: C:\Users\Xeio\

Re: Smart Pointers

Postby Yakk » Mon Apr 23, 2012 3:27 pm UTC

Meh, the same problem happens every time you construct an XML statement. Now, SQL statements are directly executed, while XML tends to be parsed, and parser failures tend not to cause arbitrary code to execute, while malformed SQL statements can cause nearly arbitrary code to execute. So there is a difference of severity.

The general problem of input sanitization is made easier if you mark up user input strings as being a different type than non-user input strings.

On the other hand, you could just process non-user input strings that come in via the "put text here" interface as being non-sanitized (as why are you trusting your programmers not to have < in their string constants?", and have a different "route" that the markup text comes through.

Which is effectively building that kind of type system, where the convention of what interface you call determines what the type of the character data is, and making the "safe" character data only come from other parts of the module/object rather than via public interfaces (ie, treat everything from a public interface as if it was a user input).
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.
User avatar
Yakk
Poster with most posts but no title.
 
Posts: 10466
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

Re: Smart Pointers

Postby Xeio » Mon Apr 23, 2012 3:45 pm UTC

Well, it'd still be easier generally to have the writer object be smart enough to escape things that needed it from a programmer's perspective. I'd much rather have the library be able to handle unsafe strings than have to manage X types of strings (a different type for every library I'm calling, and X is a least 3, one for HTML, XML, and SQL, and those are only the most common formats).

Also it means that you'd end up having to write every call twice, either in the library, or having to construct a safe string from an unsafe one whenever you want to make a call into the library.

Granted, to some extent you will have to write two functions anyways, because you'll most likely want to be able to support writing raw strings, but there's not much way around needing that code in the library to support it.
User avatar
Xeio
Friends, Faidites, Countrymen
 
Posts: 4853
Joined: Wed Jul 25, 2007 11:12 am UTC
Location: C:\Users\Xeio\

Re: Smart Pointers

Postby WarDaft » Mon Apr 23, 2012 6:19 pm UTC

I'm rather of the opinion that whoever decided that SQL queries should involve specifying the entire command as a string and passing it to a single query function for all uses was an idiot, no matter their other accomplishments. Yes, it works, but in nearly the worst way possible. It's like how getting a PHD from Harvard by age of 10 would not absolve you of idiocy if you then set your pants on fire while still wearing them.

Distinct type structures defining the commands would make it essentially impenetrable.

Consider:
Code: Select all
data Predicate = OR [Predicate] |
      AND [Predicate] |
      FIELDSTRPRED Field (String -> Bool) |
      FIELDINTPRED Field (Int -> Bool)
newtype Entry = ...
newtype Table = TABLE String
newtype Field = FIELD String
newtype Database = DB (Server, User, Password)
type Server = String
type User = String
type Password = String

query :: Database -> Table -> Predicate -> IO Entry
query (DB (s,u,p)) = connect s u p where ...

...

gameLookup = query gameDBInfo
playerLookup = gameLookup playerTable

...

showStats name = playerLookup (FIELDSTRPRED playerName (== name)) >>= printStats

statsRequest = getLine >>= showStats
Now, it's not close to the features of fully functional SQL, because I only spent a few minutes typing it up, but there is absolutely nothing that anyone can actually do to showStats that will cause it to do anything other than lookup a player with exactly that name, and then print their stats, with no modification of permissions or unwanted actions performed on the database. The action statsRequest is completely safe, no matter what the user inputs. I 'wrote' it in Haskell because I just don't how to write it in any other languages with a type system strong enough to force only this use.
All Shadow priest spells that deal Fire damage now appear green.
Big freaky cereal boxes of death.
User avatar
WarDaft
 
Posts: 1574
Joined: Thu Jul 30, 2009 3:16 pm UTC

Re: Smart Pointers

Postby Sc4Freak » Mon Apr 23, 2012 8:05 pm UTC

LINQ?
User avatar
Sc4Freak
 
Posts: 673
Joined: Thu Jul 12, 2007 4:50 am UTC
Location: Redmond, Washington

Re: Smart Pointers

Postby Yakk » Mon Apr 23, 2012 8:42 pm UTC

Ya, that sounds like the name of the embedded application specific language microsoft added to C#. :)
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.
User avatar
Yakk
Poster with most posts but no title.
 
Posts: 10466
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

Re: Smart Pointers

Postby WarDaft » Tue Apr 24, 2012 12:15 am UTC

I'm not saying that such things should exist and don't, I imagined that they did, I just didn't know any examples that satisfied what I was going for so I was specific. Rather, I wanted to politely express my rage that there are examples that do not follow anything like such a system, instead opting for a horrible 1-string-argument-interpreted-as-a-command single do-it-all function.
All Shadow priest spells that deal Fire damage now appear green.
Big freaky cereal boxes of death.
User avatar
WarDaft
 
Posts: 1574
Joined: Thu Jul 30, 2009 3:16 pm UTC


Return to Coding

Who is online

Users browsing this forum: Exabot [Bot] and 5 guests