1353: "Heartbleed"

This forum is for the individual discussion thread that goes with each new comic.

Moderators: Moderators General, Prelates, Magistrates

User avatar
Rombobjörn
Posts: 147
Joined: Mon Feb 27, 2012 11:56 am UTC
Location: right between the past and the future

Re: 1353: "Heartbleed"

Postby Rombobjörn » Wed Apr 09, 2014 4:58 pm UTC

poslundc wrote:You guys can debate the merits of programming languages all you want. Anywhere that human interaction is required there is the potential for bugs. This wasn't a failure of language choice, it was more than likely a failure of process. Any mission-critical code like this should've been reviewed line-by-line by at least one other programmer before shipping. Any C/C++ programmer worth their salt would've red-flagged an arbitrary memcpy like that.

Code review and coding standards are good things, but the fact remains that if the equivalent of that memcpy had been written in a language with bounds checking, then it wouldn't have leaked any information.

KarenRei
Posts: 285
Joined: Sat Jun 16, 2012 10:48 pm UTC

Re: 1353: "Heartbleed"

Postby KarenRei » Wed Apr 09, 2014 5:12 pm UTC

poslundc wrote:You guys can debate the merits of programming languages all you want. Anywhere that human interaction is required there is the potential for bugs.


Nobody's disputing that. The matter is over what *happens* when a bug exists. Is it a compiletime error, a runtime error, or a silent error with potentially grave consequences? Using C++/STL instead of raw pointers shifts most potential errors from the latter category to the former two.

Mazzula
Posts: 54
Joined: Mon Sep 08, 2008 5:22 pm UTC

Re: 1353: "Heartbleed"

Postby Mazzula » Wed Apr 09, 2014 5:18 pm UTC

Rombobjörn wrote:Heartbleed is yet another example of why coding in C is a bad idea. A memcpy with an incorrect size caused all this because C compilers do no bounds checking. Heartbleed wouldn't have happened if OpenSSL had been written in, for example, Ada. Instead of an information leak that leaves no trace it would have been a denial of service at the worst.

Your post brings to mind several thoughts.

  • You mention that the leak leaves no trace. Yet the heartbeat message itself provides the opportunity for tracing. It might be interesting to set up honeypots to detect clients that are exploiting the weakness, to at least identify the threats.

  • Web servers account for a significant share (maybe 2%?) of total energy use in the US. Halving the coding efficiency of web servers entails doubling that energy cost. Ada is an interesting choice because the cost penalty is much less than a factor of two, maybe 10-20%. I gather that Ada was designed for embedded systems, so a lot of thought went into efficiency.

  • I agree with you that better tools can make software more reliable, the notion that it is always possible to defeat such tools (an inevitable consequence of the Church–Turing thesis) shouldn't dissuade us from using tools to help. Simply making it easier to write safe code than unsafe code will go a long way toward making systems safer.

  • Ada really might be an OK language for this purpose. I admit that I haven't thought about Ada for a long time.

    The weakness of Ada, in my view, is that its verbosity makes the meaning of code less apparent at a glance, so ordinary coding errors are harder to spot. Of course, my perception of this weakness may arise from the fact that I no longer use Ada-style languages, but I think that the causation goes the other way, that I was drawn towards less verbose languages because they made it easier to look at code. I started programming in the 1960s with Algol 60, and later programmed quite a lot in Pascal, so it seems Ada ought to be more welcoming to me than I find it to be. I do admire it, despite not really wanting to use it.

    On the other hand, when I compile C code with Visual Studio it nags me about old code that uses memcpy and warns me to replace such calls with memcpy_s, which makes the bounds checking explicit. The cost of memcpy_s over memcpy (in the context of the overall execution time of an algorithm) is trivial and this class of compiler warning eliminates this kind of bug. Presumably Microsoft added the warnings because they had bad experiences with their own buffer overflow issues, and I think the quality of their code has improved as a result.

  • I wonder if it would be efficient to eliminate this class of bug with appropriate sandboxing. If each TLS/SSL session involved its own address space, and its own certificates (presumably issued by the host's intermediate CA), then buffer overrun errors would only reveal information pertinent to that session. Of course, sandboxing has limitations because ultimately we want to interact with "the real world" or to otherwise interact with some global state, so there must be a way out of the sandbox.

KarenRei
Posts: 285
Joined: Sat Jun 16, 2012 10:48 pm UTC

Re: 1353: "Heartbleed"

Postby KarenRei » Wed Apr 09, 2014 5:26 pm UTC

memcpy_s isn't the solution. It still trusts the coder not to screw up, and that's only one type of common error anyway. Who hasn't seen the sort of bug, for example, where a function allocates a bunch of memory blocks, and then for various reasons has to return in the middle, and then someone adds a new return in there but mistakenly doesn't free all of the memory blocks that came before it, or someone adds a new pointer and misses freeing it in one of the return statements? That's just one example among countless. The data structure should manage itself. And it doesn't have to be a performance hit, moving to STL can often even be a performance boost.

poslundc
Posts: 7
Joined: Fri Jun 08, 2012 4:05 pm UTC

Re: 1353: "Heartbleed"

Postby poslundc » Wed Apr 09, 2014 5:34 pm UTC

KarenRei wrote:
poslundc wrote:You guys can debate the merits of programming languages all you want. Anywhere that human interaction is required there is the potential for bugs.


Nobody's disputing that. The matter is over what *happens* when a bug exists. Is it a compiletime error, a runtime error, or a silent error with potentially grave consequences? Using C++/STL instead of raw pointers shifts most potential errors from the latter category to the former two.


I don't want to get into a big showdown of language features. I'm sure you're aware that STL is not some kind of panacea:

  • Many safety-critical and embedded systems can't afford to use dynamic allocation, which STL containers generally rely on.*
  • STL introduces a whole new class of bugs. If you like your errors silent and deadly, I can't count the number of times I've seen a programmer retain a raw pointer to the contents of a std::vector, unaware that the backing storage can be relocated. Of course you're not supposed to *do* that, but you're also not supposed to do an unchecked memcpy. So how is one better than the other?

I'm not saying language features can't be helpful in both preventing and early detecting these sorts of bugs. But in a safety-critical environment the process is what saves your bacon. In this particular example, there's no way a memcpy like that should have passed review.

* Custom allocators are a way around this but they are fraught with their own complications and hazards in a safety-critical environment.

User avatar
BytEfLUSh
Posts: 308
Joined: Sat Jan 02, 2010 1:15 am UTC
Location: Sombor, Serbia, Pizza
Contact:

Re: 1353: "Heartbleed"

Postby BytEfLUSh » Wed Apr 09, 2014 10:50 pm UTC

GOOMHR (yay, I finally get to say that)

rhomboidal wrote:Title Text: I looked at some of the data dumps from vulnerable sites, and it was ... bad. I saw emails, passwords, password hints. SSL keys and session cookies. [snip]

As I read that, I thought: "It's going in the right direction, PLEASE make it so!" It did. Thank you.
2-3 days ago I read a post on a forum I frequent that started similar to that. Except, the guy went on to rant about the company that makes the software game and went on about the things he noticed. I was pretty mad he didn't use the opportunity and, it turns out - he never even heard of Blade Runner.

To make things worse, it's a forum about post-apocalyptic Sci-Fi MMORPG, set far in the future.
Image

Image

-- Professor Dan, The Man from Earth (paraphrased)

Jamaican Castle
Posts: 151
Joined: Fri Nov 30, 2007 9:10 pm UTC

Re: 1353: "Heartbleed"

Postby Jamaican Castle » Thu Apr 10, 2014 1:03 am UTC

I'm curious why the people who initially designed the protocol allowed for clients to send arbitrary heartbeat requests in the first place, as opposed to making it a standardized message or at least of standard length. Was there a specific use they had in mind for that ability, or was it just an arbitrary decision that wasn't a problem up until now?

Mazzula
Posts: 54
Joined: Mon Sep 08, 2008 5:22 pm UTC

Re: 1353: "Heartbleed"

Postby Mazzula » Thu Apr 10, 2014 1:11 am UTC

In order for OpenSSL to be widely accepted and deployed it had to be fast, faster than just about anything else at doing the job. Otherwise it simply wouldn't get used enough to become the de facto standard implementation, and that would likely have led to worse loopholes in the alternative wide variety of implementations. Also, it seems as though TLS/SSL is a contained enough protocol that it would be possible to test an implementation very thoroughly. A widely deployed, open solution would face a wide variety of challenges, and would become hardened over time. So, I can't really fault them for choosing C, it was a reasonable choice. Quite possibly, even given what is now known, the alternatives would have led to worse outcomes, with possibly no less use of C (it just wouldn't be OpenSSL).

Another thing to remember is that compilers also have bugs, and the more complex the language the more likely these become, especially with optimizations enabled. Likely the OpenSSL developers wanted to use a compiler that had been popular and stable over a long history, and that would have argued for C, especially in 1998, when OpenSSL was first developed. Then, once the OpenSSL code base itself had a long history, there would have been risk in a re-implementation. C++ wasn't likely a viable option for OpenSSL, because that language, although it existed, hadn't become standardized until that year. That initial standard had some technical issues that were resolved in later revisions that were aimed at improving compilers.

The worst approach, in my view, would be to decide that critical software infrastructure needs oversight by some organ of government, in the way that the FDA oversees medicine. I probably shouldn't even have mentioned it. This bug was horrible, but it has been fixed and the fix is rapidly becoming widely deployed. The code surely is going to get another look by many experts. I don't think that any likely regulators are going to be able to provide better scrutiny than this code has received or will receive.

User avatar
ahammel
My Little Cabbage
Posts: 2135
Joined: Mon Jan 30, 2012 12:46 am UTC
Location: Vancouver BC
Contact:

Re: 1353: "Heartbleed"

Postby ahammel » Thu Apr 10, 2014 1:27 am UTC

For security-sensitive applications, I think Haskell has a lot to offer. You've got the famous purity guarantee, a fancy type system, and good tooling for thoroughly testing the sensitive bits of the app and theorem proving.

In the future, it might be neat to see theorem proving compilers like Whiley's used in security applications.
He/Him/His/Alex
God damn these electric sex pants!

Mazzula
Posts: 54
Joined: Mon Sep 08, 2008 5:22 pm UTC

Re: 1353: "Heartbleed"

Postby Mazzula » Thu Apr 10, 2014 1:28 am UTC

Jamaican Castle wrote:I'm curious why the people who initially designed the protocol allowed for clients to send arbitrary heartbeat requests in the first place, as opposed to making it a standardized message or at least of standard length. Was there a specific use they had in mind for that ability, or was it just an arbitrary decision that wasn't a problem up until now?


I don't know but, being an ultracrepidarian, I will answer anyway.

I suspect it might have several uses. Although TCP/IP allows a Keep-Alive, it isn't widely used. There is otherwise no traffic at all on an idle TCP/IP connection. Thus, if a client computer crashes, the server may never discover the fact and will likely eventually close the connection, assuming the client is no longer there. For TSL/SSL, this would require the negotiation of a new session. Since this is costly, a heartbeat may be worthwhile.

Another use might be to generate fake traffic in order to disguise the timing of real traffic. Since the messages are encrypted once the protocol gets going, the content appears only to be random sequences of various lengths.

User avatar
ahammel
My Little Cabbage
Posts: 2135
Joined: Mon Jan 30, 2012 12:46 am UTC
Location: Vancouver BC
Contact:

Re: 1353: "Heartbleed"

Postby ahammel » Thu Apr 10, 2014 1:40 am UTC

<tinfoil hat on>

So, what's known about Heartbleed's provenance? By which I mean, how do we know the NSA didn't plant it intentionally?

<tinfoil hat off>
He/Him/His/Alex
God damn these electric sex pants!

Jamaican Castle
Posts: 151
Joined: Fri Nov 30, 2007 9:10 pm UTC

Re: 1353: "Heartbleed"

Postby Jamaican Castle » Thu Apr 10, 2014 1:42 am UTC

Mazzula wrote:I don't know but, being an ultracrepidarian, I will answer anyway.

I suspect it might have several uses. Although TCP/IP allows a Keep-Alive, it isn't widely used. There is otherwise no traffic at all on an idle TCP/IP connection. Thus, if a client computer crashes, the server may never discover the fact and will likely eventually close the connection, assuming the client is no longer there. For TSL/SSL, this would require the negotiation of a new session. Since this is costly, a heartbeat may be worthwhile.

Another use might be to generate fake traffic in order to disguise the timing of real traffic. Since the messages are encrypted once the protocol gets going, the content appears only to be random sequences of various lengths.


I get why it has a heartbeat, I'm just not sure why it would be "the client sends whatever it wants, and the server echoes it" as opposed to "the client sends a specific, pre-defined message, and the server responds with a different pre-defined message". I suppose it could be to test the connection/server for dropped data, because the client can compare the reply with the data it originally sent out.

User avatar
Isaac Hill
Systems Analyst????
Posts: 542
Joined: Wed Mar 14, 2007 9:35 pm UTC
Location: Middletown, RI

Re: 1353: "Heartbleed"

Postby Isaac Hill » Thu Apr 10, 2014 2:49 am UTC

dp2 wrote:The trick is, there's no check that the size and the data match. So, one side can say "Here's 64K bytes, send the same 64K back to me" but only actually send one byte of data. The other side puts that single byte in memory, not knowing it's only one byte. Then, knowing it has to send 64K back, it puts that same byte PLUS the next (64K - 1) bytes from its memory into the packet and sends it back. All those extra bytes can be anything in memory. It might be junk, but it can very easily be valuable info. And this happens with every heartbeat.
Doesn't not checking the incoming data kind of ignore the whole point of the heartbeat checking for connectivity? If I tell you I'm sending 64k bytes, then you only receive 1 byte from me, that would be a pretty good indicator that the connection is bad.

poslundc wrote:
addams wrote:
poslundc wrote:You guys can debate the merits of programming languages all you want. Anywhere that human interaction is required there is the potential for bugs. This wasn't a failure of language choice, it was more than likely a failure of process. Any mission-critical code like this should've been reviewed line-by-line by at least one other programmer before shipping. Any C/C++ programmer worth their salt would've red-flagged an arbitrary memcpy like that.

Dan.

Wonderful!
Two more Posts and you can join the Conversation with full Linking Powers!

Yes! You know how to fix the vulnerabilities of Human Interactions with Machines?
How? Many have considered disposing of the Human Interface.

What fun would that be?
You want to Add more Programers?

Good Idea!
(please keep in mind, programers are people, too.)


I'm an engineer; I don't value quantity over quality. I'm specially trained to write code of sufficient quality for use in pacemakers, medical lasers and space shuttles, where it has to work correctly the first time and every time. My undergrad thesis was software to monitor a nuclear reactor. But hey, I'm only five thousand posts away from being worthy to be a part of the conversation. Good to see this is where the intellectuals hang out; I'm real thrilled to be here.

Dna.
This guy ending his rant about the awesomeness of his attention to detail by misspelling his own name made me laugh more than the comic did.
Alleged "poems"
that don't follow a rhyme scheme
are not poetry

KarenRei
Posts: 285
Joined: Sat Jun 16, 2012 10:48 pm UTC

Re: 1353: "Heartbleed"

Postby KarenRei » Thu Apr 10, 2014 2:49 am UTC

poslundc wrote:I can't count the number of times I've seen a programmer retain a raw pointer to the contents of a std::vector


Exactly: *a raw pointer*. The problem remains the use of raw pointers.

Mazzula
Posts: 54
Joined: Mon Sep 08, 2008 5:22 pm UTC

Re: 1353: "Heartbleed"

Postby Mazzula » Thu Apr 10, 2014 2:51 am UTC

Jamaican Castle wrote:I get why it has a heartbeat, I'm just not sure why it would be "the client sends whatever it wants, and the server echoes it" as opposed to "the client sends a specific, pre-defined message, and the server responds with a different pre-defined message". I suppose it could be to test the connection/server for dropped data, because the client can compare the reply with the data it originally sent out.

The way I read the protocol, the content comprises two main parts, payload and padding. The payload must be echoed, but the padding must be ignored by the receiver. The padding thus need not be identical between request and response. The padding must be a minimum of 16 bytes. So, in a way, the server both echoes the client content and also sends a different message. But I agree that it still seems odd to disguise the message as little as that, thus providing known identical plaintext.

Mazzula
Posts: 54
Joined: Mon Sep 08, 2008 5:22 pm UTC

Re: 1353: "Heartbleed"

Postby Mazzula » Thu Apr 10, 2014 3:14 am UTC

ahammel wrote:<tinfoil hat on>

So, what's known about Heartbleed's provenance? By which I mean, how do we know the NSA didn't plant it intentionally?

<tinfoil hat off>

It always seemed suspicious to me that Al Gore suddenly stopped pushing for the Clipper Chip/Skipjack mandate, with its explicit backdoor. I figured that the NSA had broken public key encryption and thus wanted to promote its use (speaking of foil hats...). Now, with this and the compromised Dual_EC_DRBG, it seems that they hadn't broken it after all, but just resorted to more mundane methods.

Mazzula
Posts: 54
Joined: Mon Sep 08, 2008 5:22 pm UTC

Re: 1353: "Heartbleed"

Postby Mazzula » Thu Apr 10, 2014 3:29 am UTC

ahammel wrote:For security-sensitive applications, I think Haskell has a lot to offer. You've got the famous purity guarantee

If a program actually has no side effects, how can you tell if it is running?

Even if such side effects are limited to I/O, how can a program be forbidden to use I/O (e.g. to a file system) to store global state information?

User avatar
addams
Posts: 10224
Joined: Sun Sep 12, 2010 4:44 am UTC
Location: Oregon Coast: 97444

Re: 1353: "Heartbleed"

Postby addams » Thu Apr 10, 2014 4:03 am UTC

Mazzula wrote:
ahammel wrote:For security-sensitive applications, I think Haskell has a lot to offer. You've got the famous purity guarantee

If a program actually has no side effects, how can you tell if it is running?

Even if such side effects are limited to I/O, how can a program be forbidden to use I/O (e.g. to a file system) to store global state information?

If a drug has no side effects, how can you tell if it is working?
I don't know. How?
Life is, just, an exchange of electrons; It is up to us to give it meaning.

We are all in The Gutter.
Some of us see The Gutter.
Some of us see The Stars.
by mr. Oscar Wilde.

Those that want to Know; Know.
Those that do not Know; Don't tell them.
They do terrible things to people that Tell Them.

User avatar
ahammel
My Little Cabbage
Posts: 2135
Joined: Mon Jan 30, 2012 12:46 am UTC
Location: Vancouver BC
Contact:

Re: 1353: "Heartbleed"

Postby ahammel » Thu Apr 10, 2014 4:30 am UTC

Mazzula wrote:
ahammel wrote:For security-sensitive applications, I think Haskell has a lot to offer. You've got the famous purity guarantee

If a program actually has no side effects, how can you tell if it is running?

Even if such side effects are limited to I/O, how can a program be forbidden to use I/O (e.g. to a file system) to store global state information?

Haskell doesn't make it impossible to write code with side-effects; it makes it possible to write code that is guaranteed to be free of side effects.
He/Him/His/Alex
God damn these electric sex pants!

User avatar
Icalasari
Posts: 107
Joined: Wed May 26, 2010 5:11 am UTC

Re: 1353: "Heartbleed"

Postby Icalasari » Thu Apr 10, 2014 4:50 am UTC

dp2 wrote:
gsilverfish wrote:So as an end user I'm confused. Do I just wait for my sites to email me when it's time to change the password? If I understand correctly there won't be much point to it until they update things on their end.

That's correct. You can pester sites to get it fixed, and you can keep an eye on your money via non-web means. But as a user, there's nothing for you to do.

http://www.cnet.com/news/how-to-protect ... bleed-bug/


Dammit. I'm lazy and reuse a password for most sites (to be fair, they are accounts I don't mind losing if they get compromised), so I'm just going to have to wait, at which point I'll likely have forgotten (fucking memory)

REALLY frustrating is that Yahoo has yet to update, and that's one of the sites that I do use a unique password on. Sooooo much info is exposed right now until they get off their asses and update...

User avatar
CocoaNutCakery
Posts: 66
Joined: Fri May 24, 2013 6:26 am UTC
Contact:

Re: 1353: "Heartbleed"

Postby CocoaNutCakery » Thu Apr 10, 2014 5:33 am UTC

spoonyspork wrote:Fun fact: hotmail used to plop your username and password right on the url created on login, not obscured in any way (IIRC the url read something like 'hotmail.com/uid=<username>&pwd=<password>' or something equally as absurd). I could actually tell when my mom was lying about triple-checking to make sure she'd typed her password correctly when she'd complain she couldn't log in. Fun times, man. Fun times.

*re-lurks -- just here to change password on yet another site, just in case*


Okay? But that means that someone needed to either be looking over your shoulder or looking through your browsing history, and I assume that, in the latter case, a keylogger program would work better.

On the flip side, Heartbleed is something that is currently an issue and doesn't require someone to target you directly.

skeets011
Posts: 1
Joined: Thu Apr 10, 2014 6:06 am UTC

Re: 1353: "Heartbleed"

Postby skeets011 » Thu Apr 10, 2014 6:10 am UTC

I came to xkcd for a laugh today...I left with more stress...

Jamaican Castle
Posts: 151
Joined: Fri Nov 30, 2007 9:10 pm UTC

Re: 1353: "Heartbleed"

Postby Jamaican Castle » Thu Apr 10, 2014 6:23 am UTC

ahammel wrote:<tinfoil hat on>

So, what's known about Heartbleed's provenance? By which I mean, how do we know the NSA didn't plant it intentionally?

<tinfoil hat off>


It doesn't seem like it achieves what the NSA would want in a backdoor.

If you want to make a bunch of money (or make fraudulent purchases, etc.) you just need a few people's information. If you want to monitor what people are doing (to fight crime, or oppress your population, or catch Russian spies, or whatever you think the NSA is up to) they you need a lot of people's information - since you're looking for a needle in a haystack, you need the ability to search vast portions of that haystack, or odds are you'll never find it.

I mean, I don't doubt they would/could/maybe did exploit it, if they knew about it. But if they had a group making network attacks for their own use and this is what they came up with, they maybe need to try again.

Edit: because I confuse these sneaky government agencies sometimes.
Last edited by Jamaican Castle on Thu Apr 10, 2014 6:35 am UTC, edited 1 time in total.

keldor
Posts: 80
Joined: Thu Jan 26, 2012 9:18 am UTC

Re: 1353: "Heartbleed"

Postby keldor » Thu Apr 10, 2014 6:28 am UTC

CocoaNutCakery wrote:
spoonyspork wrote:Fun fact: hotmail used to plop your username and password right on the url created on login, not obscured in any way (IIRC the url read something like 'hotmail.com/uid=<username>&pwd=<password>' or something equally as absurd). I could actually tell when my mom was lying about triple-checking to make sure she'd typed her password correctly when she'd complain she couldn't log in. Fun times, man. Fun times.

*re-lurks -- just here to change password on yet another site, just in case*


Okay? But that means that someone needed to either be looking over your shoulder or looking through your browsing history, and I assume that, in the latter case, a keylogger program would work better.

On the flip side, Heartbleed is something that is currently an issue and doesn't require someone to target you directly.


Or they could just snoop your web traffic and see the URL request on its way to the server...

Jamaican Castle
Posts: 151
Joined: Fri Nov 30, 2007 9:10 pm UTC

Re: 1353: "Heartbleed"

Postby Jamaican Castle » Thu Apr 10, 2014 6:37 am UTC

keldor wrote:Or they could just snoop your web traffic and see the URL request on its way to the server...


Or, better/worse yet, snoop on Hotmail's web traffic and see everybody's URL requests.

User avatar
Rombobjörn
Posts: 147
Joined: Mon Feb 27, 2012 11:56 am UTC
Location: right between the past and the future

Re: 1353: "Heartbleed"

Postby Rombobjörn » Thu Apr 10, 2014 9:42 am UTC

Jamaican Castle wrote:I get why it has a heartbeat, I'm just not sure why it would be "the client sends whatever it wants, and the server echoes it" as opposed to "the client sends a specific, pre-defined message, and the server responds with a different pre-defined message".

Pre-defined messages might open the protocol to known-plaintext attacks.

nlitchfield
Posts: 16
Joined: Mon Mar 07, 2011 1:25 pm UTC

Re: 1353: "Heartbleed"

Postby nlitchfield » Thu Apr 10, 2014 11:46 am UTC

Rysto wrote:
ManaUser wrote:I like how the people that found it created not only a catchy name, but a matching logo. Not many people take the time to do that anymore... actually has anyone taken the time to do that before?

I like how they created a catchy name and a logo, but apparently couldn't be bothered to notify major OS vendors before disclosing the vulnerability.


Which O/S vendors would those be? They notified the software developers for openssl which is exactly what you would expect them to do. They then held off public disclosure until the issue was fixed and the developers had made the update available including to all O/S vendors and the general public. I don't see how google security did anything but TheRightThing(tm) here.

It's also likely to turn out to be a poster boy counter-argument for the "open source is more secure because anyone can, and many do, review the code". This is probably the core open source security library implementation and it went unfixed for, well a comparable or longer time than closed source vendors.

I'm sure I'm not the only one imagining that certain agencies might have already been well aware of this, as might have been all sorts of neer do wells. Its a mess.

KarenRei
Posts: 285
Joined: Sat Jun 16, 2012 10:48 pm UTC

Re: 1353: "Heartbleed"

Postby KarenRei » Thu Apr 10, 2014 12:55 pm UTC

Let's take a practical view of how the code would change by switching to STL. The bug is in this code:

Code: Select all

unsigned char *buffer, *bp;
int r;
buffer = OPENSSL_malloc(1 + 2 + payload + padding);
bp = buffer;
*bp++ = TLS1_HB_RESPONSE;
s2n(payload, bp);
memcpy(bp, pl, payload);
RAND_pseudo_bytes(p, padding);
r = dtls1_write_bytes(s, TLS1_RT_HEARTBEAT, buffer, 3 + payload + padding);


First off, bad dog, no biscuit! for using cryptic couple-letter-long variable names instead of names that actually describe things... so let's clarify: bp is the output buffer, they're setting the first field to the response type in a really ridiculous manner, then they're copying the stated length (not the actual length) from the source packet (payload) to the output, then they copy the stated length of data, then they add random padding and send it.

The STL equivalent, with the same sort of bug and lousy coding approach, could be something like:

Code: Select all

std::string bp;
bp.reserve(1 + 2 + payload + padding);
int r;
bp += TLS1_HB_RESPONSE;
bp += pl;
RAND_pseudo_bytes(p, padding);
r = dtls1_write_bytes(s, TLS1_RT_HEARTBEAT, buffer);


Note that the entire need for the coder to spell out the length check disappears - they *can't* screw up there, the entire possibility for the error has been removed.

What about the standard C excuse, performance? There's for-loops with length checks in them (each of the += statements), versus two in the previous one (the TLS1_HB_RESPONSE one is superfluous, we can assume the check isn't needed there). But given that's just a couple bytes, the memcpy, random bytes, and write bytes will eat up vastly more time, the little added by the first will be insignificant. Furthermore, you can be assured with SSL that the code for your memory handling has optimized, from allocation onward, which you're not guaranteed when you make your own piecemeal implementations, and since stl structures are all .h/.tcc code, it's fully available to the compiler to optimize *into* your code instead of just being fixed library calls. And that whole example is a pointless exercise anyway as it wouldn't occur if they weren't programming like idiots - if they'd just fill out their data stucture, then do a write_bytes on it all at once.

Then there's all the other stuff, like the ease of threading STL provides, especially on C++11 - you can std::thread().detach() your write/failure callback, either as one function or a lambda that calls multiple functions. If you need to keep your buffer around until it's no longer needed and then free it afterwards, you can declare it as a std::shared_ptr and it'll take care of deallocation on its own at the end. With such easy one-line managed threading, performance-critical software can make much better use of multiple CPU cores.
Last edited by KarenRei on Thu Apr 10, 2014 3:53 pm UTC, edited 2 times in total.

Mazzula
Posts: 54
Joined: Mon Sep 08, 2008 5:22 pm UTC

Re: 1353: "Heartbleed"

Postby Mazzula » Thu Apr 10, 2014 1:05 pm UTC

addams wrote:
Mazzula wrote:
ahammel wrote:For security-sensitive applications, I think Haskell has a lot to offer. You've got the famous purity guarantee

If a program actually has no side effects, how can you tell if it is running?

Even if such side effects are limited to I/O, how can a program be forbidden to use I/O (e.g. to a file system) to store global state information?

If a drug has no side effects, how can you tell if it is working?
I don't know. How?


In programming, a side effect is any action of a function other than generating its return value. From Wikipedia, "In computer science, a function or expression is said to have a side effect if, in addition to returning a value, it also modifies some state or has an observable interaction with calling functions or the outside world."

It is a different word from the word as used in medicine, where an effect is not considered to be a side effect if it was the desired effect.

It would be nice if your definition really did apply to Haskell, so that the purity guarantee meant that programs only did what the programmer desired for them to do. But alas...

Mazzula
Posts: 54
Joined: Mon Sep 08, 2008 5:22 pm UTC

Re: 1353: "Heartbleed"

Postby Mazzula » Thu Apr 10, 2014 2:07 pm UTC

KarenRei wrote:Let's take a practical view of how the code would change by switching to STL. […]

Code: Select all

unsigned char *buffer, *bp;
int r;
buffer = OPENSSL_malloc(1 + 2 + payload + padding);
bp = buffer;
*bp++ = TLS1_HB_RESPONSE;
s2n(payload, bp);
memcpy(bp, pl, payload);
RAND_pseudo_bytes(p, padding);
r = dtls1_write_bytes(s, TLS1_RT_HEARTBEAT, buffer, 3 + payload + padding);



I think the last lines were actually…

Code: Select all

memcpy(bp, pl, payload);
bp += payload;
RAND_pseudo_bytes(bp, padding);
r = dtls1_write_bytes(s, TLS1_RT_HEARTBEAT, buffer, 3 + payload + padding);


KarenRei wrote:The STL equivalent, with the same sort of bug and lousy coding approach, could be something like:

Code: Select all

std::string bp;
bp.reserve(1 + 2 + payload + padding);
int r;
bp += TLS1_HB_RESPONSE;
bp += pl;
RAND_pseudo_bytes(p, padding);
r = dtls1_write_bytes(s, TLS1_RT_HEARTBEAT, buffer);


Note that the entire need for the coder to spell out the length check disappears - they *can't* screw up there, the entire possibility for the error has been removed.


I don't really understand your code.

You seem to have removed the "s2n(payload, bp);". How does the payload size get encoded into the buffer?
You don't seem to have put the padding anywhere useful.
These bugs show there is a risk associated with re-implementation.

But the most important thing is that you haven't taken the correct action on a buffer overflow. In the case of a buffer overflow, the standard says that the incoming message is supposed to be ignored. Ignoring the message is an active response in this code. So you haven't removed the need for an explicit buffer check, you have simply replaced the explicit buffer check with an implicit one that will crash OpenSSL.

Perhaps crashing OpenSSL is better than a buffer overflow, but you still need the explicit check in order to have correct code.

The increase in overhead doesn't come from the buffer checking, but from maintaining the std:string structure and accessing the bytes of that string using array operations rather than operations like *bp++ in the original code. The implementation of "bp += TLS1_HB_RESPONSE;" will likely be at least an order of magnitude slower than the implementation of "*bp++ = TLS1_HB_RESPONSE;".

The loops are also likely to take much longer, because the size of the payload is likely to be small (except when the bug is being exploited). I believe that a payload size of 1 byte is typical. So the "looping" part of the loop does not actually swamp the other overhead.

And for all that, you still haven't removed the need for an explicit buffer overflow check.

Given that inefficiency means more energy will be used to perform the computation, how many tons of greenhouse gasses is it worth to switch from C to C++?

justvisiting
Posts: 9
Joined: Fri May 17, 2013 10:21 am UTC

Re: 1353: "Heartbleed"

Postby justvisiting » Thu Apr 10, 2014 2:50 pm UTC

jpvlsmv wrote:Right, and that's the part that's not being made clear in the mass-media version of "The Sky Has Fallen" reporting on this bug (as well as on the actual tech side.

Yes, you will get 64k of memory from the server. But you don't control which 64k1. And even if you happen to get the server's key, that string of bytes is (should be) indistinguishable from 8192 bits from /dev/random (except that you can test it to see if it happens to decrypt something, or match the signature on a certificate)

--Joe
[1] It depends on where your dynamic memory allocator decides to put the 1 byte (in this example) Alice sent to Bob's server. It would have to be in a lower address within 64k of the data you actually want. Your placement will usually be influenced by the size of the allocation (1byte may be far away from a 4-byte allocation) and how fragmented the server's memory is (i.e. how busy it has been since it started), and you may still crash the server process if you read past the end of a memory page into unallocated space.


OpenSSL does its own memory caching, so most of the time it does not actually call the real malloc() and free(). So, most of the time the leaked memory contains something that was previously used by OpenSSL in the same process. This often includes remnants of previous HTTP requests, which is how people have been able to trivially get other people's usernames/passwords/sessions/etc. by simply connecting to the server and sending heartbeat requests.

About the private key: I believe the key doesn't sit in memory by itself, but it is surrounded by other certificate-related data that can be easily distinguished.

User avatar
Ken_g6
Posts: 78
Joined: Tue Jun 29, 2010 10:45 pm UTC
Location: in yer GPUz fakterin' primez in wardrobez

Re: 1353: "Heartbleed"

Postby Ken_g6 » Thu Apr 10, 2014 3:13 pm UTC

justvisiting wrote:OpenSSL does its own memory caching, so most of the time it does not actually call the real malloc() and free(). So, most of the time the leaked memory contains something that was previously used by OpenSSL in the same process. This often includes remnants of previous HTTP requests, which is how people have been able to trivially get other people's usernames/passwords/sessions/etc. by simply connecting to the server and sending heartbeat requests.

About the private key: I believe the key doesn't sit in memory by itself, but it is surrounded by other certificate-related data that can be easily distinguished.
Yeah, I largely blame their malloc for this. If they'd used a secure malloc that wipes the memory it allocates, preferably with a code that's easily recognizable when debugging, then all the attackers would have gotten would be 0xDEADBEEF, or the like.

jpvlsmv
Posts: 85
Joined: Wed Jan 09, 2013 9:43 pm UTC

Re: 1353: "Heartbleed"

Postby jpvlsmv » Thu Apr 10, 2014 3:26 pm UTC

justvisiting wrote:
jpvlsmv wrote:Yes, you will get 64k of memory from the server. But you don't control which 64k1. And even if you happen to get the server's key, that string of bytes is (should be) indistinguishable from 8192 bits from /dev/random (except that you can test it to see if it happens to decrypt something, or match the signature on a certificate)

--Joe
[1] It depends on where your dynamic memory allocator decides to put the 1 byte (in this example) Alice sent to Bob's server. It would have to be in a lower address within 64k of the data you actually want. Your placement will usually be influenced by the size of the allocation (1byte may be far away from a 4-byte allocation) and how fragmented the server's memory is (i.e. how busy it has been since it started), and you may still crash the server process if you read past the end of a memory page into unallocated space.


OpenSSL does its own memory caching, so most of the time it does not actually call the real malloc() and free(). So, most of the time the leaked memory contains something that was previously used by OpenSSL in the same process. This often includes remnants of previous HTTP requests, which is how people have been able to trivially get other people's usernames/passwords/sessions/etc. by simply connecting to the server and sending heartbeat requests.

About the private key: I believe the key doesn't sit in memory by itself, but it is surrounded by other certificate-related data that can be easily distinguished.

Ah, that's the part that I'd been missing -- it was the "allocator"'s fault. And Theo and Ted from OpenBSD have pointed me in this direction as well.

Guess I was in the lucky 10000 that day. (Although from the response to the bug, I'd say it was more than 10000 on Monday)
--Joe

KarenRei
Posts: 285
Joined: Sat Jun 16, 2012 10:48 pm UTC

Re: 1353: "Heartbleed"

Postby KarenRei » Thu Apr 10, 2014 3:47 pm UTC

Mazzula wrote:
KarenRei wrote:Let's take a practical view of how the code would change by switching to STL. […]

Code: Select all

unsigned char *buffer, *bp;
int r;
buffer = OPENSSL_malloc(1 + 2 + payload + padding);
bp = buffer;
*bp++ = TLS1_HB_RESPONSE;
s2n(payload, bp);
memcpy(bp, pl, payload);
RAND_pseudo_bytes(p, padding);
r = dtls1_write_bytes(s, TLS1_RT_HEARTBEAT, buffer, 3 + payload + padding);



I think the last lines were actually…

Code: Select all

memcpy(bp, pl, payload);
bp += payload;
RAND_pseudo_bytes(bp, padding);
r = dtls1_write_bytes(s, TLS1_RT_HEARTBEAT, buffer, 3 + payload + padding);


I was copying straight from the patch that introduced the bug, although your version makes more sense - perhaps the original patch contained a different bug that was quickly patched?

KarenRei wrote:The STL equivalent, with the same sort of bug and lousy coding approach, could be something like:

Code: Select all

std::string bp;
bp.reserve(1 + 2 + payload + padding);
int r;
bp += TLS1_HB_RESPONSE;
bp += pl;
RAND_pseudo_bytes(p, padding);
r = dtls1_write_bytes(s, TLS1_RT_HEARTBEAT, buffer);


Note that the entire need for the coder to spell out the length check disappears - they *can't* screw up there, the entire possibility for the error has been removed.


I don't really understand your code. You seem to have removed the "s2n(payload, bp);".


That's part of the problem of having only a partial implementation here, I wasn't about to go back and rework the entire openssl stack for a discussion on xkcd. My assumption was that the std::string length is taken as the buffer length everywhere in the program, which is the only thing that makes sense here.

You don't seem to have put the padding anywhere useful.


Again, this was based on the original patch that introduced the bug that I saw on the web. The padding was put in the same place as in the C version above.

But the most important thing is that you haven't taken the correct action on a buffer overflow.


Neither did the C version. This wasn't written to be correct code, it was written to be a C++ version of the same incorrect code. The difference is that while the latter gives incorrect behavior, the former leaks raw memory content to malicious attackers.

Perhaps crashing OpenSSL is better than a buffer overflow


"Perhaps"? Is there any remote question that it's the better option? The order of preferability of bugs is of the order:

1) Compiletime error
2) Runtime error
3) Silent error

As stated in my earlier post on the subject, C++/STL takes much of what would be in the third category and moves them to the first two categories. This is a Very Good Thing(TM).

The increase in overhead doesn't come from the buffer checking, but from maintaining the std:string structure and accessing the bytes of that string using array operations rather than operations like *bp++ in the original code.


As stated, in any *reasonable* implementation they would have just had a structure and assigned structurename.bp = ....; instead of assigning it by pointer dererence (or copying into a string), and it would have been even faster. Perhaps they did that because they experienced trouble because C has such crappy support for data alignment (*cough* std::aligned_storage *cough*)? Also, as stated, C++ lets you do the exact same potentially dangerous operations as the un-bounds-checked, potentially-null-destroying *bp++ example above. But you have to do your dangerous behavior *explicitly*, preferably by extending std::string, less preferably by const casting c_str().

The loops are also likely to take much longer, because the size of the payload is likely to be small


Not if you take a look at the actual data structure, and factor in all of the time to make the padding and send the message (or to receive it in the first place).

Given that inefficiency means more energy will be used to perform the computation, how many tons of greenhouse gasses is it worth to switch from C to C++?


Given that if they'd used C++ aligned data structures in the first place they wouldn't have had to increment that pointer in the first place, how many tons of greenhouse gasses is it worth to keep using C instead of C++?

KarenRei
Posts: 285
Joined: Sat Jun 16, 2012 10:48 pm UTC

Re: 1353: "Heartbleed"

Postby KarenRei » Thu Apr 10, 2014 4:03 pm UTC

As an just an example of what I mean about about STL containing optimizations that most people don't do, allowing switchovers to STL to gain performance increases: how many people do you know that make a regular habit of using the restrict keyword all throughout their code when not dealing with threaded memory? I'd wager that most C coders don't even know what it does. But restrict is used many dozens of times in the STL backends. Without restrict, the compiler can't trust that each object in memory is constant from moment to moment, meaning that it has to insert (often unnecessary) reads all over the place.

It pays to make use of highly optimized, common, safe data structures. And if you want to use unsafe operations to gain further performance, fine, you can, but you have to do it *explicitly*.

M2tM
Posts: 1
Joined: Mon Mar 24, 2014 6:55 pm UTC

Re: 1353: "Heartbleed"

Postby M2tM » Thu Apr 10, 2014 5:42 pm UTC

KarenRei wrote:memcpy_s isn't the solution. It still trusts the coder not to screw up, and that's only one type of common error anyway. Who hasn't seen the sort of bug, for example, where a function allocates a bunch of memory blocks, and then for various reasons has to return in the middle, and then someone adds a new return in there but mistakenly doesn't free all of the memory blocks that came before it, or someone adds a new pointer and misses freeing it in one of the return statements? That's just one example among countless. The data structure should manage itself. And it doesn't have to be a performance hit, moving to STL can often even be a performance boost.


In general, RAII is a very powerful concept. Modern C++ may not have been around when this stuff was written, but now that it is, combined with the interoperability of C with C++ means that the most obvious choice to avoid this type of issue in the future would indeed be to begin modernizing the code base.

C programmers can sometimes sneer at C++, but honestly there's no reason to eschew the benefits of more powerful and safe language features when they don't provide overhead, and indeed, inlining and more explicit generic type-safety can improve performance during compiler optimization.

keldor
Posts: 80
Joined: Thu Jan 26, 2012 9:18 am UTC

Re: 1353: "Heartbleed"

Postby keldor » Thu Apr 10, 2014 6:50 pm UTC

KarenRei wrote:As an just an example of what I mean about about STL containing optimizations that most people don't do, allowing switchovers to STL to gain performance increases: how many people do you know that make a regular habit of using the restrict keyword all throughout their code when not dealing with threaded memory? I'd wager that most C coders don't even know what it does. But restrict is used many dozens of times in the STL backends. Without restrict, the compiler can't trust that each object in memory is constant from moment to moment, meaning that it has to insert (often unnecessary) reads all over the place.

It pays to make use of highly optimized, common, safe data structures. And if you want to use unsafe operations to gain further performance, fine, you can, but you have to do it *explicitly*.


I thought that was the purpose of the 'volatile' keyword - to tell the compiler that given location in memory could change, i.e. don't store it in a register.

As far as performance, extra memory reads don't really matter, since it just lives in the L1 cache. x86 is short on registers to begin with, so you're not going to be keeping much there.

User avatar
BlueNight
Posts: 270
Joined: Sat Aug 18, 2007 3:59 am UTC
Location: Albuquerque
Contact:

Re: 1353: "Heartbleed"

Postby BlueNight » Thu Apr 10, 2014 6:56 pm UTC

Is it wrong of me to giggle at the thought that this came to light just after XP Final Update Day?
---------
BlueNight

keldor
Posts: 80
Joined: Thu Jan 26, 2012 9:18 am UTC

Re: 1353: "Heartbleed"

Postby keldor » Thu Apr 10, 2014 6:56 pm UTC

keldor wrote:
KarenRei wrote:As an just an example of what I mean about about STL containing optimizations that most people don't do, allowing switchovers to STL to gain performance increases: how many people do you know that make a regular habit of using the restrict keyword all throughout their code when not dealing with threaded memory? I'd wager that most C coders don't even know what it does. But restrict is used many dozens of times in the STL backends. Without restrict, the compiler can't trust that each object in memory is constant from moment to moment, meaning that it has to insert (often unnecessary) reads all over the place.

It pays to make use of highly optimized, common, safe data structures. And if you want to use unsafe operations to gain further performance, fine, you can, but you have to do it *explicitly*.


I thought that was the purpose of the 'volatile' keyword - to tell the compiler that given location in memory could change, i.e. don't store it in a register.

As far as performance, extra memory reads don't really matter, since it just lives in the L1 cache. x86 is short on registers to begin with, so you're not going to be keeping much there.


Anyway, one compiler optimization involves looking at variables to see if they could possibly be accessed by another thread - for instance, a locally scoped variable can't be accessed by another thread legally. This is what that no alias compiler flag is for, to tell it to assume that funny pointer shenanigans are not used.

User avatar
PhoenixRising
Posts: 129
Joined: Wed Mar 27, 2013 3:53 am UTC

Re: 1353: "Heartbleed"

Postby PhoenixRising » Thu Apr 10, 2014 9:46 pm UTC

KarenRei wrote:Let's take a practical view of how the code would change by switching to STL. The bug is in this code:

Code: Select all

unsigned char *buffer, *bp;
int r;
buffer = OPENSSL_malloc(1 + 2 + payload + padding);
bp = buffer;
*bp++ = TLS1_HB_RESPONSE;
s2n(payload, bp);
memcpy(bp, pl, payload);
RAND_pseudo_bytes(p, padding);
r = dtls1_write_bytes(s, TLS1_RT_HEARTBEAT, buffer, 3 + payload + padding);


First off, bad dog, no biscuit! for using cryptic couple-letter-long variable names instead of names that actually describe things... so let's clarify: bp is the output buffer, they're setting the first field to the response type in a really ridiculous manner, then they're copying the stated length (not the actual length) from the source packet (payload) to the output, then they copy the stated length of data, then they add random padding and send it.

The STL equivalent, with the same sort of bug and lousy coding approach, could be something like:

Code: Select all

std::string bp;
bp.reserve(1 + 2 + payload + padding);
int r;
bp += TLS1_HB_RESPONSE;
bp += pl;
RAND_pseudo_bytes(p, padding);
r = dtls1_write_bytes(s, TLS1_RT_HEARTBEAT, buffer);


Note that the entire need for the coder to spell out the length check disappears - they *can't* screw up there, the entire possibility for the error has been removed.

What about the standard C excuse, performance? There's for-loops with length checks in them (each of the += statements), versus two in the previous one (the TLS1_HB_RESPONSE one is superfluous, we can assume the check isn't needed there). But given that's just a couple bytes, the memcpy, random bytes, and write bytes will eat up vastly more time, the little added by the first will be insignificant. Furthermore, you can be assured with SSL that the code for your memory handling has optimized, from allocation onward, which you're not guaranteed when you make your own piecemeal implementations, and since stl structures are all .h/.tcc code, it's fully available to the compiler to optimize *into* your code instead of just being fixed library calls. And that whole example is a pointless exercise anyway as it wouldn't occur if they weren't programming like idiots - if they'd just fill out their data stucture, then do a write_bytes on it all at once.

Then there's all the other stuff, like the ease of threading STL provides, especially on C++11 - you can std::thread().detach() your write/failure callback, either as one function or a lambda that calls multiple functions. If you need to keep your buffer around until it's no longer needed and then free it afterwards, you can declare it as a std::shared_ptr and it'll take care of deallocation on its own at the end. With such easy one-line managed threading, performance-critical software can make much better use of multiple CPU cores.


I couldn't help but notice you've been steadily advocating the use of C++ and its STL containers in this thread. As someone who's seen too much nonsense from those STL containers, I can say they won't work the way you think they will for this.

The problem with STL, and C++ in general, is the lack of a well-defined ABI (application binary interface; basically an agreement on the size, layout, etc. of data in memory). This means every implementation of the STL is potentially incompatible with every other implementation. There are multiple threads on sites like StackOverflow that cover this issue: here, here, and here. For a security library which is as widely used as OpenSSL, this could be disastrous - how do you guarantee your server's STL-based version of OpenSSL is compatible with a given client's STL-based OpenSSL? Additionally, anyone who wanted to work with the OpenSSL source would need to use exactly the same version of the same compiler as the official OpenSSL lib.

Even more importantly, using C++ STL containers means only C++ apps can use OpenSSL. Other languages would need to write wrappers around the STL-based SSL lib, which defeats the entire purpose of using these containers. And once again, every wrapper around OpenSSL would need to use exactly the same version of the same compiler, or things will crash in odd and hard-to-debug ways.

The STL has many good uses, but I don't believe this is one of them.
riverssong wrote:I step away for ten minutes and y'all are creating a new civilization?!

This is getting out of hand. Stop that right this minute.


Return to “Individual XKCD Comic Threads”

Who is online

Users browsing this forum: cellocgw, mscha and 50 guests