This forum is for the individual discussion thread that goes with each new comic.

Moderators: Moderators General, Prelates, Magistrates

fagricipni
Posts: 41
Joined: Thu Nov 04, 2010 7:32 pm UTC

Call it the logarithm base 2, please.

Because if I didn't know better, the phrasing "binary log" would have me wondering if there was some different kind of logarithm than the one that I have been familiar with since I was 13; and I can easily see other people experiencing the same type of confusion.

erik65536
Posts: 8
Joined: Wed May 04, 2011 7:06 pm UTC

gmalivuk wrote:It seems rather foolish to rely on the assumption that attackers will never use the characters you're using.

Once again, the entropy values that have been discussed through most of this thread refer to the difficulty of cracking a password *after* knowing the algorithm used to generated it. And knowing someone has a unicode character in their password means we *will* include those in our search. At which point it turns out this method isn't any more secure than just typing the code for the same character.

I agree that the original poster was overstating the security, but I think they do have a valid point about using unicode characters.

Few attackers actually use a truly brute force approach. If they did, then they would try every possible binary permutation. An attacker usually assumes some information about the password so that they can reduce the key space and thus the complexity of breaking a password. Even if they are just assuming [a-zA-Z0-9]. Its all about information. The more information an attacker knows, the more an attacker can reduce the key space.

The larger the key space, the longer it will take on average for an attacker to discover your password. However, the attacker could get extremely lucky and pick your password correctly the first time. So choosing a larger key space does not guarantee security, it just decreases the probability that an attacker will crack your password in a reasonable time.

When calculating the entropy the assumption is that an attacker only knows what you knew at the time you generated the password. The key space from which you picked your password, and hopefully you picked your password completely randomly from this key space. That is a good conservative estimate, because if the attacker knows anything more you are probably screwed anyways. So if we consider the worst case scenario to be that the attacker knows only the algorithm both passwords below have the same security.

But that is not the only information we know. We know that attackers are more likely to choose certain key spaces than others. Well at least we can make an educated guess on the probability. I think its fair to say the probability of an attacker choosing a key space that contains unicode characters is lower than them excluding them. So you have used information you know about crackers to decrease the probability of your password being cracked. But in the "worst case" scenario your password still has enough entropy to be secure.

Granted there are still issues with memorization and actually typing unicode characters especially on mobile devices. But if you know something about an attacker, why not use it against them?

gmalivuk
GNU Terry Pratchett
Posts: 26726
Joined: Wed Feb 28, 2007 6:02 pm UTC
Location: Here and There
Contact:

fagricipni wrote:Call it the logarithm base 2, please.

Because if I didn't know better, the phrasing "binary log" would have me wondering if there was some different kind of logarithm than the one that I have been familiar with since I was 13; and I can easily see other people experiencing the same type of confusion.
Well if the one you've been familiar with that long is base 10, the binary logarithm is something new.

And if someone isn't sure of what "binary logarithm" means, they could google it, and discover immediately that it means base-2 logarithm.
Unless stated otherwise, I do not care whether a statement, by itself, constitutes a persuasive political argument. I care whether it's true.
---
If this post has math that doesn't work for you, use TeX the World for Firefox or Chrome

(he/him/his)

Yakk
Poster with most posts but no title.
Posts: 11115
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

fagricipni wrote:Call it the logarithm base 2, please.

Because if I didn't know better, the phrasing "binary log" would have me wondering if there was some different kind of logarithm than the one that I have been familiar with since I was 13; and I can easily see other people experiencing the same type of confusion.

And to be even worse, there are 3 things that go by the term "log".

log base 2 is used in computer science as a default. Sometimes people bother with the log2 or call it lg instead of log, but inside computer science you are next to never going to use log base 10.

Using log to mean log base e isn't that off-putting. ln is more often used.

Log base 10 isn't used in mathematics (proper: people who apply it, like engineers or scientists, may use it) or computer science at all. If you wrote down log in those contexts, and you actually meant base 10, people would look at you weird.

In this case, the strength of passwords is information theoretic in an algorithmic sense -- firmly in the domain of computer science. Using "log" as the base 2 log without any additional adjectives would be standard use of language in this domain. Saying "binary log" is actually being redundant. The use of "binary" there is just to give someone who doesn't know what is going on some kind of hint.

And calling it the binary log is also a relatively common name for it -- the only other log worth mentioning being the natural log (base e).
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

bigjeff5
Posts: 127
Joined: Tue Nov 10, 2009 3:59 am UTC

Isaac5 wrote:How does one calculate this?

Remember that "bits of entropy" is just an easy way of describing the total number of combinations for a password scheme - it's easier to compare 50 bits of entropy to 60 bits of entropy than it is to compare a thousand trillion password combinations to a million trillion password combinations. As Eebster said, it's the log base 2 of the total number of password combinations.

To find the total number of password combinations, you take the total possibilities for each digit (or word, in the case of the passphrase) and raise that the power of the total number of digits in the password (or words in the passphrase).

So, if x = possible characters/words (i.e. 26 for lower case alphabet, 50,000 for a small dictionary) and y = number of digits/words, then the formula is like so:

entropy = log2(x^y) OR entropy = log(x^y) / log(2)

So in Windows Calculator (which can't do log base 2), for a 10 character password using only lower case alphabetical characters the formula is entropy = log(26^10) / log(2), and for four dictionary words the formula is log(50,000^4) / log(2).

To check your result, 2^<entropy> should equal x^y.

jpk wrote:So one of my passwords expired towards the end of last week, and I thought I'd give this a try. Did the reset, used four common words, had to insert a digit to keep the algorithm happy, so I used digits as separators.

I remembered the digits, but the words themselves were completely gone by the time the system prompted me for a refresher.
I'm not playing that game again. It was embarrassing having to ask for a reset after just a few hours.

Doh! >.<

The mental image is key to this type of thing. If you build the silly picture linking all four words it should be difficult to forget. It must be both silly and an actual scene of some kind that you picture mentally in order for it to be memorable. You can either make one big mental scene, as in the comic, or you can make three different scenes, each one linked to the next. I personally think the linked set of scenes is a little easier than one big scene with all objects, and it is a good way to remember longer lists, too. If you didn't take the time to do that, though, it's just 4 random unrelated words, and they will be hard to remember. I would stick to words you can readily visualize first, as more abstract words are not as easy (still completely doable, though), and verbs. Remember that the less sense an image makes, the easier it will be to remember. Also avoid places that you can't easily recognize - if you can construct a quick mental picture of London and immediately recognize it as London, then go for it, but if you have no idea what Duluth would look like, best not to use it. There are more techniques that build on this, allowing you to remember completely unfamiliar words, but they all rely on this mental image trick so it is fundamental. If you can't make mental images for some reason, then you may be stuck with old-fashioned repetition, which sucks. I still can't imagine a 4 word phrase being harder than 8-10 random characters, but obviously people are different.

I recommend looking up the memory books by Harry Lorayne or Dean Vaughn if you are interested in going further - the techniques really are pretty amazing. Or with more elbow grease you can find all of the tricks online in various places. Look for memory linking or memory associations. In fact, the more advanced techniques make memorizing Tr0ub@dor&3 and G%@Csb89 a snap too, but they involve things like linking numbers and symbols to a silly picture, and then linking that silly picture to the main silly picture that represents the password, so they are a bit more involved and easier to screw up.

Eebster the Great
Posts: 3409
Joined: Mon Nov 10, 2008 12:58 am UTC
Location: Cleveland, Ohio

Yakk wrote:And to be even worse, there are 3 things that go by the term "log".

Worse still, there are two different definitions of entropy, based on the two different bases used in information theory and physics.

And the decadic (base-ten) logarithm is sometimes called the "common log," despite the fact that it is the least commonly used of the three.
Last edited by Eebster the Great on Thu Sep 08, 2011 11:07 pm UTC, edited 1 time in total.

TheGrammarBolshevik
Posts: 4878
Joined: Mon Jun 30, 2008 2:12 am UTC
Location: Going to and fro in the earth, and walking up and down in it.

Isaac5 wrote:How does one calculate this?

Have you tried logarithms?
Nothing rhymes with orange,
Not even sporange.

Yakk
Poster with most posts but no title.
Posts: 11115
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

Eebster the Great wrote:
Yakk wrote:And to be even worse, there are 3 things that go by the term "log".
Worse still, there are two different definitions of entropy, based on the two different bases used in information theory and physics.

And the decimal (base-ten) logarithm is sometimes called the "common log," despite the fact that it is the least commonly used of the three.
"Common" as in "base" or "without value"?
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

Pfhorrest
Posts: 5374
Joined: Fri Oct 30, 2009 6:11 am UTC
Contact:

Eebster the Great wrote:
Yakk wrote:And to be even worse, there are 3 things that go by the term "log".

Worse still, there are two different definitions of entropy, based on the two different bases used in information theory and physics.

If I recall, the mathematical expressions of them are the same however, which is why Shannon named his entropy after the thermodynamics concept to begin with.

Besides which, information is a physical property anyway, and when dealing with things like black holes it becomes pretty obvious that thermodynamic entropy and informational entropy are the same thing when you get down to it. Heat is just noisy energy.
Forrest Cameranesi, Geek of All Trades
"I am Sam. Sam I am. I do not like trolls, flames, or spam."
The Codex Quaerendae (my philosophy) - The Chronicles of Quelouva (my fiction)

Eebster the Great
Posts: 3409
Joined: Mon Nov 10, 2008 12:58 am UTC
Location: Cleveland, Ohio

Yakk wrote:
Eebster the Great wrote:And the decimal (base-ten) logarithm is sometimes called the "common log," despite the fact that it is the least commonly used of the three.
"Common" as in "base" or "without value"?

I don't understand your question. It was called the "common logarithm" because at one point it was commonly used to expedite multiplication. Base-ten log tables were valuable when working with a base-ten number system.

Pfhorrest wrote:
Eebster the Great wrote:
Yakk wrote:And to be even worse, there are 3 things that go by the term "log".

Worse still, there are two different definitions of entropy, based on the two different bases used in information theory and physics.

If I recall, the mathematical expressions of them are the same however, which is why Shannon named his entropy after the thermodynamics concept to begin with.

They are directly proportional, yes, with the proportionality constant [imath]- \frac{k_B}{\ln{2}}[/imath], where kB is the Boltzmann constant. It's true that they really are the same thing, but they are calculated slightly differently in different contexts, which could cause additional confusion if you already aren't sure which log to use.

Yakk
Poster with most posts but no title.
Posts: 11115
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

Eebster the Great wrote:
Yakk wrote:
Eebster the Great wrote:And the decimal (base-ten) logarithm is sometimes called the "common log," despite the fact that it is the least commonly used of the three.
"Common" as in "base" or "without value"?
I don't understand your question. It was called the "common logarithm" because at one point it was commonly used to expedite multiplication. Base-ten log tables were valuable when working with a base-ten number system.
Yes. But using "common" to mean "base" or "without value" in that context is funnier.
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

Eebster the Great
Posts: 3409
Joined: Mon Nov 10, 2008 12:58 am UTC
Location: Cleveland, Ohio

Yakk wrote:
Eebster the Great wrote:
Yakk wrote:
Eebster the Great wrote:And the decimal (base-ten) logarithm is sometimes called the "common log," despite the fact that it is the least commonly used of the three.
"Common" as in "base" or "without value"?
I don't understand your question. It was called the "common logarithm" because at one point it was commonly used to expedite multiplication. Base-ten log tables were valuable when working with a base-ten number system.
Yes. But using "common" to mean "base" or "without value" in that context is funnier.

Well but technically it's funny because of the other meanings of "base" and "value," not because of the other meanings of "common."

gmalivuk
GNU Terry Pratchett
Posts: 26726
Joined: Wed Feb 28, 2007 6:02 pm UTC
Location: Here and There
Contact:

Not if it's meant as a criticism of the uselessness of base-10 logarithms.
Unless stated otherwise, I do not care whether a statement, by itself, constitutes a persuasive political argument. I care whether it's true.
---
If this post has math that doesn't work for you, use TeX the World for Firefox or Chrome

(he/him/his)

ganglion
Posts: 9
Joined: Wed Oct 20, 2010 5:23 am UTC

A few days ago I decided to have a go at writing a cgi app which would implement the idea from the comic, so people could pick their own. It's also quite fun looking at a grid of randomly words and trying to pick a set which make a good picture. The link is here:

http://highfellow.org/misc/chooser/chooser-intro.html

It's somewhat configurable - you can change the number of choices per word and the number of words, as well as the min and max word lengths.

I decided to set the default number of words to 3 rather than 4, because it's quicker to type and still takes well over a year to brute-force at 1000 hits per sec.

The code is here:

http://highfellow.org/misc/chooser.tgz

You can use it however you wish, but a credit would be nice.

If anyone has already done this, sorry - I gave up after the second or third page of the thread history.

IsilZha
Posts: 8
Joined: Mon Aug 15, 2011 2:08 pm UTC

Well, a bit delayed, mostly because I forgot I posted here...

gmalivuk wrote:
Sure, and then be unable to login quickly and securely on another computer.

Never had a problem with it. No, I do not copy paste it from somewhere, I do know my password.

Jorpho wrote:Well, ALT+[four numpad characters] can pop up something outside of the ASCII keyspace, right? That's an interesting idea.

Yes, that's exactly how it works.

gmalivuk wrote:How is that more effective than just including "alt0151" or whatever at that point in your password? With the added benefit that it'll still work for password fields that don't accept full unicode input.

This was pretty effectively covered. It provides a much larger key space per character, and, as mentioned before, I know of no brute force cracker that actually even searches the full unicode key space. (Not saying it's not possible, but I've never seen one.)

gmalivuk wrote:It seems rather foolish to rely on the assumption that attackers will never use the characters you're using.

Rely on it? No. Added benefit that it's highly likely that no one attempting to break your password will even include the appropriate key space in their search? Hell yes.

Once again, the entropy values that have been discussed through most of this thread refer to the difficulty of cracking a password *after* knowing the algorithm used to generated it. And knowing someone has a unicode character in their password means we *will* include those in our search. At which point it turns out this method isn't any more secure than just typing the code for the same character.

Right, but I don't go around telling possible attackers that there are unicode characters in my password. Regardless, it still gives them a larger key space to work through, making it that much harder. So the "downside" of giving a would-be attacker that information still leaves them with this: my password will still take exponentially more time to crack than virtually any other password of similar length. Which is really the whole idea of combating against brute force: make it take impractically long to crack.

In all likely-hood, a would-be attacker isn't going to be anyone I know or ever even talk to, and they'll literally spend eternity and never crack it.

Now, on the other hand, I could say that I use Unicode characters, and not actually do so. Now the attacker is still spending even more time in any password cracking attempts regardless of weather I actually do or not. If they don't, they risk the first situation: they will never break my password.

This sounds like a win-win for me.

I agree that the original poster was overstating the security, but I think they do have a valid point about using unicode characters.

I was simply pointing out that in most (if not all) cases, a brute force attack won't even include the appropriate key space - in those cases, it literally will never break the password.

In the cases that they actually do - entropy is significantly increased anyway, making the time it takes to crack it even more impractical - so much so it's effectively unbreakable.

Few attackers actually use a truly brute force approach. If they did, then they would try every possible binary permutation. An attacker usually assumes some information about the password so that they can reduce the key space and thus the complexity of breaking a password. Even if they are just assuming [a-zA-Z0-9]. Its all about information. The more information an attacker knows, the more an attacker can reduce the key space.

The larger the key space, the longer it will take on average for an attacker to discover your password. However, the attacker could get extremely lucky and pick your password correctly the first time. So choosing a larger key space does not guarantee security, it just decreases the probability that an attacker will crack your password in a reasonable time.

When calculating the entropy the assumption is that an attacker only knows what you knew at the time you generated the password. The key space from which you picked your password, and hopefully you picked your password completely randomly from this key space. That is a good conservative estimate, because if the attacker knows anything more you are probably screwed anyways. So if we consider the worst case scenario to be that the attacker knows only the algorithm both passwords below have the same security.

But that is not the only information we know. We know that attackers are more likely to choose certain key spaces than others. Well at least we can make an educated guess on the probability. I think its fair to say the probability of an attacker choosing a key space that contains unicode characters is lower than them excluding them. So you have used information you know about crackers to decrease the probability of your password being cracked. But in the "worst case" scenario your password still has enough entropy to be secure.

Granted there are still issues with memorization and actually typing unicode characters especially on mobile devices. But if you know something about an attacker, why not use it against them?

This hit the nail on the head. And as I mentioned before, the fact that I even suggest that I use Unicode characters in my passwords(s) has a psychological effect: Do they take my word for it, and are now looking at a significantly increased time frame to brute force it, or do they call my bluff, and risk running a futile effort for a password that their cracker will literally never solve?

As for ease of use - I've never had a problem actually using it in what I use those types of passwords for. I'm sure I'm not the only one that considers what I'm protecting and how strong of a password I'm going to use. I sure as hell don't need it for some silly internet forum account. (Or do I? )

Eebster the Great
Posts: 3409
Joined: Mon Nov 10, 2008 12:58 am UTC
Location: Cleveland, Ohio

IsilZha wrote:
gmalivuk wrote:How is that more effective than just including "alt0151" or whatever at that point in your password? With the added benefit that it'll still work for password fields that don't accept full unicode input.

This was pretty effectively covered. It provides a much larger key space per character, and, as mentioned before, I know of no brute force cracker that actually even searches the full unicode key space. (Not saying it's not possible, but I've never seen one.)

It provides a larger key space "per character," but no larger "per input time," as it is no different from simply including "0151." Yes, 0151 is four characters as opposed to just one, but it if anything takes less time to type.

And there is no reason a cracker would need to search the full unicode keyspace (which is enormous) when she could just search the ones that can be input with Windows alt-codes.

Once again, the entropy values that have been discussed through most of this thread refer to the difficulty of cracking a password *after* knowing the algorithm used to generated it. And knowing someone has a unicode character in their password means we *will* include those in our search. At which point it turns out this method isn't any more secure than just typing the code for the same character.

Right, but I don't go around telling possible attackers that there are unicode characters in my password. Regardless, it still gives them a larger key space to work through, making it that much harder. So the "downside" of giving a would-be attacker that information still leaves them with this: my password will still take exponentially more time to crack than virtually any other password of similar length. Which is really the whole idea of combating against brute force: make it take impractically long to crack.

You just told everybody on this site that you do this. Besides, if you do it, there are probably others out there that do it too.

And it does not give them a "larger key space," assuming your password includes just a single non-ASCII character. In that case it is--as I said before--no higher entropy than a four-digit string (lower entropy actually, as not every four-digit alt-code returns a usable unicode character).

Again, it really comes down to the fact that a good password-generating scheme is secure even if people know you are using it.

Yakk
Poster with most posts but no title.
Posts: 11115
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

It adds the "alt down" and "alt up" balanced non-nesting "keys" to the password. The keys between alt-down and alt-up must be digits.

I'd estimate the increase in entropy from adding this to be somewhere on the order of 2%?
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

MrRubix
Posts: 49
Joined: Sun Jul 27, 2008 2:59 pm UTC

It really doesn't matter -- having longer passwords with larger keyspaces will be harder to crack even if the cracker knows your password is complex. It's always a good idea, though, to understand how most crackers work so you can avoid making common mistakes.

An ideal password would be a complete mess -- a long, random string of letters, cases, and symbols from a large keyspace. This way, no pattern-reliant cracker would be able to take advantage of anything, and would be forced to use hugely impractical brute-force methods to crack it. But such a password is hard to remember. So we aim for some ideal tradeoff. Of course, in doing so, you risk relying on patterns that a cracker can rely on and factor into his search approach based on statistics and common patterns (e.g. "You may be using a larger keyspace, but most people just tack stuff on the end, like mypassword1 or abc123Ω").

Some sites/interfaces don't allow you to use certain characters anyway, or force you to change your password every so often, resulting in a huge pain in the ass ("previous password + 1") and not much safer security at all if you're resorting to easily-memorable passwords. Long, annoying passwords also take longer to type out, which can be a pain if you're using it often.

Even if you try to make your password leet by using something like \$up3rC00lp455w0rd, one could make a password cracker that does different permutations of leet-substitutions and get your password that way.

IMO the best, practical approach is to generate a random password until you get something you can type comfortably, and then fix a memory-pnemonic story to it after the fact.

I'll generate a random 20-length pass: w&v2hGFZG9bj{zpo5X)

Story for memory: magic w(and) + velocity 2h (physics) + GO FAT ZEBRAS GO + nine blowjobs (from said zebras) + {zippo lighter setting tilde on fire 5 TIMES)
It's a weird story, but using it, I can recall the original pass from memory: w&v2hGFZG9bj{zpo5X)

Yakk
Poster with most posts but no title.
Posts: 11115
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

Except writing the actual story would be a better password than the gibberish short form.

Magic wand with velocity 2 times height makes fat zebras go...

You'll note that I added connective words. These don't make the password worse (it makes it worse for the length, but worse than not having the words), while making it easier to remember.

This method (of long pass phrases based off random gibberish) is both easier to remember for humans, and stronger than the original short(er) gibberish. Any attack that goes after the long pass phrases can "compactify" them for only a small amount of more entropy and generate short gibberish phrases. Your memorized algorithm to turn your phrases into your short gibberish is something humans are worse at remembering than they are at remembering words, so it is an inefficient use of your memory -- meanwhile, programs are quite good at going through such compactification algorithms.
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

MrRubix
Posts: 49
Joined: Sun Jul 27, 2008 2:59 pm UTC

Yakk wrote:Except writing the actual story would be a better password than the gibberish short form.

Magic wand with velocity 2 times height makes fat zebras go...

You'll note that I added connective words. These don't make the password worse (it makes it worse for the length, but worse than not having the words), while making it easier to remember.

This method (of long pass phrases based off random gibberish) is both easier to remember for humans, and stronger than the original short(er) gibberish. Any attack that goes after the long pass phrases can "compactify" them for only a small amount of more entropy and generate short gibberish phrases. Your memorized algorithm to turn your phrases into your short gibberish is something humans are worse at remembering than they are at remembering words, so it is an inefficient use of your memory -- meanwhile, programs are quite good at going through such compactification algorithms.

While this is a good, correct point, it's a bit impractical to have such a long password like that, especially if you have to use it often. Having the mnemonic describe something smaller yet sufficiently strong is an arguably favorable tradeoff if we're evaluating the quality of a password by its strength against crackers, ease of memorization, and ease of inputting.

Having a long sentence password is still prone to dictionary attacks because "magic wand" may as well be MW. At some point we need a decent way to memorize large chunks of entropy. While you can do this with full sentences, they're a bitch to type. If your memory can handle it, I see nothing wrong with memorizing a complex 20-character pass with a mnemonic story.

superluser
Posts: 16
Joined: Wed Aug 17, 2011 5:36 am UTC

Yakk wrote:It adds the "alt down" and "alt up" balanced non-nesting "keys" to the password. The keys between alt-down and alt-up must be digits.

I'd estimate the increase in entropy from adding this to be somewhere on the order of 2%?

Just because your password has higher entropy doesn't make it more difficult to compromise. For example, I think you could probably use only two words if they're both very uncommon, and achieve much higher entropy than the four common words like battery, correct, horse, and staple. What if, for example, you used the words ossifrage and squeamish?

Your password would have very high entropy. Also, it would be guessed in no time at all. There's a difference between making something secure and making something difficult to compromise, like our dyspeptic bird of prey. Adding unicode characters could actually price a password cracker out of even simple passwords, as the calculations may be (actually, are probably) based on the 6.5 bits of printable ASCII. And there can be passwords that are easy to guess but ^[Abcd would be literally unguessable for anyone trying to telnet into your server, effectively disabling that route of entry for attackers.

TheGrammarBolshevik
Posts: 4878
Joined: Mon Jun 30, 2008 2:12 am UTC
Location: Going to and fro in the earth, and walking up and down in it.

Why would you expect to get more entropy from restricting the space to uncommon words?
Nothing rhymes with orange,
Not even sporange.

IsilZha
Posts: 8
Joined: Mon Aug 15, 2011 2:08 pm UTC

Eebster the Great wrote:
IsilZha wrote:
gmalivuk wrote:How is that more effective than just including "alt0151" or whatever at that point in your password? With the added benefit that it'll still work for password fields that don't accept full unicode input.

This was pretty effectively covered. It provides a much larger key space per character, and, as mentioned before, I know of no brute force cracker that actually even searches the full unicode key space. (Not saying it's not possible, but I've never seen one.)

It provides a larger key space "per character," but no larger "per input time," as it is no different from simply including "0151." Yes, 0151 is four characters as opposed to just one, but it if anything takes less time to type.

Are you sure about that? I don''t think you actually bothered to do the math. :p

8 characters with a character set of 94 = 94^8 = 6,095,689,385,410,820 possible combinations.
12 characters with a character set of 94 = 12^8 = 475,920,314,814,253,000,000,000 possible combinations.

8 characters with unicode characters, we won't even include the entire space, we'll assume a character set size of 1000 in total:
1000^8 = 1,000,000,000,000,000,000,000,000 possible combinations. More than double the key space of simply adding "0151" rather than the alt code.

In other words: you are completely wrong.

And there is no reason a cracker would need to search the full unicode keyspace (which is enormous) when she could just search the ones that can be input with Windows alt-codes.

...and? It's still a larger key space (see above.) A key space that most don't even attempt.

You just told everybody on this site that you do this.

And?

Besides, if you do it, there are probably others out there that do it too.

Assumption. Regardless, this completely side-steps the actual issue: how many brute force attacks even include it in their character set search?

And it does not give them a "larger key space," assuming your password includes just a single non-ASCII character. In that case it is--as I said before--no higher entropy than a four-digit string (lower entropy actually, as not every four-digit alt-code returns a usable unicode character).

I already blatantly proved you wrong. You would've known this if you had bothered to do the math instead of just stating your off-the-cuff guesswork as fact.

Again, it really comes down to the fact that a good password-generating scheme is secure even if people know you are using it.

This is exactly what I was saying. A) They already have to brute force my PW because dictionary attacks won't work. B) Any would-be attacker very likely will never have met or talked to me, likely meaning that they will not be searching the Unicode space - making it unbreakable to them. C) Even knowing all this, they're still stuck brute forcing for longer than their lifetime to break it, so the whole exercise is rather futile.

PS: My most secure password has:
140,935,105,818,184,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000 possible combinations.

Enjoy brute forcing it until the sun runs out of fuel?

superluser
Posts: 16
Joined: Wed Aug 17, 2011 5:36 am UTC

TheGrammarBolshevik wrote:Why would you expect to get more entropy from restricting the space to uncommon words?

There are 600,000 words in the OED. If we restrict the dictionary to only the 598,000 words that aren't included in the 2**11 words Randall assumes are common, each word now gets 2**19 bits of entropy (modulo good RNG).

I'll admit, I was being flip when I said that you could use only two words. Two uncommon words would get you only 39 bits of entropy (19 bits per word, one order bit), while four common words would get you 46 bits (eleven bits per word, two order bits). You'd have to add another 6.5 bits of entropy (since you get half a bit from adding another thing to order)to make up the difference. So add a random printable ASCII character, and you'll have it.

But the point that I was trying to make is that just because something is high entropy doesn't make it hard to defeat. And just because something's hard to defeat, it doesn't make it high entropy. Usually, low-entropy passwords are easy to defeat, and it's not really a very safe bet to assume that a low-entropy password that's hard to defeat will remain so, but they are not one in the same.

Eebster the Great
Posts: 3409
Joined: Mon Nov 10, 2008 12:58 am UTC
Location: Cleveland, Ohio

Yakk wrote:It adds the "alt down" and "alt up" balanced non-nesting "keys" to the password. The keys between alt-down and alt-up must be digits.

No it doesn't, because again, the assumption is that we know there is one alt-code in the password, and this code necessarily consists of an alt-down, four digits (if fewer than four, this is the same as using leading zeroes), then alt-up. So there are 10,000 possible characters that could be included (actually less than that, but I'm rounding), which is the same as the number of four-digit combinations.

IsilZha wrote:
Eebster the Great wrote:
IsilZha wrote:
gmalivuk wrote:How is that more effective than just including "alt0151" or whatever at that point in your password? With the added benefit that it'll still work for password fields that don't accept full unicode input.

This was pretty effectively covered. It provides a much larger key space per character, and, as mentioned before, I know of no brute force cracker that actually even searches the full unicode key space. (Not saying it's not possible, but I've never seen one.)

It provides a larger key space "per character," but no larger "per input time," as it is no different from simply including "0151." Yes, 0151 is four characters as opposed to just one, but it if anything takes less time to type.

Are you sure about that? I don''t think you actually bothered to do the math. :p

8 characters with a character set of 94 = 94^8 = 6,095,689,385,410,820 possible combinations.
12 characters with a character set of 94 = 12^8 = 475,920,314,814,253,000,000,000 possible combinations.

8 characters with unicode characters, we won't even include the entire space, we'll assume a character set size of 1000 in total:
1000^8 = 1,000,000,000,000,000,000,000,000 possible combinations. More than double the key space of simply adding "0151" rather than the alt code.

In other words: you are completely wrong.

This calculation is utter bullshit. Did you even read the discussion? The comparison was between a password including a unicode character input via Windows alt-code and the same password with that character replaced by four decimal digits. Nobody suggested every single character in the password should be chosen from the set of all unicode characters; that would take forever to type (and again, would be no more secure than just a string of four times as many digits, and also no easier to remember, as they are in fact the same thing).

Assumption. Regardless, this completely side-steps the actual issue: how many brute force attacks even include it in their character set search?

There only needs to be one such attacker who does this. It doesn't even need to be an attacker targeting you; it could be one targeting the entire database. You might say it is unlikely you would get attacked, but then arguing about password security is pointless in the first place. And as has been said a billion times, relying on security by obscurity is generally a bad idea.

And it does not give them a "larger key space," assuming your password includes just a single non-ASCII character. In that case it is--as I said before--no higher entropy than a four-digit string (lower entropy actually, as not every four-digit alt-code returns a usable unicode character).

I already blatantly proved you wrong. You would've known this if you had bothered to do the math instead of just stating your off-the-cuff guesswork as fact.

And if you thought before you typed, you wouldn't have to repeatedly post this annoying, smug "lol you're so stupid, you didn't calculate" shit.

PS: My most secure password has:
140,935,105,818,184,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000 possible combinations.

Enjoy brute forcing it until the sun runs out of fuel?

Supposing your password is eight completely random printable ASCII characters (and as I understand it they are not in any sense completely random, but I'll give you the benefit of the doubt here) and a randomly chosen alt-code, of which there are a maximum of 10,000, the actual number of combinations is 94^8 * 10000 = 60956893854108160000, or just under 67 bits. Now, that really should be plenty of entropy for a password, but it's nothing like what you think it is.

gmalivuk
GNU Terry Pratchett
Posts: 26726
Joined: Wed Feb 28, 2007 6:02 pm UTC
Location: Here and There
Contact:

superluser wrote:Two uncommon words would get you only 39 bits of entropy (19 bits per word, one order bit), while four common words would get you 46 bits (eleven bits per word, two order bits).
Why do you think you need "order bits"? If order didn't matter, we'd *subtract* that many bits from the total entropy, but adding up the per-word entropy already assumes order matters.

But the point that I was trying to make is that just because something is high entropy doesn't make it hard to defeat.
It makes it hard to defeat through any sort of brute force attack by an attacker who knows how you generated your password. So please explain what you mean by hard to defeat, since it's obviously something different.

IsilZha wrote:More than double the key space of simply adding "0151" rather than the alt code.
Wowee! A whole one single bit of additional entropy! What an amazing password technique!

What this basically amounts to is adding a little more entropy because we don't know whether the alt key is pressed down or not for different parts of the password. That's the only additional thing unicode gets you, apart from security through obscurity which is highly unrecommended.

You just told everybody on this site that you do this.
And?'
And so now we all know how you generate your passwords. There goes the obscurity part of your "security".

Besides, if you do it, there are probably others out there that do it too.
Assumption. Regardless, this completely side-steps the actual issue: how many brute force attacks even include it in their character set search?
So you call someone out for assuming that some others use the same technique, but then go on to restate your own assumption that attackers don't?

And it does not give them a "larger key space," assuming your password includes just a single non-ASCII character. In that case it is--as I said before--no higher entropy than a four-digit string (lower entropy actually, as not every four-digit alt-code returns a usable unicode character).
I already blatantly proved you wrong. You would've known this if you had bothered to do the math instead of just stating your off-the-cuff guesswork as fact.
A single alt-code is a single alt-code, and the only difference between using unicode or not is whether the alt key is pushed down at the time. Which adds one bit of entropy.

They already have to brute force my PW because dictionary attacks won't work.
Dictionary attacks are a kind of brute-force attack. And claiming the only two options are dictionary attack and "brute force" just proves how little you know about the whole topic.

I bet you a million dollars it really really doesn't.
Unless stated otherwise, I do not care whether a statement, by itself, constitutes a persuasive political argument. I care whether it's true.
---
If this post has math that doesn't work for you, use TeX the World for Firefox or Chrome

(he/him/his)

Yakk
Poster with most posts but no title.
Posts: 11115
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

If you have a reasonably high-entropy password (ie, a password selected from a ridiculously huge set), the odds you will get an easy to get password are lower than the odds that there will be a glitch in the computer that asks the password caused by a cosmic ray hitting the computer and it just lets the person guessing the wrong password in.

And that isn't hyperbole.

The ability to pick a single element from some huge set and say "you said 1000 random alphabetical characters was hard to crack -- but aaaaaaaaa...aaaaaa is easy to guess" is a failure to understand probability. (4700 bits of entropy in 1000 random alphabetical characters. Adding case gives you another +1000 bits. 2^5700 is large enough that we are in the cosmic ray accident territory.)
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

fagricipni
Posts: 41
Joined: Thu Nov 04, 2010 7:32 pm UTC

IsilZha wrote:
Jorpho wrote:Well, ALT+[four numpad characters] can pop up something outside of the ASCII keyspace, right? That's an interesting idea.

Yes, that's exactly how it works.

gmalivuk wrote:How is that more effective than just including "alt0151" or whatever at that point in your password? With the added benefit that it'll still work for password fields that don't accept full unicode input.

This was pretty effectively covered. It provides a much larger key space per character, and, as mentioned before, I know of no brute force cracker that actually even searches the full unicode key space. (Not saying it's not possible, but I've never seen one.)

gmalivuk wrote:It seems rather foolish to rely on the assumption that attackers will never use the characters you're using.

Rely on it? No. Added benefit that it's highly likely that no one attempting to break your password will even include the appropriate key space in their search? Hell yes.

Once again, the entropy values that have been discussed through most of this thread refer to the difficulty of cracking a password *after* knowing the algorithm used to generated it. And knowing someone has a unicode character in their password means we *will* include those in our search. At which point it turns out this method isn't any more secure than just typing the code for the same character.

Right, but I don't go around telling possible attackers that there are unicode characters in my password. Regardless, it still gives them a larger key space to work through, making it that much harder. So the "downside" of giving a would-be attacker that information still leaves them with this: my password will still take exponentially more time to crack than virtually any other password of similar length. Which is really the whole idea of combating against brute force: make it take impractically long to crack.

In all likely-hood, a would-be attacker isn't going to be anyone I know or ever even talk to, and they'll literally spend eternity and never crack it.

Now, on the other hand, I could say that I use Unicode characters, and not actually do so. Now the attacker is still spending even more time in any password cracking attempts regardless of weather I actually do or not. If they don't, they risk the first situation: they will never break my password.

This sounds like a win-win for me.

I agree that the original poster was overstating the security, but I think they do have a valid point about using unicode characters.

I was simply pointing out that in most (if not all) cases, a brute force attack won't even include the appropriate key space - in those cases, it literally will never break the password.

In the cases that they actually do - entropy is significantly increased anyway, making the time it takes to crack it even more impractical - so much so it's effectively unbreakable.

Few attackers actually use a truly brute force approach. If they did, then they would try every possible binary permutation. An attacker usually assumes some information about the password so that they can reduce the key space and thus the complexity of breaking a password. Even if they are just assuming [a-zA-Z0-9]. Its all about information. The more information an attacker knows, the more an attacker can reduce the key space.

But that is not the only information we know. We know that attackers are more likely to choose certain key spaces than others. Well at least we can make an educated guess on the probability. I think its fair to say the probability of an attacker choosing a key space that contains unicode characters is lower than them excluding them. So you have used information you know about crackers to decrease the probability of your password being cracked. But in the "worst case" scenario your password still has enough entropy to be secure.

Granted there are still issues with memorization and actually typing unicode characters especially on mobile devices. But if you know something about an attacker, why not use it against them?

This hit the nail on the head. And as I mentioned before, the fact that I even suggest that I use Unicode characters in my passwords(s) has a psychological effect: Do they take my word for it, and are now looking at a significantly increased time frame to brute force it, or do they call my bluff, and risk running a futile effort for a password that their cracker will literally never solve?

As for ease of use - I've never had a problem actually using it in what I use those types of passwords for. I'm sure I'm not the only one that considers what I'm protecting and how strong of a password I'm going to use. I sure as hell don't need it for some silly internet forum account. (Or do I? )

* lg is log base 2, and I have rounded things down; eg, lg 24=4.58... is rounded down to 4.5, not up to 4.6, because I want to always underestimate the entropy.

superluser
Posts: 16
Joined: Wed Aug 17, 2011 5:36 am UTC

gmalivuk wrote:
But the point that I was trying to make is that just because something is high entropy doesn't make it hard to defeat.
It makes it hard to defeat through any sort of brute force attack by an attacker who knows how you generated your password. So please explain what you mean by hard to defeat, since it's obviously something different.

High entropy will make a password difficult to brute force. But if there are deficiencies in the security system, certain random numbers may be less secure, despite being high entropy (for an unlikely example, a PKI pair of numbers that are relatively prime but each have easily-guessable factors). All I'm saying is that just because your RNG barfs it out doesn't automatically make it secure. Heck, that's why BN_generate_prime has a safe flag.

gmalivuk
GNU Terry Pratchett
Posts: 26726
Joined: Wed Feb 28, 2007 6:02 pm UTC
Location: Here and There
Contact:

Oh yeah, it definitely has a pretty good chance of making your password slightly more secure when you use something attackers are in practice less likely to be using. It's just that you shouldn't rely on that overmuch if it makes a difference to whether you consider something "strong enough" for your purposes.

As you said: always *underestimate* entropy. Assume the attacker knows everything about your password generating scheme, apart from the actual results of your random number generator. If you *still* have enough entropy in that case to feel safe, go ahead and use the password. But if you only feel safe enough when you also assume that attackers are unlikely to guess your method, then your password is relying on security through security and should be rejected.

superluser wrote:All I'm saying is that just because your RNG barfs it out doesn't automatically make it secure.
Well yeah. In the really really unlikely event that your random 12-character ASCII string ends up being an English word, you should definitely reject it, because then it's part of a space attackers are likely to check before they bother getting to the brute force step.

gmalivuk wrote:
I bet you a million dollars it really really doesn't.
To expand on this a bit: a password from a space that size means that you've got about 24 completely random alt+#### unicode characters. Which means you're basically remembering a full 96-digit random number (okay, apparently slightly less than that because yours is closer to 10^95 than 10^96, but whatever).

This is possible, I suppose, since it's possible to remember 10 10-digit phone numbers (though difficult when none of them have the same area code and you can't look them up in the phone book when you're unsure if you remember correctly). I just don't believe your claim that your password has that much entropy.
Unless stated otherwise, I do not care whether a statement, by itself, constitutes a persuasive political argument. I care whether it's true.
---
If this post has math that doesn't work for you, use TeX the World for Firefox or Chrome

(he/him/his)

Pfhorrest
Posts: 5374
Joined: Fri Oct 30, 2009 6:11 am UTC
Contact:

gmalivuk wrote:
superluser wrote:All I'm saying is that just because your RNG barfs it out doesn't automatically make it secure.
Well yeah. In the really really unlikely event that your random 12-character ASCII string ends up being an English word, you should definitely reject it, because then it's part of a space attackers are likely to check before they bother getting to the brute force step.

This is basically what I was arguing a few pages ago in this thread, and someone argued back that this concern wasn't very important on balance. My argument was basically that attackers are going to check commonly-used-pattern spaces first and uncommonly-used-pattern spaces later, and only brute-force enumerate every remaining possible string in the character space when they have exhausted all of those; therefore, it doesn't only matter what size the space of possible passwords from your pattern is (the entropy), but how commonly used your pattern is; an eight-character initialism of a sentence you found personally significant in an obscure book you read once has low entropy given that the attacker is trying just that pattern (initialisms of memorable sentences in books you've read), but given how far down the attacker's list of patterns to try that will be (after English words, capitalized and uncapitalized, with one number after them, with two numbers, with three, etc... with the numbers before, etc... etc... initialisms of famous quotes from popular movies, books, etc...), it's closer to a completely random string of eight letters than it is to an eight-letter English word.

The point of my argument was that, for a given character set and length, you really do have a trade-off between the ease-of-use of a pattern (which will make it more commonly used and thus a higher-priority target, the epitome of such being single common words) and the security of the patten (which will require it use harder-to-use and thus less-common patterns, the epitome of such being something from the space of passwords that don't match any other pattern).
Forrest Cameranesi, Geek of All Trades
"I am Sam. Sam I am. I do not like trolls, flames, or spam."
The Codex Quaerendae (my philosophy) - The Chronicles of Quelouva (my fiction)

superluser
Posts: 16
Joined: Wed Aug 17, 2011 5:36 am UTC

gmalivuk wrote:
superluser wrote:All I'm saying is that just because your RNG barfs it out doesn't automatically make it secure.
Well yeah. In the really really unlikely event that your random 12-character ASCII string ends up being an English word, you should definitely reject it, because then it's part of a space attackers are likely to check before they bother getting to the brute force step.

It's entertaining when people quote other people, say This is basically what I'm saying,'' and then violently disagree with it. So I'll do that.

This is basically what I'm saying. (1) Run some sanity checks on the output of your RNG, and (2) if you know a blind spot in common or naïve attacks, and you can have sufficient entropy when hiding out in those blind spots, it can't hurt to hang out there.

It does emphatically require you to strictly observe the same rules about high-entropy values for your passwords. So paßword' is not going to be a better password, but `correct battery hor∫e staple' might make it difficult for script kiddies to defeat your password. APTs may still get by, but a higher entropy password may not make it more difficult for APTs to compromise a system, especially when there are other passwords, trojan PDFs, suspiciously new smoke detectors, &c. to try instead.

(wait. As eszet is alt-s on a Mac, but a long s is alt-b?)

Eebster the Great
Posts: 3409
Joined: Mon Nov 10, 2008 12:58 am UTC
Location: Cleveland, Ohio

I think the odds of you getting an easily-crackable password from a good algorithm are extraordinarily small though, so most people just don't worry about that. They are, in fact, the same as the odds of a brute forcer stumbling onto your secure password anyway, so if you aren't worried about the one, there's no reason to worry about the other either.

It's all a game of probability no matter how you look at it.

superluser
Posts: 16
Joined: Wed Aug 17, 2011 5:36 am UTC

Eebster the Great wrote:I think the odds of you getting an easily-crackable password from a good algorithm are extraordinarily small though, so most people just don't worry about that. They are, in fact, the same as the odds of a brute forcer stumbling onto your secure password anyway, so if you aren't worried about the one, there's no reason to worry about the other either.

Depends on the type of password. If you're generating primes (as in my previous example), the output of a true RNG is

(1) increasingly less likely to be a prime as the size of the password increases
(2) increasingly less likely to be a safe prime
(3) increasingly less likely to be a strong prime (though this requirement has been largely superseded)

The more constraints you put on this, the less entropy the result will have, but the more secure the password will (probably) be. Again, you're not worried about brute force, you're worried about shortcuts.

gmalivuk
GNU Terry Pratchett
Posts: 26726
Joined: Wed Feb 28, 2007 6:02 pm UTC
Location: Here and There
Contact:

Of all the 12-character ASCII passwords, approximately 0.08% consist entirely of letters (upper and lower case). So the entropy lost by rules which prohibit such passwords is minuscule.
Unless stated otherwise, I do not care whether a statement, by itself, constitutes a persuasive political argument. I care whether it's true.
---
If this post has math that doesn't work for you, use TeX the World for Firefox or Chrome

(he/him/his)

Eebster the Great
Posts: 3409
Joined: Mon Nov 10, 2008 12:58 am UTC
Location: Cleveland, Ohio

gmalivuk wrote:Of all the 12-character ASCII passwords, approximately 0.08% consist entirely of letters (upper and lower case). So the entropy lost by rules which prohibit such passwords is minuscule.

Right, but the odds of picking such a password are exactly equally miniscule.

gmalivuk
GNU Terry Pratchett
Posts: 26726
Joined: Wed Feb 28, 2007 6:02 pm UTC
Location: Here and There
Contact:

But the reason to then reject it is that the odds of an attacker checking that password before others are very high.
Unless stated otherwise, I do not care whether a statement, by itself, constitutes a persuasive political argument. I care whether it's true.
---
If this post has math that doesn't work for you, use TeX the World for Firefox or Chrome

(he/him/his)

MrRubix
Posts: 49
Joined: Sun Jul 27, 2008 2:59 pm UTC

Ultimately, it comes down to the notion that easily-memorable passwords have some sort of non-random structure to it, and that non-random structure makes a password easier to crack, even if it's high-entropy.

The true power of high-entropy comes into play when you expand your keyspace and lack any discernible structure. You want a password that *requires* brute-force to crack *even* if the cracker were to know how you got your password (which he ideally shouldn't, but you should assume the worst), such that brute-forcing would take eons and be more trouble than it's worth.

Yakk
Poster with most posts but no title.
Posts: 11115
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

No, it doesn't come down to that notion? You can have lots of entropy with structure that makes it easy to remember.

The "pass phrase of random words, which you then connect with additional words of your choice" generates easy to remember pass phrases that have more than enough entropy.

@#())IFΦDSKM1)*(8@F style passwords are harder to remember (even with mnemonics) and have no higher security (per effort put into remembering them). Which is the point of the comic.
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

gmalivuk
GNU Terry Pratchett
Posts: 26726
Joined: Wed Feb 28, 2007 6:02 pm UTC
Location: Here and There
Contact:

MrRubix wrote:Ultimately, it comes down to the notion that easily-memorable passwords have some sort of non-random structure to it, and that non-random structure makes a password easier to crack, even if it's high-entropy.
That may be a notion some people have, but it's a completely incorrect one.
Unless stated otherwise, I do not care whether a statement, by itself, constitutes a persuasive political argument. I care whether it's true.
---
If this post has math that doesn't work for you, use TeX the World for Firefox or Chrome

(he/him/his)