What-if 0034: "Twitter"

What if there was a forum for discussing these?

Moderators: Moderators General, Prelates, Magistrates

User avatar
neremanth
Posts: 157
Joined: Wed Jul 25, 2012 4:24 pm UTC
Location: UK

Re: What-if 0034: Twitter

Postby neremanth » Tue Feb 26, 2013 6:18 pm UTC

Messysaurus wrote:Why wouldn't the number of string possibilities be 140^27 instead of 27^140 (first sentence)? My thought process: if you were to look at numbers 1-1000, there are 10 characters and with a length of 3 characters; 10^3 = 1000 possibilities, not 3^10 = ~57k. I apologize in advance if I'm missing something obvious.


You're right that with a length of 3 characters and 10 possible digits for each of those characters, there are 10^3 = 1000 possibilities. You're just getting the application wrong when it comes to the twitter situation. For the 3 character, 10 digit case, we have length = 3 and possible characters = 10, and the number of possible strings is calculated by

possible strings = (possible characters)^(length) = 10^3 = 1000

For the twitter case, we have length = 140 and possible characters = 27, so indeed

possible strings = (possible characters)^(length) = 27^140

and not 140^27.

arjan
Posts: 67
Joined: Tue Feb 26, 2013 7:48 pm UTC
Location: The Netherlands

Re: What-if 0034: Twitter

Postby arjan » Tue Feb 26, 2013 8:08 pm UTC

?ecnetnes siht ni retcarahc tsal eht sseug ot drah ti sI

(I tried a few compressors to compress Huckleberry Finn from the Gutenberg project, but they all seemed to fail miserably. 25% was the best result. None of them had the option "treat this file as plain English", though. I think they should come up with the idea themselves)
Last edited by arjan on Wed Feb 27, 2013 12:09 am UTC, edited 2 times in total.

amulshah7
Posts: 6
Joined: Mon Sep 17, 2012 4:59 am UTC

Re: What-if 0034: Twitter

Postby amulshah7 » Tue Feb 26, 2013 8:22 pm UTC

This what-if reminded me of this:

http://www.youtube.com/watch?v=DAcjV60RnRw

It's an analysis of how many possible songs there can be.

TranquilFury
Posts: 131
Joined: Thu Oct 15, 2009 1:24 am UTC

Re: What-if 0034: Twitter

Postby TranquilFury » Wed Feb 27, 2013 12:28 am UTC

MythSearcher wrote:This reminds me of 2 things:
1) The Freefall comic (can't find the right one, but it is about tweeting all possible combination of letters and dictionary words and storage capacity limits)
2) There are leap years. =_,= (super evil grin)

That conversation between Florence and Dvorak starts here:
http://freefall.purrsia.com/ff2100/fc02056.htm

Novgorod
Posts: 3
Joined: Tue Jan 08, 2013 5:35 pm UTC

Re: What-if 0034: Twitter

Postby Novgorod » Wed Feb 27, 2013 5:06 am UTC

DR6 wrote:Even then, if we assumed that only the first 100 languages are relevant, (or 10 for that matter) already makes a difference. (You end up with 10^53 and 10^52 respectively).


Actually, what's really relevant in your consideration is the increased entropy, i.e. you assume 1.2 bits per character instead of 1.1. A ~10%-change in entropy makes a 10%-change to the binary exponent (or "effective" text length) rather than to the result, so you easily change the result by orders of magnitude..
Now even in the what-if text an uncertainty range for the normal English entropy is admitted to be on the order of 20% (1.0 to 1.2 bits per character, thus the 1.1 "mean" value) and I assume this number might change with time or with the group of people (or even method) chosen to determine the entropy in the first place.
The total uncertainty of this calculation is therefore already many orders of magnitude, so that considering other languages most likely lies within that uncertainty...

User avatar
blackle
Posts: 1
Joined: Wed Feb 27, 2013 6:59 am UTC

Re: What-if 0034: Twitter

Postby blackle » Wed Feb 27, 2013 7:04 am UTC

Hi guys! This is my first post to the forums. I was inspired by the new whatif to create a bookmarklet that changes everything on twitter to either "there's a horse in aisle five" or "my house is full of traps"

Code: Select all

javascript:(function(){var elements=document.getElementsByTagName("p");for(var i=0; elements[i]; i++){var e=elements[i];if(e.className == "js-tweet-text"){e.innerHTML=((Math.random()<0.5) ? "My house is full of traps." : "There's a horse in aisle five.");}}}());

ijuin
Posts: 1111
Joined: Fri Jan 09, 2009 6:02 pm UTC

Re: What-if 0034: Twitter

Postby ijuin » Wed Feb 27, 2013 7:19 am UTC

orthogon wrote:I have to admit to not getting Twitter. I mean, I get the idea, but I hate the way that the character limit makes writers whom I admire for their lucid, erudite, clear and insightful prose in other media produce awkward, garbled and ugly tweets.

The thing that annoys me about Twitter is that once you start tweeting, everybody expects you to keep tweeting pretty much 24/7, so if you go offline for a few hours people start asking where you went and act annoyed that you didn't keep posting updates. It's like they assume that they have a God-given right to know your status at all times--sort of like those people who call your cell phone and act annoyed that you can't suddenly drop everything any time of the day or night to talk to them, just because it's a cell phone and not a landline.

computronium
Posts: 1
Joined: Wed Feb 27, 2013 12:07 pm UTC

Re: What-if 0034: Twitter

Postby computronium » Wed Feb 27, 2013 12:15 pm UTC

Very similar to the Van Loon quote on this What If is an excellent song call Randy Described Eternity by Built To Spill, which starts with these lyrics:

Every thousand years
This metal sphere
Ten times the size of Jupiter
Floats just a few yards past the earth
You climb on your roof
And take a swipe at it
With a single feather
Hit it once every thousand years
'til you've worn it down
To the size of a pea
Yeah I'd say that's a long time
But it's only half a blink
In the place you're gonna be


Tried linking to the song and the lyrics in full, but got denied as "spam". Hmm. Please go give the song a spin on YouTube.

cantab314
Posts: 45
Joined: Tue Sep 25, 2012 6:03 pm UTC

Re: What-if 0034: Twitter

Postby cantab314 » Wed Feb 27, 2013 1:31 pm UTC

YellowYeti wrote:What proportion have to spell 'lose' as 'loose' before it becomes the correct spelling?
I don't know, but I'm tipping "weary" to become the correct spelling for "wary" first.

User avatar
eculc
Wet Peanut Butter
Posts: 451
Joined: Mon Jun 27, 2011 4:25 am UTC

Re: What-if 0034: Twitter

Postby eculc » Wed Feb 27, 2013 2:38 pm UTC

cantab314 wrote:
YellowYeti wrote:What proportion have to spell 'lose' as 'loose' before it becomes the correct spelling?
I don't know, but I'm tipping "weary" to become the correct spelling for "wary" first.

If you'll notice, "Defiantly" has already overtaken "Definitely"
Um, this post feels devoid of content. Good luck?
For comparison, that means that if the cabbage guy from Avatar: The Last Airbender filled up his cart with lettuce instead, it would be about a quarter of a lethal dose.

CharlieP
Posts: 397
Joined: Mon Dec 17, 2012 10:22 am UTC
Location: Nottingham, UK

Re: What-if 0034: Twitter

Postby CharlieP » Wed Feb 27, 2013 2:59 pm UTC

There's a horse in aisle five. Good luck reassembling it from all the frozen beefburgers and lasagne though.
This is my signature. There are many like it, but this one is mine.

TimS
Posts: 6
Joined: Wed Feb 27, 2013 3:15 pm UTC

Re: What-if 0034: Twitter

Postby TimS » Wed Feb 27, 2013 3:16 pm UTC

If you have 7 billion people speaking all possible tweets, it'd only take 6.523 eternal minutes. ((10^47 seconds / 7 billion / (10^32 years)) days)

NiteClerk
Posts: 44
Joined: Wed Sep 14, 2011 4:22 pm UTC

Re: What-if 0034: Twitter

Postby NiteClerk » Wed Feb 27, 2013 4:27 pm UTC

silverkitty wrote:"To a normal English speaker, “Hi, I’m Mxyztplk” is basically indistinguishable from “Hi, I’m Mxzkqklt” "
...how many English speakers have to recognize something before it becomes "normal"?


In the 1970's Americans learned to recognize and say Zbigniew Brzezinski. (Hah. Spell check wants to change Zbigniew to ignitible.)

User avatar
bmonk
Posts: 662
Joined: Thu Feb 18, 2010 10:14 pm UTC
Location: Schitzoed in the OTT between the 2100s and the late 900s. Hoping for singularity.

Re: What-if 0034: Twitter

Postby bmonk » Wed Feb 27, 2013 6:35 pm UTC

I think this pizza fish is upside down. Would you please hand me that piano?
Having become a Wizard on n.p. 2183, the Yellow Piggy retroactively appointed his honorable self a Temporal Wizardly Piggy on n.p.1488, not to be effective until n.p. 2183, thereby avoiding a partial temporal paradox. Since he couldn't afford two philosophical PhDs to rule on the title.

DarkJMKnight
Posts: 1
Joined: Wed Feb 27, 2013 7:37 pm UTC

Re: What-if 0034: Twitter

Postby DarkJMKnight » Wed Feb 27, 2013 7:46 pm UTC

I wz srprzd no1 pntd out...

Bah, I can't do it anymore.

I was surprised that no one has pointed out...
that Twitter posts are often pre-compressed, meaning their information density is higher than 1-1.2 bits per letter.
But I suppose he was talking about 'proper English' sentences.

tomintx
Posts: 13
Joined: Tue Sep 18, 2012 8:15 pm UTC

Re: What-if 0034: Twitter

Postby tomintx » Wed Feb 27, 2013 8:27 pm UTC

Randall wrote:Hi, I’m Mxyztplk

Dammit! Now I'll need to change my password everywhere.

User avatar
Eternal Density
Posts: 5581
Joined: Thu Oct 02, 2008 12:37 am UTC
Contact:

Re: What-if 0034: Twitter

Postby Eternal Density » Thu Feb 28, 2013 3:56 am UTC

We'll never run out of things to say, but that doesn't stop us from repeating/retweeting ourselves continually.

There's a horse in aisle five. Hypothetically speaking, Summer Glau would be more likely to tweet that her house is full of traps than inform the world that there's a horse in aisle five. I guess. In any case, that possibility seems more awesome.
Play the game of Time! castle.chirpingmustard.com Hotdog Vending Supplier But what is this?
In the Marvel vs. DC film-making war, we're all winners.

Anne E Moose
Posts: 3
Joined: Mon Feb 25, 2013 1:34 am UTC

Re: What-if 0034: Twitter

Postby Anne E Moose » Thu Feb 28, 2013 8:40 am UTC

FarAlSamShaidar wrote:I wrote this all out expecting the difference to be larger than that, though now that I think about it I can see why it's not a HUGE difference. Still, off by more than 2x, it's bigger than rounding errors.

It's off by a factor of exactly* 2.2, but you've overlooked the biggest rounding error. The uncertainty in the information density of English text is +/-0.1 bits per character, which is magnified by the exponentiation and the length of a Twitter post to be +/- four orders of magnitude**.

*Well, 2.2 - 1/(2^152.9), which is close enough.
**A factor of 16384, to be precise.

JesterBLUE
Posts: 1
Joined: Tue Feb 26, 2013 1:15 pm UTC

Re: What-if 0034: Twitter

Postby JesterBLUE » Fri Mar 01, 2013 12:03 am UTC

The thing that gets me is wrestling with the sense of scale. A bird wearing away a mountain one speck at a time is clearly going to take a long time. But this is no ordinary mountain.

for example, the top 37.5% of that mountain is in outer space. That's 11 Mt. Eversts in the atmosphere and 7 Mt. Everests in space on top of it. And that's from Sea level to the top of Everest. It's almost double if you just count base to summit height.
Granted it is a fairly spiky mountain - it's slope is a bit steeper than 60 degrees, so that should cut down on it's volume, but still...

And then there's the fact that it only gets worn away a speck every 1,000 years. In that 10,000 years from the invention of writing to the present the bird has worn away 10 mm^3 of mountain. 1/10 of a ml. That's a fingernail clipping.

User avatar
addams
Posts: 10271
Joined: Sun Sep 12, 2010 4:44 am UTC
Location: Oregon Coast: 97444

Re: What-if 0034: Twitter

Postby addams » Fri Mar 01, 2013 2:08 am UTC

eculc wrote:
cantab314 wrote:
YellowYeti wrote:What proportion have to spell 'lose' as 'loose' before it becomes the correct spelling?
I don't know, but I'm tipping "weary" to become the correct spelling for "wary" first.

If you'll notice, "Defiantly" has already overtaken "Definitely"

Oh, Dear God. The poor spellers have won.
Oh. I am so sorry. They didn't intend to.

I know this; Because, I am one of 'them'.
Spelling is hard. Good spellers deserve our respect.
What they want is for us to spell well.

I do what I can. Spelling well is not something I can do.
Zen:
If a word in the dictionary were misspelled; How would we know?

It is not funny to some people. I knew a woman that could spell.
When she found out I could not spell, it stressed our relationship.

No color is as off putting as ignorance.
She saw it as willful ignorance.
It's not.

Poor Spellers Untie was a great slogan when we had no chance of winning.
It's not so funny, now.
Life is, just, an exchange of electrons; It is up to us to give it meaning.

We are all in The Gutter.
Some of us see The Gutter.
Some of us see The Stars.
by mr. Oscar Wilde.

Those that want to Know; Know.
Those that do not Know; Don't tell them.
They do terrible things to people that Tell Them.

User avatar
Klear
Posts: 1965
Joined: Sun Jun 13, 2010 8:43 am UTC
Location: Prague

Re: What-if 0034: Twitter

Postby Klear » Fri Mar 01, 2013 9:27 am UTC

JesterBLUE wrote:The thing that gets me is wrestling with the sense of scale. A bird wearing away a mountain one speck at a time is clearly going to take a long time. But this is no ordinary mountain.

for example, the top 37.5% of that mountain is in outer space. That's 11 Mt. Eversts in the atmosphere and 7 Mt. Everests in space on top of it. And that's from Sea level to the top of Everest. It's almost double if you just count base to summit height.
Granted it is a fairly spiky mountain - it's slope is a bit steeper than 60 degrees, so that should cut down on it's volume, but still...

And then there's the fact that it only gets worn away a speck every 1,000 years. In that 10,000 years from the invention of writing to the present the bird has worn away 10 mm^3 of mountain. 1/10 of a ml. That's a fingernail clipping.


That's exactly the point. To boggle the mind.

User avatar
ilduri
Posts: 43
Joined: Thu Nov 29, 2012 7:59 am UTC
Location: Canada

Re: What-if 0034: Twitter

Postby ilduri » Sat Mar 02, 2013 3:03 am UTC

The quote that came to my mind when I read this was that one from the Quran:
Were every tree on earth a pen
And were the ocean filled with ink
With seven oceans more
Even so the words of God
Would not be exhausted

(Luq'man 31:27)

(which, if you interpret it to mean "words about God," is a rather humorous description of the rate at which theology is published).

Regarding the question of using multiple languages, the amount of overlap would vary significantly depending on how we define the difference between a language and a dialect. For example, many linguists consider English and Scots and to be two different languages, because of their history, despite the fact that they're largely mutually intelligible and a number of sentences could be written identically in both. They're an example of convergent evolution. And many other languages are in similar situations.
Another thing to consider is the fact that the Chinese characters used to write a sentence in, say, Mandarin, can often be read as a meaningful sentence in other Chinese langauges like Cantonese (albeit pronounced differently), and may even be meaningful in Japanese (though probably not as a complete sentence). Thus the number of possible sentences written in Chinese script is significantly lower per language than those written in Roman script.

KarenRei
Posts: 285
Joined: Sat Jun 16, 2012 10:48 pm UTC

Re: What-if 0034: Twitter

Postby KarenRei » Mon Mar 04, 2013 8:12 am UTC

tibfulv wrote:Hm. I never knew the lands where that mountain was was supposed to be Svithjod. If I remember correctly, that's the norse name for Sweden, or as Svithjod hin mikla (Great Svithjod), Russia. Based on the Karakorum?

huanghos bookmarklet is working beautifully, too.


I don't know which particular Nordic language spells it suchly, but in Icelandic, Sweden is "Svíþjóð". With time most Nordics lost the þ and ð in favor of th and d (the latter being an especially bad shift, ð and d do not sound similar). Þjóð means nation - hence, the "Sví Nation"

sonar1313
Posts: 183
Joined: Tue Mar 05, 2013 5:29 am UTC

Re: What-if 0034: Twitter

Postby sonar1313 » Tue Mar 05, 2013 5:42 am UTC

computronium wrote:Very similar to the Van Loon quote on this What If is an excellent song call Randy Described Eternity by Built To Spill, which starts with these lyrics:

Every thousand years
This metal sphere
Ten times the size of Jupiter
Floats just a few yards past the earth
You climb on your roof
And take a swipe at it
With a single feather
Hit it once every thousand years
'til you've worn it down
To the size of a pea
Yeah I'd say that's a long time
But it's only half a blink
In the place you're gonna be


Tried linking to the song and the lyrics in full, but got denied as "spam". Hmm. Please go give the song a spin on YouTube.


Apparently the both of us registered to talk about eternity, not Twitter. What the Van Loon quote reminded me of was an even older reference to eternity from a hellfire-and-brimstone sermon in A Portrait of The Artist as a Young Man; according to the Google Books link, the Van Loon book was published in 1921, and Portrait of the Artist was published in 1916 (and serialized prior to that.) I had to read it in AP Lang, you see, and the eternity sermon is basically the one thing that's stayed with me after the rest of the book disappeared. I'm paraphrasing, but eternity is described thusly: imagine a mountain of sand a million miles high. Now imagine a bird flies to the mountain every million years and removes one grain of sand. Then when the mountain is gone, the bird once again flies to where the mountain was and replaces it, one grain at a time, every million years. When the mountain has disappeared and reappeared once, that is not even an instant in the span of eternity.

I would imagine that the idea of such a mountain to illustrate eternity predates Joyce as well. But the Van Loon quote works better for illustrating the Twitter question.

User avatar
Quicksilver
Posts: 437
Joined: Wed Apr 29, 2009 6:21 am UTC

Re: What-if 0034: Twitter

Postby Quicksilver » Tue Mar 05, 2013 7:01 am UTC

addams wrote:
eculc wrote:
cantab314 wrote:
YellowYeti wrote:What proportion have to spell 'lose' as 'loose' before it becomes the correct spelling?
I don't know, but I'm tipping "weary" to become the correct spelling for "wary" first.

If you'll notice, "Defiantly" has already overtaken "Definitely"

Oh, Dear God. The poor spellers have won.
Oh. I am so sorry. They didn't intend to.

I know this; Because, I am one of 'them'.
Spelling is hard. Good spellers deserve our respect.
What they want is for us to spell well.

I do what I can. Spelling well is not something I can do.
Zen:
If a word in the dictionary were misspelled; How would we know?

It is not funny to some people. I knew a woman that could spell.
When she found out I could not spell, it stressed our relationship.

No color is as off putting as ignorance.
She saw it as willful ignorance.
It's not.

Poor Spellers Untie was a great slogan when we had no chance of winning.
It's not so funny, now.
This may be one of the best things I've ever read on the internet ever.

User avatar
tendays
Posts: 957
Joined: Sat Feb 17, 2007 6:21 pm UTC
Location: HCMC

Re: What-if 0034: Twitter

Postby tendays » Tue Mar 05, 2013 7:36 am UTC

DR6 wrote:Ah, but we are talking about possible tweets, without accounting how probable they are.

Interestingly, as he bases his calculations on entropy, his answer really counts probable English sentences! As soon as sentences do not all have the same probability we get a value that is too low. To take an extreme example, if we had a language with 1048578 valid 140 character sentences, but where people most of the time (like 1048575 times in 1048576) used just two of them we get 2.0000273952 sentences which we'd round to two, instead of over a million. Now I wonder to what extent this affects the value for English sentences...
<Will> s/hate/love/
Hammer wrote:We are only mildly modly. :D
Beware of the shrolymerase!

Erzengel
Posts: 9
Joined: Thu Apr 26, 2012 4:36 pm UTC

Re: What-if 0034: Twitter

Postby Erzengel » Mon Mar 25, 2013 5:47 pm UTC

addams wrote:Zen:
If a word in the dictionary were misspelled; How would we know?


Fortunately, there is no "THE dictionary", there are only "dictionaries". Thus we can apply the Byzantine Generals Algorithm to the various dictionaries to determine which is the "correct" spelling. And if it's misspelled in all the dictionaries? Well, then it's not really misspelled, as they are the definitive source of correct spelling. There is no nebulous "correct spelling" in the ether, only that which we define for ourselves through common acceptance.

FarAlSamShaidar wrote:There's a rather large problem with Randall's math. Thought I'd NEVER say that. But this is not like binary in that leading (or trailing, depending on endian-ness) 0s don't affect the result. In other words, messages of 139 characters in length are wholly different than those of 140 characters. Even more so for those of 100 characters in length. Etc. If we assume that Mr. H. wants any string of characters that are English, rather than complete and logical sentences (which is, more or less, the assumption Randall makes) then even one-letter messages such as "I" and "a" are valid. Using 1.1 bits per letter then, the proper answer is 2140*1.1 + 2139*1.1 + 2138*1.1 ... + 22*1.1 + 21*1.1; or in other words (sorry, I don't really know LaTeX or if it can be used in forums) SUM(2n*1.1, 1, 140). That gives an answer of approximately 4.28*1046.

I wrote this all out expecting the difference to be larger than that, though now that I think about it I can see why it's not a HUGE difference. Still, off by more than 2x, it's bigger than rounding errors.


Thank you, I wanted to say exactly that.
Last edited by Erzengel on Mon Mar 25, 2013 7:23 pm UTC, edited 1 time in total.

User avatar
AlexTheSeal
Posts: 53
Joined: Mon Oct 26, 2009 12:57 am UTC

Re: What-if 0034: Twitter

Postby AlexTheSeal » Mon Mar 25, 2013 5:55 pm UTC

Oktalist wrote:"If a million monkeys were given a million typewriters, eventually one of them might produce the complete works of Shakespeare, but to reach it would it be worth wading through four hundred copies of 'Money' by Martin Amis?"
- Simon Munnery

I'm just a dumb American, and I don't know who Simon Munnery is, but I do know that Martin Amis is a freakin' genius. So I'd have to answer "yes."

Code: Select all

10 REM WORLD'S SMALLEST ADVENTURE GAME
20 PRINT "YOU ARE IN A CAVE (N, S, E, W)? ";
30 INPUT A$
40 GOTO 10

Lulled to sleep by the one-hertz chuckle of Linux logfile writes since 1997.

User avatar
snowyowl
Posts: 464
Joined: Tue Jun 23, 2009 7:36 pm UTC

Re: What-if 0034: "Twitter"

Postby snowyowl » Tue Apr 30, 2013 12:46 pm UTC

Does anyone know where I can find the small community that repeated the same six posts over and over in the same order for ten years?
The preceding comment is an automated response.

User avatar
addams
Posts: 10271
Joined: Sun Sep 12, 2010 4:44 am UTC
Location: Oregon Coast: 97444

Re: What-if 0034: "Twitter"

Postby addams » Fri May 03, 2013 1:13 am UTC

snowyowl wrote:Does anyone know where I can find the small community that repeated the same six posts over and over in the same order for ten years?

Why would a group of People do such a stupid thing?
All the other stupid stuff had already been done?
Life is, just, an exchange of electrons; It is up to us to give it meaning.

We are all in The Gutter.
Some of us see The Gutter.
Some of us see The Stars.
by mr. Oscar Wilde.

Those that want to Know; Know.
Those that do not Know; Don't tell them.
They do terrible things to people that Tell Them.

speising
Posts: 2354
Joined: Mon Sep 03, 2012 4:54 pm UTC
Location: wien

Re: What-if 0034: "Twitter"

Postby speising » Fri May 03, 2013 11:57 am UTC

addams wrote:
snowyowl wrote:Does anyone know where I can find the small community that repeated the same six posts over and over in the same order for ten years?

Why would a group of People do such a stupid thing?
All the other stupid stuff had already been done?



because they can. that's the only reason required.

mgold
Posts: 1
Joined: Thu Feb 12, 2015 3:35 am UTC

What-if 34: Twitter, ebook compression

Postby mgold » Thu Feb 12, 2015 3:45 am UTC

So I've tried compressing a .txt ebooks and I have yet to reach the acclaimed 1/8th size of compression. I've found bzip2 seems to compress text the best by a fair margin (vs zip, gz, 7z) and it still only achieves about 1/4th. Am I doing it wrong? Or is it just not an ideal enough case to reach 1/8th?

I used "The History of Pottery Part 1 by H. B. Walters" on Project Gutenberg since it's fairly large (over 1MB) and fairly recent (2013) so I figured there wouldn't be any weird change in character frequency from being written a long time ago.

1. I converted it from utf-8 to ascii using iconv
2. I zipped it using bzip2 --best --keep
3. I divided the filesize of the compressed file by the uncompressed and got roughly .26, not the .13ish I was hoping to

User avatar
PinkShinyRose
Posts: 834
Joined: Mon Nov 05, 2012 6:54 pm UTC
Location: the Netherlands

Re: What-if 0034: Twitter

Postby PinkShinyRose » Sun Feb 15, 2015 4:07 pm UTC

eculc wrote:
cantab314 wrote:
YellowYeti wrote:What proportion have to spell 'lose' as 'loose' before it becomes the correct spelling?
I don't know, but I'm tipping "weary" to become the correct spelling for "wary" first.

If you'll notice, "Defiantly" has already overtaken "Definitely"

Now I'm wondering if there are people counting uses of each spelling variant. I'm also wondering if spell checking software is causing language shifts to move towards increasing numbers of homonyms.

User avatar
tms
Posts: 266
Joined: Fri Sep 21, 2012 12:53 am UTC
Location: The Way of the Hedgehog

Re: What-if 34: Twitter, ebook compression

Postby tms » Fri Feb 27, 2015 1:05 pm UTC

mgold wrote:So I've tried compressing a .txt ebooks and I have yet to reach the acclaimed 1/8th size of compression. I've found bzip2 seems to compress text the best by a fair margin (vs zip, gz, 7z) and it still only achieves about 1/4th. Am I doing it wrong? Or is it just not an ideal enough case to reach 1/8th?

The mentioned compressors work without context, which is part of the point of them but makes predictions somewhat unreliable.
- No, son. I said 'duck'.
- Duck duck duck duck! Duck duck duck duck!

User avatar
Logological
Posts: 9
Joined: Tue Mar 17, 2015 3:52 pm UTC

Re: What-if 0034: "Twitter"

Postby Logological » Tue Mar 17, 2015 5:00 pm UTC

I suspect there's a small error in this What-If. The article states,
This means that a good compression algorithm should be able to compress ASCII English text—which is eight bits per letter—to about 1/8th of its original size.

However, ASCII is and has always been a 7-bit character set. (Various microcomputer manufacturers and operating system developers have devised their own 8-bit character sets whose lower 7 bits are fully or mostly identical to ASCII, but the sets as a whole were never properly referred to as ASCII, even by their creators.) The compression ratio given in the article is probably a bit off then.


Return to “What If?”

Who is online

Users browsing this forum: No registered users and 4 guests