Page 2 of 2

Re: 1953: "The History of Unicode"

Posted: Mon Feb 12, 2018 1:27 pm UTC
by cellocgw
da Doctah wrote:
Mikeski wrote:
jc wrote:I know a few who like to call themselves Mainiacs.

The only former-Bostonian I've worked with called himself a Masshole.

Kinda makes me wish there was a good self-deprecating term for "Minnesotan". Other than "Canadian".


I think you mean "Baja Canadian".


as opposed to, maybe, "Bwahaha Canadian" ?

Re: 1953: "The History of Unicode"

Posted: Mon Feb 12, 2018 1:42 pm UTC
by Godsguy
ivnja wrote:
Eternal Density wrote:(What's the word for people who live in Maine?)
Mainers, although I don't think there's technically an official demonym.

jc wrote:I know a few who like to call themselves Mainiacs.
And then there's the ultra-special breed, the Maineiacs, who you'll find screaming abuse at opposing goaltenders from the student section balcony at the Alfond every other Friday and Saturday night from mid-October to mid-February. (Go Blue!)



While the rest of the country might spell/pronounce it "Mainer", the true name is closer to "Mainah".

You people and your R's. Leave em at the bordah.....

Re: 1953: "The History of Unicode"

Posted: Mon Feb 12, 2018 2:47 pm UTC
by ebow
Godsguy wrote:While the rest of the country might spell/pronounce it "Mainer", the true name is closer to "Mainah".

You people and your R's. Leave em at the bordah.....


I'd like to see some dater to support the idear that R's need not apply.

(Or is that only the case in mainland Massachusetts, not in the District of Maine?)

Re: 1953: "The History of Unicode"

Posted: Tue Feb 13, 2018 4:39 pm UTC
by Shifty
J%r wrote:Did the image change? Somehow I only see the top rectangle.


When the cartoon was originally posted, the date on the left of the third panel was "1998" instead of "1988". I suspect correcting that was the cause of the momentary glitching.

Re: 1953: "The History of Unicode"

Posted: Tue Feb 13, 2018 6:02 pm UTC
by jc
Godsguy wrote:While the rest of the country might spell/pronounce it "Mainer", the true name is closer to "Mainah".

You people and your R's. Leave em at the bordah.....


But an R seems to have crept in somehow. It should be "boadah".

Re: 1953: "The History of Unicode"

Posted: Thu Feb 15, 2018 8:58 pm UTC
by xtifr
So yes, why does the forum block random unicode characters like Egyptian Hieroglyphs? I can understand blocking things like the bidi control characters, and I can maybe understand blocking the penis hieroglyph (and the ejaculating penis heiroglyph), but the whole code block? That's just bizarre.

In any case, though I can't post it (and that may be just as well), I do want to remind aubergine users that there are perfectly good unicode characters for the penis itself: U+130B8 and U+130BA. Not as colorful as the aubergine, perhaps, but much less ambiguous.

Re: 1953: "The History of Unicode"

Posted: Thu Feb 15, 2018 10:40 pm UTC
by Soupspoon
xtifr wrote:So yes, why does the forum block random unicode characters like Egyptian Hieroglyphs? I can understand blocking things like the bidi control characters, and I can maybe understand blocking the penis hieroglyph (and the ejaculating penis heiroglyph), but the whole code block? That's just bizarre.

Whitelisting?

Start with selected ASCII codes lower than 32 (CR/LF accepted but standardised, no BELL?) and the whole of the 32 to 126 range, then expand into allowing long-accepted Codepage/Unicode territories, certainly, but most of the Emoji pages just lower the information density.

Like I often cannot identify elements in a 'Graphics Pack'ed Dwarf Fortress screenshot, because that little (exquisitely drawn!) pictorial array of pixels makes less sense to me than a given colour of Codepage-437 character image, most of the time.

Words, though. They work well enough, most of the time, with a good application of punctuation, a limited number of standardised in-line smiley images, proper mark-up and faux mark-up ([explanation]maybe for irony[/explanation]).

Sending me an aubergine symbol would still confuse me, though I have been aware of its popular symbology (a minced reference, to get around penis-symbol blacklisting, no doubt).

Sending me the 1F64F "PERSON WITH FOLDED HANDS symbol (which you can't do, here, but if you could I'd advise you to drastically increase the font-size for me to have a hope to identify it without some form of pinch-zooming) confuses me more as it looks more like a "deep in prayer" symbol. Is that what you meant? Apparently Randall used it as "thank you", in the comic title-text example. But "Thank God! (for helicopter heroes? - though I don't see the helicopter or medal symbols myself)" might be the message I take away, which is quite a different meaning from most of the various possibilities a sender might intend.

(Heck, even some of the authorised in-line smiley symbols are ambiguous. :| )

Re: 1953: "The History of Unicode"

Posted: Thu Feb 15, 2018 11:30 pm UTC
by ucim
Problem with Unicode and Emojis is that they don't go far enough. We should have a single symbol for every possible internet thought. Think of all the typing it would save!

(Upon further reflection, we can probably have a symbol for every possible internet thought just using the upper case ASCII character set.)

Jose

Re: 1953: "The History of Unicode"

Posted: Mon Feb 19, 2018 5:38 pm UTC
by chridd
xtifr wrote:So yes, why does the forum block random unicode characters like Egyptian Hieroglyphs? I can understand blocking things like the bidi control characters, and I can maybe understand blocking the penis hieroglyph (and the ejaculating penis heiroglyph), but the whole code block? That's just bizarre.
It has problems with any character with a code above U+FFFF, i.e., any character that's more than two bytes in UTF-16 (which include, among other things, hieroglyphs and most emoji). (...which is odd, because an older version of the forum software allowed them just fine, and last I checked old posts that have the characters still display okay.)

Re: 1953: "The History of Unicode"

Posted: Mon Feb 19, 2018 6:07 pm UTC
by Soupspoon
I say that "It has problems with…" is probably more like "It has been told to refuse…". A filter that did not exist in prior forum backends, perhaps, thus messages already featuring them still do1 but the current generation of the backend is (with possibly good reasons) set up to reject characters that are technically passable, but blanket banned either as default (not unbanned, locally) or as a chosen installation option by our senior moderator(s).

And that's that, I think, until this is brought up in the forum-admin areas (public or otherwise) and the option to allow it is posited, indicated as possible and then made so by those with sufficient bit-flipping rights on the server. Not worth arguing about here, I would say (unless someone with coloured text cares to just pass by here say that this is not the case, and it's no use petitioning anybody to change something that isn't doable in the first place, for far more complex reasons than I have been guessing about).


1 OTOH, I note many "unavailable image" inserts, without having checked where it thinks the image is, maybe an obsolete 3rd-party image-store site but it seems to be consistent among early years posts all over the place so they might be once-hosted-on-here ones that didn't get ported (or not properly so).

Re: 1953: "The History of Unicode"

Posted: Thu Feb 22, 2018 11:09 pm UTC
by jc
xtifr wrote:So yes, why does the forum block random unicode characters like Egyptian Hieroglyphs? I can understand blocking things like the bidi control characters, and I can maybe understand blocking the penis hieroglyph (and the ejaculating penis heiroglyph), but the whole code block? That's just bizarre.

In any case, though I can't post it (and that may be just as well), I do want to remind aubergine users that there are perfectly good unicode characters for the penis itself: U+130B8 and U+130BA. Not as colorful as the aubergine, perhaps, but much less ambiguous.

For anyone who's curious, you can find those characters at http://www.unicode.org/charts/PDF/U13000.pdf .
.

Re: 1953: "The History of Unicode"

Posted: Thu Feb 22, 2018 11:39 pm UTC
by ucim
... and where can I find a list of unicode characters with (potentially bad) side effects on web pages, browsers, and other user agents?

Jose

Re: 1953: "The History of Unicode"

Posted: Fri Feb 23, 2018 12:45 am UTC
by Soupspoon
Well, there'd be a potential list for one security issue, that I'm not even sure that we are protected from. Or could/should be, because there are legitimate reasons for using those сhагасtегѕ. And that's just an obvious exploit. I never actually typed the plural of "character", in this post, and I made not as much effort to be subtle as I could have (you probably spotted it). Imagine that instead I was messing about with RTL and other non-printing codes that are 'allowable'?

There are people whose whole paycheque (legitimate or illicit) heavily depends upon thinking these things through. Or responding to them, hence a good part of the CVEs thing. (It was been a sideline/distraction, at most, even when I was ostensibly employed as consultant in IT security (they changed my job name, but not my workload, which I was Ok with), so I let the Big Guys earn their keep without professing anything like a similar level of expertise, or claiming any particularly significqnt hat (of any greyscale value) for myself.)

Re: 1953: "The History of Unicode"

Posted: Fri Feb 23, 2018 7:29 am UTC
by Steve the Pocket
jc wrote:
xtifr wrote:So yes, why does the forum block random unicode characters like Egyptian Hieroglyphs? I can understand blocking things like the bidi control characters, and I can maybe understand blocking the penis hieroglyph (and the ejaculating penis heiroglyph), but the whole code block? That's just bizarre.

In any case, though I can't post it (and that may be just as well), I do want to remind aubergine users that there are perfectly good unicode characters for the penis itself: U+130B8 and U+130BA. Not as colorful as the aubergine, perhaps, but much less ambiguous.

For anyone who's curious, you can find those characters at http://www.unicode.org/charts/PDF/U13000.pdf .
.

What the hell.

Forget emojis; why in Dickens's deuce is this a thing?!

Re: 1953: "The History of Unicode"

Posted: Fri Feb 23, 2018 12:06 pm UTC
by Flumble
Steve the Pocket wrote:Forget emojis; why in Dickens's deuce is this a thing?!

You know what the "uni" in unicode stands for, right? Egyptian hieroglyphs have (allegedly) been used as a script by a whole civilization.
It makes much more sense for those to be included in unicode than, say, emoji ...let alone things like skin tone modifiers. U+130B8 U+1F3FE U+1F93F U+0356

Re: 1953: "The History of Unicode"

Posted: Fri Feb 23, 2018 2:13 pm UTC
by orthogon
jc wrote:
xtifr wrote:So yes, why does the forum block random unicode characters like Egyptian Hieroglyphs? I can understand blocking things like the bidi control characters, and I can maybe understand blocking the penis hieroglyph (and the ejaculating penis heiroglyph), but the whole code block? That's just bizarre.

In any case, though I can't post it (and that may be just as well), I do want to remind aubergine users that there are perfectly good unicode characters for the penis itself: U+130B8 and U+130BA. Not as colorful as the aubergine, perhaps, but much less ambiguous.

For anyone who's curious, you can find those characters at http://www.unicode.org/charts/PDF/U13000.pdf .
.

I am disappoint that the name of that codepoint is "EGYPTIAN HIEROGLYPH D053" not "EGYPTIAN HIEROGLYPH EJACULATING PENIS". It's like they noticed that was one of the hieroglyphs, and prudishly decided to call them all by their codepoints in some other system to spare their blushes. If it's good enough for hieroglyphs, why aren't the Roman letters called things like "ASCII CHARACTER 41"?

Re: 1953: "The History of Unicode"

Posted: Fri Feb 23, 2018 8:12 pm UTC
by Steve the Pocket
Flumble wrote:
Steve the Pocket wrote:Forget emojis; why in Dickens's deuce is this a thing?!

You know what the "uni" in unicode stands for, right? Egyptian hieroglyphs have (allegedly) been used as a script by a whole civilization.

A dead civilization. It's not like we're leaving modern-day ancient Egyptians without a way to communicate on their phones. And I've never seen anyone use hieroglyphics in print, only ever just photographs of tomb inscriptions. Maybe a monochrome diagram from time to time, on its own line, but you can digitize that without custom characters.

Re: 1953: "The History of Unicode"

Posted: Sat Feb 24, 2018 8:09 pm UTC
by xtifr
Steve the Pocket wrote:
Flumble wrote:You know what the "uni" in unicode stands for, right? Egyptian hieroglyphs have (allegedly) been used as a script by a whole civilization.

A dead civilization.


A dead civilization which is the subject of tons of research and which regularly generates scholarly papers. The fact that you may only read popularizations which show photos of tombs rather than going into detail about the language doesn't mean there aren't regularly published papers about the language, or which quote the language in detail.

Heck, Egyptology is such a prominent branch of archaeology that it has its own name.

But ancient Egyptian isn't the only dead language covered by Unicode. Unicode is big! It has Cuneiform and Ogham and Runic as well. And plenty of room to spare.

There's a movement to get Tolkien's Tengwar (Elvish) accepted. Some folks have even staked out a code block and produced unofficial fonts using that block. So far, the Unicode Consortium have balked at accepting any made-up languages, but there's enough scholarship studying Tolkien that they could end up conceding on this one. For the moment, though, they're still limiting it to real languages. But that is pretty much the only limit so far.

Unicode isn't designed just to allow millennials in the US and China to send each other pictures of aubergines. It's designed to be useful for grownups as well! :P :wink:

edit: fix typo

Re: 1953: "The History of Unicode"

Posted: Sun Feb 25, 2018 12:49 am UTC
by rmsgrey
xtifr wrote:So far, the Unicode Consortium have balked at accepting any made-up languages


I expect most made-up languages are supported by Unicode (a lot are supported by ASCII) - it's the made-up graphemes that the Consortium presumably draw the line at.

Re: 1953: "The History of Unicode"

Posted: Sun Feb 25, 2018 12:52 am UTC
by GlassHouses
xtifr wrote:So far, the Unicode Consortium have balked at accepting any made-up languages

What about emoji, though?

N.B. I'm not one of those people who disapprove of emoji in Unicode. There are plenty of code points available to have some fun with... But I am curious, what's the official rationale there?

Re: 1953: "The History of Unicode"

Posted: Sun Feb 25, 2018 1:42 am UTC
by Pfhorrest
Emoji predated unicode and are in some older CJK encodings that unicode has to interconvert with (unicode’s primary mission), and since it then has to do that basic unicode anyway and people turned out to use it so much, they way as well get it right while they’re at it.