The devil said, “Take this glyph-laden grimoire and try to render it cross-platform.”

merari42@lemmy.world · 1 day ago

The devil said, “Take this glyph-laden grimoire and try to render it cross-platform.”

esa@discuss.tchncs.de · 1 day ago

It’s a joke because it includes useless letters nobody needs, like that weird o with the leg, and a rich set of field and record separating characters that are almost completely forgotten, etc, but not normal letters used in everyday language >:(

CameronDev@programming.dev · 1 day ago

weird o with the leg

Can you elaborate? Do you mean Q or p?

esa@discuss.tchncs.de · 1 day ago

Q. P is a common character across languages. But Q is mostly unused, at least outside the romance languages who appear to spell K that way. But that can be solved by letting the characters have the same code point, and rendering it as K in most regions, and Q in France. I can’t imagine any problems arising from that. :)

spizzat2@lemm.ee · edit-2 1 day ago

While we’re at it, I have some other suggestions…

For example, in year 1 that useless letter “c” would be dropped to be replased either by “k” or “s,” and likewise “x” would no longer be part of the alphabet. The only kase in which “c” would be retained would be the “ch” formation, which will be dealt with later. year 2 might reform “w” spelling, so that “which” and “one” would take the same konsonant, wile year 3 might well abolish “y” replasing it with “i” and iear 4 might fiks the “g/j” anomali wonse and for all.
Jenerally, then, the improvement would kontinue iear bai iear with iear 5 doing awai with useless double konsonants, and iears 6-12 or so modifaiing vowlz and the rimeining voist and unvoist konsonants. Bai iear 15 or sou, it wud fainali bi posibl tu meik ius ov thi ridandant letez “c,” “y” and “x”–bai now jast a memori in the maindz ov ould doderez–tu riplais “ch,” “sh,” and “th” rispektivli.
Fainali, xen, aafte sam 20 iers ov orxogrefkl riform, wi wud hev a lojikl, kohirnt speling in ius xrewawt xe Ingliy-spiking werld.

setVeryLoud(true);@lemmy.ca · 19 hours ago

Look into the Shavian alphabet

Onomatopoeia@lemmy.cafe · 1 day ago

Haha, nicely done. I had to work harder and harder to read it.

esa@discuss.tchncs.de · 1 day ago

Jess. Ai’m still lukking får the ekvivalent åv /r/JuropijenSpelling her ån lemmi. Fæntæstikk søbreddit vitsj æbsolutli nids lemmi representeysjen.

lad@programming.dev · 1 day ago

If that’s a joke, it’s a good one. Otherwise, well, there are a lot of “this letter isn’t needed let’s throw it away,” in most cases it will not work as good as you think.

esa@discuss.tchncs.de · 1 day ago

Yes, I am joking. We probably could do something like the old iso-646 or whatever it was that swapped letters depending on locale (or equivalent), but it’s not something we want to return to.

It’s also not something we’re entirely free of: Even though it’s mostly gone, apparently Bulgarian locales do something interesting with Cyrillic characters. cf https://tonsky.me/blog/unicode/

AnarchistArtificer@slrpnk.net · 19 hours ago

Damn, thanks for that link; earlier today I was telling a non techy friend about Unicode quirks earlier and I could vaguely remember that post, but not well enough to remember how to find it. I didn’t try very hard because it wasn’t a big deal, so the serendipity of finding it via your comment was neat.

CameronDev@programming.dev · edit-2 1 day ago

That is quite a unique quip. I love the idea of geo-based rendering, every application that renders text needs location access to be strictly correct :D.

I’d go further with the codepoint reduction, and delete w (can use uu) instead, and delete k (hard c can take its place)

esa@discuss.tchncs.de · edit-2 1 day ago

To unjerk, as it were, it was a thing. So on old systems they’d do stuff like represent æøå with the same code points as {|}. Curly brace languages must have looked pretty weird back then:)

CameronDev@programming.dev · 1 day ago

It still is a thing in some fonts: https://blog.miguelgrinberg.com/post/font-ligatures-for-your-code-editor-and-terminal

Took me a while to work out what they were called. Font rendering is hard :(

palordrolap@fedia.io · 1 day ago

Those “almost completely forgotten” characters were important when ASCII was invented, and a lot of that data is still around in some form or another. There’s also that, since they’re there, they’re still available for the use for which they were designed. You can be sure that someone would want to re-invent them if they weren’t already there.

Some operating systems did assign symbols to those characters anyway. MS-DOS being notable for this. Other standards also had code pages where different languages had different meanings for the byte ranges beyond ASCII. One language might have “é” in one place and another language in another. This caused problems.

Unicode is an extension of ASCII that covers all bases and has all the necessary symbols in fixed places.

That languages X, Y and Z don’t happen to have their alphabets in contiguous runs because they’re extended Latin is a problem, but not something that much can be done about.

It’s understandable that anyone would want their alphabet to be the base language, but one has to be or you end up in code page hell again. English happened to get there first.

If you want a fun exercise (for various interpretations of “fun”), design your own standard. Do you put the digits 0-9 as code points 0-9 or do you start with your preferred alphabet there? What about upper and lower case? Which goes first? Where do you put Chinese?

esa@discuss.tchncs.de · 1 day ago

I’m not entirely sure here, but you are aware you’re in a humour community, yeah?

palordrolap@fedia.io · 1 day ago

I see I’ve forgotten to put on my head net today. You know the one. Looks like a volleyball net. C shape. Attaches at the back. Catches things that go woosh.

The_Decryptor@aussie.zone · 1 day ago

That’s “Extended ASCII”, basic ASCII only has upper and lowercase latin characters and things like <, =, >, and ?

And probably half of the control codes are still used, mostly in their original form too, teletype systems. They’re just virtual these days.

esa@discuss.tchncs.de · edit-2 1 day ago

No, I’m pretty sure the weird o with the leg is in basic ASCII. It’s also missing Latin characters like Æ. It’s a very weird standard.