Content-Length: 18269 | pFad | http://lwn.net/Articles/750624/

A look at terminal emulators, part 1 [LWN.net]
|
|
Subscribe / Log in / New account

A look at terminal emulators, part 1

A look at terminal emulators, part 1

Posted Mar 31, 2018 5:06 UTC (Sat) by gutschke (subscriber, #27910)
Parent article: A look at terminal emulators, part 1

Testing Unicode compliance makes sense. It would have been nice to also test things like combining characters, although I honestly don't know how frequently they are encountered in real-life scenarios.

Testing Right-To-Left text is interesting though, as I am not even sure there is an unambiguously sane way that terminals can interpret changing directions. What is supposed to happen, if I explicitly set the cursor position adjacent to existing mixed-direction text? Should the existing text in this cursor position influence how new text is formatted? Or should that only happen if I output a sequence of characters without intervening escape codes? How about when I change the terminal from overwrite into insert mode? Does that change behavior? How does cursor positioning work when it encounters mixed text? Is the answer different for relative and absolute cursor positioning commands?

While it is certainly possible to come up with *an* answer to these question, I doubt it is the only answer. I'd expect that each implementation does something different, resulting in behavior that is so hard to predict that no application could rely on anything but the most basic set of features. But who knows, maybe there is a standard for this and I just don't know about it.

All of this gets even more complex for languages like Arabic that require ligatures (or shaping) to make the text intelligble. You can no longer think of a word as being made up of distinct characters, but you have to look at most or all of the word to figure out how to render it. I have a really hard time seeing how terminal emulators can provide a sane implementation with these constraints. It pretty much breaks all assumptions about a fixed grid of characters, and about the ability to arbitrarily position the cursor to any X/Y coordinate. Y coordinates still make sense (until we consider vertically-rendered text), but X coordinates are getting increasingly fuzzy.

I'd love to hear if there is a universally agreed upon convention how to handle this problem, and whether there are any terminal emulators that correctly implement all the corner cases.


to post comments

A look at terminal emulators, part 1

Posted Mar 31, 2018 9:40 UTC (Sat) by dottedmag (subscriber, #18590) [Link]

There is annex to Unicode http://unicode.org/reports/tr9/ implemented in many browsers and UI toolkits, as they have to implement the same editing features (not on a fixed character grid though).

A look at terminal emulators, part 1: more on bidi and such

Posted Apr 2, 2018 4:08 UTC (Mon) by tzafrir (subscriber, #11501) [Link] (1 responses)

Joining of Arabic (and similar: also Farsi and some others) was indeed not supported here. Though I'm not sure it's that much of a complication with fixed-width fonts: in the Arabic script the shape of the letter may (and usually does) change depending on its place in the word. But it would still be a single character and a single character's space in a terminal. It does mean that shapes of letters around may change when you change the shape of a single letter. How does that work with the various terminal emulators in this review?

Though some south-east Asian scripts are indeed more complex and break that assumption of one character per space.

I think this was not mentioned in the review, but Konsole's bidi support is optional (It's an option you have to tick somewhere deep in the menus). Indeed there's one common use case where bidi is annoying: if you use an Israeli locale, the day of the week and the month name are Hebrew, and thus the output of ls has some Hebrew in it. Some terminals would make a mess of it aligning some file names to the right and some file names to the left.

Generally in my experience mlterm works relatively well for editing Hebrew. Konsole: less so.

A look at terminal emulators, part 1: more on bidi and such

Posted Apr 10, 2018 13:24 UTC (Tue) by pjm (guest, #2080) [Link]

I actually object to the article's "they should be rendered right to left when displayed" and "handle this correctly [i.e. rendering right to left]": terminal input doesn't really have a good concept of where paragraphs start or end, what the base directionality of the paragraph is, and so on, so it's hard to say that any attempted bidi rendering really is "correct". Picking arbitrary answers such as "each line is considered a new paragraph" makes text really hard to follow due to the incorrect reordering for text that doesn't match that arbitrary answer.

While rendering rtl text left to right does make that text very slow to read, it does at least have the advantage that it's clear what the logical order is (avoiding the ls problem referred to above), so I would often prefer the ltr approach when using a terminal. However, I'll grant that this is only viable because I don't use an rtl locale for LC_MESSAGES.

(I wonder, is there any value in the Mongolian solution, i.e. rotating the text so that everything is written top-to-bottom instead of either ltr or rtl ? 90°-rotated text is still a bit slower to read than unrotated text, but much quicker than reading ltr isolated-form letters for arabic script.)

A look at terminal emulators, part 1

Posted Mar 7, 2019 15:27 UTC (Thu) by nicm (guest, #50555) [Link]

> combining characters, although I honestly don't know how frequently they are encountered in real-life scenarios.

Regular combining characters are common, much more common among terminal users than right-to-left text in my experience.

It would be interesting to know if any terminals support zero width joiner (U+200D), I don't believe many do.

A look at terminal emulators, part 1

Posted Sep 12, 2019 11:55 UTC (Thu) by egmont (guest, #134373) [Link]

@gutschke You raise excellent points about BiDi.

During the last 1.5 years (after you made your comment, and before I make this one now) I spent a lot of time pondering about these issues and studying existing practice and an ancient specification on the topic, and came up with a draft specification for the desired behavior which is available at https://terminal-wg.pages.freedesktop.org/bidi/. It hopefully provides a good enough answer to your questions.

I also implemented this feature accordingly in GNOME Terminal 3.34, or more precisely, VTE 0.58, which are getting officially released today. With this, all other VTE-based terminal emulators (including Tilix, Terminator, Xfce Terminal, ROXTerm, Guake and a whole lot more) also get BiDi support according to this new emerging specification.

I hope you'll like it!


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds









ApplySandwichStrip

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier!      Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

Fetched URL: http://lwn.net/Articles/750624/

Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy