Skip to content

Add font feature API to Text #29695

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: text-overhaul
Choose a base branch
from

Conversation

QuLogic
Copy link
Member

@QuLogic QuLogic commented Mar 1, 2025

PR summary

Font features allow font designers to provide alternate glyphs or shaping within a single font. These features may be accessed via special tags corresponding to internal tables of glyphs.

The mplcairo backend supports font features via an elaborate re-use of the font file path. This commit adds the API to make this officially supported in the main user API.

At this time, nothing in Matplotlib itself uses these settings, but they will have an effect with libraqm. I am opening this PR early for review of the API while I work through some issues with the latter. Consequently, the What's New note will not show the effect of this API, and there are only smoke tests.

PR checklist

@QuLogic
Copy link
Member Author

QuLogic commented Mar 1, 2025

The What's New entry with libraqm will look something more like this (minus a bug with the kerning, hopefully):
image

@anntzer
Copy link
Contributor

anntzer commented Mar 1, 2025

A difficulty that I've not really handled with mplcairo but warrants at least discussion is what you want to do with subranges. Harfbuzz supports toggling a feature only for some of the characters (see https://harfbuzz.github.io/harfbuzz-hb-common.html#hb-feature-from-string, e.g. aalt[3:5]=2), which mplcairo just forwards directly to harfbuzz, but this can(?) become problematic for multiline inputs, which get fed (by matplotlib) one-line-at-a-time to the rendering machinery, so something like aalt[3:5] likely(?) gets interpreted as "characters 3-to-5 of each line" rather than "characters 3-to-5 of the entire string".

The two main alternatives I can think of are either to do nothing, like mplcairo (subranges are interpreted as "repeated over each line"), or to reparse ranges and reinterpret them as "indices over the full string" (after line splitting, matplotlib tweaks the actual ranges than get fed to harfbuzz when shaping each line).

@anntzer
Copy link
Contributor

anntzer commented Mar 2, 2025

Also, looking at this again, I wonder how well this will interact with font fallback: it seems not unreasonable that two entries in the font fallback list would want to use different font features (e.g., a latin script and a chinese script likely want very different font features). Perhaps the real question here is whether font fallback should really have been implemented by stashing (references to) multiple fonts in a single FontProperties (though I'm not sure I can immediately design something much better), but the PR here makes the question more salient, I'd say.

@QuLogic
Copy link
Member Author

QuLogic commented Mar 22, 2025

Since we have automatic wrapping (Text(..., wrap=True)), I think the only possible implementation is to reparse the ranges ourselves.

For settings across font fallback, I guess that is possible to do, but it might require quite a bit of bookkeeping work.

@anntzer
Copy link
Contributor

anntzer commented Mar 22, 2025

The comments at #29794 (comment) also apply, but because this PR doesn't actually touch the rendering API, I guess it's fine.

@tacaswell
Copy link
Member

Discussed on a developer call we decided to not implement the sub-range application yet. Promoting a tuple of strings to Dict[tuple[int, int], Tuple[str, ...]] is something we can do unambiguously later and given that we expect most strings to be "short", hopefully the demand for mixed language within one Text object will be low.

@anntzer
Copy link
Contributor

anntzer commented May 7, 2025

Even if we don't explicitly support sub-ranges, we still need to decide what happens if someone writes e.g. fontfeatures=["+kern[3:5]"] (where the whole syntax gets interpreted by harfbuzz). It's probably fine to just document the limitation for now ("the interaction of subranges and multiline text is currently unspecified and behavior may change in the future").

@QuLogic
Copy link
Member Author

QuLogic commented May 12, 2025

Since we moved the information from FontProperties to Text, I was actually going to implement the splitting. However, I ran into an issue in that the Text object is only supplied to Renderer.draw_text if the string is not multiline:

mtext = self if len(info) == 1 else None

I tried a small change:

diff --git a/lib/matplotlib/text.py b/lib/matplotlib/text.py
index 3b0de58814..b255a93c52 100644
--- a/lib/matplotlib/text.py
+++ b/lib/matplotlib/text.py
@@ -800,9 +800,7 @@ class Text(Artist):
 
             angle = self.get_rotation()
 
             for line, wh, x, y in info:
-
-                mtext = self if len(info) == 1 else None
                 x = x + posx
                 y = y + posy
                 if renderer.flipy():
@@ -816,14 +814,19 @@ class Text(Artist):
                 else:
                     textrenderer = renderer
 
-                if self.get_usetex():
-                    textrenderer.draw_tex(gc, x, y, clean_line,
-                                          self._fontproperties, angle,
-                                          mtext=mtext)
-                else:
-                    textrenderer.draw_text(gc, x, y, clean_line,
-                                           self._fontproperties, angle,
-                                           ismath=ismath, mtext=mtext)
+                xt, yt = self.get_transform().inverted().transform((x, y))
+                with cbook._setattr_cm(self, _x=xt, _y=yt, _text=clean_line,
+                                       convert_xunits=lambda x: x,
+                                       convert_yunits=lambda y: y,
+                                       _horizontalalignment='left', _verticalalignment='bottom'):
+                    if self.get_usetex():
+                        textrenderer.draw_tex(gc, x, y, clean_line,
+                                              self._fontproperties, angle,
+                                              mtext=self)
+                    else:
+                        textrenderer.draw_text(gc, x, y, clean_line,
+                                               self._fontproperties, angle,
+                                               ismath=ismath, mtext=self)
 
         gc.restore()
         renderer.close_group('text')

AFAICT, only the PGF backend uses mtext and under certain conditions will place the text using the original position and alignment instead of the x/y passed to draw_text. Unfortunately, these don't seem to match (I assume the x/y passed in accounts for the descenders and other flourishes), so this breaks the PGF tests. I wonder if there's a better condition to be put in here:

if mtext and (
(angle == 0 or
mtext.get_rotation_mode() == "anchor") and
mtext.get_verticalalignment() != "center_baseline"):

@QuLogic QuLogic changed the base branch from main to text-overhaul June 5, 2025 01:38
@QuLogic QuLogic moved this to Waiting for other PR in Font and text overhaul Jun 5, 2025
@QuLogic QuLogic moved this from Waiting for other PR to Todo in Font and text overhaul Jun 5, 2025
@QuLogic QuLogic linked an issue Jul 1, 2025 that may be closed by this pull request
@QuLogic QuLogic changed the title Add font feature API to FontProperties and Text Add font feature API to Text Jul 1, 2025
@QuLogic QuLogic changed the title Add font feature API to Text Add font feature API to Text Jul 1, 2025
Font features allow font designers to provide alternate glyphs or
shaping within a single font. These features may be accessed via special
tags corresponding to internal tables of glyphs.

The mplcairo backend supports font features via an elaborate re-use of
the font file path [1]. This commit adds the API to make this officially
supported in the main user API.

At this time, nothing in Matplotlib itself uses these settings, but
they will have an effect with libraqm.

[1] https://github.com/matplotlib/mplcairo/blob/v0.6.1/README.rst#font-formats-and-features
@QuLogic
Copy link
Member Author

QuLogic commented Jul 10, 2025

I've added a note about subranges in multiline text, some comments about where TODO items are left, and rebased on the text-overhaul branch. Also, split the previous comment about multiline text into its own issue.

Everything should now pass as well due to preloading from #30231.

@anntzer
Copy link
Contributor

anntzer commented Jul 10, 2025

Looking at this in relation to #30282, I wonder (sorry for opening again a potential can of worms) whether associating the font features (and likewise the language) with the Text object is really the better design, or whether they should be associated with the FT2Font object; i.e., consider that the same font with different features is really two different font objects (possibly with some caching along the way) and support feature ranges by actually supporting having multiple fonts in the same text string. Then the fundamental rendering function would no longer be a FT2Font method, but a free function like layout(string, [(font0, nchars0), (font1, nchars1), ...]) (returning a list of glyphs that agg can rasterize and the vector backends can embed) where font0 (which includes font features, languages, fallback info) is used for the first nchars0 characters, etc.

@QuLogic
Copy link
Member Author

QuLogic commented Jul 10, 2025

support feature ranges by actually supporting having multiple fonts in the same text string.

That seems like a fairly large change, and I'm not sure how it would interact with the fact that you can essentially construct FontProperties by passing individual font* keyword arguments to Text. I'm not against the concept, but not sure how it would look.

@QuLogic
Copy link
Member Author

QuLogic commented Jul 11, 2025

We discussed this idea on the call earlier today, though we did not have quorum to decide anything.

@tacaswell suggested that we provide some lightweight description of ranged-properties to apply to all font properties keyword arguments that could be passed to Text. Essentially, either the single property, or a list of ranges and properties (as in #29794 most likely, since not all these properties are strings.) Internally, Text would flatten this all out into discrete FT2Font objects to be passed to layouting. At the lower level, #30000 already has layout at almost a free-standing level and it wouldn't be too difficult to make that change.

However, as that is an additional large set of changes, we likely want to defer doing anything of the sort right now until after all the libraqm PRs are in. So the plan would be something like:

  1. Add fontfeatures here and language in Add language parameter to Text objects #29794 to Text.
  2. Don't document any ranged application in either one (though features would allow it just because libraqm does it within the string, we won't guarantee or document that explicitly.)
  3. In the future (either after the text overhaul is done or maybe in the next release), start accepting either setting | list[tuple[setting, start, end]] for all font properties (font features, language, variant, weight, style).

I'd also like to confirm with @timhoffm whether this seems alright from an API point of view, or if he can see some kind of clash between steps 1 and 3. Or if we'd want to expose this ranged-setting option in a different way to the user, or not at all.

@timhoffm
Copy link
Member

1 is ok. I have some reservations on 3. I believe list[tuple[setting, start, end]] is error-prone and not user friendly. If I understand correctly, this would mean "this is cumbersome", fontweight=[("normal", 0, 6), ("bold", 6, 8), ("normal", 10, -1)]. What happens if the ranges are not aligned? Is there a default, so that "this is cumbersome", fontweight=[("bold", 6, 8)] would be sufficient? What happens if the ranges overlap? But more importantly, if I change to "the handling is cumbersome", I'd have to re-count the characters.

Multiple formats are effectively a form of rich text. I'd rather either decide not to support that, or alternatively go one step further and consider some sort of mini-language, e.g. a subset of markdown or html as a text-based rich text standard. We'd have to write (or use as optional dependency) as parser. The parsed text structure would then be transformed to the chain of FTFont objects.

@story645
Copy link
Member

I'd rather either decide not to support that, or alternatively go one step further and consider some sort of mini-language, e.g. a subset of markdown or html as a text-based rich text standard.

Would this be akin to something like https://github.com/znstrider/highlight_text or https://github.com/tomicapretto/flexitext?

@QuLogic
Copy link
Member Author

QuLogic commented Jul 12, 2025

1 is ok. I have some reservations on 3. I believe list[tuple[setting, start, end]] is error-prone and not user friendly. If I understand correctly, this would mean "this is cumbersome", fontweight=[("normal", 0, 6), ("bold", 6, 8), ("normal", 10, -1)]. What happens if the ranges are not aligned? Is there a default, so that "this is cumbersome", fontweight=[("bold", 6, 8)] would be sufficient? What happens if the ranges overlap? But more importantly, if I change to "the handling is cumbersome", I'd have to re-count the characters.

Anything uncovered would use the default (rcParams, effectively.) I can't say about overlap, though I'd expect an error. I do agree that this is somewhat cumbersome, though.

Multiple formats are effectively a form of rich text. I'd rather either decide not to support that, or alternatively go one step further and consider some sort of mini-language, e.g. a subset of markdown or html as a text-based rich text standard. We'd have to write (or use as optional dependency) as parser. The parsed text structure would then be transformed to the chain of FTFont objects.

I did consider that after the call, and I expect something HTML-like is most likely to be familiar with users. How difficult that would be to implement, I haven't determined.

Would this be akin to something like https://github.com/znstrider/highlight_text or https://github.com/tomicapretto/flexitext?

These are good, though I think one downside is they need to layout themselves. Each different font block (including bold vs normal) is laid out separately and joined with Packers AFAICT. With libraqm though, we are able to individually pick fonts per character, thus getting a true holistic layout for a single line string (and libraqm even supports configurable spacing between characters/words.) So I would like to get something implemented here at some point, but it's maybe not a rush to do so.

For now though, I think we'll stick with 1 and not specifically try to do any ranged setting applications.

@timhoffm
Copy link
Member

For now though, I think we'll stick with 1 and not specifically try to do any ranged setting applications.

That’s a reasonable first step and we can still go either way later.

@khaledhosny
Copy link

I wonder (sorry for opening again a potential can of worms) whether associating the font features (and likewise the language) with the Text object is really the better design, or whether they should be associated with the FT2Font object; i.e., consider that the same font with different features is really two different font objects (possibly with some caching along the way) and support feature ranges by actually supporting having multiple fonts in the same text string.

This would break interaction between parts of text that use different features, e.g. "AV" string and user wants to enable a feature on "V" glyph, if this results in two fonts then no kerning will be applied between the two glyphs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Ready for Review
Development

Successfully merging this pull request may close these issues.

OTF feature support (alternate figure styles, etc.)
6 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy