Page MenuHomePhabricator

Bug 1584000 - Migrate glyph to character association code from libThebes to graphite for sandboxed libGraphite performance r=jfkthame,froydnj
ClosedPublic

Authored by shravanrn on Sep 27 2019, 4:42 AM.
Tags
None
Referenced Files
Unknown Object (File)
Sun, May 4, 1:28 PM
Unknown Object (File)
Mar 17 2025, 3:09 AM
Unknown Object (File)
Jan 11 2025, 6:26 PM
Unknown Object (File)
Jan 10 2025, 3:04 AM
Unknown Object (File)
Jan 9 2025, 12:01 AM
Unknown Object (File)
Jan 1 2025, 8:32 PM
Unknown Object (File)
Dec 19 2024, 1:06 PM
Subscribers

Diff Detail

Repository
rMOZILLACENTRAL mozilla-central
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
gfx/graphite2/src/GraphiteExtra.cpp
1 ↗(On Diff #169384)

This file needs an MPL header.

10–13 ↗(On Diff #169384)

This isn't equivalent to NS_ASSERTION, so it really shouldn't reuse that name.

80 ↗(On Diff #169384)

Rather than allocating a record here and returning a pointer, can we have the caller pass a reference to a local (stack) variable and just have this function fill it in, thus eliminating a heap allocation?

89–96 ↗(On Diff #169384)

The NS_ASSERTIONs here (which aren't the same as Gecko's normal NS_ASSERTION) will abort() if any of the allocations failed. That's not desirable; we'd prefer to gracefully fail to render the text, but not kill the process altogether. Compare the existing code in gfxGraphiteShaper, which uses fallible array allocations and just returns an error nsresult if allocation failed.

Further: do we really need to do these allocations from within the (sandboxed) code here, or could we have the caller pass in the buffers that it wants filled? The current code avoids doing these four heap allocations entirely in the common case, by using AutoTArray so that everything just lives on the stack unless the text run being shaped is unusually long.

gfx/graphite2/src/moz.build
11

Can we do this without actually putting the GraphiteExtra files directly into the graphite source and include directories? Given that this isn't part of the upstream library (which we periodically update from new upstream release tarballs), it'd be nice if we can segregate our added chunk of code to avoid confusion. Even if it's just putting our additions into a gfx/graphite2/geckoextra/ folder instead of inside upstream's /src/ and /include/.

gfx/graphite2/src/GraphiteExtra.cpp
1 ↗(On Diff #169384)

will fix!

10–13 ↗(On Diff #169384)

will fix!

61–67 ↗(On Diff #169384)

So this change was intentional to ensure the code in this module never directly uses aText.

From the security side, the general idea is that the less we share with the sandbox the better. In this case, I don't think sharing aText one way or the other matters, as i suspect aText is inferrable from the other data available to this function. However, if this is not the case, this is definitely a win. Please let me know about this.

From the performance side, once graphite is sandboxed, having graphite code using aText would require making a copy of aText in memory accessible by the sandboxed libGraphite. So the tradeoff here is function call for InMiddleOfSurrogatePair that calls into thebes or making a copy of aText. If you have a sense for one being more performant than the other, please let me know.

80 ↗(On Diff #169384)

will fix!

89–96 ↗(On Diff #169384)

Ah, good catch. Will fix fallible allocations.

I think being able to use AutoTArray gets tricky, as the graphite code is unaware of AutoTArray. Would you know if AutoTArray is guaranteed to be continuous storage i.e. Is it safe to cast AutoTArray<float> to a float*? If not, we would need to pull in AutoTArray into graphite, which I don't think would be a good idea...

Alternately, another option that retains the performance would be to have the glyph_to_char_association code exist both in thebes and in graphite. The unsandboxed version can use the implementation in thebes. While the sandboxed version can use the one in graphite.

Let me know any thoughts!

gfx/graphite2/src/moz.build
11

will fix!

gfx/graphite2/src/GraphiteExtra.cpp
89–96 ↗(On Diff #169384)

Yes, an AutoTArray's storage will always be a contiguous block. So if you have

AutoTArray<float, N> myarray;
bool ok = myarray.SetLength(len, fallible);
// if !ok, allocation failed so we have to bail out!

then myarray.Elements() will give you the float* you want, guaranteed to be sized for len elements (either on the stack, if it doesn't exceed the AutoTArray's built-in buffer size N, or on the heap).

gfx/graphite2/src/GraphiteExtra.cpp
61–67 ↗(On Diff #169384)

So the sandboxed code is not able to dereference aText, is that what you're saying? OK - then I guess it's a question of whether it is cheaper to make an in-sandbox copy of aText or to call back out once per character. (Roughly -- the actual number of callbacks will depend on the shaping behavior, but it'll generally be approximately equivalent to the number of characters.)

I'm under the impression that the transitions in/out of the sandboxed code are fairly expensive, so that makes me suspect that it'll probably be cheaper to make a copy of the text (once) than to do a per-character callback, but maybe you'll want to measure each option to see if that's correct. I wouldn't particularly trust my intuition about this, as I don't really know anything about how this interface works.

gfx/graphite2/src/GraphiteExtra.cpp
61–67 ↗(On Diff #169384)

Yup sandboxed graphite can't dereference aText without making a copy. I'll look into the performance and follow up.

Also just wanted to confirm the security related question above... i.e. say we go the callback route and never expose aText to graphite directly, can graphite "learn" the value of aText from the other inputs such as aSegment?

froydnj added 1 blocking reviewer(s): jfkthame.

Jonathan has caught all the things that I would have suggested--and many things that I wouldn't have!

gfx/graphite2/src/GraphiteExtra.cpp
61–67 ↗(On Diff #169384)

Yes - we've passed the same text buffer to gr_make_seg already, and graphite must have iterated over its contents in order to do its work at all. So presumably there's already a sandbox-side copy of aText being created for gr_make_seg. We should just use the same thing here.

gfx/graphite2/src/GraphiteExtra.cpp
89–96 ↗(On Diff #169384)

@jfkthame - I am hoping to avoid the use of AutoTArray here. Could you please confirm if this is ok? The reasoning is provided below.

So I was playing with this some more. The difficulty here is that small array optimizations (stack based arrays) are not permitted with a sandboxed graphite. Thus if we want to use AutoTArray for the unsandboxed graphite, we do have to introduce an branch in the code, probably something like

#ifdef MOZ_SANDBOXED_GRAPHITE
  gr_glyph_to_char_cluster* clusters = new gr_glyph_to_char_cluster[aLength];
  uint16_t* gids = new uint16_t[glyphCount];
  float* xLocs = new float[glyphCount];
  float* yLocs = new float[glyphCount];
#else
  AutoTArray<gr_glyph_to_char_cluster, SMALL_GLYPH_RUN> clusters;
  AutoTArray<uint16_t, SMALL_GLYPH_RUN> gids;
  AutoTArray<float, SMALL_GLYPH_RUN> xLocs;
  AutoTArray<float, SMALL_GLYPH_RUN> yLocs;
#endif

In general I have been trying to avoid these sort of changes in favor of code that works with both the sandboxed and unsandboxed graphite. Code in the above style tends to get harder to maintain over the long term.

What are your thoughts on just moving to heap allocations for these structures?

Address comments for thebes code migration

gfx/graphite2/geckoextra/src/GraphiteExtra.cpp
15–28

It'd be good to explicitly #undef these CHECK... and NS_IS_... macros at the end of the file, to avoid the risk of possible conflicts when unified compilation is used and other source files end up as part of the same compilation unit.

34

gids should be declared uint16_t rather than unsigned short, I think.

Oh, and Gecko style wants the a prefix on all arguments, so ugly though it is, I guess they should be aClusters, aGids, aXLocs, aYLocs. Though actually... why not just pass in the gr_glyph_to_char_association pointer here?

87

That comment doesn't look right!

92–103

OK... it's a bit unfortunate to have to heap-allocate here, but can we at least minimize the number of allocations? Something along the lines of

char* buffer = malloc(sizeof(gr_glyph_to_char_association) +
    aLength * sizeof(gr_glyph_to_char_cluster) +
    glyphCount * (2 * sizeof(float) + sizeof(uint16_t)));
gr_glyph_to_char_association* data = (gr_glyph_to_char_association*)(buffer);
data->clusters = (gr_glyph_to_char_cluster*)(data + 1);
data->xLocs = (float*)(data->clusters + aLength);
data->yLocs = data->xLocs + glyphCount;
data->gids = (uint16_t*)(data->yLocs + glyphCount);

(with added failure checking and appropriate C++ casts) would condense the five allocate/free pairs into just one, which should reduce the overhead.

105–108

Oh, and if we combine all the above into a single allocation, we can just use calloc() rather than malloc() or new and it'll be pre-zeroed for us, perhaps more efficiently.

gfx/thebes/gfxGraphiteShaper.cpp
194–195

I think you've removed all users of this constant (right?), so it can just go away.

Fix more comments for thebes code migration

Need to fix failing tests

gfx/graphite2/geckoextra/src/GraphiteExtra.cpp
113–117

This looks a bit broken - size1 twice, and no size4?

More generally, this feels over-engineered to me. We know exactly what types we're dealing with, and don't need such verbose and generalized code here.

As a starting point, calloc will give us a pointer that is "suitably aligned for any object type", so we don't need to align it any further. From there, we can keep track of the alignment requirements of the various objects, basically working from larger to smaller: 16-byte cluster records (but their fields only need 4-byte alignment), then 4-byte floats, and finally 2-byte shorts, and know that we won't be misaligning anything.

A couple of static_asserts to verify that the sizes/alignment requirements of the types are indeed what we expect would probably be a good idea, in case we someday meet a weird architecture, but I don't think we need to go to the lengths here.

(And we don't need a separate memoryPool field; we can always use the beginning of the calloced block as the gr_glyph_to_char_association record itself, so that's what we'll need to free later.)

gfx/graphite2/geckoextra/src/GraphiteExtra.cpp
113–117

Ah, thanks! Saved me some debugging. Had everything working, and then I did some formatting changes which broke tests, so knew I had a bug somewhere. 👍

The alignment suggestions makes sense. Will fix.

Simplify and correct memory pool allocation

gfx/graphite2/geckoextra/include/GraphiteExtra.h
8–9

The name of the #include guard doesn't match the filename!

35

Use uint32_t for aLength, please, to match usage in gfxGraphiteShaper.

Actually, maybe we should instead (but equivalently) go with graphite's gr_uint32, as this is effectively being added to the graphite library as an extra API. That would apply to all the various integer types here. I don't feel strongly either way, really - the stdint types or the graphite ones - but let's be consistent in using them, and not this unsigned int, given that we use uint32_t for text lengths all over the place.

gfx/graphite2/geckoextra/src/GraphiteExtra.cpp
15–19

I'm not thrilled about seeing abort() here, where it ends up replacing (currently non-fatal, debug-only assertions). If we really need to terminate the program if one of these checks fails - which we don't currently - then we should be using MOZ_CRASH to do so with an appropriate "reason" message.

When this is sandboxed, though, I guess MOZ_CRASH probably isn't an option; I'm curious, actually, what would abort() do if called from within the WASM-sandboxed code?

I think instead we should make gr_get_glyph_to_char_association safely bail out if one of these consistency checks fails, freeing the allocated record and returning null. (This may be simpler to implement if you inline the loop - where those checks and potential bail-outs happen - rather than having it in a separate helper function.)

This CHECK macro would then become something more like

#define CHECK(cond, str) \
  do { \
    if (!(cond)) { \
      free(memoryPool); \
      return nullptr; \
    } \
  while (0)
21

This shouldn't be needed, see below.

107–111

In theory, these multiplications could overflow (although in practice something else would probably have broken before aLength or aGlyphCount got that big). To be safer, though, don't store these into size_t variables, which might not be large enough (on a 32-bit build)...

113–118

...instead, just add them directly to a uint64_t total, which can't overflow given the sizes of the types and the fact that the multipliers (counts) are 32-bit values.

Then a single check that total < SIZE_MAX should be all you need to ensure the result is valid, rather than this repeated use of ADD_UNSIGNED_NO_OVERFLOW.

146–147

Need to handle allocation failure here.

shravanrn marked 7 inline comments as done.

Address code review comments

This looks much nicer, thanks. Just a couple remaining tweaks I'd like to see - just code-style nits rather than anything functional - otherwise I think this should be ready to go.

gfx/graphite2/geckoextra/src/GraphiteExtra.cpp
20

This use of static_assert feels strange to me; it's not a convention we use anywhere in the codebase, AFAIK, and it's not really asserting anything useful.

A more common idiom would be to wrap the macro body in do { ... } while (false). Though in this case, where you don't have multiple statements to package up, another option would be

#define CHECK(cond, str) if (cond) { /*OK, do nothing*/ } else return false
31

This is a boolean success/failure result, so please use a bool success value rather than an int -- so return true for success and false for failure. Adjust the CHECK macro and return value accordingly...

149–151

...and with the change to a bool, this becomes more like

bool succeeded = LoopThrough(...);
if (!succeeded) {
  gr_free...(...);
  return nullptr;
}

which IMO reads better.

shravanrn marked 3 inline comments as done.

Address code review comments

gfx/graphite2/geckoextra/include/GraphiteExtra.h
8–9

Ah... a symptom of too much refactoring. Will fix!

35

Yeah, my original concern with uint32_t types was that the RLBox APIs handling of uint32_t types was not fully implemented. However, I have confirmed that this should be fine. Will fix!

gfx/graphite2/geckoextra/src/GraphiteExtra.cpp
15–19

When this is sandboxed, though, I guess MOZ_CRASH probably isn't an option

Yup, we cannot directly use MOZ_CRASH. The function would have to return some error code which thebes code would have to call MOZ_CRASH on... However, I think its fine to fail and return null pointer here as the thebes code would just bail out without crashing

I'm curious, actually, what would abort() do if called from within the WASM-sandboxed code?

The way I have setup the wasm sandbox, the abort does the same thing as an abort in Firefox... invokes the Firefox signal handler...

20

Sure, I can change it. I generally prefer static_assert as you can use it outside function bodies as well (say in class definitions)

I think this should be fine, modulo one adjustment as noted inline - thanks.

gfx/graphite2/geckoextra/src/GraphiteExtra.cpp
21

This isn't quite what I meant; the convention is to wrap the entire macro body in do ... while, so I would expect this to be

#define CHECK(cond, str) \
    do { \
      if (!(cond)) { \
        return false; \
      } \
    } while (false)

which encapsulates the macro body so that it behaves syntactically more like a simple statement. Note that do ... while (false) is guaranteed to execute its body exactly once, because the always-false termination test is at the end of the loop.

(See also the [definitions of PR_BEGIN_MACRO and PR_END_MACRO](https://searchfox.org/mozilla-central/rev/d7537a9cd2974efa45141d974e5fc7c727f1a78f/nsprpub/pr/include/prtypes.h#123-124); I guess those definitions themselves probably aren't accessible here, but you can just do the same thing explicitly.)

This revision is now accepted and ready to land.Oct 21 2019, 8:11 PM

@jfkthame Just before landing I wanted to run one extra test across all platforms. It looks like the Windows opt builds are causing issues
https://treeherder.mozilla.org/#/jobs?repo=try&revision=a7f69424cf9bb518d569ad534fb8c5dc2fde2add&selectedJob=272244454
I'm unvestigating this further, although this is a little difficult to debug as this is happening only on Windows + optimized builds... Please let me know you have any suggestions

@jfkthame Just before landing I wanted to run one extra test across all platforms. It looks like the Windows opt builds are causing issues
https://treeherder.mozilla.org/#/jobs?repo=try&revision=a7f69424cf9bb518d569ad534fb8c5dc2fde2add&selectedJob=272244454
I'm unvestigating this further, although this is a little difficult to debug as this is happening only on Windows + optimized builds... Please let me know you have any suggestions

It's hitting an array-bounds violation. Given how this is only on Windows opt builds, I'm guessing there might be a usage of an uninitialized value or something like that, where other compilers or debug builds end up with it zero-initialized "by chance", but the opt builds get a random value. If just inspecting the code doesn't find anything, maybe running under valgrind on Linux would detect it and point to the exact place.

It's hitting an array-bounds violation. Given how this is only on Windows opt builds, I'm guessing there might be a usage of an uninitialized value or something like that, where other compilers or debug builds end up with it zero-initialized "by chance", but the opt builds get a random value. If just inspecting the code doesn't find anything, maybe running under valgrind on Linux would detect it and point to the exact place.

Valgrind turned out to be too noisy... So, I've setup a local Windows build... Took a while to work through build system issues.

It looks like this is some sort of linking bug... the error is actually thrown on the first call to graphite (i.e.gr_seg_n_slots in the function gr_get_glyph_to_char_association). I'm not sure why exactly this is happening, but will play with this some more.
Let me know if you have any thoughts - (perhaps some bug in how I've use the moz.build?)

Fix moz.build for windows builds

This latest patch fixes the moz.build on Windows as well. Tested on my machine. However, try is down for now so will check again tomorrow

@jfkthame : So I have confirmed that everything has been fixed locally on windows build. But it looks like the windows machines on try are still down (See here). Any thoughts on how we can proceed? Also, when we are ready, I just wanted to mention that I don't have permissions to land code, so I was hoping you could help land this?

@jfkthame : So I have confirmed that everything has been fixed locally on windows build. But it looks like the windows machines on try are still down (See here).

The logs there show things like

[task 2019-10-28T22:05:37.188Z] 22:05:37     INFO -  FATAL ERROR PROCESSING MOZBUILD FILE
[task 2019-10-28T22:05:37.188Z] 22:05:37     INFO -  ==============================
[task 2019-10-28T22:05:37.188Z] 22:05:37     INFO -  The error occurred while processing the following file:
[task 2019-10-28T22:05:37.188Z] 22:05:37     INFO -      z:/build/build/src/gfx/angle/targets/angle_common/moz.build
[task 2019-10-28T22:05:37.188Z] 22:05:37     INFO -  The underlying problem is we referenced a path that does not exist. That path is:
[task 2019-10-28T22:05:37.188Z] 22:05:37     INFO -      z:/build/build/src/gfx/graphite2/geckoextra/moz.build
[task 2019-10-28T22:05:37.189Z] 22:05:37     INFO -  Either create the file if it needs to exist or do not reference it.

Did you create a graphite2/geckoextra/moz.build file locally, but forget to hg add it to the patch?

Oh, on second glance, I think the problem is that you deleted the geckoextra/moz.build file, but didn't remove the reference to geckoextra in gfx/moz.build's DIRS variable, so the build system is still looking for it.

Yup, that was definitely it. I am sorting out one last issue and will update the code review.