Page MenuHomePhabricator

Bug 1598380 - Use single PBO per texture for uploads. r=kvark
ClosedPublic

Authored by jnicol on Dec 9 2019, 3:56 PM.

Details

Summary

Uploading texture data is showing up frequently in profiles on Adreno devices,
especially when zooming on a text-heavy page. Specifically, the time is spent in
glMapBufferRange and glBufferSubData, most likely when internally allocating the
buffer before transferring data in to it.

Currently, we are creating a new PBO, by calling glBufferData(), for each
individual upload region. This change makes it so that we calculate the required
size for all upload regions to a texture, then create single a PBO of the
required size. The entire buffer is then mapped only once, and each individual
upload chunk is written to it. This can require the driver to allocate a large
buffer, sometimes multiple megabytes in size. However, it handles this case much
better than allocating tens or even hundreds of smaller buffers.

An upload chunk may require more space in a PBO than the original CPU-side
buffer, so that the data is aligned correctly for performance or correctness
reasons. Therefore it is the caller of Device.upload_texture()'s responsibility
to call a new function, Device.required_upload_size(), to calculate the required
size beforehand.

On AMD Macs, there is a bug where PBO uploads from a non-zero offset can
fail. See bug 1603783. Therefore this patch preserves the current behaviour on
AMD Mac, reallocating a new PBO for each upload, therefore ensuring the offset
is always zero.

Diff Detail

Repository
rMOZILLACENTRAL mozilla-central
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

jnicol created this revision.Dec 9 2019, 3:56 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 9 2019, 3:56 PM
phab-bot requested review of this revision.Dec 9 2019, 3:56 PM
phab-bot changed the visibility from "Custom Policy" to "Public (No Login Required)".
phab-bot changed the edit policy from "Custom Policy" to "Restricted Project (Project)".
phab-bot removed a project: Restricted Project.
kvark accepted this revision.Dec 9 2019, 6:46 PM

Thank you for this change!

It's an improvement for allocating less small PBOs. It comes at a cost that the PBO sizes we allocate are not regular.
There is also an inconvenience involved in the fact we have to go through the uploads twice now: once to estimate the size, and once to do a copy.

I'll file a bug to address this as a follow-up.

gfx/wr/webrender/src/device/gl.rs
2714

we have the same assert below

3656–3657

shouldn't this be required_upload_size()?

This revision is now accepted and ready to land.Dec 9 2019, 6:46 PM
jnicol added a comment.Dec 9 2019, 7:44 PM

Thanks for the quick review!

It comes at a cost that the PBO sizes we allocate are not regular.

This was the case before too, but I agree it's a definite disadvantage that we should address in follow up.

The double pass is unfortunate, but better than my initial draft's 3, where the debug clears were done separately to size calculation :) will also be good to get rid of this in follow up!

gfx/wr/webrender/src/device/gl.rs
2714

Different functions though. The user should be passing the same stride to both, but asserting in both seems sensible to me.

3656–3657

Good catch, now that the function exists we should use it!

jnicol added inline comments.Dec 9 2019, 10:09 PM
gfx/wr/webrender/src/device/gl.rs
3656–3657

This is actually a bit icky to do, because the TextureUploader doesn't have access to the Device to call that function. But it can be made a static function (or whatever we call that in rust.) Hopefully we'll be replacing this code soon anyway, so it's not the end of the world.

jnicol updated this revision to Diff 205145.Dec 9 2019, 10:17 PM
CosminS reopened this revision.Dec 12 2019, 6:43 PM
This revision is now accepted and ready to land.Dec 12 2019, 6:43 PM
jnicol updated this revision to Diff 209087.Dec 17 2019, 2:11 PM
jnicol edited the summary of this revision. (Show Details)

Revision updated.

jnicol retitled this revision from Bug 1598380 - Use single PBO per texture for uploads. r?kvark to Bug 1598380 - Use single PBO per texture for uploads. r=kvark.Dec 17 2019, 2:11 PM
jnicol edited the summary of this revision. (Show Details)

New version keeps the old behaviour on AMD Mac of reinitializing the buffer prior to each upload. It does so by changing TextureUploader's Some(PixelBuffer) field (indicating immediate or PBO), into an enum TextureUploaderType indicating immediate, single-use PBOs, or multi-use PBO.

This also includes the regression fixes from bug 1602678, addressing your latest review comments on those.

jnicol requested review of this revision.Dec 17 2019, 2:16 PM
kvark requested changes to this revision.Dec 17 2019, 4:58 PM
kvark added inline comments.
gfx/wr/webrender/src/device/gl.rs
3689

can we merge this match with the if body above?

gfx/wr/webrender/src/internal_types.rs
365

should we remove the CacheTextureId from the strcut TextureCacheUpdate now?

gfx/wr/webrender/src/renderer.rs
1522

nit: let's do let (upload_size, _) = instead

3498

why do we break and not continue?

This revision now requires changes to proceed.Dec 17 2019, 4:58 PM
jnicol added inline comments.Dec 17 2019, 6:30 PM
gfx/wr/webrender/src/device/gl.rs
3689

I would love to. the problem I ran in to was that I want a reference to the buffer (either existing mutli-use or a new single use). If it's multi-use I don't want to move it out of self.uploader_type. But if it's single-use we need to create a new one here, but it needs to outlive the reference to it.

I've come up with something, let me know if you think it's good, or you have a better solution

gfx/wr/webrender/src/renderer.rs
1522

Sure, will change the other calls too

3498

Because I program bad :( good catch!

jnicol updated this revision to Diff 209233.Dec 17 2019, 6:30 PM

Revision updated.

kvark added inline comments.Dec 17 2019, 7:05 PM
gfx/wr/webrender/src/device/gl.rs
3689

It's a pattern I've faced before. Typically what you can do is:

let something; // leave uninitialized
let other_thing = match match_thing {
  Thing::New() {
    something = do_something(); // initialized here
    &something // scope is still larger than the match arm
  }
  Thing::Other(ref thing) => thing,
};
kvark accepted this revision.Dec 17 2019, 7:07 PM
This revision is now accepted and ready to land.Dec 17 2019, 7:07 PM