Tuesday, April 16, 2013

"Twenty-one. That's blackjack." "Hit me."

Like any idiot who gets a natural 21, I just had to take another hit. 21 was ported relatively uneventfully. Some of the underlying work for Australis, the new Firefox UX, is in this version (invisibly), and they ported pretty much as is to 10.4, so that's a good sign for the future. I'm still watching some of the later workbugs but I don't see them landing until at least Fx24 (Fx23 is the current nightly), and Australis isn't going to be in place in full by then. Other than breaking issue 82 again, which was a printing bug introduced by an incompatible fix way back in 6.0 and is trivial to back out, the browser appears to basically work in debugging mode. getUserMedia still functions fine, we're starting to get into that curve back to where we should be with JavaScript performance, and there continue to be improvements to the graphics stack which make animations and display smoother. Later this week once I've done a couple other bug investigations, I'll flip it over and build optimized for further testing, another version in the can relatively painlessly for a change.

But I just had to, had to, take another hit with the king and the ace showing. The "hit" is something called jemalloc, which is a improved memory allocator with lower overhead: instead of asking for little tiny allocations from the operating system, jemalloc creates "arenas" of larger memory blocks (assuming that more allocation requests are following) and then parcels those out with a faster internal routine. It also scales better between threads by keeping multiple arenas in play so they don't have to contend with each other. Certain kernel-level operations and multithreading are not well optimized on 10.4 as issue 193 demonstrates, and anything that reduces the amount of locking and waiting for kernel resources is clearly a benefit because of Firefox's increasing dependence on threads for multicore systems. You can read about the gory implementation details here.

Firefox works just fine with jemalloc disabled (since it is intended to be mostly transparent), and that is how we've shipped TenFourFox so far. (Near as I can determine, AuroraFox and SeaMonkeyPPC aren't using it either, or it's not actually turned on.) Well, 10.4 must have a really crummy default allocator, because after some fiddling to account for operating system differences, I was able to slot it in and WOW! the browser not only starts up apparently normally, but is noticeably faster. Most of the deadlocked sites we're tracking in issue 193 are up to 25% faster in wall clock time compared to the non-jemalloc 21, which is already itself faster than 20. Sites with less contention are less improved, of course, but it's an improvement right where we need it. Even if it doesn't fix the actual underlying issue in the kernel, it eliminates another source of contention, and that's enough to get us through the hump. 10.5 is considerably less affected due to kernel improvements, but it could still benefit as well.

But, and here's the "busted" part, major parts of the browser's interface to OS X are screwed up when using jemalloc as the allocator. While menus, widgets and gadgets all work, cut and paste doesn't work, minimizing the window doesn't work, and drag and drop doesn't work (they don't do anything other than log an error to the system console). If I rebuild the browser with jemalloc statically disabled, they start working again, so that's the problem. That's not shippable no matter how much faster the browser is, and there is at least one crash bug related to drag and drop on Intel 10.5 that caused Mozilla to disable jemalloc on anything less than 64-bit 10.6. The PowerPC kernel might not use the same code or crash in the same way, but right now I can't even test it.

I'm suspicious that memory alignment is the problem and it's stomping on some sort of internal memory move routine, but it's going to take a while to debug it and IonMonkey is still highest priority, so this is going to slip to Job 2. Still, look for it soon once I get IonMonkey into a working state (the interested can watch issue 218). 21 is an improvement over 20, but if I can get jemalloc off the ground, maybe we'll be able to say blackjack with 22 or 24. Never bet against the house!

No comments:

Post a Comment

Due to an increased frequency of spam, comments are now subject to moderation.