History | Log In     View a printable version of the current page.  
Issue Details (XML)

Key: ASC-2595
Type: Bug Bug
Status: Open Open
Priority: B B
Assignee: Erik Tierney
Reporter: wsharp
Votes: 4
Watchers: 1
Operations

If you were logged in you would be able to see more operations.
ActionScript Compiler (ASC)

Performance: Early binding static Math functions would double performance

Created: 03/06/06 07:15 AM   Updated: 02/16/09 03:58 PM
Component/s: Performance
Security Level: Public (All JIRA Users )

Severity: Performance
Reproducibility: Every Time
Discoverability: Low
Found in Version: Flash 9.0 - Deferred Bugs - AVMPLUS_1_0_0_d538
Milestone: Future
Affected OS(s): All OS Platforms - All
Steps to Reproduce:
Reproduction:
1. Using test case in bug folder, test using Math.abs vs custom abs function
2. Math.abs : 800 msecs
3. myabs : 125 msecs

Actual Results: hand-written abs function is 6x faster than using Math.abs

Expected Results: Same speed or something similar

Workaround: Write your own abs/min/max/etc. routines.

Because the Math routines are static to the class, the "Math" constant first needs to be looked up before Math.abs call be called. If the verifier could optimize this out so we could directly bind to the Math routines, we'd double our performance. I'm using VTUNE to compare the overhead of Math.abs vs a custom function. A custom function will actually always be faster because we don't have to go out to a native C++ routine which has slightly more overhead.

A further workitem for next release would be to inline Math.abs and another simple routines directly into the JITed code. Math.abs is basically a "fabs" instruction on the PC yet our current system needs to make two function calls (MathClass::abs, MathUtils::abs) to generate that result. Having it completely inlined (or written in SSE if possible) would make it 10-50x faster?


        
Language Found: English
Bugbase Id: 161818
Needs Release Note: No
Triaged: Yes
Regression: No
QA Owner: wsharp
Fixed Version: Flash 9.0 - Deferred Bugs
Participants: BugDB Migration, Erik Tierney, JIRA Migration Admin, Lars Hansen, Trevor Baker and wsharp


 All   Comments      Sort Order:
JIRA Migration Admin - [06/04/07 12:35 PM ]
Move from BugDB issue number 161818

JIRA Migration Admin - [06/04/07 12:35 PM ]
Milestone ID = 905
Milestone = Flash 9.0 - Deferred Bugs
Build ID = 17275
Build = AVMPLUS_1_0_0_d538
Fix Build ID = null
Fix Build = null

BugDB Migration - [06/04/07 12:35 PM ]
[wsharp 3/6/06] Entered Bug.
[dansmith 3/7/06] Deferring for this release. Good optimization for following release.
[edwsmith 3/8/06]

implementation notes: there are two suggestions here, comments for each

1. speed up early binding to static functions. Math.abs() becomes
findproperty<Math>
early-bind-to abs()

The findproperty is slow, it requires a HT lookup.

fix 1 (asc): at compile time, recognize that the result of findproperty can't vary from call to call in the same scope, so only call it once and keep re-using the result. this requires no vm changes and would give the 2x speedup

fix 2 (avm): the findproperty call is necessary because we dont have a vm-wide int-indexed table of the toplevel symbols. but we could. Domain & DomainENV could be updated to assign every toplevel symbol an integer slot in a single table. at verify time, we would look for Math and find it in the Domain and get the integer slot_id. at run time, we'd use the id to index into a DomainENV based table. The two tables would have parallel structure, just like method_id's work (traits has a table of AbstractFunction* pointers) and VTable has a table organized the same way, with MethodENV pointers.

2. inline math functions like abs() and sqrt(), to use FPU instructions.

first, create some opcodes for these functions, like OP_abs and OP_sqrt. then there are two approaches, as before.

fix 1 (asc) use the new opcodes

fix 2 (avm) recognize Math.abs at VT and call MIR to emit OP_abs (etc) like we do for other ops.

either way, implement OP_abs/sqrt, etc, in MIR as follows. in pass 1 (verify pass) if we know we have an FPU with the right operation, generate MIR_abs, MIR_sqrt, etc. (new floating point MIR operations). If we don't, then emit MIR_cs to call some static helper function.

in the second pass, generate MD code for MIR_abs, sqrt, etc, using the cpu instructions that make sense.

note that if the CPU/FPU doesn't have the operation, then the MIR code will already have a static call to a helper. We don't want to have a MIR opcode that calls a static helper. if a helper is required, it should be invoked with the appropriate MIR call opcode.

for example, x87 has sin/cos/tan, but SSE and PPC don't. but all three FPU's have sqrt support.

Ed

Lars Hansen - [12/17/07 09:55 AM ]
The Math binding in the global object is mutable and the function properties on the Math object are mutable and deletable in ES3 and ES4. Thus Ed's fix 1.1 fails (if we want to be standards-compliant). Fix 1.2 is better but we'd still have to dynamically look up the function property in the Math object, and then call it. Inlining has similar issues.

Another fix might be to do guarded inlining, ie, recognize Math.foo on the source level and effectively rewrite it to code like this:

  $theMathObjectIsDirty ? Math.foo(x) : OP_foo(x)

which checks a bit in some data structure (the Domain?) that is set whenever the Math object is modified or whenever the global object's Math binding is modified. Effectively, it pushes the complexity into the Math object and into the global object.

Even with that fix, "with" and "eval" get in the way, as they can introduce new bindings for "Math", so the optimization is not available in a scope affected by any of those constructs.

Alternatively I suppose it's possible to inline if some incantation can turn on early-binding. Is there such an incantation already?

Trevor Baker - [01/27/09 02:53 PM ]
back to internal review for reprioritization