Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hand optimised methods #435

Closed
wants to merge 2 commits into from
Closed

Conversation

hrj
Copy link
Contributor

@hrj hrj commented Mar 2, 2016

Background

I am trying to design a simple JIT compiler for doppio. During analysis, I noticed that for my benchmark, the following methods were the most frequently called:

count of calls: method name
------------------------------------
161516: java/lang/Math/min(II)I
147661: java/lang/Object/<init>()V
113636: com/sun/tools/javac/file/ZipFileIndex/get4ByteLittleEndian([BI)I
113633: com/sun/tools/javac/file/ZipFileIndex/access$500([BI)I
90974: com/sun/tools/javac/file/ZipFileIndex/get2ByteLittleEndian([BI)I
90962: com/sun/tools/javac/file/ZipFileIndex/access$400([BI)I
84696: java/lang/String/compareTo(Ljava/lang/String;)I
83489: com/sun/tools/javac/file/ZipFileIndex$Entry/compareTo(Ljava/lang/Object;)I, com/sun/tools/javac/file/ZipFileIndex$Entry/compareTo(Lcom/sun/tools/javac/f
ile/ZipFileIndex$Entry;)I 

Implementing the JIT compiler will take some time, but I thought I would hand-compile the above methods for two reasons:

  • To see how much performance benefit it fetches.
  • Some of these methods are too complex for the first iteration of JIT that I am planning. They have loops and branches that will take a long time to JIT optimise.

Changes

This PR adds some more "trapped methods" that help improve speed because they are implemented as native JS functions that use NativeStackFrame instead of interpreted methods with BytecodeStackFrame.

Benchmark result

For the "compile" benchmark:

Environment Time before Time after (improvement percent)
Chromium 48 26870 23327 (13.2)
Firefox 44 33796 29602 (11.6)

@hrj
Copy link
Contributor Author

hrj commented Mar 2, 2016

If this PR is acceptable, then further work can be:

  • to add trapped implementation for all similar methods such as different Math.min and Math.max variants.
  • remove the "simple" trapped methods after JIT is implemented. For example: com/sun/tools/javac/file/ZipFileIndex/get4ByteLittleEndian. These don't have branches or loops so JITing them is easy and has predictable tradeoffs (JIT overhead v/s performance improvement).

@jvilk
Copy link
Member

jvilk commented Mar 3, 2016

I am not willing to add any additional trapped methods, even for speed. It adds additional maintenance burden that we should not attempt to bear. I would rather focus on improving the speed of the bytecode so that these aren't necessary.

If you want to surface an API to register custom trapped methods, I will merge in a PR for that feature, and you can continue to use hand-defined methods in your projects.

I hope your JIT works out! That would certainly be interesting to see. One speed issue with our bytecode interpreter is that we eagerly create stack frames, rather than create the stack frame lazily when it is needed to suspend / create an exception stack trace. Changing how that works is a serious undertaking, but may make it easier to implement a JIT.

@jvilk jvilk closed this Mar 3, 2016
@hrj
Copy link
Contributor Author

hrj commented Mar 4, 2016

I can understand your hesitancy to merge these changes in.

If you want to surface an API to register custom trapped methods, I will merge in a PR for that feature, and you can continue to use hand-defined methods in your projects.

That's a great idea; thanks! Adding it to my todo list.

I hope your JIT works out! That would certainly be interesting to see. One speed issue with our bytecode interpreter is that we eagerly create stack frames, rather than create the stack frame lazily when it is needed to suspend / create an exception stack trace. Changing how that works is a serious undertaking, but may make it easier to implement a JIT.

Is lazy creation of stack frames possible with the asynchronous-calling behaviour of doppio? #359 seems a bit relevant.

The JIT I am planning is very simple; in the first cut, it will just coalesce opcodes to avoid hammering the operand-stack, similar to the first point in #32. Control-flow opcodes (invokes, conditional jumps) will not be included in the JIT trace, atleast in the first cut. I am hoping to write a design document soon, to get your feedback.

@jvilk
Copy link
Member

jvilk commented Mar 8, 2016

You'd have to change a lot to lazily create stack frames, but it is possible to do. I look forward to seeing your proposal!

@hrj hrj mentioned this pull request May 20, 2016
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants