Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Further improve performance #83

Merged
merged 6 commits into from
Jan 2, 2024
Merged

Further improve performance #83

merged 6 commits into from
Jan 2, 2024

Conversation

tim-hoffman
Copy link

No description provided.

@tim-hoffman tim-hoffman changed the title Further improve perforamance Further improve performance Jan 2, 2024
@tim-hoffman tim-hoffman marked this pull request as ready for review January 2, 2024 18:17
@tim-hoffman tim-hoffman requested review from iangneal and 0xddom January 2, 2024 18:17
Copy link
Collaborator

@0xddom 0xddom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really understand what's going on, but I trust you that this improves performance. Because, if I understood it correctly, by splitting between compute and execute you avoid copying the env unnecessarily, correct? Do you have some numbers of the run time now?

Also, I was thinking. Do we still clone the whole IR every time we run a pass? Maybe we could improve performance further if we move the IR and only create new copies of things if we actually modify them. I don't know if it will work since we need the IDs for each bucket but may be worth a try

_compute_instruction,
_execute_instruction
);
error::add_loc_if_err(result, inst.as_ref())
}

/****************************************************************************************************
* Private implemenation
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: typo

@tim-hoffman
Copy link
Author

You got it. Profiling showed that Env clones and destructors were taking the most time by far.
I've replaced "execute" with "compute" in many cases where the bucket cannot update the Env. The compute_or_execute macro helps in cases where the Env might be updated but we don't end up using the updated Env so it's safe to try the "compute" approach first but if an Env modifying instruction is encountered, fall back to the "execute" approach.

@tim-hoffman
Copy link
Author

Each pass does still reconstruct the IR via the transform methods but I haven't seen that as a bottleneck in profiling so far.

@0xddom
Copy link
Collaborator

0xddom commented Jan 2, 2024

Each pass does still reconstruct the IR via the transform methods but I haven't seen that as a bottleneck in profiling so far.

It's not only performance but memory usage, but we can leave it for future work then, if it's not as pressing. This PR has enough stuff already

@tim-hoffman tim-hoffman merged commit 560117e into llvm Jan 2, 2024
2 checks passed
@tim-hoffman tim-hoffman deleted the th/van-929 branch January 2, 2024 20:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants