Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support parallel runs in Gradle multi-module projects #183

Open
loetifuss opened this issue Mar 24, 2023 · 12 comments
Open

Support parallel runs in Gradle multi-module projects #183

loetifuss opened this issue Mar 24, 2023 · 12 comments
Labels
enhancement New feature or request parser-gradle question Further information is requested

Comments

@loetifuss
Copy link

In a multi-module build I have applied the plugin to all subprojects. When I try to run the "rewriteRun" task on all subprojects, I'm getting the class loading errors below for each subproject.
When I try to run the build without parallel processing ("--no-parallel") the tasks will run fine. However since there are hundreds of subprojects this is not a good solution.

Is the plugin intended to be used on multiple subprojects? I tried to apply the plugin on the root project but that will produce other issues (e.g. StackOverflowErrors).

5: Task failed with an exception.
-----------
* What went wrong:
Execution failed for task ':subprojectA:rewriteRun'.
> loader org.openrewrite.gradle.RewriteClassLoader @37fc83c0 attempted duplicate class definition for com.fasterxml.jackson.module.kotlin.KotlinModule$Builder. (com.fasterxml.jackson.module.kotlin.KotlinModule$Builder is in unnamed module of loader org.openrewrite.gradle.RewriteClassLoader @37fc83c0, parent loader org.gradle.internal.classloader.VisitableURLClassLoader @4894f78d)

* Try:
> Run with --stacktrace option to get the stack trace.
> Run with --info or --debug option to get more log output.
> Run with --scan to get full insights.
==============================================================================
@timtebeek timtebeek added bug Something isn't working question Further information is requested labels Mar 24, 2023
@timtebeek
Copy link
Contributor

hi @loetifuss ; Interesting use of rewrite you've got there! I don't know the specifics around our Gradle plugin too well (yet), but can imagine that it was not foreseen to be used in parallel on multiple subprojects, but not present on the root project.

It sounds like you've found a workaround for now by not running in parallel; does that make this issue feature request to support running in parallel?

Typically we do suggest to add the plugin to the root module, and then optionally increase the memory with -Dorg.gradle.jvmargs=-Xmx2G or more as needed. Could that be an option there too?

I don't suppose your project is open source is it? Otherwise we'd have another option through public.moderne.io.

@loetifuss
Copy link
Author

Thanks for the update. It would indeed be great if the plugin could work on individual subprojects since this would leverage Gradle's built-in parallel processing of subprojects. My current goal is to automatically update gradle build dependencies in a large multimodule build using the "ChangeDependencyArtifactId" recipe.
As far as I understand, when running the plugin on the root project, it tries to parse alle source files from all subprojects first. With a build containing millions of lines of code that's a very heavy task to perform. Or is there some way of filtering the input (currently I'm only interested in the Gradle build files)?

@timtebeek
Copy link
Contributor

I've not used these options on Gradle projects myself yet, and from the documentation it's questionable if it will work to exclude Java sources, but we do have both exclusions and plainTextMasks options that you could try to run from the root whilst not parsing any sources as Java. 🤔

Just creatively looking for a quick workaround for you on your limited use-case; ideally we'd support more of this officially of course, but I imagine you mostly want something that works today, with an option on something better in the future.

@sambsnyd
Copy link
Member

sambsnyd commented Apr 8, 2023

For such a large project it'll generally work better to separate parsing and recipe execution into separate steps.
That is how our commercial saas product works - build plugin produces LST jars, saas runs the recipes on beefy hardware and issues pull requests. Perhaps that would be of interest to you @loetifuss ?

@timtebeek timtebeek changed the title "duplicate class definition" errors when running rewriteRun on subprojects Support parallel runs in Gradle multi-module projects Apr 25, 2023
@timtebeek timtebeek added enhancement New feature or request and removed bug Something isn't working labels Apr 25, 2023
@timtebeek timtebeek moved this to Backlog in OpenRewrite Apr 25, 2023
@wakingrufus
Copy link

typically, gradle plugins should be designed to be applied to the project within which they are operating (ie subproject) for better project encapsulation, which helps with parallel builds among other things.
The Gradle Worker API is also ideal to use, as it handles parallelization of multiple units of work within a single subproejct for you.
I have experience working with these things in my day job as well as in my work on https://github.com/JLLeitschuh/ktlint-gradle So I would be willing to help out the effort to bring this functionality to the Open Rewrite plugin

@wakingrufus
Copy link

wakingrufus commented Aug 7, 2023

I took a brief look at the code, and my first impression is:
the plugin uses a custom classloader (RewriteClassLoader) which delegates to the parent classloader in some cases. the parent classloader is the gradle build's classloader which is shared across all projects, so, when running parallel builds, it causes race conditions that results in duplicate class definitions. Instead, the plugins should use the Gradle Worker API in classLoaderIsolation() mode i think. while this will hurt overall performance, the ability to run parallel builds will gain that back and more for many users.

@shanman190
Copy link
Collaborator

So currently the rewrite plugin was only ever really designed to be applied to the root project and operate over all of the source code at once. There are aspects here that relate to being able to properly type attribute files between dependent projects. It may be possible to allow Gradle to perform more of the orchestration a little bit, but the rewrite plugin would have to make sure that all source sets have completed for a project before then performing the rewrite-specific tasks.

Now the other part that is going to be a larger kicker, is that event IsolationMode.NONE isn't actually sufficiently isolated enough for the purposes of rewrite. As an specific example, the worker's Groovy and Kotlin versions leak into the isolated boundary and for rewrite, we need to make sure that the project's Groovy and Kotlin versions are used instead for parsing. So the custom classloader isolation would have to remain if only for that aspect in particular.

The last part that will make this difficult, is that the rewrite plugin presently targets Gradle 4.0 and newer. The worker api first appears on the scene in Gradle 4.1 with the method that you call out not appearing until Gradle 5.6.

Personally, I'd enjoy seeing the ability to use several of the features available in newer Gradle versions, but just mentioning a few things about the current state presently.

@wakingrufus
Copy link

Can you explain more what "attribute files" means? If we just need the classes from other projects loaded, we can piggyback off the sub project's compile classpath to get that.

As for gradle version, even 5.x is EOL at this point, so maybe its time to make the jump. But i understand Moderne may have clients still on older versions, so that might be a tough call.

On another note, doing this refactor would probably allow the plugin to make use of task caching as well, which i notice it currently doesn't.

@shanman190
Copy link
Collaborator

"attribute files" in terms of type attribution from a compiler standpoint. So if we have project A, B, C, and D with relationships as so:

A -> B
A -> C
B -> D
C -> D

Then D needs to fully compile all source sets (main, test, etc), then it can do the rewrite parsing steps, then B and C fully compile all sources sets and do their rewrite steps, then finally A does it's work. Just the normal task graph stuff. The rewrite LSTs are type aware in so far as a given LST element points to it's immediate Java type reference which then in turn points to it's transitive type references. In order for the transitive nature to occur, previous elements must have had their types attributed and cached for the later stages to perform their work.

@wakingrufus
Copy link

"attribute files" in terms of type attribution from a compiler standpoint. So if we have project A, B, C, and D with relationships as so:

A -> B
A -> C
B -> D
C -> D

Then D needs to fully compile all source sets (main, test, etc), then it can do the rewrite parsing steps, then B and C fully compile all sources sets and do their rewrite steps, then finally A does it's work. Just the normal task graph stuff. The rewrite LSTs are type aware in so far as a given LST element points to it's immediate Java type reference which then in turn points to it's transitive type references. In order for the transitive nature to occur, previous elements must have had their types attributed and cached for the later stages to perform their work.

Thank you for the explanation. And to do this, rewrite needs the AST/LST directly from sources, not the class files?

@shanman190
Copy link
Collaborator

It's a bit of both. For files in the project the actual source code is used and those LSTs are combined with classes from the classpath and JDK itself to create the entire tree structure.

You can see a more detailed explanation of this here:
https://docs.openrewrite.org/concepts-explanations/lossless-semantic-trees

@sergeykad
Copy link

Parallel execution supported since Gradle 1.2 (2012), so even if you want to support very old Gradle versions it should not be a problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request parser-gradle question Further information is requested
Projects
Status: Backlog
Development

No branches or pull requests

6 participants