-
Notifications
You must be signed in to change notification settings - Fork 544
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add asynchronous concurrent execution #3687
Conversation
1484d67
to
f81588d
Compare
fd5af51
to
6a139c6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left comments. Looks good overall.
495e166
to
9835194
Compare
ed7e05f
to
7cb3237
Compare
9592272
to
e0ab3f3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM, some minor comments
Streams enable the overlap of computation and data transfer, ensuring | ||
continuous GPU activity. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[NIT] Seems vague and out of place in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to explain, why should we use streams added an extra sentence to avoid the "out of place"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The whole paragraph seems out of place. You come from section Streams and concurrent execution
where you explain this idea. Then you restate it at the beginning of Managing streams
. It is not wrong but it does not read well, and feels unnecessary/redundant. As it is a NIT, feel free to resolve this comment if you don't agree with the suggestion.
Asynchronous memory operations allow data to be transferred between the host | ||
and device while kernels are being executed on the GPU. Using operations like |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this sentence is a bit misleading. A reader could miss the fact that this operation must be on a different streams to get this behavior.
Asynchronous memory operations, do not block the host while copying this data.
Asynchronous memory operations on multiple streams allow for data to be transferred between the host and device while kernels are executed. (and do not block the host while copying this data)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated sentence.
One of the primary benefits of asynchronous operations is the ability to | ||
overlap data transfer with kernel execution, leading to better resource | ||
utilization and improved performance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[NIT] you could clarify that multiple streams are needed to copy while executing a kernel in parallel.
another. This technique is especially useful in applications with large data | ||
sets that need to be processed quickly. | ||
|
||
Concurrent data transfers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this differ to the Asynchronous memory operations section
?
This feels repetitive, and it is not clear on how you wish to distinguish between concurrent and asynchronous within this context
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed concurrent data transfers section and added hipMemcpyPeerAsync
mention to previous section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Main comments were addressed
No description provided.