-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove copy from hsMatrix2SIMD in Metal renderer #1647
base: master
Are you sure you want to change the base?
Remove copy from hsMatrix2SIMD in Metal renderer #1647
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if it makes sense to return a matrix_float4x4&
rather than a pointer, since we always seem to be immediately dereferencing it
b11d7cf
to
14d89ad
Compare
I've rebased this branch to include the Linux runner changes. @dpogue - I'm still considering your question. No real thoughts yet - I think the underlying fMap in the matrix is a pointer type which is why it was a pointer here. Shame the matrix type includes extra data beyond the matrix, but using the fMap as the raw underlying matrix data is a small inconvenience. |
14d89ad
to
618ccc2
Compare
I've edited this commit to use reinterpret_cast and to return a reference instead of a pointer. |
hsMatrix2SIMD used to do a transpose of the input hsMatrix - but shaders and code were rewritten to make this unnecessary. hsMatrix2SIMD still does a vestigial memory copy. This commit removes that memory copy. This changes the meaning of hsMatrix2SIMD - so functions that need to make a copy now do so manually. Otherwise a direct cast is made. hsMatrix2SIMD has also been made constexpr. This should result in a small (~1%) performance improvement in the Metal renderer main loop.
618ccc2
to
0aa4708
Compare
matrix_float4x4* hsMatrix2SIMD(const hsMatrix44& src, matrix_float4x4* dst); | ||
inline const matrix_float4x4& hsMatrix2SIMD(const hsMatrix44& src) | ||
{ | ||
//reinterperate_cast not allowed in constexpr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
//reinterperate_cast not allowed in constexpr | |
// reinterpret_cast not allowed in constexpr |
inline const matrix_float4x4& hsMatrix2SIMD(const hsMatrix44& src) | ||
{ | ||
//reinterperate_cast not allowed in constexpr | ||
return *reinterpret_cast<const simd_float4x4 *>(src.fMap); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return *reinterpret_cast<const simd_float4x4 *>(src.fMap); | |
return *reinterpret_cast<const simd_float4x4*>(src.fMap); |
matrix_float4x4 mat; | ||
hsMatrix2SIMD(drawable->GetPaletteMatrix(span.fBaseMatrix + 1), &mat); | ||
fDevice.CurrentRenderCommandEncoder()->setVertexBytes(&mat, sizeof(matrix_float4x4), VertexShaderArgumentBlendMatrix1); | ||
const matrix_float4x4 *mat = &hsMatrix2SIMD(drawable->GetPaletteMatrix(span.fBaseMatrix + 1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const matrix_float4x4 *mat = &hsMatrix2SIMD(drawable->GetPaletteMatrix(span.fBaseMatrix + 1)); | |
const matrix_float4x4* mat = &hsMatrix2SIMD(drawable->GetPaletteMatrix(span.fBaseMatrix + 1)); |
@@ -1315,8 +1314,7 @@ void plMetalPipeline::IRenderProjection(const plRenderPrimFunc& render, plLightI | |||
fCurrentRenderPassUniforms->fogColor = {0.f, 0.f, 0.f}; | |||
fCurrentRenderPassUniforms->diffuseCol = {1.f, 1.f, 1.f, 1.f}; | |||
|
|||
matrix_float4x4 tXfm; | |||
hsMatrix2SIMD(proj->GetTransform(), &tXfm); | |||
const matrix_float4x4 &tXfm = hsMatrix2SIMD(proj->GetTransform()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const matrix_float4x4 &tXfm = hsMatrix2SIMD(proj->GetTransform()); | |
const matrix_float4x4& tXfm = hsMatrix2SIMD(proj->GetTransform()); |
if (weights[j]) { | ||
float weight = weights[j]; | ||
if (weight) { | ||
const simd_float4x4 simdMatrix = hsMatrix2SIMD(matrixPalette[indices & 0xFF]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const simd_float4x4 simdMatrix = hsMatrix2SIMD(matrixPalette[indices & 0xFF]); | |
const simd_float4x4& simdMatrix = hsMatrix2SIMD(matrixPalette[indices & 0xFF]); |
hsMatrix2SIMD used to do a transpose of the input hsMatrix - but shaders and code were rewritten to make this unnecessary. hsMatrix2SIMD still does a vestigial memory copy. This commit removes that memory copy.
This changes the meaning of hsMatrix2SIMD - so functions that need to make a copy now do so manually. Otherwise a direct cast is made. hsMatrix2SIMD has also been made constexpr.
This should result in a small (~1%) performance improvement in the Metal renderer main loop.