Vectorize `coordinates._construct_face_centroids()` #1117

erogluorhan · 2024-12-21T16:30:27Z

Closes #1116.

Overview

Optimizes coordinates.py/_construct_face_centroids() by replacing the for-loop with vectorization with the help of a masking method thanks to @hongyuchen1030 's older suggestion

Further optimization of the code would be possible, but maybe it is handled in future PRs.

FYI @philipc2 : The masking method this PR uses might be useful for cases we previously dealt with via partitioning, e.g. the nodal_averaging problem we look into last year

FYI @rajeeja : This is where the cartesian-based centroid calculations are being fixed. Welzl is a bit different but you may still want to look into it.

Results

#1116 shows the profiling results for the existing code for a 4GB SCREAM dataset where the execution time is around 5 mins.

The optimized code gives around 6 seconds, see below:

PR Checklist

General

An issue is linked created and linked
Add appropriate labels
Filled out Overview and Expected Usage (if applicable) sections

Testing

Tests cover all possible logical paths in your function

rajeeja · 2024-12-21T16:54:13Z

Good fix, will test welzl and see appropriate fix.

github-actions · 2024-12-21T18:23:00Z

ASV Benchmarking

Benchmark Comparison Results

Benchmarks that have improved:

Change	Before [`62ca314`]	After [`41b915a`]	Ratio	Benchmark (Parameter)
-	448M	397M	0.89	face_bounds.FaceBounds.peakmem_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/ugrid/geoflow-small/grid.nc'))
-	637M	394M	0.62	face_bounds.FaceBounds.peakmem_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/ugrid/quad-hexagon/grid.nc'))
-	1.76±0.07ms	600±9μs	0.34	mpas_ocean.ConnectivityConstruction.time_n_nodes_per_face('120km')
-	596±10μs	482±10μs	0.81	mpas_ocean.ConnectivityConstruction.time_n_nodes_per_face('480km')
-	6.12±0.1ms	4.78±0.07ms	0.78	mpas_ocean.ConstructFaceLatLon.time_cartesian_averaging('120km')
-	482M	380M	0.79	mpas_ocean.Integrate.peakmem_integrate('480km')

Benchmarks that have stayed the same:

Before [`62ca314`]	After [`41b915a`]	Ratio	Benchmark (Parameter)
400M	394M	0.98	face_bounds.FaceBounds.peakmem_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/mpas/QU/oQU480.231010.nc'))
432M	424M	0.98	face_bounds.FaceBounds.peakmem_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/scrip/outCSne8/outCSne8.nc'))
18.8±0.2ms	18.8±0.2ms	1	face_bounds.FaceBounds.time_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/mpas/QU/oQU480.231010.nc'))
7.69±0.08ms	7.62±0.1ms	0.99	face_bounds.FaceBounds.time_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/scrip/outCSne8/outCSne8.nc'))
44.2±0.5ms	43.2±0.4ms	0.98	face_bounds.FaceBounds.time_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/ugrid/geoflow-small/grid.nc'))
3.94±0.05ms	3.89±0.1ms	0.99	face_bounds.FaceBounds.time_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/ugrid/quad-hexagon/grid.nc'))
2.98±0.02s	3.04±0.03s	1.02	import.Imports.timeraw_import_uxarray
677±10μs	664±10μs	0.98	mpas_ocean.CheckNorm.time_check_norm('120km')
446±3μs	445±10μs	1	mpas_ocean.CheckNorm.time_check_norm('480km')
665±7ms	648±8ms	0.98	mpas_ocean.ConnectivityConstruction.time_face_face_connectivity('120km')
42.6±0.4ms	41.5±0.4ms	0.98	mpas_ocean.ConnectivityConstruction.time_face_face_connectivity('480km')
3.62±0.05ms	3.66±0.04ms	1.01	mpas_ocean.ConstructFaceLatLon.time_cartesian_averaging('480km')
3.52±0.01s	3.49±0.01s	0.99	mpas_ocean.ConstructFaceLatLon.time_welzl('120km')
222±2ms	227±2ms	1.02	mpas_ocean.ConstructFaceLatLon.time_welzl('480km')
302±1ns	299±2ns	0.99	mpas_ocean.ConstructTreeStructures.time_ball_tree('480km')
810±2ns	883±2ns	1.09	mpas_ocean.ConstructTreeStructures.time_kd_tree('120km')
290±3ns	289±2ns	1	mpas_ocean.ConstructTreeStructures.time_kd_tree('480km')
441±5ms	436±4ms	0.99	mpas_ocean.CrossSection.time_const_lat('120km', 1)
222±3ms	219±1ms	0.99	mpas_ocean.CrossSection.time_const_lat('120km', 2)
114±0.5ms	115±2ms	1.01	mpas_ocean.CrossSection.time_const_lat('120km', 4)
361±5ms	356±2ms	0.99	mpas_ocean.CrossSection.time_const_lat('480km', 1)
181±2ms	179±1ms	0.99	mpas_ocean.CrossSection.time_const_lat('480km', 2)
93.0±0.7ms	93.4±1ms	1	mpas_ocean.CrossSection.time_const_lat('480km', 4)
122±1ms	122±0.7ms	1	mpas_ocean.DualMesh.time_dual_mesh_construction('120km')
8.80±0.07ms	8.61±0.3ms	0.98	mpas_ocean.DualMesh.time_dual_mesh_construction('480km')
1.09±0.01s	1.07±0s	0.99	mpas_ocean.GeoDataFrame.time_to_geodataframe('120km', False)
55.0±0.6ms	53.4±0.6ms	0.97	mpas_ocean.GeoDataFrame.time_to_geodataframe('120km', True)
86.5±0.5ms	86.4±1ms	1	mpas_ocean.GeoDataFrame.time_to_geodataframe('480km', False)
5.63±0.04ms	5.38±0.06ms	0.96	mpas_ocean.GeoDataFrame.time_to_geodataframe('480km', True)
339M	339M	1	mpas_ocean.Gradient.peakmem_gradient('120km')
316M	316M	1	mpas_ocean.Gradient.peakmem_gradient('480km')
2.75±0.01ms	2.77±0.01ms	1.01	mpas_ocean.Gradient.time_gradient('120km')
314±2μs	315±4μs	1	mpas_ocean.Gradient.time_gradient('480km')
244±3μs	231±6μs	0.95	mpas_ocean.HoleEdgeIndices.time_construct_hole_edge_indices('120km')
119±0.7μs	118±2μs	0.99	mpas_ocean.HoleEdgeIndices.time_construct_hole_edge_indices('480km')
402M	396M	0.99	mpas_ocean.Integrate.peakmem_integrate('120km')
173±4ms	172±3ms	0.99	mpas_ocean.Integrate.time_integrate('120km')
12.2±0.2ms	11.7±0.1ms	0.96	mpas_ocean.Integrate.time_integrate('480km')
347±2ms	350±2ms	1.01	mpas_ocean.MatplotlibConversion.time_dataarray_to_polycollection('120km', 'exclude')
347±3ms	348±2ms	1	mpas_ocean.MatplotlibConversion.time_dataarray_to_polycollection('120km', 'include')
346±3ms	347±2ms	1	mpas_ocean.MatplotlibConversion.time_dataarray_to_polycollection('120km', 'split')
23.2±0.2ms	23.2±0.3ms	1	mpas_ocean.MatplotlibConversion.time_dataarray_to_polycollection('480km', 'exclude')
23.1±0.1ms	23.3±0.3ms	1.01	mpas_ocean.MatplotlibConversion.time_dataarray_to_polycollection('480km', 'include')
22.9±0.1ms	23.0±0.1ms	1	mpas_ocean.MatplotlibConversion.time_dataarray_to_polycollection('480km', 'split')
55.2±0.1ms	55.6±0.3ms	1.01	mpas_ocean.RemapDownsample.time_inverse_distance_weighted_remapping
45.3±0.1ms	45.1±0.04ms	1	mpas_ocean.RemapDownsample.time_nearest_neighbor_remapping
355±0.4ms	356±2ms	1	mpas_ocean.RemapUpsample.time_inverse_distance_weighted_remapping
261±0.9ms	261±0.8ms	1	mpas_ocean.RemapUpsample.time_nearest_neighbor_remapping
312M	314M	1.01	quad_hexagon.QuadHexagon.peakmem_open_dataset
311M	312M	1	quad_hexagon.QuadHexagon.peakmem_open_grid
6.32±0.07ms	6.42±0.08ms	1.02	quad_hexagon.QuadHexagon.time_open_dataset
5.44±0.08ms	5.43±0.04ms	1	quad_hexagon.QuadHexagon.time_open_grid

Benchmarks that have got worse:

Change	Before [`62ca314`]	After [`41b915a`]	Ratio	Benchmark (Parameter)
+	1.18±0μs	1.30±0μs	1.1	mpas_ocean.ConstructTreeStructures.time_ball_tree('120km')

philipc2 · 2024-12-21T18:33:53Z

Nice work @erogluorhan

It appears that a few ASV benchmarks are failing (may or may not be due to the changes in this PR). I can look at it deeper into it on Monday.

erogluorhan · 2024-12-21T19:41:55Z

All sound good, thanks both @rajeeja @philipc2 !

erogluorhan · 2024-12-21T19:43:28Z

I can't see any failing benchmarks though! @philipc2

philipc2 · 2024-12-21T19:45:46Z

I can't see any failing benchmarks though! @philipc2

If you take a look at the GitHub Bot above, it shows which ones failed. If they failed before and after, it means something was wrong before this PR.

Looks like this PR didn't break anything, must have been something with the benchmark written before.

erogluorhan · 2024-12-21T20:21:27Z

Ok, I see what you mean now, @philipc2 !

Also, re benchmarking, I'd expect mpas_ocean.ConstructFaceLatLon.time_cartesian_averaging('120km') to show the difference this PR creates with the optimization, but it does not. Dataset is pretty small I think for that, which makes me think big data benchmarking would be important in the future.

philipc2 · 2024-12-23T18:42:31Z

uxarray/grid/coordinates.py

+    centroid_x = np.sum(node_x[face_nodes] * mask, axis=1) / n_nodes_per_face
+    centroid_y = np.sum(node_y[face_nodes] * mask, axis=1) / n_nodes_per_face
+    centroid_z = np.sum(node_z[face_nodes] * mask, axis=1) / n_nodes_per_face


This here is problematic for grids that have variable element sizes.

face_node_connectivity will contain fill values, which will lead to an index error here.

IndexError: index -9223372036854775808 is out of bounds for axis 0 with size 83886080

This wasn't an issue for the SCREAM dataset because it is completely composed of quads.

How about fixing this and changing/standardizing the way we handle connectivity?

changing/standardizing the way we handle connectivity

Can you elaborate? I've explored storing connectivity as partitioned arrays in #978, although for something as simple as computing centroids, Numba seems to a significantly simpler implementation, performing closer to compiled language performance without needing to create an additional mask.

My understanding is that with -1 etc. the mask approach fails, isn't that correct? so, I was thinking adopting a convention that allows for masking to work would simply things.

Not sure about the performance as I think NumPy's vectorized operations might be highly optimized for numerical computations and hopefully outperform the explicit loop here, even when those loops are parallelized with Numba.

philipc2 · 2024-12-23T18:48:29Z

Here is a minimal example that fails. Feel free to create a test case from this

node_lon = np.array([-20.0, 0.0, 20.0, -20, -40])
node_lat = np.array([-10.0, 10.0, -10.0, 10, -10])
face_node_connectivity = np.array([[0, 1, 2, -1], [0, 1, 3, 4]])

uxgrid = ux.Grid.from_topology(
    node_lon=node_lon,
    node_lat=node_lat,
    face_node_connectivity=face_node_connectivity,
    fill_value=-1,
)

uxgrid.construct_face_centers()

philipc2 · 2024-12-23T19:51:02Z

Consider this implementation with Numba.

@njit(cache=True, parallel=True)
def _construct_face_centroids(node_x, node_y, node_z, face_nodes, n_nodes_per_face):
    centroid_x = np.zeros((face_nodes.shape[0]), dtype=np.float64)
    centroid_y = np.zeros((face_nodes.shape[0]), dtype=np.float64)
    centroid_z = np.zeros((face_nodes.shape[0]), dtype=np.float64)
    n_face = n_nodes_per_face.shape[0]

    for i_face in prange(n_face):
        n_max_nodes = n_nodes_per_face[i_face]

        x = np.mean(node_x[face_nodes[i_face, 0:n_max_nodes]])
        y = np.mean(node_y[face_nodes[i_face, 0:n_max_nodes]])
        z = np.mean(node_z[face_nodes[i_face, 0:n_max_nodes]])

        x, y, z = _normalize_xyz_scalar(x, y, z)

        centroid_x[i_face] = x
        centroid_y[i_face] = y
        centroid_z[i_face] = z
    return centroid_x, centroid_y, centroid_z

Without partitioning our arrays, it's really difficult to find an elegant vectorized solution directly using NumPy.

rajeeja · 2024-12-23T19:58:16Z

Consider this implementation with Numba.

@njit(cache=True, parallel=True)
def _construct_face_centroids(node_x, node_y, node_z, face_nodes, n_nodes_per_face):
    centroid_x = np.zeros((face_nodes.shape[0]), dtype=np.float64)
    centroid_y = np.zeros((face_nodes.shape[0]), dtype=np.float64)
    centroid_z = np.zeros((face_nodes.shape[0]), dtype=np.float64)
    n_face = n_nodes_per_face.shape[0]

    for i_face in prange(n_face):
        n_max_nodes = n_nodes_per_face[i_face]

        x = np.mean(node_x[face_nodes[i_face, 0:n_max_nodes]])
        y = np.mean(node_y[face_nodes[i_face, 0:n_max_nodes]])
        z = np.mean(node_z[face_nodes[i_face, 0:n_max_nodes]])

        x, y, z = _normalize_xyz_scalar(x, y, z)

        centroid_x[i_face] = x
        centroid_y[i_face] = y
        centroid_z[i_face] = z
    return centroid_x, centroid_y, centroid_z

Without partitioning our arrays, it's really difficult to find an elegant vectorized solution directly using NumPy.

Yes, it was a sticky one, I left it for a later PR to optimize this.

rajeeja · 2025-01-10T23:43:04Z

this also might use a bit more memory due to the creation of the mask array - but that should be fine for larger grids and the speedup would be pretty good considering it uses vectorized operations and broadcasting. Seems to work and I'm okay with approving this.

philipc2 · 2025-01-10T23:46:58Z

@erogluorhan

Let me know if you have any thoughts about the Numba implementation that I shared.

rajeeja · 2025-01-11T01:00:49Z

@erogluorhan

Let me know if you have any thoughts about the Numba implementation that I shared.

Fine with the Numba suggestion also, we have modify this PR to use that, we use Numba all over the project.

philipc2 · 2025-01-16T00:49:48Z

@erogluorhan @rajeeja

With the parallel Numba implementation, I was able to obtain the following times on my local machine for a 3.75km MPAS grid (84 million nodes, 42 million faces)

erogluorhan · 2025-01-16T04:58:59Z

@philipc2 thanks very much for catching the original issue in this PR and suggesting the Numba solution!

Also thank you and @rajeeja for the convo you kept running here!

Please see some considerations of mine below and let me know your thoughts about each of them, mostly for future, but at least for the scope of this PR, I am fine with the current Numba version, you all feel free to give it a review for this version:

Even ~5 secs with the method I tried, and also the current execution time of ~8secs (in Philip's local machine) are also sub-ideal for permanent long term performance of this function. We need to bring it to a better spot (say around a second or two) for this setup (4GB SCREAM). That's because when the user wants to use a time-series data with several time-steps, maybe for an even finer-res grid, the performance should still be okay.
Speaking of time-series, the user will likely use chunking schemes, e.g. each time step being a separate chunk. In such scenarios, we need to be sure Numba and Dask will play well together.
While this current Numba implementation seems to have a decent performance for now, we saw that vectorization would give us even better performance with my original attempt (if only we didn't have the masking issue with nonuniform connectivity array), and I thought even further parts of the code in that case would be optimized in the future. That said, partitioning the connectivity as Philip attempted in DRAFT: API for accessing partitioned connectivity, reworked topological aggregations #978 still seems like the best bet to me to still achieve a vectorized performance. @philipc2 , was there any difficulty in that code that made you stop and closing that PR?

Thoughts?

philipc2 · 2025-01-16T05:37:59Z

Hi @erogluorhan

Even ~5 secs with the method I tried, and also the current execution time of ~8secs (in Philip's local machine) are also sub-ideal for permanent long term performance of this function. We need to bring it to a better spot (say around a second or two) for this setup (4GB SCREAM). That's because when the user wants to use a time-series data with several time-steps, maybe for an even finer-res grid, the performance should still be okay.

This is a one-time execution, so I personally think a couple of seconds is acceptable, especially for grids with millions of elements. The comparison is also not direct, as I ran my benchmark on a 3.75km MPAS grid (~84 million nodes). Do you recall the size of the SCREAM grid?

Speaking of time-series, the user will likely use chunking schemes, e.g. each time step being a separate chunk. In such scenarios, we need to be sure Numba and Dask will play well together.

The centroid computation here is independent of any time-series data, since this is tied to the Grid and not a UxDataset.

While this current Numba implementation seems to have a decent performance for now, we saw that vectorization would give us even better performance with my original attempt (if only we didn't have the masking issue with nonuniform connectivity array), and I thought even further parts of the code in that case would be optimized in the future.

While I agree the speed is significantly in the masked approach better, it only appears that way due to the fixed-dimension. While partitioning as initially explored in #978 may improve our execution time, that PR has been stale for a while and I plan to re-visit it in the near future. However, for our needs, especially with the Face bounds optimization, Numba has proven to be an excellent choice for obtaining significant performance gains.

There was some discussion about Numba in #1124

Pinging @Huite if he has anything he'd like to comment on.

Huite · 2025-01-16T08:51:42Z

Happy to comment!

You can easily circumvent the fill values for floating point data by utilizing NaNs and the relevant numpy methods.
E.g. index with the -1 value, then overwrite those values with NaN, then use the appropriate NaN-aware function to filter.

Vectorized

face_nodes_x = node_x[face_node_connectivity]
face_nodes_x[face_node_connectivity == -1] = np.nan
centroid_x = np.nanmean(face_nodes_x, axis=1)

This allocates two intermediate arrays (the face_nodes_x and the boolean mask). That isn't too bad, although I'm fairly confident numpy doesn't parallellize these operations.

Numba

With regards to numba, you're probably leaving quite some performance on the table:

 x = np.mean(node_x[face_nodes[i_face, 0:n_max_nodes]])

Note that the node_x[face_nodes[i_face, 0:n_max_nodes]] continuously allocates two small (heap-allocated) arrays. My guess is that this takes quite some memory, because it's right in the hottest part of the code. Ideally you pre-allocate and reuse the memory instead. The parallellization with prange makes this slightly trickier though.

You can always write your own non-allocating mean, e.g. something like:

def mean(face, node_x):
     for index in face:
           accum = 0.0 
           if index == -1:
                break
           accum += node_x[face]
      return accum / index

This might not have optimal performance either, because memory is accessed all over the place with node_x[face], which probably causes cache misses (depending on the size of the node_x).

You want to do indexing and computation of mean in parallel. The mean ideally operates over a contiguous piece of memory.

Centroid versus center-of-mass

But something else to consider is whether the mean is really what you're after.

I compute a mean for triangles, but use a center of mass approach otherwise. These are normally equivalent for convex polygons. The reason (which I should've commented in the code...!) is that there are quite a number of grids where there's stuff like hanging nodes (in quadtree meshes e.g.).

See this sketch in paint.

Node 3 is hanging. My expectation is that the user wants to get the spot marked by the x (which is the center). But if you compute the mean of the vertices, node 3 is going to "pull" the centroid towards the right, which is undesirable.

You can avoid this by using a center of mass approach instead, as I do:

https://github.com/Deltares/xugrid/blob/fc0bdcb0eaeff62887f1e62fadd61957629dff61/xugrid/ugrid/connectivity.py#L584

To deal with the fill values, I "close" the faces with close_polygons (as shapely calls it, I add another column to the face nodes, and I replace the fill values by the first node). The center of mass can then be computed easily, because the fill values will result in zero-area weights (thereby discounting their value).

My method generates relatively many temporary numpy arrays, but you could easily avoid those with fully scalar operations in numba.

My general philosophy is to rely on vectorized solution first, and primarily use numba when there's no other way, primarily to keep JIT latency down. Even though caching will make it only on first use, e.g. CI runners will not be able to utilize the caches. But it's debatable at which point numba becomes superior; I have 16 cores on my laptop so numba's prange will give huge speed ups compared to the vectorized approach...

Vectorize coordinates._construct_face_centroids()

50c16e1

erogluorhan requested review from rajeeja and philipc2 December 21, 2024 16:33

philipc2 assigned erogluorhan Dec 21, 2024

philipc2 added the run-benchmark Run ASV benchmark workflow label Dec 21, 2024

philipc2 added this to the Scalability & Performance milestone Dec 21, 2024

philipc2 removed the run-benchmark Run ASV benchmark workflow label Dec 21, 2024

Merge branch 'main' into optimize_face_centroids

b13aa57

philipc2 added the run-benchmark Run ASV benchmark workflow label Dec 23, 2024

philipc2 requested changes Dec 23, 2024

View reviewed changes

Merge branch 'main' into optimize_face_centroids

95939fe

rajeeja approved these changes Jan 11, 2025

View reviewed changes

philipc2 added 4 commits January 13, 2025 16:14

Merge branch 'main' into optimize_face_centroids

d1e6116

Merge branch 'main' into optimize_face_centroids

0486528

Merge branch 'main' into optimize_face_centroids

47a18d7

numba implementation

ed7b2f9

philipc2 added 2 commits January 15, 2025 18:50

add numba prange

0f2ccf1

remove comment

1c5e1ce

faster n_node_per_face construction

73809fb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vectorize `coordinates._construct_face_centroids()` #1117

Vectorize `coordinates._construct_face_centroids()` #1117

erogluorhan commented Dec 21, 2024 •

edited

Loading

rajeeja commented Dec 21, 2024

github-actions bot commented Dec 21, 2024 •

edited

Loading

philipc2 commented Dec 21, 2024

erogluorhan commented Dec 21, 2024

erogluorhan commented Dec 21, 2024

philipc2 commented Dec 21, 2024

erogluorhan commented Dec 21, 2024

philipc2 Dec 23, 2024

rajeeja Jan 10, 2025

philipc2 Jan 10, 2025

rajeeja Jan 11, 2025

philipc2 commented Dec 23, 2024

philipc2 commented Dec 23, 2024

rajeeja commented Dec 23, 2024

rajeeja commented Jan 10, 2025

philipc2 commented Jan 10, 2025

rajeeja commented Jan 11, 2025

philipc2 commented Jan 16, 2025

erogluorhan commented Jan 16, 2025 •

edited

Loading

philipc2 commented Jan 16, 2025

Huite commented Jan 16, 2025 •

edited

Loading

Vectorize coordinates._construct_face_centroids() #1117

Are you sure you want to change the base?

Vectorize coordinates._construct_face_centroids() #1117

Conversation

erogluorhan commented Dec 21, 2024 • edited Loading

Overview

Results

PR Checklist

rajeeja commented Dec 21, 2024

github-actions bot commented Dec 21, 2024 • edited Loading

ASV Benchmarking

philipc2 commented Dec 21, 2024

erogluorhan commented Dec 21, 2024

erogluorhan commented Dec 21, 2024

philipc2 commented Dec 21, 2024

erogluorhan commented Dec 21, 2024

philipc2 Dec 23, 2024

Choose a reason for hiding this comment

rajeeja Jan 10, 2025

Choose a reason for hiding this comment

philipc2 Jan 10, 2025

Choose a reason for hiding this comment

rajeeja Jan 11, 2025

Choose a reason for hiding this comment

philipc2 commented Dec 23, 2024

philipc2 commented Dec 23, 2024

rajeeja commented Dec 23, 2024

rajeeja commented Jan 10, 2025

philipc2 commented Jan 10, 2025

rajeeja commented Jan 11, 2025

philipc2 commented Jan 16, 2025

erogluorhan commented Jan 16, 2025 • edited Loading

philipc2 commented Jan 16, 2025

Huite commented Jan 16, 2025 • edited Loading

Vectorized

Numba

Centroid versus center-of-mass

Vectorize `coordinates._construct_face_centroids()` #1117

Vectorize `coordinates._construct_face_centroids()` #1117

erogluorhan commented Dec 21, 2024 •

edited

Loading

github-actions bot commented Dec 21, 2024 •

edited

Loading

erogluorhan commented Jan 16, 2025 •

edited

Loading

Huite commented Jan 16, 2025 •

edited

Loading