Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add a bylayer keyword to zonal #820

Open
wants to merge 11 commits into
base: main
Choose a base branch
from
Open

[WIP] Add a bylayer keyword to zonal #820

wants to merge 11 commits into from

Conversation

asinghvi17
Copy link
Collaborator

@asinghvi17 asinghvi17 commented Nov 12, 2024

Successor to #775. This provides a switch that allows you to iterate directly over the values of a rasterstack, instead of iterating over the layers and then the layer values.

It currently defaults to true (the old behaviour) but setting it to false allows iteration over the NamedTuple values of each dimselector in the stack, which is the new behaviour.

Needs tests. This also fails when skipmissing=true, because it's not implemented for RasterStacks. From discussion in #775 we still need to decide whether skipmissing on a rasterstack should skip a cell if it has any missing values, or skip a cell if all its values are missing. Personally, I'm inclined to option 1 :).

if false, `f` gets an iterable of named tuples.  if true, `f` gets an iterable of values, and we use `maplayers` outside that.

Still an open question what the default should be.  I would like to have it be `true` to underline the nature of RasterStacks as being arrays of namedtuples as well, but `false` would be less breaking.
@asinghvi17 asinghvi17 self-assigned this Nov 12, 2024
@rafaqz
Copy link
Owner

rafaqz commented Nov 12, 2024

Yeah, lets not break things straight away. The main reason to have it as true is so sum and mean just work as-is.

I think we probably need to move away from returning NamedTuple at some point to a similar object that math works on. At that point most workflows wont break when we flip the switch, but until then bylayer=true needs to be the default.

I also think option 1 is the correct skipmissing behaviour. We could also allow passing a Symbol like skipmissing=:somelayer to specify the layer to use.

Project.toml Outdated Show resolved Hide resolved
src/methods/zonal.jl Outdated Show resolved Hide resolved
src/methods/zonal.jl Outdated Show resolved Hide resolved
src/methods/zonal.jl Outdated Show resolved Hide resolved
src/methods/zonal.jl Outdated Show resolved Hide resolved
@asinghvi17
Copy link
Collaborator Author

asinghvi17 commented Nov 13, 2024

Here's what the namedtuple warning looks like:

julia> zonal(mean, st; of = [polygon], bylayer=false)
┌ Warning: The function you passed to `zonal` cannot handle a RasterStack's values directly,
│ as they are NamedTuples of the form `(; varname = value, ...)`.
│ 
│ Set `bylayer=true` to apply the function to each layer of the stack separately, 
│ or change the function to accept NamedTuple input.
└ @ Rasters ~/.julia/dev/Rasters/src/methods/zonal.jl:170
ERROR: MethodError: no method matching /(::@NamedTuple{a::Missing, b::Missing}, ::Int64)
The function `/` exists, but no method is defined for this combination of argument types.

Closest candidates are:
  /(::BigInt, ::Union{Int16, Int32, Int64, Int8, UInt16, UInt32, UInt64, UInt8})
   @ Base gmp.jl:560
  /(::BigFloat, ::Union{Int16, Int32, Int64, Int8})
   @ Base mpfr.jl:553
  /(::Missing, ::Number)
   @ Base missing.jl:123
  ...

Stacktrace:
  [1] mean(f::typeof(identity), itr::Base.SkipMissing{RasterStack{…}})
    @ Statistics ~/.julia/packages/Statistics/gbcbG/src/Statistics.jl:69
  [2] mean(itr::Base.SkipMissing{RasterStack{…}})
    @ Statistics ~/.julia/packages/Statistics/gbcbG/src/Statistics.jl:44
  [3] _maybe_skipmissing_call(f::typeof(mean), A::RasterStack{…}, sm::Bool)
    @ Rasters ~/.julia/dev/Rasters/src/methods/zonal.jl:206

@asinghvi17
Copy link
Collaborator Author

asinghvi17 commented Nov 13, 2024

It's pretty easy to miss in the giant error stack though. This also doesn't activate in zonal where of is a single geometry, only where it is a vector of geometries.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants