Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JP-3824: Fix a bug in wcs_from_sregions when crpix is provided #326

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

mcara
Copy link
Member

@mcara mcara commented Dec 18, 2024

Resolves JP-3824

wcs_from_sregions is used by make_output_wcs function in the JWST pipeline to compute output WCS of the resampled image for the ResampleStep . Currently wcs_from_sregions assigns a bounding box to the output WCS that is computed in the assumption that crpix is (0, 0) and it is not adjusted when crpix is provided by the user and when it is different from (0, 0). This causes unit test failures in JWST pipeline when run with master branch of gwcs since now inverse WCS transforms check for bounding box.

A separate issue observed while working on this (which may or may not be an issue) is that currently crpix is considered 0-indexed while in FITS WCS standard it is 1-indexed.

Tasks

  • update or add relevant tests
  • update relevant docstrings and / or docs/ page
  • Does this PR change any API used downstream? (if not, label with no-changelog-entry-needed)
    • write news fragment(s) in changes/: echo "changed something" > changes/<PR#>.<changetype>.rst (see below for change types)
    • run regression tests with this branch installed ("git+https://github.com/<fork>/stcal@<branch>")
news fragment change types...
  • changes/<PR#>.apichange.rst: change to public API
  • changes/<PR#>.bugfix.rst: fixes an issue
  • changes/<PR#>.general.rst: infrastructure or miscellaneous change

@mcara mcara requested a review from a team as a code owner December 18, 2024 06:42
@mcara mcara requested review from emolter and nden and removed request for a team December 18, 2024 06:42
@mcara mcara self-assigned this Dec 18, 2024
@mcara mcara added the bug Something isn't working label Dec 18, 2024
Copy link

codecov bot commented Dec 18, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 86.94%. Comparing base (35726e9) to head (3031f2c).
Report is 4 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff             @@
##             main     #326       +/-   ##
===========================================
+ Coverage   29.57%   86.94%   +57.36%     
===========================================
  Files          36       49       +13     
  Lines        7949     8954     +1005     
===========================================
+ Hits         2351     7785     +5434     
+ Misses       5598     1169     -4429     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@kmacdonald-stsci kmacdonald-stsci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new return tuple from _calculate_offsets needs to be verified the docstring and computation are correct. Other than that, it looks good.

for axis_range, minval, shift in zip(bbox, axis_min_values, shifts):
output_bounding_box.append(
(
axis_range[0] + shift + minval,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the docstring for _calculate_offsets it says shifts are to be subtracted, but they are added here. Is the doctrstring wrong or is the calculation wrong?

Copy link
Member Author

@mcara mcara Dec 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neither is wrong: basically shifts are equivalent to 0-indexed CRPIX and should be subtracted from input coordinates to the WCS transformation. However, here we are adjusting the bounding box coordinates (not input coordinates to the WCS transform) so that it "moves" with the CRPIX value. Currently bounding box stays glued to the (0, 0) corner regardless of the output image shape or CRPIX location.

Copy link
Collaborator

@emolter emolter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good but just a few small questions

changes/326.bugfix.rst Outdated Show resolved Hide resolved
tuple
A tuple of offset *values* that are to be *subtracted* from input
coordinates (negative values of offsets in the returned transform).
Note: these are equivalent to an effective 0-indexed "crpix".
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that these offsets are always equal to crpix? If so, what is the point of returning this, if crpix is an input to the function? If not, can you clarify this note a little bit?

@@ -30,7 +30,7 @@ def _create_wcs_object_without_distortion(
shape,
):
# subtract 1 to account for pixel indexing starting at 0
shift = models.Shift() & models.Shift()
shift = models.Shift(0) & models.Shift(0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this change necessary? Does this shift actually subtract 1 as the comment indicates?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, this was a leftover from testing

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment is incorrect either way. The default value for the shift is 0 anyway. So my change does not have any effect. I'll undo it.

@mcara
Copy link
Member Author

mcara commented Dec 18, 2024

The bigger issue is: should the bounding box of the resampled image follow (i.e., be centered) at CRPIX or should it always be fixed to the bottom-left corner of the resampled image.

The difference between the two approaches:

  1. bbox follows crpix: If user sets a large output shape for the resampled image and places rcpix at the middle then most (valid) data will be centered at the center of the output. User can control where data will be by modifying crpix.
  2. bbox is fixed: Then no matter what output shape user chooses, the input data will appear only in the bbox which is situated in the lower-left corner of the resampled image (when with_bounding_box for inverse WCS is enabled). By changing CRPIX, user shifts where data fall in the output but the bbox is not shifted and so it acts as a mask. So, as an example, if we increase CRPIX, we will move data to the right and there will be an empty region (no data) near the origin (BLC) but the data will not be "full" as bbox will cut off data on the right/top side. That is, we will be moving data out of the "view" (which, in this case, is the bbox and it is stationary) and the data will be trimmed by the bbox.

I think currently 2nd approach is implemented (albeit with a bug). 1st approach seems to be more intuitive as it allows control of where data will be in the output image and it is intuitive to estimate this based on the output shape. In the 2nd approach output shape larger than bbox makes no sense at all as it will be filled with 0s or NaN.

I know this is kind of a philosophical question but it is important to make a decision and based on this I will know how to fix spacetelescope/jwst#9017.

CC: @nden @emolter @melanieclarke

@mcara
Copy link
Member Author

mcara commented Dec 18, 2024

I have addressed most (all) comments, I believe. Along the way, fixed a typo in ramp test.

@mcara
Copy link
Member Author

mcara commented Dec 18, 2024

Just to be clear, this PR takes approach #1 in my previous (-2) comment. I think it is more intuitive (to the user) and kind of makes sense: bbox is moving with CRPIX thus allowing user to control where in the image the data is resampled.

With the other approach, both output_shape and crpix are pretty much useless since resampled pixels are contained only within the bounding box that is forced to stay in the bottom-left corner of the image. Modifying crpix in this case only can move valid pixels out of the bounding box. One problem is that we do not know (in the resample step) what this bounding box is and thus we don't know where to choose CRPIX. With a poor initial choice, pixmap is all nan and resample crashes before we can see what was the computed bounding box. It's counterintuitive.

@mcara
Copy link
Member Author

mcara commented Dec 19, 2024

Oh, there is a third idea:

  1. Make the bounding box using the entire array shape. First, drizzle figures out overlaps as it resamples. Bounding boxes on the output images do not help much, unless that is what the user wants. Second, output WCS usually is a distortion-free WCS that is easily invertible so guarding against out-of-domain values is not necessary here. So, users can provide a bounding box with their custom WCS if that is what they want but in an automated pipeline, why impose one at all?

This actually makes me think that approach #1 is probably the better one if we are going to compute a restrictive bounding box in the pipeline. Or option #3.

Copy link
Collaborator

@emolter emolter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me Mihai, I just raised one more question. But before we merge it I'd like to check whether it indeed fixes the issue that we have now bypassed via Nadia's PR for AL-852 spacetelescope/jwst@9f1bbbd. I'll get back to you hopefully later today about that

Comment on lines 343 to 347
log.warning(
"Setting minimum array dimension for axis %d to 10."
% (len(output_bounding_box) - k)
)
upper = 10
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is 10 the default here?

In what cases would upper be less than 1? Should this ever happen for valid values of crval and crpix? I thought this PR attempted to fix this lack of overlap for any reasonable input WCS.

Copy link
Member Author

@mcara mcara Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR initially attempted to fix lack of overlap due to a bug in the code: applying crpix to coordinate transforms in WCS but not to the bounding box. However, some tests in romancal failed when the following two conditions occurred:

  1. the combination of crval is such that all input images create an outline (bounding box) that is way outside of the output image (towards the negative coordinates); AND
  2. shape is not specified.

When crpix is provided (but size is not), it could happen that only a small part (or none) is in the quadrant with all positive coordinates. Also, the bounding box could be shifted to the right and up (from the origin of the image coordinate system). In this case the size should be determined from the upper bounding box coordinates and not from the bounding box width and height. However, if the bounding box is, for example, in the quadrant with all negative coordinates, then what should be the size of the output image. I think drizzle needs a minimum of 2 or 3 pixels. 10 is a small enough number that it will not use a lot of resources but we can lower it to let's say 3.

Or would you prefer to raise an exception and quit?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I think that is how size should be determined at all times (from the largest coordinate values of the bounding box): it just happens that the lower limit of the bounding box is always 0 when crval is not specified.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok here is my understanding so far. I still don't understand WCS transforms very well so please pardon me if anything here is obviously wrong:

  • This PR is making it so that the bounding_box is updated to respect crpix != 0, which means that the forward and reverse transforms should be self-consistent and will respect bounding_box.
  • there's no reason why crpix and crval cannot be outside the image itself, but bounding_box should always be around the image
  • if bounding_box is nowhere near the image, then there is something wrong with either bounding_box or crpix and crval

Are these three statements true?
If so, isn't (1) an instance of a bad WCS, since it's making a bounding_box that isn't in the image?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • This PR is making it so that the bounding_box is updated to respect crpix != 0, which means that the forward and reverse transforms should be self-consistent and will respect bounding_box.

"forward and reverse transforms should be self-consistent" - that is something that always should be happening. "will respect bounding_box" this depends on the settings such as with_bounding_box (for the inverse) and I think always for the forward transform. This PR does not deal with this.

  • there's no reason why crpix and crval cannot be outside the image itself, but bounding_box should always be around the image

First part of the statement is correct. Second part is correct too if one wants to get finite numbers (not nan) for the world coordinates for pixels in an actual image described by said WCS object or if one wants to get any data in the resampled image. If the bounding box is not in the image then pixel coordinates of that image's pixels will not map to finite world coordinates.

  • if bounding_box is nowhere near the image, then there is something wrong with either bounding_box or crpix and crval

"something wrong" - it depends. If this is the result of our computations for default parameters - then yes. If it is the result of the user's input - then it's user's choice (maybe by mistake or maybe not ... who knows?

Are these three statements true? If so, isn't (1) an instance of a bad WCS, since it's making a bounding_box that isn't in the image?

What is (1) referring to? A WCS cannot make a bounding_box. We create a bounding box to enclose the data. That is what the current code does before adding a shift to the detector coordinates to account for crpix.

At least for the case of resampled image, the bounding box should enclose the region of the image/detector coordinate system where all input images will be resampled (projected) but it also should be trimmed to the actual size of the images (a new thing to add to the code).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On and around this line it appears that currently the forward_transform and backward_transform do not round-trip if the bounding_box is not None. I thought this PR was supposed to fix that.

You say "currently the forward_transform and backward_transform do not round-trip". To me "round-tripping is that the following is true:

world = forward_transform(x, y)
assert np.allclose(backward_transform(*world), (x, y))

Is this condition not satisfied (modulo nan)? I think it still should be. The issue is that the bounding box is off when crpix is specified by the user and hence nans in the output. I believe this PR fixes the issue and so those lines in the reproject will not be needed.

Can you give me an example of when getting NaN for the world coordinates on purpose, or getting no data in the resampled image, would be the desired behavior?

No. And that is the issue! Current code creates a bounding box (when crpix is defined) that may not be in the resampled => no data.

Along the same lines, under what circumstances would this not be a mistake?

I don't know.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, thanks for all the explanations, I think I understand now. The bounding box does not modify the transform in any way, but it does set the results of the transform to NaN outside the box.

Going all the way back to the original question, if we can't think of any real use-cases for a bounding box that is entirely outside the data, then IMO it would be better to raise an exception than log a warning.

I will admit also that some of my confusion came because I made a typo when removing those lines in resample_utils.reproject() and it led to a bunch of test failures. The branch I made (typo fixed) is here, and I'm running our full regtest suite on those changes here, pointing to this PR branch as the stcal dependency. I'm ready to approve this as long as those tests come back clean

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you did change this log.warning() to an error, would that cause the Romancal tests to fail again?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you help me understand why there are regression test failures in the run I linked above?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you help me understand why there are regression test failures in the run I linked above?

probably my prs were not rebased.

I removed log warnings and now raise an exception.

@emolter
Copy link
Collaborator

emolter commented Dec 20, 2024

@mcara mcara force-pushed the fix-bbox-shift-make-wcs branch 3 times, most recently from 06db4b1 to ed42588 Compare December 22, 2024 13:06
@mcara
Copy link
Member Author

mcara commented Dec 22, 2024

I made somewhat major changes to the code in ed42588 and combined two functions into one. I think it is much clearer this way. I also made the bounding box be on pixel edge (int coord +- 0.5) because even if footprints of input images when projected to the "resampled" WCS cover half of a resampled pixel, in order to get those input data into that "corner" or "edge" resampled pixel, the bounding box should be at the outermost edge of the pixel so as to guarantee that pixmap will not be NaN at the footprint vertices.

Also, something is not right with test_wcs_from_footprints(): how can output image be 4x4 and WCS not be NaN outside of it at coordinate (4, 4) - see, i.e.,

assert all(np.isclose(wcs.footprint()[2], wcs(4, 4)))
? I fixed the test in ed42588.

Regression tests after restoring the old behavior in 77a8176: https://github.com/spacetelescope/RegressionTests/actions/runs/12454766437 (just one or two failures that do not look to me are caused by the code changes that preserve old bounding box behavior).

After removing the code that was reproducing the old behavior in d0a4a76, I started regression tests here: https://github.com/spacetelescope/RegressionTests/actions/runs/12457411084 I expect 0-4% pixel to have different values in the resampled images depending on output bounding box correction and image size.

@mcara
Copy link
Member Author

mcara commented Dec 22, 2024

From the last regression test the only failure that I see that can be ascribed to this PR is:

Keyword S_REGION has different values

That is to be expected.

Another failure in both "old" boundary box and "new" boundary box code is in test_nircam_image_sirs:

         Headers have different number of cards:
          a: 197
          b: 195
         Extra keyword 'R_SIRSKL' in a: 'crds://jwst_nircam_sirskernel_0001.asdf'
         Inconsistent duplicates of keyword ''      :
          Occurs 20 time(s) in a, 19 times in (b)
         Keyword         [16] has different values:
            a> NIR Optimized Convolution Kernel reference file information
            b> Mask reference file information
         ....................................

I do not see how this failure could be caused by this PR. It is probably due to some reference file change.

@emolter
Copy link
Collaborator

emolter commented Jan 2, 2025

Just getting back to this after vacation. Can you please fix the check-types failure? Also, why are the downstream Romancal tests failing and is there an associated fix for that?

@mcara
Copy link
Member Author

mcara commented Jan 17, 2025

romancal tests will be fixed via spacetelescope/romancal#1584

@mcara mcara force-pushed the fix-bbox-shift-make-wcs branch from d0a4a76 to 1850146 Compare January 17, 2025 09:26
@emolter
Copy link
Collaborator

emolter commented Jan 17, 2025

Sounds good, so we should hold off on merging this one until the Romancal testing one is merged, right? The Romancal one appears not to break anything even though it's still pointed to stcal/main.

What is going on with the saturation and dark current unit tests for Romancal? I see lots of failures there too, are those unrelated? I don't see them on Romancal main though, at least not in the spot I checked.

@mcara
Copy link
Member Author

mcara commented Jan 17, 2025

What is going on with the saturation and dark current unit tests for Romancal? I see lots of failures there too, are those unrelated? I don't see them on Romancal main though, at least not in the spot I checked.

I don't see how they could be related to this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working ramp_fitting testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants