Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Converting to DEPHY format #518

Open
dustinswales opened this issue Sep 19, 2024 · 25 comments
Open

Converting to DEPHY format #518

dustinswales opened this issue Sep 19, 2024 · 25 comments

Comments

@dustinswales
Copy link
Collaborator

As part of the CCPP v7.0.0 release, all SCM cases included in the repository are in the DEPHY format. Provided with the codebase are script(s) to convert cases in the "old" SCM format to the DEPHY format.
However, this script is not documented, so it's not clear how to use it.

@grantfirl @ligiabernardet

@grantfirl
Copy link
Collaborator

Here is the broad-stroke documentation for using it:

The script is in ccpp-scm/scm/etc/scripts/dephy_converter.py. It only takes one argument: -n name_of_case (where name_of_case corresponds to an existing case data file found in the ccpp-scm/scm/data/processed_case_data directory without the .nc extension).

The script reads in the old file in ccpp-scm/scm/data/processed_case_data together with the associated case configuration namelist in ccpp-scm/scm/etc/case_config and outputs a new DEPHY-formatted case file in ccpp-scm/scm/data/processed_case_data named "name_of_case_SCM_driver.nc". It also modifies that case configuration namelist for the case.

Before trying to use the script, I would create a backup of the case data file and case configuration namelist. Then, you can check that the conversion worked by running the original case and the DEPHY version and comparing the output. It may not be bit-for-bit, but the output should look VERY similar if plotted.

@dustinswales
Copy link
Collaborator Author

Thanks @grantfirl!
Let's see if this is all that I-Kuan needs, and if so we can a) Add this to the documentation and/or b) Create a discussion thread with your response as the "Answer" for others to see?
The latter seems adequate to me until we can update the docs.

@ihursmas
Copy link

ihursmas commented Sep 24, 2024

Thank you @dustinswales and @grantfirl!

So far in the CCPP SCM v7.0.0, I've converted the required forcings from the conventional format to DEPHY format without any issues, and I've run all seven 33-hour periods of the ATOMIC test case using the SCM forcings in both the conventional and DEPHY formats and using three physics suites (GFS_v16, GFS_v17_p8, and RRFS_v1beta). For all the periods and physics suites tested, the outputs of advective tendencies - i.e. “T_force_tend”, “qv_force_tend”, “u_force_tend”, and “v_force_tend” - are identical between the runs driven by the conventional- and DEPHY-format forcings. The boundary forcings are identical between the forcing files (under data/processed_case_input/) in the conventional and DEPHY formats as well. In general, the runs driven by different forcing formats are similar, regardless of choices of tested period and physics suite. However, one or two specific combinations of period and physics suite show noticeable differences, particularly in cloud fraction and/or surface fluxes, between the runs driven by different forcing formats. Would this relatively significant sensitivity (of certain variables, under certain run cases) to forcing format be a concern, or is it actually not so uncommon?

@dustinswales
Copy link
Collaborator Author

Thank you @dustinswales and @grantfirl!

So far in the CCPP SCM v7.0.0, I've converted the required forcings from the conventional format to DEPHY format without any issues, and I've run all seven 33-hour periods of the ATOMIC test case using the SCM forcings in both the conventional and DEPHY formats and using three physics suites (GFS_v16, GFS_v17_p8, and RRFS_v1beta). For all the periods and physics suites tested, the outputs of advective tendencies - i.e. “T_force_tend”, “qv_force_tend”, “u_force_tend”, and “v_force_tend” - are identical between the runs driven by the conventional- and DEPHY-format forcings. The boundary forcings are identical between the forcing files (under data/processed_case_input/) in the conventional and DEPHY formats as well. In general, the runs driven by different forcing formats are similar, regardless of choices of tested period and physics suite. However, one or two specific combinations of period and physics suite show noticeable differences, particularly in cloud fraction and/or surface fluxes, between the runs driven by different forcing formats. Would this relatively significant sensitivity (of certain variables, under certain run cases) to forcing format be a concern, or is it actually not so uncommon?

@ihursmas
Great to hear that things are working, well kinda working at least.
"Significant sensitivity" worries me, so I would like to know more about your configuration. Can you provide some details? (e.g. your SCM configuration and possibly the case file(s))

@ihursmas
Copy link

ihursmas commented Sep 25, 2024

@dustinswales
Sure thing. When using the GFS_v16 physics suite and the period from 22 UTC on Jan. 16 to 06 UTC Jan. 18, 2020, the differences between the runs driven by forcings in the conventional vs. DEPHY formats are most significant.

The corresponding files can be found here:

  • case files: atomic_ERA5_Jan16T22Jan18T06_dephy.nml and atomic_ERA5_Jan16T22Jan18T06.nml under /scratch2/BMC/mcwi/I-kuan.Hu/CCPPSCM.v7/v7.0.0/scm/etc/case_config;
  • forcing files: atomic_ERA5_Jan16T22Jan18T06_dephy_SCM_driver.nc and atomic_ERA5_Jan16T22Jan18T06.nc under /scratch2/BMC/mcwi/I-kuan.Hu/CCPPSCM.v7/v7.0.0/scm/data/processed_case_input.

The runs are
atomic_ERA5_Jan16T22Jan18T06_dephy_SCM_GFS_v16 and atomic_ERA5_Jan16T22Jan18T06_SCM_GFS_v16 under /scratch2/BMC/mcwi/I-kuan.Hu/CCPPSCM.v7/v7.0.0/scm/output
Based on the diagnostics using NCVIEW, I think the differences stem from a 2D variable rad_cloud_fraction, which affects the 1D variables such as max_cloud_fraction, tprcp_rate_inst, and all the variables associated with surface turbulent and radiative heat fluxes. However, pwat still looks very similar between the runs driven by different formats of forcings.

The same period conducted using GFS_v17_p8 or RRFS_v1beta shows much smaller discrepancies between the runs driven by different formats of forcings, but I think those differences are still relatively noticeable compared to their counterparts for other periods.

Let me know if you don't have assess to all these files listed above (I will then try to deliver them here). Thank you!

@dustinswales
Copy link
Collaborator Author

@ihursmas Sorry, I was out yesterday. I'm taking a look at this now.

@dustinswales
Copy link
Collaborator Author

@ihursmas @grantfirl
I created some plots for all the fields in output.nc files, and there are noticeable differences in most of the fields.

cd scm/test
./cmp_scmout.py -fbl /scratch2/BMC/mcwi/I-kuan.Hu/CCPPSCM.v7/v7.0.0/scm/output/atomic_ERA5_Jan16T22Jan18T06_SCM_GFS_v16/output.nc -frt /scratch2/BMC/mcwi/I-kuan.Hu/CCPPSCM.v7/v7.0.0/scm/output/atomic_ERA5_Jan16T22Jan18T06_dephy_SCM_GFS_v16/output.nc
NOTE I-Kuan if you want to use this script, you will need to modify this plotting script to point to a local directory.

T_force_tend:
Screenshot 2024-09-26 at 8 52 09 AM

qv_force_tend:
Screenshot 2024-09-26 at 8 52 15 AM

qv:
Screenshot 2024-09-26 at 8 58 43 AM

Temperature:
Screenshot 2024-09-26 at 9 00 58 AM

So the forcing is nearly identical between DEPHY and non-DEPHY runs, there are some bitwise differences, but nothing that stands out to me as egregious . But the state in the SCM evolves quite differently?

@ihursmas
Copy link

@dustinswales
Thank you for taking a further analysis!
Yes these results you showed look consistent with what I saw. Fortunately the different evolutions between the DEPHY and non-DEPHY driven runs are only noticeable for this specific period (i.e. Jan16T22Jan18T06), and the differences seem to be less when using the physics suites other than GFS_v16 (like GFS_v17_p8 and RRFS_v1beta). I honestly don't know where the cause of these differences could be, given almost identical forcings in the DEPHY and non-DEPHY formats...

@dustinswales
Copy link
Collaborator Author

@ihursmas No worries. @hertneky is looking into this closer to get to the root cause of these differences. We don't expect differences between runs with DEPHY vs. non-DEPHY, so we need to understand what's going on.

@ihursmas
Copy link

ihursmas commented Sep 26, 2024

@dustinswales Thanks! I quickly checked the sensitivity to column_area (I set it to 1.69E8, and I think the default is 2E9): the T and qv differences are smaller yet the qc (which would lead to cloud fraction) difference is larger. I'm also wondering if the momentum nudging (by setting mom_forcing_type = 3 and relax_time = 3600.0) in the conventional format were genuinely reflected in the DEPHY format, as I saw relatively large difference in u_force_tend and v_force_tend (compared to T_force_tend and qv_force_tend which are several orders smaller than their magnitudes of individual run) between the DEPHY and non-DEPHY runs.

u_force_tend:
scm u_force_tend
v_force_tend:
scm v_force_tend

@hertneky
Copy link
Collaborator

@ihursmas I've compared the configs for both and I don't see any discrepancies there. I am going to look for anything that might cause diffs in the code, specifically the u_/v_force_tend.

@ihursmas
Copy link

ihursmas commented Oct 8, 2024

Hi @hertneky and @dustinswales, do we have any updates? Thanks!

@hertneky
Copy link
Collaborator

hertneky commented Oct 8, 2024

Hi @ihursmas I have not found anything that is specifically causing those differences we see. I replicated your runs and looked at various variables, many of which seemed to have rounding/precision errors that grow with time. In the scm_input.F90, I did notice precision diffs between orig/dephy for initializing some of the variables (double vs single precision). I decided to build/run with 32-bit to see if the diffs reduced, but came across an issue with running the atomic case in 32-bit for the dephy format. Noting that a supplied case in dephy format, 'twpice', did not have the issue, so it may be unique to the atomic dephy case. Again, this is just for single precision as everything ran fine otherwise. @scrasmussen will help look at the single precision issue.

@lisa-bengtsson
Copy link
Contributor

@hertneky @dustinswales @ihursmas @grantfirl Hi! I'm just checking if anyone has any updates on this work? It seems that it is not related to precision, but something else in the tendency files?

@hertneky
Copy link
Collaborator

@ihursmas @dustinswales @grantfirl

First, did you run the dephy_converter without any modifications? I may need to test mods and want to assure I am using the same starting point as you.

There was an issue with how column area was applied for the DEPHY format. Fixing this issue had no affect on your latest results, since you had column_area set the same in both nmls.

  • It was not properly assigning the area from the DEPHY forcing file, so it was taking whatever was in the nml - the column_area in nml for DEPHY is intended to be used if doing scalability tests and otherwise should use what is set in the forcing file; however, it was not working correctly

There have been issues opened wrt w_ls/omega not being present in the DEPHY forcing.

  • For non-DEPHY format, your omega is read in and assigned to scm_state%omega
  • For DEPHY format, omega is not present in the forcing file, since forc_wap = 0, and is therefor not read in/assigned. It looks like omega is used by the physics, but I am no expert with the physics. @grantfirl @dustinswales Can you comment on this?

@ihursmas
Copy link

@hertneky Thank you for getting back to this.
I didn't modify dephy_converter.py; All I did is ./dephy_converter -n atomic_ERA5_[a certain period].
And a good point about the lack of w_ls. I can generate a set of forcing files with the presence of w_ls for sensitivity tests. My guess is that although w_ls does not impact the large-scale forcings (since I directly used the large-scale advective tendencies as forcings, i.e. thermo_forcing_type = 1 for the non-DEPHY format), it might play a role in the closure in the PBL scheme.

@ihursmas
Copy link

ihursmas commented Dec 12, 2024

Hi all, I did several other tests and still haven’t been able to figure out what cause the issue. Interestingly, I tried the TWPICE case as a test and found discrepancies of similar magnitudes, which makes me think this may be an intrinsic issue of transformation of forcing file format. Some details are provided below.

  • @hertneky The absence of omega/w_ls in the DEPHY forcing file I produced is due to the choice of thermo_forcing_type. For all the ATOMIC cases I set thermo_forcing_type = 1, which allows the SCM to use already-computed (horizontal and vertical) advective tendencies of temperature and moisture as forcings, and omega/w_ls will NOT be converted to the DEPHY forcing file. If we set thermo_forcing_type = 2, the SCM then uses the prescribed omega/w_ls (depending on which is available) and the modeled thermodynamic state to compute the advective tendencies of temperature and moisture, and omega/w_ls will be converted to the DEPHY forcing file. I tried setting thermo_forcing_type = 2 for the ATOMIC case (the period from 22 UTC on Jan. 16 to 06 UTC Jan. 18, 2020) that we’re trying to address here, and the discrepancies become even larger.
  • Providing w_ls in the non-DEPHY-format forcings that was later converted to the DEPHY-format forcings barely changes the result.
  • The TWPICE runs using the same GFS_v16 physics suite and with the non-DEPHY vs. DEPHY forcings:
    scm T
    scm qv
    To me, the magnitudes of T and qv discrepancies are similar to the ATOMIC period from 22 UTC on Jan. 16 to 06 UTC Jan. 18, 2020 (the scales of the color bars differs from those of the ATOMIC run).

@lisa-bengtsson
Copy link
Contributor

Thanks I-Kuan for following up on this, the difference is quite large and doesn't seem to be just truncation... Is there some vertical interpolation involved in the conversion?

@ihursmas
Copy link

@lisa-bengtsson No, the SCM handles the temporal and vertical interpolations of forcings after reading in the forcings.

@grantfirl
Copy link
Collaborator

@hertneky @dustinswales @ihursmas @lisa-bengtsson

As far as I can tell, the differences that we're seeing with this particular ATOMIC case time period using the SCM_GFS_v16 suite are due to differences in surface fluxes from NSST. Since marine boundary layer clouds are acutely sensitive to surface fluxes, if the surface fluxes differ by a small amount, this difference can easily be amplified by the cloud response. If one simulation develops clouds and one does not ,the impact on the column for all state variables can be huge.

Here are the surface fluxes comparisons using the SCM_GFS_v16 suite:
time_series_lhf.pdf
time_series_shf.pdf

If I run both DEPHY and non-DEPHY formatted cases with the "no_nsst" version of the SCM_GFS_v16 suite (uses sfc_ocean instead and the SST is actually held to the forced value), then there are relatively negligible differences between the cases.

Here are the surface fluxes from the SCM_GFS_v16_no_nsst suite:
time_series_lhf.pdf
time_series_shf.pdf

For oceanic field campaign cases, I think that we should be using either suites with the sfc_ocean scheme rather than NSST unless we have the additional required input like depth of the thermocline, etc. OR use prescribed surface fluxes. I'm not at all confident in how the NSST scheme works. It appears that if we give it an initial SST, it produces reasonable surface fluxes, but it also produces its own SST that diverges slightly from the prescribed value. It is only using the prescribed SST for the first time step. It must be using some kind of default values for the mixed layer above the thermocline, but I'm not sure if that's really what we want?

@grantfirl
Copy link
Collaborator

@ihursmas @dustinswales @grantfirl

  • For non-DEPHY format, your omega is read in and assigned to scm_state%omega
  • For DEPHY format, omega is not present in the forcing file, since forc_wap = 0, and is therefor not read in/assigned. It looks like omega is used by the physics, but I am no expert with the physics. @grantfirl @dustinswales Can you comment on this?

I definitely need to look at what is happening in the non-DEPHY format case. If it is reading and assigning omega, this does get passed to physics and used by some schemes, even if it doesn't get applied in the forcing. This is likely not desired and would lead to differences between non-DEPHY and DEPHY cases if the DEPHY cases are completely ignoring any omega values if it is not used in the forcing. If this is the case, the DEPHY cases are doing it right and the non-DEPHY cases are doing it wrong, IMO.

@grantfirl
Copy link
Collaborator

I do think that some of what we're seeing is related to growth of high significant digit differences akin to single- vs double-precision changes. If the changes always start out small and grow in time this is consistent with this possibility. One way to test this would be to make two DEPHY cases that differ in some (or all) variables by some tiny amount and seeing if the evolution of differences is similar to what one observes between non-DEPHY and DEPHY versions of the same case.

@grantfirl
Copy link
Collaborator

@ihursmas @dustinswales @grantfirl

  • For non-DEPHY format, your omega is read in and assigned to scm_state%omega
  • For DEPHY format, omega is not present in the forcing file, since forc_wap = 0, and is therefor not read in/assigned. It looks like omega is used by the physics, but I am no expert with the physics. @grantfirl @dustinswales Can you comment on this?

I definitely need to look at what is happening in the non-DEPHY format case. If it is reading and assigning omega, this does get passed to physics and used by some schemes, even if it doesn't get applied in the forcing. This is likely not desired and would lead to differences between non-DEPHY and DEPHY cases if the DEPHY cases are completely ignoring any omega values if it is not used in the forcing. If this is the case, the DEPHY cases are doing it right and the non-DEPHY cases are doing it wrong, IMO.

This does seem to be the case for non-DEPHY cases. The statein%vvl variable that has the standard name of lagrangian_tendency_of_air_pressure is associated by pointer to the scm_state%omega variable that is given the time and space-interpolated values in scm_forcing.F90 regardless of what the thermo_forcing_type control was set to. Schemes can use this variable even if it is not used for forcing.

While this is certainly wrong, I wonder if it is a big deal because we're moving on from the non-DEPHY files anyway? It does appear to be controlled correctly for the DEPHY cases (vertical velocities are initialized and kept at zero if the forc_wap and forc_w global attributes are both false), although it also might not be the desired behavior to pass in forcing omega/w to physics in addition to using it in forcing.

@ihursmas
Copy link

ihursmas commented Dec 18, 2024

For oceanic field campaign cases, I think that we should be using either suites with the sfc_ocean scheme rather than NSST unless we have the additional required input like depth of the thermocline, etc. OR use prescribed surface fluxes. I'm not at all confident in how the NSST scheme works. It appears that if we give it an initial SST, it produces reasonable surface fluxes, but it also produces its own SST that diverges slightly from the prescribed value. It is only using the prescribed SST for the first time step. It must be using some kind of default values for the mixed layer above the thermocline, but I'm not sure if that's really what we want?

@grantfirl Thank you for digging this further and figuring out one dominant cause! Yes this is a very good point - we did find the simulated SST drifted away from the observed values when using the default sets of suite and namelist files for either SCM_GFS_v16 and SCM_GFS_v17_p8. So we used the prescribed SST for demonstrating the ATOMIC case in our manuscript, although the way I did was to modify the NSST scheme rather than using the "no_nsst" set of suite and namelist files directly.

I guess the reason I did not mention the role of prescribed SST is that I wanted the ATOMIC case to be tested to a broader extent rather than having to be run with certain constraints (e.g. prescribing the boundary conditions such as SST). I thought if the SCM would drift away, either driven with the non-DEPHY or the DEPHY format forcings it shall drift away similarly. But apparently I was wrong, based on what you showed here. So maybe we should add a README file or something similar that suggests the users what is the most "scientifically supported" configuration of using the ATOMIC case? And would this need to be applied to other cases, such as TWPICE which also shows non-negligible differences between the runs driven by non-DEPHY and DEPHY format forcings? But then the flexibility of testing prescribed vs. simulated fields at the atm-ocn interface would lose, so I don't know what would be the best solution.

@ihursmas
Copy link

ihursmas commented Dec 18, 2024

I definitely need to look at what is happening in the non-DEPHY format case. If it is reading and assigning omega, this does get passed to physics and used by some schemes, even if it doesn't get applied in the forcing. This is likely not desired and would lead to differences between non-DEPHY and DEPHY cases if the DEPHY cases are completely ignoring any omega values if it is not used in the forcing. If this is the case, the DEPHY cases are doing it right and the non-DEPHY cases are doing it wrong, IMO.

This does seem to be the case for non-DEPHY cases. The statein%vvl variable that has the standard name of lagrangian_tendency_of_air_pressure is associated by pointer to the scm_state%omega variable that is given the time and space-interpolated values in scm_forcing.F90 regardless of what the thermo_forcing_type control was set to. Schemes can use this variable even if it is not used for forcing.

While this is certainly wrong, I wonder if it is a big deal because we're moving on from the non-DEPHY files anyway? It does appear to be controlled correctly for the DEPHY cases (vertical velocities are initialized and kept at zero if the forc_wap and forc_w global attributes are both false), although it also might not be the desired behavior to pass in forcing omega/w to physics in addition to using it in forcing.

@grantfirl Yes I agree that I should have omega or w_ls available in the non-DEPHY format forcing file, even if thermo_forcing_type is set to 1 for the ATOMIC case, to ensure that the large-scale vertical motion taken by whatever schemes that need it is consistent with the large-scale vertical advective tendencies.

But for the issue we're trying to address here, I don't think the presence of omega or w_ls causes the differences between that specific ATOMIC run driven by non-DEPHY and DEPHY format forcings. I did a test regarding the inclusion of omega and w_ls in the forcing file, and the results of the experiments are very similar, if not perfectly identical, to the set of runs where the forcing file does not carry omega or w_ls. I guess one reason is that both the shallow and deep convection schemes, which as far as I know may use omega or w_ls for their triggering criteria and/or closures, are barely or not-at-all activated in that specific ATOMIC run. Yet again I totally agree that for the sake of consistency, I should include omega and w_ls in the non-DEPHY format forcing file that will be converted to DEPHY format forcing file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants