Tests fail when using new `ifx` compiler #31

eirikurj · 2024-12-19T11:10:37Z

Description

When compiling with the new ifx compiler, some tests are failing. This is preventing us from updating our docker images as part of https://github.com/mdolab/docker/pull/266.

Steps to reproduce issue

Pull the docker container mdolab/public:u22-intel-impi-latest-amd64-failed (specifically sha256:d318081f9bf4cc2c110685d4592fe6ee2b0f7799aff8cbefe87e138d04b224b7)
In ~/repos/cmplxfoil run testflo -v -n 1 .

Current behavior

When running one of the failed tests, for example, testflo -n 1 -s -v ./tests/test_solver_class.py:TestDerivativesCST.test_alpha_sens on the docker container the following error is printed

mdolabuser@a6661930c69d:~/repos/cmplxfoil$ testflo -n 1 -s -v ./tests/test_solver_class.py:TestDerivativesCST.test_alpha_sens
######## Fitting CST coefficients to coordinates in /home/mdolabuser/repos/cmplxfoil/tests/n0012BluntTE.dat ########
Upper surface
    L2 norm of coordinates in dat file versus fit coordinates: 0.0003504064468410577
    Fit CST coefficients: [0.16601024 0.13092967]
Lower surface
    L2 norm of coordinates in dat file versus fit coordinates: 0.0003504064499657832
    Fit CST coefficients: [-0.16601024 -0.13092967]
+----------------------------------------------------------------------+
|  Switching to Aero Problem: fc                                       |
+----------------------------------------------------------------------+
 LEXITFLAG TRUE, GOING TO 90...
 LEXITFLAG TRUE, GOING TO 90...
 LEXITFLAG TRUE, GOING TO 90...
 LEXITFLAG TRUE, GOING TO 90...
 LEXITFLAG TRUE, GOING TO 90...
 LEXITFLAG TRUE, GOING TO 90...
 LEXITFLAG TRUE, GOING TO 90...
./tests/test_solver_class.py:TestDerivativesCST.test_alpha_sens  ... FAIL (00:00:5.03, 172 MB)
Traceback (most recent call last):
  File "/home/mdolabuser/repos/cmplxfoil/./tests/test_solver_class.py", line 369, in test_alpha_sens
    np.testing.assert_allclose(checkSensFD, actualSensCS, rtol=relTol, atol=absTol)
  File "/home/mdolabuser/.pyenv/versions/3.11.9/lib/python3.11/site-packages/numpy/testing/_private/utils.py", line 1504, in assert_allclose
    assert_array_compare(compare, actual, desired, err_msg=str(err_msg),
  File "/home/mdolabuser/.pyenv/versions/3.11.9/lib/python3.11/contextlib.py", line 81, in inner
    return func(*args, **kwds)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/mdolabuser/.pyenv/versions/3.11.9/lib/python3.11/site-packages/numpy/testing/_private/utils.py", line 797, in assert_array_compare
    raise AssertionError(msg)
AssertionError: 
Not equal to tolerance rtol=0.002, atol=1e-05

Mismatched elements: 1 / 1 (100%)
Max absolute difference: 10005205.62637494
Max relative difference: inf
 x: array(10005205.626375)
 y: array(0.)



The following tests failed:
test_solver_class.py:TestDerivativesCST.test_alpha_sens


Passed:  0
Failed:  1
Skipped: 0


Ran 1 test using 1 processes
Wall clock time:   00:00:5.76

Expected behavior

All tests should pass

Observations

The build process looks to be a bit messy when using intel. There seems to be a mix of compilers used, gcc for interface c-code, ifx for compiling source and ifort for library. While this is probably not an issue, we should address this.
Since this is a f77 code, its possible that we have encountered an issue when using ifx that we have not encountered yet on other repositories, since they are mostly >f90. The porting guide might help, but it states that f77 is completely implemented.
I did some minor tests, and it seems that just removing any optimization, i.e., change from -O2 to -O0 and rebuilding, makes the tests pass. This indicates that some optimization is affecting the code when using ifx that does not show up with ifort for some reason.
I would appreciate it if someone can dig into this and identify the issue and possible solutions.

The text was updated successfully, but these errors were encountered:

A-CGray · 2024-12-19T13:13:22Z

To add to the confusion, if you build cmplxfoil using the gcc config file on these images, the tests still fail, even though every part of the build process is done using a gcc compiler (either gcc or gfortran). See the attached log below.

cmplxfoil-make.log

~~Given that the tests don't fail on the GCC images, what is different between the intel and gcc images that could be causing gcc compiled code to behave differently?~~

Ignore the above, I was forgetting to pip install cmplxfoil again after rebuilding with gcc, after doing that the tests pass, so this is just an intel compiler issue.

A-CGray · 2025-01-14T01:48:32Z

This is tenuous at best, but seems like as of 2023, ifx was known to not work very well with complex numbers, at least from a performance perspective.

Also, some of the default floating point arithmetic behaviour is different between ifort and ifx, ifort checks for NaNs by default while ifx doesn't

eirikurj · 2025-01-14T13:37:50Z

The NaN check might be a problem since -fp-model=fast is the default. Even though I did not report it here, I feel I did run this with precise and strict at some point, and it did not have any effect. This we can test.
The issue with ifx and complex numbers and optimization seems like a more possible explanation, since the code does seem to work with no optimizations -O0. Its possible that a newer compiler version will fix this, but we should check the compiler release notes.

eirikurj · 2025-01-14T13:49:04Z

Did a very quick test with the latest image testing these combinations of optimization flags and floating point models. All pass, but only when the optimizations are turned off.

opt flag	fast	precise	strict
O0	pass	pass	pass
O1	fail	fail	fail
O2	fail	fail	fail

A-CGray · 2025-01-14T14:02:13Z

Damn, it's not that then. I'm also not fully convinced this is purely a complex number issue either, as some of the failing tests don't involve the complexified code.

A-CGray · 2025-01-14T21:24:22Z

@eirikurj , as a fallback, and to avoid holding up https://github.com/mdolab/docker/pull/266 any more we could just change the logic in the intel config file so that we use ifort -O2 if available and ifx -O0 if not?

I've implemented this in #33

A-CGray mentioned this issue Jan 14, 2025

Update intel config to mitigate ifx issues #33

Open

13 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tests fail when using new `ifx` compiler #31

Tests fail when using new `ifx` compiler #31

eirikurj commented Dec 19, 2024 •

edited

Loading

A-CGray commented Dec 19, 2024 •

edited

Loading

A-CGray commented Jan 14, 2025

eirikurj commented Jan 14, 2025 •

edited

Loading

eirikurj commented Jan 14, 2025

A-CGray commented Jan 14, 2025

A-CGray commented Jan 14, 2025 •

edited

Loading

Tests fail when using new ifx compiler #31

Tests fail when using new ifx compiler #31

Comments

eirikurj commented Dec 19, 2024 • edited Loading

Description

Steps to reproduce issue

Current behavior

Expected behavior

Observations

A-CGray commented Dec 19, 2024 • edited Loading

A-CGray commented Jan 14, 2025

eirikurj commented Jan 14, 2025 • edited Loading

eirikurj commented Jan 14, 2025

A-CGray commented Jan 14, 2025

A-CGray commented Jan 14, 2025 • edited Loading

Tests fail when using new `ifx` compiler #31

Tests fail when using new `ifx` compiler #31

eirikurj commented Dec 19, 2024 •

edited

Loading

A-CGray commented Dec 19, 2024 •

edited

Loading

eirikurj commented Jan 14, 2025 •

edited

Loading

A-CGray commented Jan 14, 2025 •

edited

Loading