Skip to content

Commit

Permalink
or1k: we only support ordered compares
Browse files Browse the repository at this point in the history
Tighten the conditions to support ordered fpu compares only.

This fixes github issue #1
  • Loading branch information
stffrdhrn committed Apr 1, 2019
1 parent 6049a0d commit 975f33d
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions gcc/config/or1k/or1k.md
Original file line number Diff line number Diff line change
Expand Up @@ -481,7 +481,7 @@
(define_expand "cstore<F:mode>4"
[(set (match_operand:SI 0 "register_operand" "")
(if_then_else:F
(match_operator 1 "comparison_operator"
(match_operator 1 "ordered_comparison_operator"
[(match_operand:F 2 "register_operand" "")
(match_operand:F 3 "register_operand" "")])
(match_dup 0)
Expand Down Expand Up @@ -592,7 +592,7 @@
(define_expand "cbranch<F:mode>4"
[(set (pc)
(if_then_else
(match_operator 0 "comparison_operator"
(match_operator 0 "ordered_comparison_operator"
[(match_operand:F 1 "register_operand" "")
(match_operand:F 2 "register_operand" "")])
(label_ref (match_operand 3 "" ""))
Expand Down

15 comments on commit 975f33d

@bandvig
Copy link

@bandvig bandvig commented on 975f33d Apr 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stffrdhrn
Stafford, the update makes possible to build NewLIB itself with options -mhard-float -mdouble-float. However it looks like it brings an issue into printf() internals when printing a floating point value.
Example sinf_out.c:

#include <stdio.h>
#include <math.h>

int main(void)
{
  float x = 0.0f;
  float y;
  do
  {
    y = sinf(x);
    printf(" x = %f  y = %f\r\n", x, y);
    printf(" x = %24.17f  y = %24.17f\r\n", x, y);
    x = x + 1.0f;
  }
  while (x < 4.0f);
  return 0;
}

Being compiled with or1k-elf-gcc -pipe -O2 -mboard=atlys sinf_out.c -lm -o sinf_out.elf it's output is

 x = 0.000000  y = 0.000000
 x =      0.00000000000000000  y =      0.00000000000000000
 x = 1.000000  y = 0.8415-1
 x =      1.00000000000000000  y =      0.84147101640701294
 x = 2.000000  y = 0.91/30-
 x =      2.00000000000000000  y =      0.90929740667343140
 x = 3.000000  y = 0.0>1120
 x =      3.00000000000000000  y =      0.0>112000167369843

If I build it with previous version of tool-chain (without the modification and NewLIB was build without -mhard-float -mdouble-float options) it’s output is normal:

 x = 0.000000  y = 0.000000
 x =      0.00000000000000000  y =      0.00000000000000000
 x = 1.000000  y = 0.841471
 x =      1.00000000000000000  y =      0.84147101640701294
 x = 2.000000  y = 0.909297
 x =      2.00000000000000000  y =      0.90929740667343140
 x = 3.000000  y = 0.141120
 x =      3.00000000000000000  y =      0.14112000167369843

Not sure could it be related to unordered fp- comparison or not.

@stffrdhrn
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for testing this. This is interesting, the extra > and / characters seem strange. Is this running on FPGA or simulator? Just from the nature it looks like some sort of timing/corruption issue.

I tried to compile newlib myself right now and I got the below error. Its interesting that the compiler error is in the dtoa code (double to ascii) which is used for your printf test.

during RTL pass: expand
In file included from ../newlib/newlib/libc/stdlib/dtoa.c:30:
/home/shorne/work/gnu-toolchain/newlib/newlib/libc/include/stdlib.h: In function '_dtoa_r':
/home/shorne/work/gnu-toolchain/newlib/newlib/libc/include/stdlib.h:292:8: internal compiler error: in emit_move_insn, at expr.c:3746
  292 | char * _dtoa_r (struct _reent *, double, int, int, int *, int*, char**);
      |        ^~~~~~~
0x919da1 emit_move_insn(rtx_def*, rtx_def*)
        ../../gcc/gcc/expr.c:3745
0x8fdf9d force_reg(machine_mode, rt

Note, I might be getting this failure due to my recent change to fix issue #4 .

@stffrdhrn
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note, I got the failure with compiling newlib with CFLAGS_FOR_TARGET=-O2 -mhard-float -mdouble-float. When removing -O2 I don't get the failure.

I will need to fix it anyway.

@stffrdhrn
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note, when running in sim (or1k-elf-run) I don't see the issue.

or1k-elf-run ./printfp
 x = 0.000000  y = 0.000000
 x =      0.00000000000000000  y =      0.00000000000000000
 x = 1.000000  y = 0.841471
 x =      1.00000000000000000  y =      0.84147101640701294
 x = 2.000000  y = 0.909297
 x =      2.00000000000000000  y =      0.90929740667343140
 x = 3.000000  y = 0.141120
 x =      3.00000000000000000  y =      0.14112000167369843

BUT Running in icarus I can reproduce the issue: fusesoc run --target marocchino_tb --tool icarus mor1kx-generic --elf-load ./printfp --option_orfpx64a32_abi GCC9

Loading       26121 words
                   0 : Illegal Wishbone B3 cycle type (xxx)
 x = 0.000000  y = 0.000000
 x =      0.00000000000000000  y =      0.00000000000000000
 x = 1.000000  y = 0.8415-1
 x =      1.00000000000000000  y =      0.84147101640701294
 x = 2.000000  y = 0.91/30-
 x =      2.00000000000000000  y =      0.90929740667343140
 x = 3.000000  y = 0.0>1120
 x =      3.00000000000000000  y =      0.0>112000167369843

@bandvig
Copy link

@bandvig bandvig commented on 975f33d Apr 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stffrdhrn
I run it on FPGA. Simulation typically produces too long output for analysis. Even so the example, I believe, could be reduced down to

#include <stdio.h>
#include <math.h>
int main(void)
{
  printf("%f\r\n",sinf(2.0f));
  return 0;
}

it would be useful to have a reference for debug. Is it possible to use sim to generate, say, sequence of PC of completed instruction and written back values?

Perhaps, we additionally should check GCC behavior when it processed expression which involves operands o different types, for example "an integer" <add> "a double". I've remembered that at 1-st time I compiled Whetstone it operated incorrectly. To fix it I had to edit Whetstone sources by adding explicit type conversions into such expressions. Have GCC got an option to force types coercion before computing an expression with mixed type of operands?

@stffrdhrn
Copy link
Owner Author

@stffrdhrn stffrdhrn commented on 975f33d Apr 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be useful to have a reference for debug. Is it possible to use sim to generate, say, sequence of PC of completed instruction and written back values?

This can be done in sim. see:

or1k-elf-run --trace-insn ./printfp

I uploaded my output: https://gist.github.com/06c6e0dc0736b363c8bf2efea0981794

0x006a70  l.movhi r25,0x1      - gpr <- 0x10000                                 
0x006a74  l.lwz r23,-1708(r27) - gpr <- 0x0                                     
0x006a78  l.ori r8,r0,0x1f     - gpr <- 0x1f                                    
0x006a7c  lf.sub.d r24,r24,r21 - fd32r <- -1.0000*2^-1                          
0x006a80  l.addi r27,r25,-1704 - sys-sr-cy <- 0x1, sys-sr-ov <- 0x0, gpr <- 0xf958

@stffrdhrn
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have GCC got an option to force types coercion before computing an expression with mixed type of operands?

This should be done by default. I don't know of any option to force it to so.

@bandvig
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. As I've never build sim I've got a question. Could I build sim it while building binutils or I have to build it after completion GCC stage 2 (along with building gdb)?

@stffrdhrn
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is part of binutils and has no dependencies on gcc. You should already be building and installing it if you are building binutils.

Does or1k-elf-run exist in your path?

@stffrdhrn
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note another option, and probably better is to use or1ksim as it produces insn traces that match the or1k verilog monitor.

Did anyone as orfp32a64 to or1ksim?

@bandvig
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or1k-elf-run is definitely not present in my path because I build binutils traditionally with --disable-sim configuring option. :)
I doubt if or1ksim supports orfpx64a32.
Well, I'll rebuild binutils with sim soon.

@stffrdhrn
Copy link
Owner Author

@stffrdhrn stffrdhrn commented on 975f33d Apr 10, 2019 via email

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bandvig
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note, when running in sim (or1k-elf-run) I don't see the issue.

or1k-elf-run ./printfp
 x = 0.000000  y = 0.000000
 x =      0.00000000000000000  y =      0.00000000000000000
 x = 1.000000  y = 0.841471
 x =      1.00000000000000000  y =      0.84147101640701294
 x = 2.000000  y = 0.909297
 x =      2.00000000000000000  y =      0.90929740667343140
 x = 3.000000  y = 0.141120
 x =      3.00000000000000000  y =      0.14112000167369843

BUT Running in icarus I can reproduce the issue: fusesoc run --target marocchino_tb --tool icarus mor1kx-generic --elf-load ./printfp --option_orfpx64a32_abi GCC9

Loading       26121 words
                   0 : Illegal Wishbone B3 cycle type (xxx)
 x = 0.000000  y = 0.000000
 x =      0.00000000000000000  y =      0.00000000000000000
 x = 1.000000  y = 0.8415-1
 x =      1.00000000000000000  y =      0.84147101640701294
 x = 2.000000  y = 0.91/30-
 x =      2.00000000000000000  y =      0.90929740667343140
 x = 3.000000  y = 0.0>1120
 x =      3.00000000000000000  y =      0.0>112000167369843

@stffrdhrn
I've found the problem.
C/C++ double to integer conversion assumes truncation (rounding toward zero). But lf.ftoi.s(d) performes conversion in according with rounding mode bits of FPCSR, And these bits are "nearest-even ronding mode" by default.
I added code with change rounding mode to "toward zero" at the beginning of main() of the example and Whetstone. And printf() become printing floating point values correctly.
Not sure how the issue should be fixed correctly in hardware. As a temporary workaround I’m going to add parameter like OPTION_FTOI_ROUNDING = “IEEE”/”CPP”, where “CPP” is supposed to be default and it should force "toward zero" for lf.ftoi.s(d) instruction regardless of rounding mode bits of FPCSR.

@stffrdhrn
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great find. This is another thing we should clarify in the spec. But why is this something we didn't see before? Did we never build libc with fpu before?

@bandvig
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously I build NewLIB (and never uClibc) with FPU for GCC5-based tool-chain only. I suppose that GCC5 just doesn't involve lf.ftoi.s(d) but calls appropriate subroutine instead.

Please sign in to comment.