Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ward gets confused by multiple statements on a single line #11

Open
fingolfin opened this issue Sep 22, 2015 · 7 comments
Open

Ward gets confused by multiple statements on a single line #11

fingolfin opened this issue Sep 22, 2015 · 7 comments
Labels

Comments

@fingolfin
Copy link
Member

In the process of aligning the codebases in master and hpcgap-default, I made commit f862255ff449fcb85228cebc1862d6cd7680eb32 which basically only changes whitespace, yet causes HPC-GAP to segfault.

Eventually, I discovered this was due to the changes in oper.c, which confused ward. Consider this diff:

@@ -1751,8 +1752,7 @@ Obj DoOperation1Args (
     Obj                 prec;

     /* get the types of the arguments                                      */
-    type1 = TYPE_OBJ_FEO( arg1 );
-    id1 = ID_TYPE( type1 );
+    type1 = TYPE_OBJ_FEO( arg1 );  id1 = ID_TYPE( type1 );

     /* try to find an applicable method in the cache                       */
     cache = 1+ADDR_OBJ( CacheOper( oper, 1 ) );

The old code (on two lines) causes ward to output this:

    type1 = TYPE_OBJ_FEO( arg1 );
    ReadGuard(type1);
    id1 = ID_TYPE( type1 );

The new code (on one line) causes ward to output this:

    ReadGuard(type1),
    type1 = TYPE_OBJ_FEO( arg1 );  id1 = ID_TYPE( type1 );

While I can of course just revert that commit, I think this is a rather sever issue, as new code is likely to run into this issue again. Plus, I really prefer in this particular case to have the two statements on one line (I usually never, but this is one of the few cases where I think "breaking the rules" is justified).

Ideally, this should just be handled right, possibly by ward inserting a newline after the first semicolon; if that is infeasible, perhaps at least an error can be triggered?

fingolfin added a commit to gap-system/gap that referenced this issue Sep 22, 2015
In this case, two statements on a single line confuse ward, resulting
in wrong read guards, which in turn lead to a crash.

See gap-system/ward#11
@rbehrends
Copy link
Contributor

Yes, this is a known issue, alas. The rewriting part of Ward basically relies on regex matching and needs to be changed. I have a partial rewrite in the works, but due to other work that has gotten pushed back again and again (primarily because workarounds existed and so it never became a huge priority, especially because only a few people were working on HPC-GAP).

@fingolfin fingolfin added the bug label Mar 22, 2017
@fingolfin
Copy link
Member Author

This issue is a blocker for the HPC-GAP kernel unification. It would be really great to get this resolved.

@rbehrends
Copy link
Contributor

rbehrends commented May 5, 2017

The underlying problem is that Ward examines the C source after it has gone through the preprocessor. This means that it's difficult to identify the position within a line in the original source code where the access that has to be guarded will appear. Short of writing your own preprocessor that tracks the source coordinates fully, that is.

A possible solution would be the following: split the line into statements if there are multiple semicolons and it isn't some other construct, such as a for statement (not trivial, but feasible), and skip all statements that don't contain the identifier that needs to be guarded. This is still a heuristic, but possibly a better one. Note that it still won't help with the above example, as the identifier occurs in the first statement, too.

Creating a warning/error if there are multiple statements on a line would be an easier option, as it can probably be done in the parser proper instead of the rewrite engine, but still not entirely trivial.

@fingolfin
Copy link
Member Author

I am still confused. When you talk about "preprocessor", do you mean the usual C preprocessor?
If no, then which preprocessor do you mean? If yes, then why does that matter? After all, the preprocessor does not split lines with multiple statements.

@rbehrends
Copy link
Contributor

I do mean the C preprocessor. The problem is that Ward does not see the original source code. If you write (say):

Obj obj = foo; SET_ELM_PLIST(list,1,obj);

what Ward sees is:

Obj obj = foo; do { Obj sep_Obj = (obj); ADDR_OBJ(list)[1] = sep_Obj; } while (0);

(Obviously, ADDR_OBJ also gets expanded, but I've elided that for simplicity.)

It can determine from this that there is a write guard necessary for list because ADDR_OBJ(list)[1] is assigned to. What it cannot do is relate that back to the original statement and tell you whether the assignment was originally part of Obj obj = foo or SET_ELM_PLIST(list,1,obj).

@fingolfin
Copy link
Member Author

Ahhh, now I see, thank you!

@fingolfin
Copy link
Member Author

Of course in the example above, the code

type1 = TYPE_OBJ_FEO( arg1 );  id1 = ID_TYPE( type1 );

gets turned into

type1 = TYPE_OBJ_FEO( arg1 );  id1 = ((*(Bag**)(type1))[4]);

So in that particular case, one might be able to resolve it (at least if one is willing to do something hackish), but of course in the general case, this is is hopeless.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants