forked from HowardHinnant/papers
-
Notifications
You must be signed in to change notification settings - Fork 5
/
Copy pathUFCS-is-breaking.html
377 lines (331 loc) · 15.1 KB
/
UFCS-is-breaking.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
<!DOCTYPE HTML>
<html>
<head>
<title>UFCS is a breaking change, of the absolutely worst kind</title>
<style>
p {text-align:justify}
li {text-align:justify}
blockquote.note
{
background-color:#E0E0E0;
padding-left: 15px;
padding-right: 15px;
padding-top: 1px;
padding-bottom: 1px;
}
ins {color:#00A000}
del {color:#A00000}
</style>
</head>
<body>
<address align=right>
Document number: P3027R0
<br/>
Audience: EWG
<br/>
<br/>
<a href="mailto:[email protected]">Ville Voutilainen</a><br/>
<a href="mailto:[email protected]">Mark Gillard</a><br/>
<a href="mailto:[email protected]">Mikael Kilpeläinen</a><br/>
<a href="mailto:[email protected]">Elias Kosunen</a><br/>
<a href="mailto:[email protected]">Jari Ronkainen</a><br/>
<a href="mailto:[email protected]">Mikko Sivulainen</a><br/>
<a href="mailto:[email protected]">Lauri Vasama</a><br/>
2023-10-26<br/>
</address>
<hr/>
<h1 align=center>UFCS is a breaking change, of the absolutely worst kind</h1>
<h2>A part zero general remark</h2>
<p>I'm sure some of you will read this paper, and have hasty
reactions thinking "no, no, it doesn't do that, you have misunderstood
the proposal, it will look for a member <b>first</b> <em>and only then</em>
look for a free function, so the problems you're concerned
about don't arise".</p>
<p>They do. That doesn't help <b>at all</b>. So please read this paper
carefully, and keep your mind, eyes, and ears open.</p> Contemplate
what it says, and avoid those hasty reactions.</p>
<h2>Abstract</h2>
<p>This paper explains why C++ should never, not ever,
adopt a Unified Function Call
Syntax that can turn (syntactic) member function calls into calls of free
functions, possibly using ADL.</p>
<p>The proposal <a href="https://open-std.org/JTC1/SC22/WG21/docs/papers/2023/p3021r0.pdf">P3021</a> (correctly) says thus:
<pre><blockquote>Q: Is this fully backward-compatible without breaking existing code?
A: Yes
</blockquote></pre>
</p>
<p>But what that means is not a feature, it's a bug, it's a HUGE bug.</p>
<p>Correct, it doesn't "break" your existing code in a way that would
turn it from well-formed to ill-formed. And that's a problem, because
<b>when that code changes meaning</b>, it may continue to compile, with a different meaning.
</p>
<p>
And that in and of itself is not even the biggest problem. That proposal
breaks an existing guarantee that all existing C++ has, or thought it has,
a guarantee that a sizable portion of our users rely on, some as a happy
accident (but such a happy accident makes them very happy), and some
as a deathly serious and intentional design choice:
</p>
<p>
<b>It breaks the guarantee that code that uses member function calls will <em>never</em> be subject to any of the complexity and woes of ADL</b>.
</p>
<p>
We do have proposals trying to put that guarantee into good use:
see <a href="https://open-std.org/JTC1/SC22/WG21/docs/papers/2023/p2855r0.html">P2855</a> for an attempt to tame the effects of ADL on how to write
Customization Points (and thing that aren't customizations, but just
opt-ins) for P2300. The UFCS proposal makes approaches like that no
longer work.
</p>
<p>Or, as another phrasing: it doesn't break the well-formedness of code. Even when it should. It doesn't break short-term "programming", as defined as
a notion by a certain Dr. Winters.</p>
<p>But it breaks <b>"engineering"</b>, as defined by the same person.</p>
<p>Because it breaks the meaning of code across refactorings, possibly
silently accepting a program that wasn't intended to be accepted,
giving it a meaning that is not at all what a programmer intended.</p>
<h2>Some remarks on the history of UFCS</h2>
<p>
As far as I understood, multiple members of WG21 opposed UFCS
for this sort of reasons. I doubt they expressed the rationale
sufficiently crisply. I vividly recall "it breaks future code"
being stated and to some extent explained.. ..but I do not recall
it being clearly pointed out that we're breaking a major
and massively significant design guarantee, or that the future
breakages are far worse than they might immediately appear,
because they may be <em>silent</em>.
</p>
<h2>Some breakage examples</h2>
<h3>A rename of a member function</h3>
<p>I trust most of us have done this sort of a refactoring a thousand
times.You simply have</p>
<p>
<pre><blockquote><code>
struct Foo {
void snap();
};</code></blockquote></pre>
</p>
<p>and then calls to that in any which context,</p>
<p>
<pre><blockquote><code>
template <class T>
void do_snap(T&& f) {
f.snap();
}</code></blockquote></pre>
</p>
<p>you then decide to rename snap() to slap().</p>
<p>Fine, you rename it in the library that declares and defines Foo.
And then you rebuild the users, and if any one of them is wrong, the
compiler will tell you. This is literally the bread-and-butter of a
statically type-checked language, you can do this at arbitrary scales.
</p>
<p>Right, now some of you might (or more likely will) say "just use an IDE".
The problem is that that's not a sufficient answer. You won't always
have all the declarations/definitions and uses in the same project
for refactoring browsers to find them all. I can, for example, safely
assure you that I won't have all of Qt loaded in my IDE when I want
to refactor something in Qt Core, nor will any of you have all the
applications loaded in your IDE when you refactor one of your libraries.</p>
<p>Right, now, the problem; with the proposed UFCS, you might not get
an error that snap() was renamed to slap(). It might continue to
compile, finding a free function, arbitrarily far away in some
code you never wrote, brought in by any of the many associated namespaces
that might have that effect. In other words, The Woes of ADL.
</p>
<p>
And then it's anyone's guess whether you ever intended a function
like that to be called by your code. For a lot of code, it's today
safe to assume that you certainly didn't, because you used a call
syntax that's guaranteed not to go looking for an ADL overload set.
</p>
<h3>A removal of a member function, or never writing one</h3>
<p>I think this example has been brought up before, but I want
to reiterate it regardless. Somewhat like before, with
some small changes, we have</p>
<p>
<pre><blockquote><code>
struct Foo {
void slap();
};</code></blockquote></pre>
</p>
<p>and then calls to that in any which context,</p>
<p>
<pre><blockquote><code>
template <class T>
void slap(T&& f) {
f.slap();
}</code></blockquote></pre>
</p>
<p>
Of course, those with keen eyes will notice that we have a free function
template slap() which calls a member slap(). Perfectly reasonable, perfectly
fine, not worrisome by any means.
</p>
<p>
Let's add a class:</p>
<p>
<pre><blockquote><code>
struct Bar {
};</code></blockquote></pre>
</p>
<p>Okay. You have existing calls of the free function template slap() with
an argument of type Foo. Now add some with an argument of type Bar,
or just add calls of some template that ends up calling the free
function template slap()
with a Bar argument, when it already called slap() with a Foo argument.
</p>
<p>The added call, via whichever layers and forwarders, is infinitely
recursive. The same thing happens if you have an existing member
slap() and you for whatever reason remove it.</p>
<p><b>If that doesn't scare the living daylights out of you, I wonder what will.</b>
<p>We have here some refactoring that seem perfectly reasonable, and the first part is actually no refactoring at all, it's just a new type, and you're going
to introduce it into the system. Write the class, start using it. See
if the code using it compiles.</p>
<p>And then it silently compiles and explodes your program. Sure, fine,
unit tests, release tests, smoke tests, all of that - but the problem
is hideously difficult to grasp for a non-expert, and it's a terrible
prospect to have a problem like that sneak past any of those testing rounds.
</p>
<h3>Changing the type of a function parameter</h3>
<p>Next, we have a class that has an innocent free()
function, and we call it in a function that takes
a reference to our class type:
<p>
<pre><blockquote><code>
struct Kraken {
void free();
};</code></blockquote></pre>
</p>
<p>
<pre><blockquote><code>
void release(Kraken& k) {
k.free();
}</code></blockquote></pre>
</p>
<p>We then change the processing function to take the parameter by
pointer:</p>
<p>
<pre><blockquote><code>
void release(Kraken* k) {
if (k)
k.free();
}</code></blockquote></pre>
</p>
<p>
We forgot to change the call from <code>k.free()</code> to
<code>k->free()</code>. The language won't point out this little
problem, and will be looking for a non-member free() function to call instead.
</p>
<p>A variant of this theme is changing</p>
<p>
<pre><blockquote><code>
void cleanup(X& x) {
x.close();
}</code></blockquote></pre>
</p>
to
<p>
<pre><blockquote><code>
void cleanup(int x) {
x.close();
}</code></blockquote></pre>
</p>
<p>These last two examples end up deallocating memory and closing a file
handle, and it's unlikely that was the intent of the code.</p>
<h3>Making a member inaccessible, or deleting a member</h3>
<p>
We end with the same result as removing a member if we tighten a member's
access or =delete it.
</p>
<p>We have relatively consistently avoided changing the meaning of
code on such access changes.</p>
<p>And for a deleted member, that's so opposite to how deleted members
should work that it's no longer funny - we have a perfect match,
lookup finds it, overload resolution resolves to it - the call should be ill-formed, not dispatch
to something else. This is just fundamentally unlike how a deleted
overload is supposed to behave.
</p>
<h3>None of these problems are specific to generic code that uses templates</h3>
<p>Sure, I used templated callers in those examples. But the problems
may arise without any templates. The use of templates there is a subtle
hint; template parameters bring in additional associated namespaces.
For those familiar with the woes of ADL, this isn't new, that's just
more namespaces that ADL may look into to successfully find a non-member
function to call with your member call syntax, once UFCS enters the
building.</p>
<h2>A non-recap on the woes of ADL</h2>
<p>I'm actually not going deep into the woes of ADL. Whether accidentally
or deliberately, many users completely and categorically avoid its
complexity by using straight-up OOP code with member functions and
their call syntax thereof. It's wildly and widely successful; it's
easily comprehensible to millions of programmers to explain "here's
the type of your thing, here's its data, here are the functions
that manipulate it, all in one cohesive package". And as a bonus, none of the long-distance
written-by-someone-else functions mess with any of that.
</p>
<p>The proposal for making member-syntax calls reach into non-member
functions breaks that simplicity and comprehensibility.
That's not a service to our
users, that's a devastating and horrible disservice to our users.
Please don't inflict this horror upon them.
</p>
<h2>Some responses to the Motivation parts of the UFCS proposal</h2>
<h3>"Enable generic code"</h3>
<p>Sure, UFCS would allow writing multi-meaning generic code that
works the same way with both member functions and non-member functions.
But it also dilutes the ability to reason about such code. Considering
what ADL can do, and what functions you end up calling, that genericity
isn't necessarily what you want, so it's much harder to reason about
the meaning of code.
</p>
<p>Yeah, we could introduce new syntaxes for "member-only" or "free function only" calls, but we already have those meanings, and they are in widespread
use. The ability to call members and only members got there first. Don't break it.</p>
<h3>"Enable encapsulated code: Nonmember nonfriends increase
encapsulation, lower coupling"</h3>
<p>The reasons stated in this one are all technically lofty and good.. but they are also
hard to understand for users accustomed to how member functions work
everywhere, and how to design and program with simple classes. They
don't just get a benefit out of this, they get increased complexity
where none was appropriate. So despite all the technical wins here,
I disagree on this one being a pure win, it has downsides as well.</p>
<p>It's also not like tightly-coupled member functions having unrestricted
access to all of your data members are always an encapsulation problem.
In some over-simplified examples it might seem that way, but in practice,
many of those data members are themselves of class type, so there is
encapsulation in the code even if you don't ever use a single non-member
function.</p>
<h3>"It's friendlier to programmers: Discoverability"</h3>
<p>Well, first of all, I disagree with this sort of UFCS being friendly
to anyone, expert or non-expert, considering that it removes a fundamental
guarantee and replaces it with something that you can't reason about at all,
let alone reason about it simply.</p>
<p>The additional specific part with which I disagree with is the code completion
discoverability, and "to programmers to discover what they can do next in their code". It's already sometimes a problem to show umpteen completion
choices for "what they can do next", adding more is not a pure improvement.
And in various situations, showing more completions that allow you
to invoke functions you're actually never supposed to use for your
particular problem isn't an improvement.</p>
<h2>Summa summarum</h2>
<p>Enabling a multi-meaning for a member function call to also find and call
non-member calls without changing the syntax of those calls is a terrible
idea at such a level that words fail me. WG21 should not do this disservice
to its users. If we want to introduce such a multi-meaning facility,
we <b>have to</b> find a new syntax for it, despite how icky that might
seem to some. Otherwise we're hoping for a magical improvement, but
that magic ends up being the thing that brings forth non-improvements
that make those improvements void.</p>
<p>
The way I see this whole discussion is that we want to enable terse
and simple and uniform code. While that's a lofty goal, and could
theoretically serve some non-experts well in addition to experts,
it ends up not doing so when it introduces hideous complexity
in places that were previously guaranteed to not have such complexity.</p>
<p>We would certainly serve experts. We would probably serve some non-experts.
But at the same time, we expose many programmers to a possibly huge amount
of added complexity just dropped into their code, whether existing or new,
with completely
unpredictable and hard-to-reason results. Except some such results can be exemplified
to be seriously undesirable, and have the most horrible change of meaning,
like turning compile-time errors into run-time errors, including logic
errors, infinite loops and crashes.</p>
<p>The overall benefit/cost ratio isn't there. That ratio is terrible. Don't do it.</p>
</body>
</html>