-
Notifications
You must be signed in to change notification settings - Fork 102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Re-organise @benchmark
display
#260
base: main
Are you sure you want to change the base?
Conversation
Glancing at the code, I feel like there may be a few too many commented out lines/blocks, however, nothing else jumps out at me 🙂. I do have more feedback on the style though. I'll just go through the dot points you've raised here and then bring up one or two things from your last commit in #236.
I've said it before, but I really like this. I do wonder if
I'm not sure that this is a good idea. In
I think some rounding is a very good idea. On that note, I wonder of important/useful the
This has been discussed before, and is sensible.
Hmmm, can't say I'm a fan of this. I see from the original issue you say you don't like Now, a few things from your comments in the issue.
To me, I find the jagged line length more offputting than the lack of breathing space. I am still in favour of reordering the lines as previously mentioned.
We've already got the least informative bit of information on the first line
The difference in length with the numbers is only around 2-3 characters, so you can still make the length somewhat closer. Furthermore, as you're already constructing strings from the values you can just take the length of those strings. That is in fact what currently occurs with |
The code is a mess BTW, sorry. Lots to tidy up.
I don't love this, maybe there's a better way. I hoped there would be a self-evident symbol but the best I could think of is ㊿, and that seems not to print as a single-width character. (For the mean,
I'm pretty sure this is garbage. I presume the reason to print 3 figures after the
Yes, I can see the logic of how you got there. But it jars every time for me; the larger values are above, not to the right. And even in text If not In traditional graphs, this is often marked with a symbol for breaking the axis. That might be another way, rather than labelling the last bar.
This is good, and I agree that coding zero only by the colour would be too subtle. But until I read the source I wasn't sure that the smallest bar meant exactly zero, not just rounded down below the next-biggest bar. My hope is that this will be easier to infer with the difference of shade. (Especially with the log scale, although less of a concern if that's harder to run into.) Maybe you considered this, but printing
Sure. It would be easy to make it fit the first line exactly. The reason I showed 3 together above, though, is that I think comparing many is common, and that it looks strange to me to have the graphs change size as you go down the page. It could be fixed and also match the first line, if the times were printed (say) always with 4 numbers, I guess all the text is ragged-right here, nothing else is justified, and in this context a fixed length looks OK to me. And once we spend the lines on the graph, we may as well not squeeze it too much horizontally. One reason to like ragged-right is that I think it's less damaged by word wrapping. In 50 columns: Edit: At the width shown in #217 (comment), trying |
I was talking about the positioning of the
This only makes sense when you arrange the values in a column and want to line up the decimal places. This already isn't happening, so why not just go with three significant figures?
I think we've both said all that's worth saying on this.
Hmm, I still don't see
If you do significant figures instead of decimal places, then the length of the longest line will be much more consistent, and I think you could just set a close length here and be no more than a char or two off.
I'm not referring the the ragged-right alightment. That's never been in question. I do content though that visually:
is not as nice as,
For a few reasons:
|
Sure. I meant that needing to provide a key at all seems a bit awkward, I think we agree the first line would look better without. We can try before/after etc, but I meant first to wonder whether there might be bigger changes to avoid that entirely. Maybe if it marked all the quartiles, then 1/4 1/2 3/4 would be obvious without a key? Something like this (drawing, not data!)
Symbols like
|
You could do |
1 similar comment
You could do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this is great. A few specifics:
- Personally (I know others disagree), I really like the use of gray to indicate bins with 0 counts
- I like the idea of ꒛ for a broken axis; I would be sure to have one axis-line on either side so it stands out, and make sure the right edge value is clearly associated with the pre-break value.
- I worry a bit about the jagged linear/log scale of rounding to 2 digits; have you generated synthetic benchmark results and see if you're happy with the centration of the histogram in all cases?
- I like printing the actual trial details under the axis
Good point. One consequence of this rounding is that it sometimes demands a minimum width of 10%, but not always. Made-up cases with 2 samples, differing by 1% -- perhaps these should look more similar? I wonder a bit whether the graph should have a minimum width in all cases, perhaps even 10%. Useful graphs tend to be much wider, so perhaps seeing bars clumped together would be a reminder not to try too hard to interpret noise. |
I like the idea of a minimum 10% width. I bet going to 3 sigfigs would also sidestep the problem pretty well. (Yes, it will always be possible to find cases that do weird things with any number of sigfigs, but even with 3 I bet benchmarking noise would typically nix the oddities.) |
Also you might want to test against #261 |
Experimenting with Also, maybe the right way to lower-bound the width at roughly 10% is to demand that the limits are outside mean+-5% (or maybe 3%). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this now does what I planned.
Can I persuade any heavy users to try living on this branch for daily use, and see what they love or hate, or if they find further bugs? Paging @chriselrod
idea of ꒛ for a broken axis; I would be sure to have one axis-line on either side so it stands out, and make sure the right edge value is clearly associated with the pre-break value.
This I meant to play with a bit more. Just squeezing ꒛
into the existing gap looked awful, but there might be nice designs nearby.
pos = searchsortedlast(sorteddata, min + i * Δ) | ||
bins[i] = pos - lastpos | ||
lastpos = pos | ||
function histogram_bindata(data, fences::AbstractRange) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I re-wrote this, as it seemed the shortest path to making all the edge cases I invented work out (e.g. always highlight the correct median bar). It will be slower but no longer needs to sort the times, so I think in the end the whole thing comes out a similar speed.
function hist_round_low(times, lo=minimum(times), av=mean(times)) | ||
av < 0.1 && return 0.0 # stop at 0, not 0.01 ns, in trivial cases | ||
raw = min(lo, av / 1.03) # demand low edge 3% from mean, or further | ||
return round(raw, RoundDown; sigdigits = 2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left edge of the histogram is 0 for trivial cases like @benchmark 1+1
, minor tidying-up. These will also have a right edge of 1 ns
(the right edge is simply never less than 1ns).
av / 1.03
guarantees at least 3% below the mean. A similar condition on the right edge means the width is seldom below 10%. Thus benchmarks with very small variation will tend to sit in the middle. I'm not 100% sure that's a good idea, perhaps real use will tell?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After using this a bit, I do think seeing a scatter of bars in the centre is a sign that your whole distribution is very narrow, and hence you need not pay too much attention to the histogram.
Here's the example from #258 , old-vs-new. I guess my claim is that (with practice, perhaps) the fact that the first distribution is much narrower than the second here jumps out at you.
Notice also that, with multiple trials, it's easier to tell which ones allocate.
avgcolor = :green | ||
medcolor = :blue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same colours for the two bars as before. I think not changing them without good reason is a good idea for not confusing existing users. On some dark themes :blue
is quite dark. On other dark themes, :light_blue
is white :/
Would be easy to make these customisable, but not in this PR.
# printstyled(io, "½", color=medcolor) | ||
printstyled(io, "◑", color=medcolor) | ||
elseif i == q25pos | ||
# Quartile markers exist partly to explain the median marker, without needing a legend |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the quartile markers too much clutter? I'm not sure they are very useful.
The idea was to make the median marker self-explanatory. "quarter == quartile" isn't even a pun, right?
They are light grey -- if you have colours visible, then the colours should make obvious the connection between ◑
and median
etc. It's when you don't that the symbols may be confusing. I guess you could do printstyled("◔", hidden=true)
to hide them at the terminal (with colour) and show them when copy-pasted (without).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After ccb2bc3 the quartile markers are now hidden, on Julia 1.7 and later. On 1.6 this option doesn't exist (or would require messing with terminal codes).
Maybe it's too weird to do two things, effectively. It does look less cluttered without them.
if i == avgpos | ||
printstyled(io, bar; color=avgcolor) | ||
# If mean & median bars co-incide, colour the mean. Matches markers above. | ||
elseif i == medpos | ||
printstyled(io, bar; color=medcolor) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When bars co-incide, this gives preference to the mean: green bar and no blue bar.
However, for the markers, at present the blue median marker will be offset slightly. Making it go missing seemed weird to me. But not sure this is the right decision. Maybe the highlighted bar should move over too, in such cases? Is moving the median better than moving the mean?
The quartile markers will simply go missing if they coincide with any others.
# mean == median, three bars | ||
Trial(Parameters(), [3,4,5], [0,0,0], 0, 0) | ||
|
||
# three bars, including mean not median | ||
Trial(Parameters(), pi * [1,3,4,4], [0,0,0,100], 1, 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These examples previously highlighted the wrong bar as median. This was computed by hand based on the sorted times, which I think worked fine in most cases. Now it uses searchsortedlast(fences, t)
with t = median(times)
, the exact same method as used for filling in the bars themselves.
Could try a bit harder to write automatic tests for these, maybe. I believe tests above should catch obvious things, like changes which introduce off-by-1 errors in marker positions.
_nonhistwidth = 5 + length(boxspace) | ||
_minhistwidth = 18 + length(caption) | ||
histwidth = max(_minhistwidth, min(90, displaysize(io)[2]) - _nonhistwidth) | ||
# This should fit it within your terminal, but stops growing at 90 columns. Below about | ||
# 55 columns it will stop shrinking, by which point the first line has already wrapped. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This auto-sizes the histogram to fit your window, as suggested here #260 (comment) , but stops at 90 characters.
I think it works OK, but is it a good idea? That results in quite a big histogram, maybe it should stay smaller? Or maybe it should fill arbitrarily wide windows?
Notice also that it avoids printing trailing spaces in the histogram etc, which makes the display when word-wrapped a bit less broken up. The line on the left (like @info
) also makes a bit clearer what has been line-wrapped.
Sorry, I only just noticed this request. I am pretty swamped with teaching and probably can't use this much, but I checked out the branch and will start using it when the occasion arises. But especially given the delay, it's worth commenting that I like what I'm seeing above! |
For the next week or so I'm terribly busy, but I'll try to make some time to have another look at this within the next few weeks. |
@tecosaur bump? |
214d95d
to
6cb572c
Compare
Linus Torvalds on this: https://lkml.org/lkml/2020/5/29/1038. Limit of 80 characters was set when screens were 4:3 @ 720 × 480, I don't know anyone who programs these days have a 4:3 screen, certainly not the 99% of users of Julia, out of everything... I hope this PR gets merged if there are no major issues |
My 2c on column widths: It's not about my screen width, because I don't tend to have code files or repls take up the entire width of my screen. Splitting into 2/3 columns is often preferable. |
With 2/3 columns this will happily adjust down. With a huge screen, you won't have to turn your head. Sorry have not got around to typing a reply to all this. Tried above to seed threaded discussions on separate points of possible debate, to be more bite-sized. |
Adjusting down seems like a good solution to me, and is not something I tested thoroughly (if it works, I'm happy with that). |
My suggestion would be to stay comfortably <108 characters, because that is the most Github will show, even though it may show much less (as few as 50 with side-by-side view):
|
#260 (comment) is the explicit demonstration, in 55 columns. Care has been taken that line wrapping will be minimally disruptive on very narrow screens, like pasting into slack, even if initially run on a wider terminal. As explained there, it never expands beyond about 90, for which my reference is perhaps https://github.com/invenia/BlueStyle#synopsis (but, edit, thanks for the github 108 number). |
My complaint remains that looking at one screenshot is pretty misleading. I think you need to use the hammer to find out what's good about it, not look at the picture in the catalogue. But failing that, I tried to show off various aspects in a variety of screenshots above. Besides the scaling issue, what you miss is that the GC time is not printed when there is no GC time. Partly so as not to print "non-data ink", and partly to sharpen the difference between allocating and non-allocating versions. And more... when I have time. |
There are indeed variants that look slightly different, however I don't think a barrage of screenshots with slight differences is actually that helpful, so I went with a "maximal" (showing all elements) version. Particularly as the hiding behaviour remains unaffected — I've just tweaked order and styling (which is most apparent in the "maximal" version, that I used), I think this is a valid comparison for the changes I have made. |
bump, I still really want this to happen |
What needs to be done to resolve this? Is anyone waiting on a review or does it just lack for some free time to finish it off? If it's quibbling about details and everyone agrees this is an improvement over the status quo, I'd propose it's time to get this merged and then we can polish further in future PRs. |
There are currently merge conflicts, and unresolved discussion points. I do have the code for my (B) compromise/blend version from #260 (comment) which perhaps we'd all be happy with as an improvement on the current? |
Sorry I have not got around to writing up a reply to review points. I do think most have sensible answers, will try soon. |
still want this, bump |
I could tidy up the commit history with what I have, merge that, and let McAbbot come back to this and discuss further changes when their time allows. What do other people think of that? |
bump, do we still need tests for this? |
@mcabbott what is the state of this PR? |
This proposes to re-organise
show
for@benchmark
. The idea is:@benchmark
#236The histogram is tweaked as follows:
IOContext(stdout, :logbins => true)
(as before)+
not<
.To test on an example I didn't invent, here is the case from #249 / this post. Big screenshots but IMO it's useful to look at a few of them together, since surely you are comparing things:
Notice that in the 3rd case above, the mean and median co-incide, so the median marker is moved one to the left. A small lie, but omitting it seems like a puzzle. Note also that no GC times are printed because they are all zero.
Plain text:
Before (not exactly the same data). Notice the switch to log scale in the 2nd case. With the same size window, fits 4 less lines here.
After, on a dark theme (not the exact same data):
and on iTerm's version of Solarized Dark, with no minimum contrast:
As we know from stack traces, the
:light_black
has been set equal to the background. Here that doesn't seem like a huge problem.But also, this them sets
:light_blue
the same as the foreground, which is weird. I made the median bar that colour since I thought:blue
was nearly invisible in some dark themes. But perhaps it should change back.