Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pgf layers do not get cleared #2412

Open
xworld21 opened this issue Sep 5, 2024 · 11 comments · May be fixed by #2480
Open

pgf layers do not get cleared #2412

xworld21 opened this issue Sep 5, 2024 · 11 comments · May be fixed by #2480

Comments

@xworld21
Copy link
Contributor

xworld21 commented Sep 5, 2024

A baffling scope bug, seemingly related to some issues reported on ar5iv. I spotted this in the atoms-and-orbitals test: the content of pgf layers (created via pgfonlayer) does not clear when closing a group.

\documentclass{article}
\usepackage{tikz}
\begin{document}
\begin{tikzpicture}
    \pgfdeclarelayer{background}
    \pgfsetlayers{background,main}
    \draw[red] (-1,0) -- (1,0);
    \begin{pgfonlayer}{background}
    \filldraw (0,0) circle (4pt);
    \end{pgfonlayer}
\end{tikzpicture}
\begin{tikzpicture}
    \pgfdeclarelayer{background}
    \pgfsetlayers{background,main}
    \draw[red] (-1,0) -- (1,0);
\end{tikzpicture}
\end{document}

renders as
image
You can see that the black circle is still in the background layer even if it's a new image.

@dginev dginev added this to the LaTeXML-0.8.9 milestone Sep 5, 2024
@xworld21
Copy link
Contributor Author

xworld21 commented Jan 4, 2025

Found the reason! This is the actual bug:

\documentclass{article}
\newbox\myboxA
\newbox\myboxB
\begin{document}

\setbox\myboxA\vbox{A must appear once.}
\setbox\myboxB\vbox{B must appear once.}

\box\myboxA

\box\myboxA

\begingroup
  \box\myboxB
\endgroup

\box\myboxB

\end{document}

The PDF will show two boxes, LaTeXML will show three. This is because LaTeXML's \box simply clears the reference to the box, which is local to the group. Clearing the reference globally fixes this issue, but is also completely wrong – somehow the whatsit itself must be emptied out in place, or say marked as used, to get the correct semantics.

@xworld21
Copy link
Contributor Author

xworld21 commented Jan 4, 2025

Another facet of the same bug: changing color of a copy of a box changes the color of all the boxes, which has a funny time travelling effect.

\copy\myboxA

{\color{red}\copy\myboxA}

\copy\myboxA

All three boxes will be red! It looks like \copy must do a proper copy – a shallow clone is not enough (I just tried).

@dginev
Copy link
Collaborator

dginev commented Jan 4, 2025

Really great to have this diagnostic, thank you! We should convert these snippets into new tests.

The TeXbook mentions that \box always acts globally here:

Box registers are local to groups just as arithmetic registers are. But there’s a
big difference between box registers and all the rest: When you use a \box, it
loses its value.

this is page 119 in Chapter 15.

And indeed, \copy may want to do a deep clone on the object it looks up, currently it only copies a pointer to the same object stored in the state table.

@xworld21
Copy link
Contributor Author

xworld21 commented Jan 4, 2025

The TeXbook mentions that \box always acts globally here:

It acts globally on the box, but not on the register, e.g.

\documentclass{article}
\newbox\myboxC
\begin{document}
\setbox\myboxC\vbox{C must appear twice.}
\begingroup
  \setbox\myboxC\vbox{C must appear twice.}
  \box\myboxC
\endgroup
\box\myboxC
\end{document}

should print two boxes. If you assign the register globally, you'll only get one.

@dginev
Copy link
Collaborator

dginev commented Jan 4, 2025

Ah I see, nice example. At least I would need a number of new tests here to make sure I'm not missing nuances.

But the texbook is consistently discussing "box registers" and "voiding" box registers, rather than individual boxes (be they horizontal or vertical).

Here is some more text on that, from page 120:

A box register is either “void” or it contains an hbox or a vbox. There is a
difference between a void register and one that contains an empty box whose
height, width, and depth are zero; for example, if \box3 is void, you can say \unhbox3
or \unvbox3 or \unhcopy3 or \unvcopy3, but if \box3 is equal to \hbox{} you can say
only \unhbox3 or \unhcopy3. If you say \global\setbox3=<box>, register \box3 will
become “globally void” when it is subsequently used or unboxed.

Exercise 15.11 touches on how magical that is (again for \global purposes):

And what’s in \box3 after {\global\setbox3=\hbox{A}\setbox3=\hbox{}} ?
Answer: \hbox{A}. But after {\global\setbox3=\hbox{A}\setbox3=\box3}, \box3
will be void.

So I am wondering if this also doesn't concern the exact scoping mechanics of the \setbox assignments... Maybe \box expires the value of the "box register" at the level where it was assigned, rather than at the exact current group level? More to learn for me.

@xworld21
Copy link
Contributor Author

xworld21 commented Jan 5, 2025

So I am wondering if this also doesn't concern the exact scoping mechanics of the \setbox assignments... Maybe \box expires the value of the "box register" at the level where it was assigned, rather than at the exact current group level?

In C/Perl/Rust parlance, I think it boils down to the register being a reference to the box rather than the box, with \setbox acting on the reference (which is local to the scope), and all the other commands on the box (which is in the global heap). Thus whenever you consume a box, all references to that box become void, but whether \box0 refers to that box or another one is still decided by the scope you are in.

Knuth says that this is a consequence of the lower-level implementation on page 119 'TEX [makes boxes void] for efficiency, since it is desirable to avoid copying the contents of potentially large boxes'. Note also how he says 'When you use a \box, it loses its value' (emphasis mine). This is about references vs values even if he doesn't say so explicitly.

@dginev dginev linked a pull request Jan 8, 2025 that will close this issue
@dginev
Copy link
Collaborator

dginev commented Jan 8, 2025

Thus whenever you consume a box, all references to that box become void

I am pretty sure \box voids the register assignment, not the box itself -- if it voided the box, it wouldn't appear on screen. And the TeXbook only discusses "box registers" in this context from what I'm seeing.

I made a PR that addresses the limitation in latexml -- the AssignValue($name, undef) trick only works as expected if it is called in the exact same local frame where the assignment was made. When the assignment happens in a higher frame, the undef "shadows it" locally, but once the TeX group is completed, the original will be back.

I think we were simply missing a more precise method to unbind a single frame's assignments from the undo table.

@xworld21
Copy link
Contributor Author

xworld21 commented Jan 8, 2025

I am pretty sure \box voids the register assignment, not the box itself -- if it voided the box, it wouldn't appear on screen. And the TeXbook only discusses "box registers" in this context from what I'm seeing.

I haven't been able to fully decipher the web sources, but I suspect we are both slightly wrong. The \box primitive voids the register locally, unless I completely misunderstand how the equivalence tables work. Then in many cases the box itself gets deallocated in ship_out, which creates the behaviour we are discussing – the box doesn't exist anymore.

To test this behaviour, we need a situation in which \box is not shipped in the group; I bet that the box will suddenly become available again in the upper level until something else ships the box.

@dginev
Copy link
Collaborator

dginev commented Jan 8, 2025

I suspect we are both slightly wrong.

Quite possibly.

Since you mention the web sources, I was also consulting them without much success, actually to some frustration, since there is a comment that is almost useful, but almost.

In:

procedure begin_box(@!box_context:integer);
label exit, done;
var @!p,@!q:pointer; {run through the current list}
@!m:quarterword; {the length of a replacement list}
@!k:halfword; {0 or |vmode| or |hmode|}
@!n:eight_bits; {a box number}
begin case cur_chr of
box_code: begin scan_eight_bit_int; cur_box:=box(cur_val);
  box(cur_val):=null; {the box becomes void, at the same level}
  end;
copy_code: begin scan_eight_bit_int; cur_box:=copy_node_list(box(cur_val));
  end;
last_box_code: @<If the current list ends with a box node, delete it from
  the list and make |cur_box| point to it; otherwise set |cur_box:=null|@>;
vsplit_code: @<Split off part of a vertical box, make |cur_box| point to it@>;
othercases @<Initiate the construction of an hbox or vbox, then |return|@>
endcases;@/
box_end(box_context); {in simple cases, we use the box immediately}
exit:end;

The text the box becomes void, at the same level should really be the clear answer to our conundrum.

But cur_val is contextually implied, and has baked in structure in the number it is holding that isn't easy to penetrate from this snippet alone. What we can see is that cur_box continues to hold the actual box contents for some extra time, while the "box register" in box(cur_val) gets voided.

@xworld21
Copy link
Contributor Author

xworld21 commented Jan 8, 2025

The text the box becomes void, at the same level should really be the clear answer to our conundrum.

Indeed, same level as what? However, I just checked how \def is implemented, and I was wrong about the equivalence tables. \def is implemented by checking the current level, and if necessary, pushing the current definition into a stack (see eq_define) before replacing it with a new one. Instead, box(cur_val):=null is replaces the current definition without pushing the previous one onto the stack, even if it comes from a previous level, and doesn't change the level of the definition either. So 'the same level' means 'the same level as the definition'. In LaTeXML terms, you undef the box all the way up to the frame where the definition was made. Does that make sense?

@dginev
Copy link
Collaborator

dginev commented Jan 8, 2025

@xworld21 great! That is exactly how I had interpreted it, and what #2480 attempts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants