Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tal:content and tal:replace are sometimes treated as to be translated #876

Closed
viktordick opened this issue Jul 20, 2020 · 33 comments
Closed
Labels

Comments

@viktordick
Copy link
Contributor

viktordick commented Jul 20, 2020

I originally posted this in malthe/chameleon#326 but it was closed there as wontfix with the hint

To work around this issue, make sure expression is coerced to a string or unicode in _compile_zt_expr.
That said, the approach taken is quite different from how Chameleon was first imagined. The idea with expression compilers is that they generate code at compile time, not at render time (even if it is cached). I would imagine that quite a bit of performance is lost due to the extra function calls and scope changes.

We have seen some weird behavior when switching to Zope4 that seems to be related to the Chameleon cache.

We use a translation service that effectively delegates translation to a Python Script in the context of the page template in question. But for the sake of providing an MWE, I stripped it down to something that simply returns a fixed text whenever something is to be translated. You can find it here.

If using a page template like

<p i18n:translate="">original text</p>

This correctly replaces the content of the p tag by translated.

However, sometimes it also replaces the content of

<p tal:content="string:original text">something</p>

and the same with tal:replace. By trying to nail down the problem, I found that this happens if the Chameleon cache file for the given page template has been created by the currently running Zope instance. So if I create a new page template, the behavior can be observed. If I restart Zope, it is gone. If I stop Zope, clear the Chameleon cache and start it again, it can again be observed.

Trying to debug the issue, I found myself inside the Chameleon cache file in a code block like

    def __quote(target, quote, quote_entity, default, default_marker):
        __tt = type(target)
        if ((__tt is int) or (__tt is float) or (__tt is int)):
            target = str(target)
        else:
            if (__tt is bytes):
                target = decode(target)
            else:
                if (__tt is not str):
                    try:
                        target = target.__html__
                    except:
                        __converted = convert(target)
                        target = (str(target) if (target is __converted) else __converted)
                    else:
                        return target()
            if (target is not None):
                ...

If we are in the restarted run (where tal:replace does not get translated), target is of type str (with content original text) and we land in the last branch of the evaluation, where the string is being escaped.

However, if the Chameleon cache was cleared before starting Zope, target is of type <class 'chameleon.tokenize.Token'>. Since such a Token does not have the __html__ attribute, convert(target) is called, which somehow ends up calling the translation service.

There is one thing most probably broken here and one thing that I maybe simply am not using correctly: One is that the type of target changes depending on if there was already a Chameleon cache present when Zope was started, leading to different behaviors. The second is that the econtext given to the render function contains the fields

{'__convert': functools.partial(<function PageTemplate.render.<locals>.translate at 0x7fb5b35e5af0>, target_language=None),
 '__decode': <function PageTemplate.render.<locals>.decode at 0x7fb5b35e5ca0>,
 '__on_error_handler': None,
 '__translate': <function PageTemplate.render.<locals>.translate at 0x7fb5b35e5af0>,
...

Not sure if convert should point to a translate function if there is a specific __translate field also which seems to be the correct one for calling tags with i18n:translate.

Steps to reproduce

python3 -m venv Zope4
cd Zope4
bin/pip install -r https://raw.githubusercontent.com/zopefoundation/Zope/4.5/requirements-full.txt
bin/pip install https://github.com/perfact/Products.PerFactTranslationDomain@dummy-translator
bin/mkwsgiinstance -d zope -u admin:admin
bin/runwsgi zope/etc/zope.ini

Open a browser at http://localhost:8080/manage_main and provide the credentials admin:admin. Now add a page template testpt with the content

<p tal:content="string:replaced text">original text</p>

and let it render (http://localhost:8080/testpt)

Expected behavior

We should obtain a paragraph with the content replaced text.

Actual behavior

On the first try, we obtain a paragraph with the content translated. If restarting Zope, we again obtain the expected behavior. If removing everything in zope/var/cache/ and restarting Zope, we again obtain translated.

Versions

As seen above, I took the newly released Zope 4.5 as starting point. It contains Chameleon 3.8.1

@d-maurer
Copy link
Contributor

d-maurer commented Jul 20, 2020 via email

@d-maurer
Copy link
Contributor

d-maurer commented Jul 20, 2020 via email

@viktordick
Copy link
Contributor Author

I also can not reproduce it with chameleon itself (I have come up with essentially the same example).

I described the minimal example within Zope in the original post. Maybe we are doing something wrong with our translation product - that code has been untouched for years and I don't have contact to the original author. I somehow made it work under Zope 4, but for example I do not understand why we are providing an IFallbackTranslationDomainFactory instead of an ITranslationDomainFactory - from reading the documentation, this should be the right way to go, but when I tried to switch, regular translations no longer worked.

@d-maurer
Copy link
Contributor

d-maurer commented Jul 20, 2020 via email

@d-maurer
Copy link
Contributor

d-maurer commented Jul 21, 2020 via email

@d-maurer
Copy link
Contributor

I asked @malthe for help to reproduce the problem ("malthe/chameleon#326 (comment)"). Apparently, even for the initial compilation, Token is not always used as expression type; I have not yet been able to get chameleon use this type.

@d-maurer d-maurer added the bug label Jul 21, 2020
@d-maurer
Copy link
Contributor

The problem occurs only with Python 3. A small problem reproducing example for Python 3 is

from Products.PageTemplates.PageTemplate import PageTemplate
from Products.PageTemplates.tests.util import useChameleonEngine
useChameleonEngine()
def translate(*args, **kw): return "translated"

from z3c.pt import pagetemplate; pagetemplate.fast_translate = translate

t = PageTemplate()
ts5 = u"""<p tal:replace="string:x">original text</p>"""
t.write(ts5)
t()

This results in translated, the correct result would be x.

@viktordick
Copy link
Contributor Author

@d-maurer I created a Zope installation from your branch as described in my original post. The problem is the same. I put a breakpoint into my translation function and this is the call stack:

  /usr/lib/python3.8/threading.py(890)_bootstrap()
-> self._bootstrap_inner()
  /usr/lib/python3.8/threading.py(932)_bootstrap_inner()
-> self.run()
  /usr/lib/python3.8/threading.py(870)run()
-> self._target(*self._args, **self._kwargs)
  /home/vdick/Zope4/lib/python3.8/site-packages/waitress/task.py(86)handler_thread()
-> task.service()
  /home/vdick/Zope4/lib/python3.8/site-packages/waitress/channel.py(350)service()
-> task.service()
  /home/vdick/Zope4/lib/python3.8/site-packages/waitress/task.py(171)service()
-> self.execute()
  /home/vdick/Zope4/lib/python3.8/site-packages/waitress/task.py(441)execute()
-> app_iter = self.channel.server.application(environ, start_response)
  /home/vdick/Zope4/src/zope/src/ZPublisher/httpexceptions.py(30)__call__()
-> return self.application(environ, start_response)
  /home/vdick/Zope4/lib/python3.8/site-packages/paste/translogger.py(69)__call__()
-> return self.application(environ, replacement_start_response)
  /home/vdick/Zope4/src/zope/src/ZPublisher/WSGIPublisher.py(359)publish_module()
-> response = _publish(request, new_mod_info)
  /home/vdick/Zope4/src/zope/src/ZPublisher/WSGIPublisher.py(254)publish()
-> result = mapply(obj,
  /home/vdick/Zope4/src/zope/src/ZPublisher/mapply.py(85)mapply()
-> return debug(object, args, context)
  /home/vdick/Zope4/src/zope/src/ZPublisher/WSGIPublisher.py(63)call_object()
-> return obj(*args)
  /home/vdick/Zope4/src/zope/src/Shared/DC/Scripts/Bindings.py(335)__call__()
-> return self._bindAndExec(args, kw, None)
  /home/vdick/Zope4/src/zope/src/Shared/DC/Scripts/Bindings.py(372)_bindAndExec()
-> return self._exec(bound_data, args, kw)
  /home/vdick/Zope4/src/zope/src/Products/PageTemplates/ZopePageTemplate.py(285)_exec()
-> result = self.pt_render(extra_context=bound_names)
  /home/vdick/Zope4/src/zope/src/Products/PageTemplates/ZopePageTemplate.py(374)pt_render()
-> result = PageTemplate.pt_render(self, source, extra_context)
  /home/vdick/Zope4/src/zope/src/Products/PageTemplates/PageTemplate.py(83)pt_render()
-> return super(PageTemplate, self).pt_render(
  /home/vdick/Zope4/lib/python3.8/site-packages/zope/pagetemplate/pagetemplate.py(133)pt_render()
-> return self._v_program(
  /home/vdick/Zope4/src/zope/src/Products/PageTemplates/engine.py(354)__call__()
-> return template.render(**kwargs)
  /home/vdick/Zope4/lib/python3.8/site-packages/z3c/pt/pagetemplate.py(176)render()
-> return base_renderer(**context)
  /home/vdick/Zope4/lib/python3.8/site-packages/chameleon/zpt/template.py(307)render()
-> return super(PageTemplate, self).render(**_kw)
  /home/vdick/Zope4/lib/python3.8/site-packages/chameleon/template.py(192)render()
-> self._render(stream, econtext, rcontext)
  /home/vdick/Zope4/instance/var/cache/82bf2d94b3701d53e88a8bf035743b0b.py(126)render()
-> __content = __quote(__content, None, '\xad', None, None)
  /home/vdick/Zope4/instance/var/cache/82bf2d94b3701d53e88a8bf035743b0b.py(69)__quote()
-> __converted = convert(target)
  /home/vdick/Zope4/lib/python3.8/site-packages/chameleon/zpt/template.py(287)translate()
-> return txl(msgid, **kwargs)
  /home/vdick/Zope4/lib/python3.8/site-packages/z3c/pt/pagetemplate.py(163)translate()
-> return fast_translate(
  /home/vdick/Zope4/lib/python3.8/site-packages/chameleon/i18n.py(64)fast_translate()
-> result = translate(
  /home/vdick/Zope4/lib/python3.8/site-packages/zope/i18n/__init__.py(192)translate()
-> return util.translate(
> /home/vdick/Zope4/lib/python3.8/site-packages/Products/PerFactTranslationDomain/__init__.py(43)translate()
-> return 'translated'

I will try to further narrow down the problem, but maybe this helps.

@viktordick
Copy link
Contributor Author

@d-maurer OK I am sorry. Somehow I failed to actually install the correct branch - partly due to the # in the branch name and me not managing to construct a correct pip command to install this branch into the virtual environment, but mostly because in my virtualenv that I constructed from the requirements-full.txt Zope was installed into src and linked using easy_install while manually installing a checked-out branch using bin/pip install installs it into lib and somehow src takes precedence.

Anyhow, if I insert your changes directly, it works!

@viktordick
Copy link
Contributor Author

I have to do some more tests. With a minimal setup it seems that the problem originally posted is now gone. However, I have another example that still seems to be broken (an HTML fragment that is inserted using tal:replace="structure ..." but which seems to be still translated because our translation does an HTML cleanup and contained script-tags are filtered out, meaning the content has again gone through translation).

@viktordick
Copy link
Contributor Author

The problem that I am still seeing is that if I have a file named file_with_content in the context of the page template and use

<p tal:replace="here/file_with_content"></p>

the translation is called with the File object as argument. If no translation service is present, the content of the file is inserted instead.

The problem is similar to the original one because here also inside the chameleon cache file the passed target has no __html__ attribute and therefore convert(target) is called. However, in this case there is no difference between a cold run and a run that uses the chameleon cache.

So maybe it is not the same problem. I can work around it by using here/file_with_content/data, but in Zope 2 with Python 2 this was not necessary.

@viktordick
Copy link
Contributor Author

Let me add to that comment: Even though I can work around it, there is no i18n tag and (as far as I know, but I have to get more acquainted with that) no i18n.messageid, so the original assessment holds here also: no translation should be called. But the file created by chameleon seems to follow the logic: If the message is not a string and has no __html__ attribute, it should be passed through convert, which happens to point to the translate service.

The target is constructed by

__cache_139679322444512 = _static_139679340286928('path', 'here/file_containing_content', econtext=econtext)(_static_139679340286736(econtext, __zt_tmp))

with

_static_139679340286928 == <function _compile_zt_expr at 0x...>
_static_139679340286736 == <function _C2ZContextWrapper at 0x...>
_static_139679340286736(econtext, __zt_tmp) == <Products.PageTemplates.engine._with_vars_from_chameleon.<locals>.ContextWrapper object at 0x...>
__cache_139679322444512 == <File at /file_containing_content>

@d-maurer
Copy link
Contributor

d-maurer commented Jul 22, 2020 via email

@malthe
Copy link
Contributor

malthe commented Jul 22, 2020

I think I have described what's happening rather well in my comment in the Chameleon issue and given an easy workaround. Not sure what else is there to do? This is not a Chameleon issue.

@d-maurer
Copy link
Contributor

d-maurer commented Jul 22, 2020 via email

@d-maurer
Copy link
Contributor

d-maurer commented Jul 22, 2020 via email

@d-maurer
Copy link
Contributor

d-maurer commented Jul 22, 2020 via email

@viktordick
Copy link
Contributor Author

@d-maurer Thank you very much. I guess this simply means that our translation product should return any object that it is not ready to translate (probably anything that is not a string). This seems to work!

I hope this is the last surprise with this.

I do not know where this difference between zope.pagetemplate and chameleon should best be best documented, but I think it should be.

@d-maurer
Copy link
Contributor

d-maurer commented Jul 22, 2020 via email

@viktordick
Copy link
Contributor Author

Do not forget to handle (i.e. translate) zope.i18nmessageid.Message appropriately.

I think we do not use that, but I will keep it in mind. Thanks.

@d-maurer
Copy link
Contributor

Fixed in 4.x via #878

@d-maurer
Copy link
Contributor

Note that chameleon has an extremely wide notion for "i18n message". Those are the objects automatically (i.e. without an explicit i18n command) handed over to the translation function. While zope.pagetemplate considers only zope.i18nmessageid.Message instances as "i18n message"s, chameleon treats any object whose type is not str or unicode, which is not a number and which has not a __html__ method as "i18n message" and hands it over to the translation function.

The fix in #878 cannot change this; it only prevents that an internal chameleon type accidentally reaches the translation function.

@viktordick
Copy link
Contributor Author

I found one more problem, although it is a corner case that we probably do not actually use. I mentioned above how using tal:content="here/somefile" with a File object will also trigger the translation, so we changed our translation so it rejects any messages that fall under Chameleon's definition of translatable message but not zope.pagetemplate's.

However, now the case

<div i18n:translate="" tal:content="here/somefile"></div>

also does not get translated. In fact, the chameleon cache file contains the lines

__content = translate(__content, default=None, domain=__i18n_domain, context=__i18n_context, target_language=getitem('target_language'))
__content = __quote(__content, None, '\xad', None, None)

According to the observations above, this will pass the object twice through the translation, but it will in both cases be rejected and the __quote call will finally transform the message into a string using str because the translation returned the object itself.
Unfortunately, inside the translate function I have no way of knowing if there is an actual i18n:translate attribute on the tag and we are called from the explicit translate or if we are called from __quote because the object has no __html__ method. Short of inspecting the call stack that is, but I do not think that would be a good idea.

Returning str(msgid) in the translation for objects that are not strings also makes no difference between having an explicit i18n:translate attribute and not having it.

This only occurs with Files, not anything dynamic like a DTML Method or Script (Python). For those, the msgid is already a string. It can be worked around by using here/somefile/data, which will pass the content of the File instead of the File itself. Since we are probably not using such a combination anywhere (use the content of a file, but translate it), I can live with the current situation. I only wanted to mention it in case someone else stumbles upon it. It is a behavior that changed if compared to Zope 2.

I also reported this in chameleon#329.

So in summary, Chameleon has a different notion about what needs translating and while an explicit i18n:translate attribute will cause translation even for strings or objects with __hml__ method, for objects without such a method an explicit i18n:translate attribute will not make any difference and Chameleon will not really inform the translation function if it was called because of an explicit attribute or simply because of the type of the message.

By the way, in a comment above I said that the PRs related to this might not be needed any more because the translation engine anyhow has to reject messages it does not expect. However, upon closer inspection this is not correct. In particular it solves the deviating behavior between a run that creates the cache file and one that uses it. Without these changes, a similar problem like the one I still observe with File objects might occur with Tokens, but it would go away with a restart.

@d-maurer Unless you think this case should also be fixed somehow, I guess we could close this issue.

@d-maurer
Copy link
Contributor

d-maurer commented Jul 25, 2020 via email

@malthe
Copy link
Contributor

malthe commented Jul 25, 2020 via email

@d-maurer
Copy link
Contributor

d-maurer commented Jul 25, 2020 via email

@malthe
Copy link
Contributor

malthe commented Jul 25, 2020 via email

@viktordick
Copy link
Contributor Author

The translation function should always convert the object to a string, even a file-like object. I don’t see how this case is different than the other one we’ve discussed.
Whether the translation function converts the object to a string or simply returns it makes no difference, except that in the second case it is called twice if there is an explicit i18n:translate attribute. But it is called twice with the same arguments and both times simply returns its argument, so it makes no difference.

So the conclusion seems to be that chameleon expects the input to be a string in these cases. This can be achieved by either using here/somefile/data or python: str(here.somefile). In any case it is something I currently can not fix in the translation function but have to fix in the page templates (of which there are many). Like I said, the case that we insert content from a File and also want it translated is probably not present. And now that we have the fix from #878 and changed our translation function (returning the msgid itself or str(msgid) for messages that are not strings, both strategies work out the same), all other cases are covered.

The only way to cover this case also would be to allow a configuration of Chameleon that causes it to act like zope.pagetemplate (which probably means that __quote should not contain a call translate, although I am not sure if this would also handle zope.i18nmessageid.Message correctly - instead, it could simply use str(target) if the target is not a string. If there is an explicit i18n:translate attribute, an explicit call to translate is anyhow included before __quote is called).

I will close this issue now because the only think that could still be fixed would have to be fixed in Chameleon.

@malthe
Copy link
Contributor

malthe commented Jul 26, 2020

In the case where you have a file-object and <div i18n:translate="" tal:content="here/somefile"></div>:

  1. The conversion step (typically handled by the "translate" function) will convert the value to a string.
  2. The translation step (also handled by the "translate" function) will attempt to translate the string.

Why does this not result in a double translation? Because in (1) we either convert to a string or translate (a string or a message), never both. But it's up to the translate function to implement this logic.

@viktordick
Copy link
Contributor Author

1. The conversion step (typically handled by the "translate" function) will convert the value to a string.
2. The translation step (also handled by the "translate" function) will attempt to translate the string.

If I implement the translation function as either converting to string or translating, there is no double translation. But there is no single translation either, even if there is an i18n:translate attribute. That is because first 2. is called, then 1. (explicit translate is called before __quote). Since the translation returns a string, __quote will no longer fall back to using convert and therefore translate.

Maybe the order should be different, but I guess the primary function of __quote is to do the quoting when using tal:content without the structure keyword. Then it depends if for

<div i18n:translate="" tal:content="here/somefile"></div>

we expect the translated content to be quoted, or the quoted content to be translated. I think I would expect the translated content to be quoted.

@viktordick
Copy link
Contributor Author

As a side note, the original problem that caused me to investigate and post this issue was a

<script tal:replace="structure here/somefile></script>

where the somefile contained script tags. Since our translation does an HTML cleanup in order to avoid code injections by translations, the script tags were gone because Chameleon called __convert on the File, which has the same behavior as __quote and calls the translation for non-strings without __html__.

@d-maurer
Copy link
Contributor

d-maurer commented Jul 26, 2020 via email

@malthe
Copy link
Contributor

malthe commented Jul 26, 2020

I think I will take back some of what I have suggested after having looked more at this issue. However, here's an idea:

Typically, an i18n:translate will be used in conjunction with i18n:domain. On the other hand, an attempt to convert an object to a string – which invokes the translate function under the hood – does not get a value for domain. In fact it only gets a target_language keyword argument.

That is, for a file object, you should be able to tell whether or not a translation is being requested based on whether or not a translation domain is provided.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants