Skip to content
This repository has been archived by the owner on May 2, 2022. It is now read-only.

Add script for finding missing translations across all locales #469

Closed
wants to merge 2 commits into from

Conversation

michaelmcmillan
Copy link
Member

@michaelmcmillan michaelmcmillan commented Mar 29, 2020

The script is heavily commented with what I've been thinking. There might be a logical error here, so please read through the script to see if it makes sense.

This script will produce false positives (since it looks at all changes to the file ever). Why did I make it like that? Because translations are added at different points in time – a new feature branch will contain new translations, while all the existing branches and locales will be missing those.

This will produce some false positives (translations which have been removed etc.), but it in turn include translations which are being worked on right now in different branches (which should allow us to start translating before the PR is done, and widen the bottleneck which is currently halting PRs). We could also just add the false positives to an ignore list to make them go away.

I'm hoping this should make it easier for us to quickly grab all missing translations for all locales and then handing them off to a non-technical person that don't know JSON or git, maybe via. a more friendly medium like Google Sheets.

This is what I get if I run the script right now for no, en-IN and se.

no missing translation for: Sweden basically means that the norwegian locale (no.json) has never had a row in its JSON file where the key is Sweden and the value is Sverige.

Does this make sense? Haha, I'm actually not sure. Here's the full output:

no missing translation for: Sweden
no missing translation for: https://www.whatismyzip.com
no missing translation for: What is my Zip code?
---
se missing translation for: this
se missing translation for: https://who.org
se missing translation for:  article as an attempt to chart these numbers.
se missing translation for: In total <%= numberWithSpaces(totalPeopleInContactWithInfected) %> people have reported that they have been in close contact with a person who was tested positive for COVID-19.
se missing translation for: In total <%= numberWithSpaces(totalInfectedPeopleWithSymptoms) %> people have reported that they have tested positive for COVID-19 and experience symptoms.
se missing translation for: In total <%= numberWithSpaces(totalPeopleInContactWithInfected) %> people have reported that they have been tested for COVID-19.
se missing translation for: Join the most important crowdsource! Regardless if you're healthy or not, please submit the form below – that is also valuable information!
se missing translation for: Zip code information
se missing translation for: Netherlands
se missing translation for:  if you can't get it to work.
se missing translation for: I agree to my being data in accordance with the privacy statement
se missing translation for: To WHO.org
se missing translation for: /healthcondition/
se missing translation for: Using self-report, you can get a better picture of how many people have symptoms without exposing health workers to potential contamination hazards and without using up valuable infection control equipment that is already in short supply. We at Bustbyte, well helped by other volunteers, have created this tool in response to
se missing translation for: In total <%= numberWithSpaces(totalPeopleWithSymptoms) %> people have reported that they experience symptoms.
---
en-IN missing translation for: this
en-IN missing translation for: https://who.org
en-IN missing translation for:  article as an attempt to chart these numbers.
en-IN missing translation for: In total <%= numberWithSpaces(totalPeopleInContactWithInfected) %> people have reported that they have been in close contact with a person who was tested positive for COVID-19.
en-IN missing translation for: In total <%= numberWithSpaces(totalInfectedPeopleWithSymptoms) %> people have reported that they have tested positive for COVID-19 and experience symptoms.
en-IN missing translation for: In total <%= numberWithSpaces(totalPeopleInContactWithInfected) %> people have reported that they have been tested for COVID-19.
en-IN missing translation for: Join the most important crowdsource! Regardless if you're healthy or not, please submit the form below – that is also valuable information!
en-IN missing translation for: Zip code information
en-IN missing translation for: Netherlands
en-IN missing translation for:  if you can't get it to work.
en-IN missing translation for: I agree to my being data in accordance with the privacy statement
en-IN missing translation for: To WHO.org
en-IN missing translation for: /healthcondition/
en-IN missing translation for: Using self-report, you can get a better picture of how many people have symptoms without exposing health workers to potential contamination hazards and without using up valuable infection control equipment that is already in short supply. We at Bustbyte, well helped by other volunteers, have created this tool in response to
en-IN missing translation for: In total <%= numberWithSpaces(totalPeopleWithSymptoms) %> people have reported that they experience symptoms.

return {}

if __name__ == '__main__':
all_locales = ('no', 'se', 'en-IN')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this correct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to test, but should include all locales.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to import that from all available locales in our repo.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good point – I'll add that

@adriaandotcom
Copy link
Contributor

Also note that the front end trims the spacing from the keys of the translation files.

@fossecode
Copy link
Member

What if we rather run this as part of our GitHub workflow? And verify that all the locales on the current branch all contains the same keys? Or would that work against its purpose?

@adriaandotcom
Copy link
Contributor

Yeah, maybe nice to have warnings when you push to a repo or something.

@michaelmcmillan
Copy link
Member Author

Also note that the front end trims the spacing from the keys of the translation files.

Ah good catch! So it removes all whitespace, or simply trims redundant whitespace?

@adriaandotcom
Copy link
Contributor

adriaandotcom commented Mar 29, 2020

It basically does this:

app.use((req, res, next) => {
const translate = (
text: string,
...options: string[] | [Replacements]
): string => {
const replaced = text.replace(/[\s\n\t]+/g, ' ').trim();
// eslint-disable-next-line @typescript-eslint/ban-ts-ignore
// @ts-ignore
return i18n.__.apply(req, [replaced, ...options]);
};
res.locals.__ = translate;
res.__ = translate;
next();
});

@michaelmcmillan
Copy link
Member Author

Nice! I'll steal that line. I've opened a new PR where this script is written in JS instead: #481

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants