-
-
Notifications
You must be signed in to change notification settings - Fork 361
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add more API languages. #81
Comments
That seems like a pretty interesting idea. I'm a little concerned about changing the folder structure (or in this case adding a folder structure) for the existing files. My other main concern is that we would need someone to maintain the different languages as well as make sure changes in files in one language end up propagating to the other languages. I, unfortunately, do not speak or write in Russian. |
It might be better to some how store the different languages together but it still wouldn't handle the divergence of trying to maintain more than one language. |
Were you thinking just the descriptions or all fields converted to Russian and, inevitably, other languages? |
Yeah, pretty much. And also having the ability to request a specific language in a GET param. |
Does the SRD exist in any other languages currently? There is prob a risk of copyright infringement as well, validating that translations are not pulling from proprietary (D&D) info. Just pointing that out as something to think about. |
I am not a lawyer. @benjaminapetersen yes the SRD exists in other languages. According to the OGL, it is allowed to translate the SRD as long as the translation is also under OGL. So yes, you may translate the SRD if you want. You shouldn't be able to get sued for that. But currently there are no official translation of the SRD. (By "official", I mean WotC-approved.) There are official translations of books, but not any of the SRD. Therefore I don't see why this should be included and part of this project goal. Plus, if things are translated from the D&D books and not the SRD, we have no way to know that and this project could be hosting copyright-infringing material without anyone noticing. It's already hard to keep non-SRD monsters, spells, races, and subclasses out of this repo, I don't see why we should take the burden of doing it in languages we don't understand. |
Hi guys! I'd love to have this API available in spanish. I understand the problems @ogregoire is pointing out and the others about changing the project structure or mantaining the language. My proposition to do it would be:
then, during the building process and before refreshing the database, automatically create new localized files (
:) |
This could work, indeed. But that'd require a lot of work to do the mapping which I don't know how to do. @bagelbits any insight on how to do so? However I wouldn't go as far as to include the country so far (enUS, esES), because there are simply not enough translations yet. Also I don't like the file naming: usually, in the programming languages I know, different locales are named as So basically, I'd recommend using:
|
I'm ok with that naming, I was just following the same pattern of the current files (using For the mapping I was thinking on a simple replacement script that gets the value of I can try later to write a POC for this. The more tedious work will be replacing the current values with the keys, but I think I can automate it too to replace the original files values with the keys and generate the first language (en). |
How would you deal with incomplete or in progress translation? |
Taking those language keys not found on the incomplete or in in progress translation from the "master" language, english. What I saw in several projects I worked in, is use that master language and add in the end of the text |
Here you can find the POC: #158 |
I have some thoughts but I'll have to come back to this in a day or two. I haven't been exactly in the best headspace the last few days. Though I do like the direction where this is going. |
Sorry that took so long. I left a comment on that PR just for how we're keying everything. I think we could probably clean it up a bit based off of that but I really like this approach. It's simple and elegant, my favorite way to solve a problem. :D |
If we don't want values to be lists, I'd suggest keys along the lines of |
@carloslancha @bagelbits @fergcb https://github.com/Javrd/spanish-srd5.1-crawl/releases/tag/v0.1.0 |
@Javrd That's really useful! I think the first step is to pick up the work from the POC. This would break all of the english language into a separate doc that could then be hot-swapped for alternative language files. I think the current state is that the POC is sound, but naming conventions need to be updated, and I think there a bunch of merge conflicts, so the language file would probably have to start over. |
I'm worried about how this will actually get stored in the backend. Our json <-> mongodb pipeline would need to be altered a bit. Do we make one database per language? Keep languages as separate collections? Do we include translations in the documents themselves? https://stackoverflow.com/questions/23802834/multilingual-data-modeling-on-mongodb There's a few good approaches on this SO question that are worth exploring or feeling out |
Hmmmm. I think either separate db per language or separate collections? I'm trying to think about how to support this from a GraphQL standpoint. I guess it also begs the question on the api, how do want to distinguish which language? Would that be in the URL or as a param? |
I think the API side can be flexible. We can default to something like |
Hey! I just created a pull request (#445) for another approach to multilingual support. It allows us to parse the source data and separate it into what should be translatable (locale) and what shouldn't be (templates). It also allows us to build the source data back together with an altered locale file, resulting in a translated version of the database. Would love to hear your thoughts on it! |
Oh dang. I completely forgot to encapsulate the alternative design we came up with in the Discord. I should do that here. I'll take a look at your PR though. |
I didn't know there was a Discord 😅, I'll check that out and get up to speed. Okay cool, let me know if you have any questions about it! |
@djurnamn Right. So. Here's my alternative suggestion: I've been thinking about the multi-language support for the API a little bit more. And I think the design of one set of collections per language might be flawed/does not scale. On the the one hand, it means you can just copy the file of all text from one language, and translate it in line. However, I don't think the models in the API will easily support hot swapping which collection you're talking to based on the incoming language request. And I don't want to add a new set of models for each new supported language. The API should not care about new languages that get added after we start supporting them. We can handle this one of two ways. Option AConvert any string or array of strings to a hash where the key is the ISO language code and the value is the string/array in that language: {
"description": {
"en_us": "something",
"pt_br": "algo",
"ja_jp": "なにか"
}
} or {
"description": {
"en_us": ["something"],
"pt_br": ["algo"],
"ja_jp": ["なにか"]
}
} Option BOption B is similar to Option A, except backwards compatible. Namely we keep strings and arrays of strings the same. However, we add an additional key for each. The key would be same but we append {
"description": "something"
"description::localization": {
"en_us": "something",
"pt_br": "algo",
"ja_jp": "なにか"
}
} or {
"description": "something"
"description::localization": {
"en_us": ["something"],
"pt_br": ["algo"],
"ja_jp": ["なにか"]
}
} Either is a pretty massive change, but this is an exceptionally complicated feature. I'm honestly, leaning towards Option A, but I could be convinced for B. |
You can find the original post in Discord here. And if you haven't joined the Discord server yet. Here's the invite. |
Okay, that's cool! My populate templates script could fairly easily be modified to put the data back together in either of those shapes. And I could extend the part currently reading from one locale file to iterate through a locales folder, allowing us to rebuild the source files with any set of languages we like. Thanks for the invite! :) |
Excellent! Yeah might thoughts are you would basically build two scripts. One is a throwaway script that just coerces the data into this new shape. The second is a helper/tool script that will just prepare the database for a new language. Like adding in |
Yeah, that sounds good. Let me know how I can help! I think at least the logic for distinguishing between translatable and non-translatable values in my script could be useful for that. It would be cool to have the locales separately in some standardized format (like WebExtensions json) so that they can be pulled into, and maintained in, a translation management system. And perhaps then, the second script you mention could optionally parse the locale files and add their values in the localization map. |
@djurnamn Sorry for taking so long to respond. However, we now have semantic versioning for the docker images that get built for the DB, so I feel way more comfortable with the breaking change this will cause.
Can you say more about this? Are you saying having the locale files being separate from the rest of the data similar to your initial proposal? That is technically doable if it gets all stitched together before getting shoved into the DB. I think I'm still leaning towards Option A if we go that route. Thoughts? |
Hey @bagelbits! Oh, that's cool! Yeah, I guess that just felt like a more manageable way to maintain the translated content. The compiled version would still be what you outlined in Option A. If the build script, that combines the translatable and non-translatable content into the preferred format, is outside of the scope for what this repo should be, I could just maintain that separately. I'll start working on a new version of the build script that outputs the compiled data in the Option A format. |
@djurnamn I think Option A as the final stitched product makes a lot of sense. I can also see how splitting the locales into their own files makes it a lot easier to manage and work within the repo itself. It also means if you want to translate to a new language, you just copy from the language you feel comfortable translating from. Does a mix of both sound good? (Assuming that made sense) |
Also, thank you for doing the legwork on this one! |
Sounds great! Yeah, no problem! Thanks for creating and maintaining this, it opens up so many cool possibilities. :) |
I don't know if you get notifications from changes made in the pull request, but I managed to get this working the other day if you wanna try it out! :) |
Excellent! I'll try to take a look this week! |
Hey, any update on this ? |
I rewrote the parser and broke it out into its own repo: https://github.com/sospodd/5e-srd-translations It identifies the translatable content and creates a separate locale file for them. It can also create templates based on the structure of 5e-database json files where all translated content is represented by placeholders (paths in the generated locale json structure). |
That's awesome @djurnamn! But is it a standalone project or would be integrated in this repository ? |
I think it can make sense to maintain the parser and generated source locale separately. And anybody who wants to create a translation in their own language can just fork that and get to work. I made a version of the template population script in my original pull request that took values from multiple locales and generated the 5e-database json files in the "Option A" format (where each translatable property is an object with locale keys and translated values). It wasn't super pretty, but I believe it worked. I'll try to revisit that soon and add it to the 5e-srd-translations repo as well. Not really sure how to proceed from there but we'd at least be able to generate the data in the desired shape. |
Hey, love this repo!
I play DnD in Ukraine, and we use Russian as our primary language for the parties, so I would love to have this API available in different languages.
Maybe creating folders like
en
,ru
and just storing the differentjson
files there would be a good option?I am ready to contribute, just don't want to deploy my own API :)
The text was updated successfully, but these errors were encountered: