-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use locale-aware sort ordering in JavaScript. #184
base: master
Are you sure you want to change the base?
Conversation
@dracos: thanks for this. But AIUI, using the single-value localeCompare means it'll be sorted according to the locale of the viewer, rather than the locale of the data, and I'm unconvinced (though open to persuasion) that that's a worthwhile thing to do. This should be a useful base for the full version though, once we add a language code to all the source data. |
I think the order of the viewer is the right thing to do, because they are the person who will be trying to find things in the list. As the Unicode Collation Algorithm says: "If a Swedish employee at this French company accesses the data from a Swedish company location, the customer names need to show up in the order that meets this employee's expectations—that is, in a Swedish order" or "For example, if a German businessman making a database selection to sum up revenue in each of of the cities from O... to P... for planning purposes does not realize that all cities starting with Ö were excluded because the query selection was using a Swedish collation, he will be one very unhappy customer." – if I sort the names in the Estonian list, I will not then look for Z before T. |
I, on the other hand, will, because I know it's a list of Estonian names and I know that in Estonian Z does come before T. So I'm curious as to why you[1] might actually be doing this (sorting the list to look for names starting with Z) [1] Including hypothetical yous. |
The page is in English, though. I agree that if the page was in Estonian, then the sort order should be Estonian to match. What if e.g. the Estonian list contained a couple of Spanish names, or it's a legislature with names in multiple 'languages', or if the legislature is for an area with multiple official languages with different sort orders? I think the sorting of data in text is a property of the language of the text, not the language of the data. As to your question, I might e.g. want to see all the representatives in a particular party, so sort by party then scroll through to find the party name. I don't want to have to know the sorting order of the language(s) of all the legislature in order to find it :) |
As we currently only offer the text in English, wouldn't that meant that we should always only sort in English order, regardless of locale settings (nullifying this change)? There are three or possibly four different language settings/options in play here:
For a few countries we have (though don’t currently display) the data of things like Party and Constituency names in multiple languages. I would like to offer a way of toggling the language of the data between, say, English / Finnish / Swedish or English / Persian in those cases — even where we have no way to change the language of the rest of the text on the page. So let’s say we had a way to toggle between the English and Finnish versions of the data on http://data.everypolitician.org/finland/term_table.html (currently Group is English but Area is Finnish). And then let’s say someone visits that page with their locale set to German.
(And if the answer is different for people's names than for other types of data, where should (the made up) "Östman Party" and "Östmanin eduskuntaryhmä" appear when sorting by Group?) |
Purely in terms of speed, sorting the UK House of Commons, which is one of the largest datasets we're likely to encounter, seems reasonably performant to me: http://everypolitician-viewer-pr-184.herokuapp.com/uk/term_table.html (though I haven't tested it on a particularly wide variety of browsers/machines/etc.) So the question is really just whether this is the right thing to do or not (or, I guess, whether it's at least better than what we currently have). To further complicate things, we may (ideally should) have both display_ and sort_ names. Clicking the sort on the Name column seems like it should use the sort_name, but that may be rather surprising to the user. |
Haven't actually tested this works, and I see reports of possible slowness using localeCompare so that should be checked too.