Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use locale-aware sort ordering in JavaScript. #184

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

dracos
Copy link
Contributor

@dracos dracos commented May 15, 2015

Haven't actually tested this works, and I see reports of possible slowness using localeCompare so that should be checked too.

@tmtmtmtm
Copy link
Contributor

@dracos: thanks for this. But AIUI, using the single-value localeCompare means it'll be sorted according to the locale of the viewer, rather than the locale of the data, and I'm unconvinced (though open to persuasion) that that's a worthwhile thing to do.

This should be a useful base for the full version though, once we add a language code to all the source data.

@dracos
Copy link
Contributor Author

dracos commented May 16, 2015

I think the order of the viewer is the right thing to do, because they are the person who will be trying to find things in the list. As the Unicode Collation Algorithm says: "If a Swedish employee at this French company accesses the data from a Swedish company location, the customer names need to show up in the order that meets this employee's expectations—that is, in a Swedish order" or "For example, if a German businessman making a database selection to sum up revenue in each of of the cities from O... to P... for planning purposes does not realize that all cities starting with Ö were excluded because the query selection was using a Swedish collation, he will be one very unhappy customer." – if I sort the names in the Estonian list, I will not then look for Z before T.

@tmtmtmtm
Copy link
Contributor

if I sort the names in the Estonian list, I will not then look for Z before T.

I, on the other hand, will, because I know it's a list of Estonian names and I know that in Estonian Z does come before T.

So I'm curious as to why you[1] might actually be doing this (sorting the list to look for names starting with Z)


[1] Including hypothetical yous.

@dracos
Copy link
Contributor Author

dracos commented May 18, 2015

The page is in English, though. I agree that if the page was in Estonian, then the sort order should be Estonian to match. What if e.g. the Estonian list contained a couple of Spanish names, or it's a legislature with names in multiple 'languages', or if the legislature is for an area with multiple official languages with different sort orders? I think the sorting of data in text is a property of the language of the text, not the language of the data.

As to your question, I might e.g. want to see all the representatives in a particular party, so sort by party then scroll through to find the party name. I don't want to have to know the sorting order of the language(s) of all the legislature in order to find it :)

@tmtmtmtm
Copy link
Contributor

The page is in English, though … I think the sorting of data in text is a property of the language of the text, not the language of the data.

As we currently only offer the text in English, wouldn't that meant that we should always only sort in English order, regardless of locale settings (nullifying this change)?


There are three or possibly four different language settings/options in play here:

  1. The language of the page (currently always English)
  2. The viewer’s browser locale
  3. The language of the data being viewed
  4. The language(s) of the country being viewed

For a few countries we have (though don’t currently display) the data of things like Party and Constituency names in multiple languages. I would like to offer a way of toggling the language of the data between, say, English / Finnish / Swedish or English / Persian in those cases — even where we have no way to change the language of the rest of the text on the page.

So let’s say we had a way to toggle between the English and Finnish versions of the data on http://data.everypolitician.org/finland/term_table.html (currently Group is English but Area is Finnish). And then let’s say someone visits that page with their locale set to German.

  • When the data view is set to English, and the table is sorted by Name, where should Peter Östman appear?
  • When the data view is set to Finnish, and the table is sorted by Name, where should Peter Östman appear?

(And if the answer is different for people's names than for other types of data, where should (the made up) "Östman Party" and "Östmanin eduskuntaryhmä" appear when sorting by Group?)

@tmtmtmtm tmtmtmtm requested a deployment to everypolitician-viewer-pr-184 May 26, 2015 14:59 Pending
@tmtmtmtm tmtmtmtm deployed to everypolitician-viewer-pr-184 May 26, 2015 15:00 Active
@tmtmtmtm tmtmtmtm deployed to everypolitician-viewer-pr-184 May 26, 2015 15:02 Active
@tmtmtmtm
Copy link
Contributor

Purely in terms of speed, sorting the UK House of Commons, which is one of the largest datasets we're likely to encounter, seems reasonably performant to me: http://everypolitician-viewer-pr-184.herokuapp.com/uk/term_table.html (though I haven't tested it on a particularly wide variety of browsers/machines/etc.)

So the question is really just whether this is the right thing to do or not (or, I guess, whether it's at least better than what we currently have).

To further complicate things, we may (ideally should) have both display_ and sort_ names. Clicking the sort on the Name column seems like it should use the sort_name, but that may be rather surprising to the user.

@tmtmtmtm tmtmtmtm removed the 3 - WIP label Jul 16, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants