diff --git a/docs/site/TEMP-TEXT-FILES/data-stability.txt b/docs/site/TEMP-TEXT-FILES/data-stability.txt deleted file mode 100644 index 0b37438cbd9..00000000000 --- a/docs/site/TEMP-TEXT-FILES/data-stability.txt +++ /dev/null @@ -1,7 +0,0 @@ -Data stability -Please be mindful of data stability. Data change in the CLDR/Survey Tool will have impact to user experiences on devices and applications. -Please follow below tips to help with data stability: -Carefully review the previously Approved data before suggesting for a change. -When it's clearly incorrect, Add your suggestion and start a forum discussion -Don't change the data when it is already acceptable (even if not optimal)-consider data preference vs. data inaccuracy. -Bring evidence of a variant being much better and in customary use than the existing Approved data to the Forum discussions and gain consensus to change the Approved value. \ No newline at end of file diff --git a/docs/site/TEMP-TEXT-FILES/ddl.txt b/docs/site/TEMP-TEXT-FILES/ddl.txt deleted file mode 100644 index ffc93725ea4..00000000000 --- a/docs/site/TEMP-TEXT-FILES/ddl.txt +++ /dev/null @@ -1,7 +0,0 @@ -CLDR DDL Subcommittee -The Common Locale Data Repository (CLDR) is widely used, and the content has grown dramatically over the years with participation by organizations of all types and sizes, as well as many individual contributors. -Contributors for Digitally Disadvantaged Languages (DDL) face unique challenges. The CLDR-DDL subcommittee has been formed to evaluate mechanisms to make it easier for contributors for DDLs to: -become contributors to CLDR -improve the coverage for their language in CLDR -raise the status of their contributions, so that the CLDR data for their language is incorporated into more products. -The DDL Subcommittee has started to meet every other week as of June, 2023. \ No newline at end of file diff --git a/docs/site/TEMP-TEXT-FILES/empty-cache.txt b/docs/site/TEMP-TEXT-FILES/empty-cache.txt deleted file mode 100644 index ae802833d03..00000000000 --- a/docs/site/TEMP-TEXT-FILES/empty-cache.txt +++ /dev/null @@ -1,24 +0,0 @@ -Empty Cache -The Survey Tool depends on Javascript and you may run into issues because of JavaScript on your system being outdated. -In the Survey tool, when going to pages like the Forum or Dashboard, you may see a Disconnect error. As shown in this example screenshot with the Details expanded. -These errors are typically JavaScript being outdated on your system and clearing cache will resolve the problem. -To clear your cache, follow the steps below (Windows/macOS). -Windows (please scroll down for macOS/Safari) -1. On the page that you are running in to the error, click F12 or open Developer Tools from the browser (Settings > More tools > Developer tools). -2. Go to the settings, the ⚙️ icon: -3. Find "Disable Cache (while DevTools is open)" and check. -4. Refresh the page by clicking on the Refresh button on the page. -For additional information about Browser cache tips, see https://www.getfilecloud.com/blog/2015/03/tech-tip-how-to-do-hard-refresh-in-browsers/#.XRJaApMzbuM for examples. -macOS - Safari -Open Safari's settings (cmd-, or go to Preferences under the Safari menu) -Go to the Advanced tab and enable the Develop menu -In the Develop menu, select Empty Caches (cmd-alt-E) -Now refresh the web-page (cmd-R) -When you run into case JavaScript is the issue, you may have to try the following: -How do I stop JavaScript caching in Chrome? -Open Google Chrome and navigate to the page you want to test. -Press F12 (if available) or open Developer Tools from within Chrome's settings (Settings > More tools > Developer tools). -Click the cog in the top right of the pop-out box. Check the "Disable Cache (while DevTools is open)" setting box. -Then right-click the "Refresh" button (arrow going in a circle), and pick the last option (hard reload/empty cache). -Sometimes you need to flush your browser's cache when the Survey Tool updates. -See https://www.getfilecloud.com/blog/2015/03/tech-tip-how-to-do-hard-refresh-in-browsers/#.XRJaApMzbuM for examples. \ No newline at end of file diff --git a/docs/site/TEMP-TEXT-FILES/index-bcp47-extension.txt b/docs/site/TEMP-TEXT-FILES/index-bcp47-extension.txt deleted file mode 100644 index af25e2b780a..00000000000 --- a/docs/site/TEMP-TEXT-FILES/index-bcp47-extension.txt +++ /dev/null @@ -1,20 +0,0 @@ -Unicode Extensions for BCP 47 -IETF BCP 47 Tags for Identifying Languages defines the language identifiers (tags) used on the Internet and in many standards. It has an extension mechanism that allows additional information to be included. The Unicode Consortium is the maintainer of the extension ‘u’ for Locale Extensions, as described in rfc6067, and the extension 't' for Transformed Content, as described in rfc6497. -The subtags available for use in the 'u' extension provide language tag extensions that provide for additional information needed for identifying locales. The 'u' subtags consist of a set of keys and associated values (types). For example, a locale identifier for British English with numeric collation has the following form: en-GB-u-kn-true -The subtags available for use in the 't' extension provide language tag extensions that provide for additional information needed for identifying transformed content, or a request to transform content in a certain way. For example, the language tag "ja-Kana-t-it" can be used as a content tag indicates Japanese Katakana transformed from Italian. It can also be used as a request for a given transformation. -For more details on the valid subtags for these extensions, their syntax, and their meanings, see LDML Section 3.7 Unicode BCP 47 Extension Data. -Machine-Readable Files for Validity Testing -Beginning with CLDR version 1.7.2, machine-readable files are available listing the valid attributes, keys, and types for each successive version of LDML. The most recently released version is always available at http://unicode.org/Public/cldr/latest/ in a file of the form cldr-common*.zip (in older versions the file was of the form cldr-core*.zip). Inside that file, the directory "common/bcp47/" contains the data files defining the valid attributes, keys, and types. -The BCP47 data is also currently maintained in a source code repository, with each release tagged, for viewing directly without unzipping. For example, see https://github.com/unicode-org/cldr/tree/release-38/common/bcp47. The current development snapshot is found at https://github.com/unicode-org/cldr/tree/master/common/bcp47. -All releases including the latest are listed on http://cldr.unicode.org/index/downloads, with a link to each respective data directory under the column heading Data, and direct access to the repository under the GitHub Tag. -For example, the timezone.xml file looks like the following: - - - - -Using this data, an implementation would determine that "fr-u-tz-adalv" and fr-u-tz-aedxb" are both valid. Some data in the CLDR data files also requires reference to LDML for validation according to Appendix Q of LDML. For example, LDML defines the type 'codepoints' to define specific code point ranges in Unicode for specific purposes. -Version Information -The following is not necessary for correct validation of the -u- extension, but may be useful for some readers. -Each release has an associated data directory of the form "http://unicode.org/Public/cldr/", where "" is replaced by the release number. The version number for any file is given by the directory where it was downloaded from. If that information is no longer available, the version can still be accessed by looking at the common/dtd/ldml.dtd file in the cldr-common*.zip file (for older versions, the core.zip file), at the element cldrVersion, such as the following. This information is also accessible with a validating XML parser. - -For each release after CLDR 1.8, types introduced in that release are also marked in the data files by the XML attribute "since", such as in the following example: \ No newline at end of file diff --git a/docs/site/TEMP-TEXT-FILES/index-charts.txt b/docs/site/TEMP-TEXT-FILES/index-charts.txt deleted file mode 100644 index a5210ca0799..00000000000 --- a/docs/site/TEMP-TEXT-FILES/index-charts.txt +++ /dev/null @@ -1,23 +0,0 @@ -CLDR Charts -The Unicode CLDR Charts provide different ways to view the Common Locale Data Repository data. -Latest - The charts for the latest release version -Dev - A snapshot of data under development -Previous - Previous available charts are linked from the download page in the Charts column -The format of most of the fields in the charts will be clear from the Name and ID, such as the months of the year. The format for others, such as the date or time formats, is structured and requires more interpretation. For more information, see UTS #35: Locale Data Markup Language (LDML). -Most charts have "double links" somewhere in each row. These are links that put the address of that row into the address bar of the browser for copying. -Note that not all CLDR data is included in the charts. -Version Deltas -Delta Data - Data that changed in the current release. -Delta DTDs - Differences between CLDR DTD's over time. -Locale-Based Data -Verification - Constructed data for verification: Dates, Timezones, Numbers -Summary - Provides a summary view of the main locale data. Language locales (those with no territory or variant) are presented with fully resolved data; the inherited or aliased data can be hidden if desired. Other locales do not show inherited or aliased data, just the differences from the respective language locale. The English value is provided for comparison (shown as "=" if it is equal to the localized value, and n/a if not available). The Sublocales column shows variations across locales. Hovering over each Sublocale value shows a pop-up with the locales that have that value. -By-Type - provides a side-by-side comparison of data from different locales for each field. For example, one can see all the locales that are left-to-right, or all the different translaitons of the Arabic script across languages. Data that is unconfimred or provisional is marked by a red-italic locale ID, such as ·bn_BD·. -Character Annotations - The CLDR emoji character annotations. -Subdivision Names - The (draft) CLDR subdivision names (names for states, provinces, cantons, etc.). -Collation Tailorings - Collation charts (draft) for CLDR locales. -Other Data -Supplemental Data - General data that is not part of the locale hierarchy but is still part of CLDR. Includes: plural rules, day-period rules, language matching, language-script information, territories (countries), and their subdivisions, timezones, and so on. -Transform - (Disabled temporarily) Some of the transforms in CLDR: the transliterations between different scripts. For more on transliterations, see Transliteration Guidelines. -Keyboards - Provides a view of keyboard data: layouts for different locales, mappings from characters to keyboards, and from keyboards to characters. -For more details on the locale data collection process, please see the CLDR process. For filing or viewing bug reports, see CLDR Bug Reports. \ No newline at end of file diff --git a/docs/site/TEMP-TEXT-FILES/index-keyboard-workgroup.txt b/docs/site/TEMP-TEXT-FILES/index-keyboard-workgroup.txt deleted file mode 100644 index 4cf1cca777c..00000000000 --- a/docs/site/TEMP-TEXT-FILES/index-keyboard-workgroup.txt +++ /dev/null @@ -1,37 +0,0 @@ -CLDR Keyboard Subcommittee -The CLDR Keyboard Subcommittee is developing a new cross-platform standard XML format for use by keyboard authors for inclusion in the CLDR source repository. -News -2023-Feb-29: The CLDR-TC has authorized the proposed specification to be released as stable (out of Technical Preview). -2023-May-15: The CLDR-TC has authorized Public Review Issue #476 of the proposed specification, as a "Technical Preview." The PRI closed on 2023-Jul-15. -Background -CLDR (Common Locale Data Repository) -Computing devices have become increasingly personal and increasingly affordable to the point that they are now within reach of most people on the planet. The diverse linguistic requirements of the world's 7+ billion people do not scale to traditional models of software development. In response to this, Unicode CLDR has emerged as a standards-based solution that empowers specialist and community input, as a means of balancing the needs of language communities with the technologies of major platform and service providers. -The challenge and promise of Keyboards -Text input is a core component of most computing experiences and is most commonly achieved using a keyboard, whether hardware or virtual (on-screen or touch). However, keyboard support for most of the world's languages is either completely missing or often does not adequately support the input needs of language communities. Improving text input support for minority languages is an essential part of the Unicode mission. -Keyboard data is currently completely platform-specific. Consequently, language communities and other keyboard authors must see their designs developed independently for every platform/operating system, resulting in unnecessary duplication of technical and organizational effort. -There is no central repository or contact point for this data, meaning that such authors must separately and independently contact all platform/operating system developers. -LDML: The universal interchange format for keyboards -The CLDR Keyboard Subcommittee is currently rewriting and redeveloping the existing LDML (XML) definition for keyboards (UTS#35 part 7) in order to define core keyboard-based text input requirements for the world's languages. This format allows the physical and virtual (on-screen or touch) keyboard layouts for a language to be defined in a single file. Input Method Editors (IME) or other input methods are not currently in scope for this format. -CLDR: A home for the world's newest keyboards -Today, there are many existing platform-specific implementations and keyboard definitions. This project does not intend to remove or replace existing well-established support. -The goal of this project is that, where otherwise unsupported languages are concerned, CLDR becomes the common source for keyboard data, for use by platform/operating system developers and vendors. -As a result, CLDR will also become the point of contact for keyboard authors and language communities to submit new or updated keyboard layouts to serve those user communities. CLDR has already become the definitive and publicly available source for the world's locale data. -Unicode: Enabling the world's languages -Keyboard support is part of a multi-step, often multi-year process of enabling a new language or script. -Three critical parts of initial support for a language in content are: -Encoding, in the Unicode Standard -Display, including fonts and text layout -Input -Today, the vast majority of the languages of the world are already in the Unicode encoding. The open-source Noto font provides a wide range of fonts to support display, and the Unicode character properties play a vital role in display. However, input support often lags many years behind when a script is added to Unicode. -The LDML keyboard format, and the CLDR repository, will make it much easier to deliver text input. -Common Questions -What is the history of this effort? -In 2012, the original LDML keyboard format was designed to describe keyboards for comparative purposes. In 2018, a PRI was created soliciting further feedback. -The CLDR Keyboard Subcommittee was formed and has been meeting since mid-2020. It quickly became apparent that the existing LDML format was insufficient for implementing new keyboard layouts. -What is the current status? -Release -Updates to LDML (UTS#35) Part 7: Keyboards are scheduled to be released as part of CLDR v45. -Implementations -The SIL Keyman project is actively working on an open-source implementation of the LDML format. -How can I get involved? -If you want to be engaged in this workgroup, please contact the CLDR Keyboard Subcommittee via the Unicode contact form. \ No newline at end of file diff --git a/docs/site/TEMP-TEXT-FILES/index-process.txt b/docs/site/TEMP-TEXT-FILES/index-process.txt deleted file mode 100644 index 58a7f404862..00000000000 --- a/docs/site/TEMP-TEXT-FILES/index-process.txt +++ /dev/null @@ -1,144 +0,0 @@ -CLDR Process -Introduction -This document describes the Unicode CLDR Technical Committee's process for data collection, resolution, public feedback and release. -The process is designed to be light-weight; in particular, the meetings are frequent, short, and informal. Most of the work is by email or phone, with a database recording requested changes (See change request). -When gathering data for a region and language, it is important to have multiple sources for that data to produce the most commonly used data. The initial versions of the data were based on best available sources, and updates with new and improvements are released twice a year with work by contributors inside and outside of the Unicode Consortium. -It is important to note that CLDR is a Repository, not a Registration. That is, contributors should NOT expect that their suggestions will simply be adopted into the repository; instead, it will be vetted by other contributors. -The CLDR Survey Tool is the main channel for collecting data, and bug/feature request are tracked in a database (CLDR Bug Reports). -The final approval of the release of any version of CLDR is up to the decision of the CLDR Technical Committee. -Formal Technical Committee Procedures -For more information on the formal procedures for the Unicode CLDR Technical Committee, see the Technical Committee Procedures for the Unicode Consortium. -Specification Changes -The UTS #35: Locale Data Markup Language (LDML) specification are kept up to date with each release with change/added structure for new data types or other features. -Requests for changes are entered in the bug/feature request database (CLDR Bug Reports). -Structural changes are always backwards-compatible. That is, previous files will continue to work. Deprecated elements remain, although their usage is strongly discouraged. -There is a standing policy for structural changes that require non-trivial code for proper implementation, such as time zone fallback or alias mechanisms. These require design discussions in the Technical Committee that demonstrates correct function according to the proposed specification. -Data- Submission and Vetting -The contributors of locale data are expected to be language speakers residing in the country/region. In particular, national standards organizations are encouraged to be involved in the data vetting process. -There are two types of data in the repository: -Core data (See Core data for new locales): The content is collected from language experts typically with a CLDR Technical Committee member involvement, and is reviewed by the committee. This is required for a new language to be added in CLDR. See also Exemplar Character Sources. -Common locale data: This is the bulk of the CLDR data and data collection occurs twice a year using the Survey tool. (See How to Contribute.) -The following 4 states are used to differentiate the data contribution levels. The initial data contributions are normally marked as draft; this may be changed once the data is vetted. -Level 1: unconfirmed -Level 2: provisional -Level 3: contributed (= minimally approved) -Level 4: approved (equivalent to an absent draft attribute) -Implementations may choose the level at which they wish to accept data. They may choose to accept even unconfirmed data if having some data is better than no data for their purpose. Approved data are vetted by language speakers; however, this does not mean that the data is guaranteed to be error-free -- this is simply the best judgment of the vetters and the committee according to the process. -Survey Tool User Levels -There are multiple levels of access and control: -Vetter Level Number of Votes Description -TC Member 50 / 6 or 4 - Manage users in their organization -- Can vet and submit data for all locales (However, their vetting work is only done to correct issues.) -- Can see the email addresses for all vetters in their organization -- Only uses a 50 vote for items agreed to by the CLDR technical Committee -- TC members may have a 6 or 4 regular vote depending on how actively their organization participates in the TC -TC Organization Managers 6 - Manage users in their organization -- Can vet and submit data for all locales (However, their vetting work is only done to correct issues.) -- Can see the email addresses for all vetters in their organization -Organization Managers 4 -Manage users in their organization -- Can vet and submit data for all locales (However, their vetting work is only done to correct issues.) -- Can see the email addresses for all vetters in their organization -TC Organization Vetter 6 - Can vet and submit data for a particular set of locales. -- Can see the email addresses for submitted data in their locales. -- Cannot manage other users. -Organization Vetter 4 - Can vet and submit data for a particular set of locales -- Can see the email addresses for submitted data in their locales. -- Cannot manage other users. -Guest Vetter 1 - Can vet and submit data for a particular set of locales -- Cannot see email addresses. -- Cannot manage other users. -Locked Vetter 0 - If a user is locked or removed, then their vote is considered a zero weight. -These levels are decided by the technical committee and the TC representative for the respective organizations. -Unicode TC members (full/institutional/supporting) can assign its users to Regular or Guest level, and with approval of the TC, users at the Expert level. -TC Organizations that are fully engaged in the CLDR Technical Committee are given a higher vote level of 6 votes to reflect their level of expertise and coordination in the working of CLDR and the survey tool as compared to the normal organization vote level of 4 votes -Liaison or associate members can assign to Guest, or to other levels with approval of the TC. -The liaison/associate member him/herself gets TC status in order to manage users, but gets a Guest status in terms of voting, unless the committee approves a higher level. -Users assigned to "unicode.org" are normally assigned as Guest, but the committee can assign a different level. -Voting Process -Each user gets a vote on each value, but the strength of the vote varies according to the user level (see table above). -For each value, each organization gets a vote based on the maximum (not cumulative) strength of the votes of its users who voted on that item. -For example, if an organization has 10 Vetters for one locale, if the highest user level who voted has user level of 4 votes, then the vote count attributed to the organization as a whole is 4 for that item. -Optimal Field Value -For each release, there is one optimal field value determined by the following: -Add up the votes for each value from each organization. -Sort the possible alternative values for a given field -by the most votes (descending) -then by UCA order of the values (ascending) -The first value is the optimal value (O). -The second value (if any) is the next best value (N). -Draft Status of Optimal Field Value -Let O be the optimal value's vote, N be the vote of the next best value (or zero if there is none), and G be the number of organizations that voted for the optimal value. Let oldStatus be the draft status of the previously released value. -Assign the draft status according to the first of the conditions below that applies: -Resulting Draft Status Condition -approved - O > N and O ≥ 8, for established locales* -- O > N and O ≥ 4, for other locales -contributed - O > N and O ≥ 4 and oldstatus < contributed -- O > N and O ≥ 2 and G ≥ 2 -provisional O ≥ N and O ≥ 2 -unconfirmed otherwise -Established locales are currently found in coverageLevels.xml, with approvalRequirement[@votes="8"] -Some specific items have an even higher threshold. See approvalRequirement elements in coverageLevels.xml for details. -If the oldStatus is better than the new draft status, then no change is made. Otherwise, the optimal value and its draft status are made part of the new release. -For example, if the new optimal value does not have the status of approved, and the previous release had an approved value (one that does not have an error and is not a fallback), then that previously-released value stays approved and replaces the optimal value in the following steps. -It is difficult to develop a formulation that provides for stability, yet allows people to make needed changes. The CLDR committee welcomes suggestions for tuning this mechanism. Such suggestions can be made by filing a new ticket. -Data- Resolution -After the contribution of collecting and vetting data, the data needs to be refined free of errors for the release: -Collisions errors are resolved by retaining one of the values and removing the other(s). -The resolution choice is based on the judgment of the committee, typically according to which field is most commonly used. -When an item is removed, an alternate may then become the new optimal value. -All values with errors are removed. -Non-optimal values are handled as follows -Those with no votes are removed. -Those with votes are marked with alt=proposed and given the draft status: unconfirmed -If a locale does not have minimal data (at least at a provisional level), then it may be excluded from the release. Where this is done, it may be restored to the repository for the next submission cycle. -This process can be fine-tuned by the Technical Committee as needed, to resolve any problems that turn up. A committee decision can also override any of the above process for any specific values. -For more information see the key links in CLDR Survey Tool (especially the Vetting Phase). -Notes: -If data has a formal problem, it can be fixed directly (in CVS) without going through the above process. Examples include: -syntactic problems in pattern, extra trailing spaces, inconsistent decimals, mechanical sweeps to change attributes, translatable characters not quoted in patterns, changing ' (punctuation mark) to curly apostrophe or s-cedilla to s-comma-below, removing disallowed exemplar characters (non-letter, number, mark, uppercase when there is a lowercase). -These are changed in-place, without changing the draft status. -Linguistically-sensitive data should always go through the survey tool. Examples include: -names of months, territories, number formats, changing ASCII apostrophe to U+02BC modifier letter apostrophe or U+02BB modifier letter turned comma, or U+02BD modifier letter reversed comma, adding/removing normal exemplar characters. -The TC committee can authorize bulk submissions of new data directly (CVS), with all new data marked draft="unconfirmed" (or other status decided by the committee), but only where the data passes the CheckCLDR console tests. -The survey tool does not currently handle all CLDR data. For data it doesn't cover, the regular bug system is used to submit new data or ask for revisions of this data. In particular: -Collation, transforms, or text segmentation, which are more complex. -For collation data, see the comparison charts at http://www.unicode.org/cldr/comparison_charts.html or the XML data at http://unicode.org/cldr/data/common/collation/ -For transforms, see the XML data at http://unicode.org/cldr/data/common/transforms/ -Non-linguistic locale data: -XML data: http://unicode.org/cldr/data/common/supplemental/ -HTML view: http://www.unicode.org/cldr/data/diff/supplemental/supplemental.html -Prioritization -There may be conflicting common practices or standards for a given country and language. Thus LDML provides keyword variants to reflect the different practices (for example, for German it allows the distinction between PHONEBOOK and DICTIONARY collation.). -When there is an existing national standard for a country that is widely accepted in practice, the goal is to follow that standard as much as possible. Where the common practice in the country deviates from the national standard, or if there are multiple conflicting common practices, or options in conforming to the national standard, or conflicting national standards, multiple variants may be entered into the CLDR, distinguished by keyword variants or variant locale identifiers. -Where a data value is identified as following a particular national standard (or other reference), the goal is to keep that data aligned with that standard. There is, however, no guarantee that data will be tagged with any or all of the national standards that it follows. -Maintenance Releases -Maintenance releases, such as 26.1, are issued whenever the standard identifiers change (that is, BCP 47 identifiers, Time zone identifiers, or ISO 4217 Currency identifiers). Updates to identifiers will also mean updating the English names for those identifiers. -Corrigenda may also be included in maintenance releases. Maintenance releases may also be issued if there are substantive changes to supplemental data (non-language such as script info, transforms) data or other critical data changes that impact the CLDR data users community. -The structure and DTD may change, but except for additions or for small bug fixes, data will not be changed in a way that would affect the content of resolved data. -Data Retention Policy -Public Feedback Process -The public can supply formal feedback into CLDR via the Survey Tool or by filing a Bug Report or Feature Request. There is also a public forum for questions at CLDRMailing List (details on archives are found there). -There is also a members-only CLDRmailing list for members of the CLDR Technical Committee. -Public Review Issues may be posted in cases where broader public feedback is desired on a particular issue. -Be aware that changes and updates to CLDR will only be taken in response to information entered in the Survey Tool or by filing a Bug Report or Feature Request. Discussion on public mailing lists is not monitored; no actions will be taken in response to such discussion -- only in response to filed bugs. The process of checking and entering data takes time and effort; so even when bugs/feature requests are accepted, it may take some time before they are in a release of CLDR. -Data Release Process -Version Numbering -The locale data is frozen per version. Once a version is released, it is never modified. Any changes, however minor, will mean a newer version of the locale data being released. The version numbering scheme is "xy.z", where z is incremented for maintenance releases, and xy is incremented for regular semi-annual releases as defined by the regular semi-annual schedule -Release Schedule -Early releases of a version of the common locale data will be issued as either alpha or beta releases, available for public feedback. The dates for the next scheduled release will be on CLDR Project. -The schedule milestones are listed below. -Milestone JiraPhase Description -Survey Tool Shakedown Selected survey tool users try out the survey tool and supply feedback. The contributed data will be considered as real data. -Data Submission dsub All survey tool registered u sers can add data and vet (vote for) for data -Data Vetting dvet The survey tool users focus shifts to resolving data differences/disputes, and resolve errors. -Data Resolution T he data contribution is closed for general contributors. The Technical Committee will close remaining errors and issues found during the release process . -Alpha and Beta releases rc The release candidates are available for testing. Only showstoppers will be triage and fixed at this point. -Release final Release completed with referenceable release notes and links. -Labels in the Jira column correspond to the phase field in Jira. Phase field in Jira is used to identify tickets that need to be completed before the start of each milestone (table above). -Meetings and Communication -The currently-scheduled meetings are listed on the Unicode Calendar. Meetings are held by phone, every week at 8:00 AM Pacific Time (-08:00 GMT in winter, -07:00 GMT in summer). Additional meeting is scheduled every other Mondays depending on the need and people's availability. -There is an internal email list for the Unicode CLDR Technical Committee, open to Unicode members and invited experts. All national standards bodies who are interested in locale data are also invited to become involved by establishing a Liaison membership in the Unicode Consortium, to gain access to this list. -Officers -The current Technical Committee Officers are: -Chair: Mark Davis (Google) -Vice-Chair: Annemarie Apple (Google) \ No newline at end of file diff --git a/docs/site/TEMP-TEXT-FILES/index-survey-tool.txt b/docs/site/TEMP-TEXT-FILES/index-survey-tool.txt deleted file mode 100644 index 5aeb81462c1..00000000000 --- a/docs/site/TEMP-TEXT-FILES/index-survey-tool.txt +++ /dev/null @@ -1,16 +0,0 @@ -CLDR Survey Tool -Survey Tool | Accounts | Guide | FAQ and Known Bugs -Introduction -CLDR provides key building blocks for software to support the world's languages, with the largest and most extensive standard repository of locale data available. -Translations in the Unicode Common Locale Data Repository are gathered and processed via what is called the Survey Tool, an online tool that can be used to view data for different languages and propose additions or changes. This tool provides a way to propose new localized data, see what others have proposed, and communicate with them to resolve differences. During each submission period, contributors from Unicode Consortium members, other organizations and the public at large are invited to review the data for their languages and countries, and propose new translations of terms or modifications, including language translations entirely new to the repository. -Below are the main pages to look at. -Schedule -For the Milestone schedule, see the navigation bar on the left. -Accounts -You don't need an account to view data for a particular language. If you wish to propose changes or additions, you will need an account. For how to get one, see Survey Tool Accounts. If you would like to add data for a new locale, see Adding New Locales. -Guide -For an overview of how the Survey Tool works, see the Survey Tool Guide. -New Fields -To see a summary of the new fields that will be in the next version of CLDR, see http://cldr.unicode.org/index/downloads/dev. At the top of that page you can follow a link to the beta release page. -Development -For developers, see the development pages. \ No newline at end of file diff --git a/docs/site/TEMP-TEXT-FILES/plurals.txt b/docs/site/TEMP-TEXT-FILES/plurals.txt deleted file mode 100644 index 71c6143e9d3..00000000000 --- a/docs/site/TEMP-TEXT-FILES/plurals.txt +++ /dev/null @@ -1,46 +0,0 @@ -Plurals & Units -Plurals -In CLDR, Plurals are used for localized Units and Compact numbers (under Numbers). -In the Survey Tool for translation purpose, plural forms shown per language will differ as only those that are relevant to that language are shown. -For example, in French, the distinction of the One and Other are available. Please see Plural Rules, and file a ticket if you see a form in the Survey Tool that is not expected in your language. -Note: Many of the sets of names form Logical Groups, and you need to make sure they have the same status or you will get error messages. See Logical Groups for more information. -Localized Units -Localized units provide more natural ways of expressing unit phrases that vary in plural form, such as "1 hour" vs "2 hours". While they cannot express all the intricacies of natural languages, the plural forms allow for more natural phrasing than constructions like "1 hour(s)". -As well as being used for durations, like "3.5 hours", they can also be used for relative times, such as: -"3 hours ago" (regarding an event that took place 3 hours in the past; that is, 3 hours before now). -"In 3 hours" (regarding an event that will take place 3 hours in the future; that is, 3 hours from now). -Casing of Relative Times -All of these should have the same casing for the first character (capitalized or not) if the pattern has letters coming before the placeholder. For example, in the following, either #1 needs to have a lowercase 'd', or #2 needs to have an uppercase 'H'. -Dentro de {0} horas -hace {0} años -Each unit may have multiple plural forms, one for each category (see below). These are composed with numbers using a unitPattern. A formatted number will be substituted in place of the number placeholder. -For example, for English if the unit is an hour and the number is 1234, then the number is looked up to get the rule category other. The number is then formatted into "1,234" and composed with the unitPattern for other to get the final result. Examples are in the table below for the unit hour. -Locale Number Formatted number Plural category CLDR Unit Pattern PH Unit Pattern Final Result -en 0 "0" other {0} hours [NUMBER] hours "0 hours" -en 1 "1" one {0} hour [NUMBER] hour "1 hour" -en 1234 "1,234" other {0} hours [NUMBER] hours "1,234 hours" -fr 0 "0" one {0} heure [NUMBER] heure "0 heure" -fr 1 "1" one {0} heure [NUMBER] heure "1 heure" -fr 1234 "1 234" other {0} heures [NUMBER] heures "1 234 heures" -Narrow and Short Forms -Unit formats -Whereas an expression like “{0} englische Meile pro Stunde” may be fine for the long form of the unit speed-mile-per-hour, it does not work for the short or narrow forms. In particular, the narrow form needs to be absolutely as short as possible. It is intended for circumstances where there is very little room in the UI. However, the message will have additional context. Thus the user will typically know, for example, that the units are speed or distance. So a much shorter abbreviation can be used than would work in general. -In addition, when English units are used in languages that don't use them, they will typically be accompanied by the equivalent metric amount. For example, a map in Russian might show the distance between cities in the US both in metric units and in English units (since they are used in the US, and may be needed for reference there). Or an ad for computer monitors might have “60,9 cm (24 Zoll) Full-HD Monitor”, with Zoll for inches. The metric unit might not even be present where the English unit is in common use, such as when measuring computer screen sizes. -The short form can be longer than the narrow form, but should also be as short as possible, while still being clear and grammatical. It may have less context available than the narrow form, and thus may need to be somewhat longer than the narrow form in order to be clear. -Some techniques for shortening the narrow or short form include: -Drop the space between the value and the unit: “{0}km” instead of “{0} km”. -Use symbols like km² or / instead of longer terms like “Quadrat” or “ pro ”. -Use symbols that would be understood in context: eg “/h” for “ per hour” when the topic is speed, or "Mi" for mile(s) when the topic is distance. -Replace the qualifiers "English" or "American" by an abbreviation (UK, US), or drop if most people would understand that the measurement would be an English unit (and not, say, an obsolete German or French one). -Use narrow symbols for CJK languages, such as “/” instead of “/”. -Which of these techniques you can use will depend on your language, of course. -Unit display names -The short and narrow forms of the display names for a unit need not be as short as the symbol used in the actual unit formats. In fact, since the display name may often provide the context necessary to properly understand a unit symbol, the display name will often be longer and more explicit than the short or narrow form of a unit symbol. Often the narrow display name is not specified so that it falls back to the short display name. -Past and Future -Unit patterns for past and future (3 hours ago, In 4 hours) are related to Relative Dates, and occur in the same circumstances. They need to have the same casing behavior. That is, if the translation for "Yesterday" starts with an uppercase 'Y', then the translation for "In {0} hours" needs to start with an uppercase 'I' (if it doesn't start with the placeholder). -Minimal Pairs -Minimal pairs are used to verify the different grammatical features used by a language. These messages are not to be translated literally; do not simply translate the English! -Plurals (cardinals) and Ordinals. See Determining Plural Categories. -Grammatical Case and Gender. See Grammatical Inflection -Compound Units -Units of measurement can be formed from other units and other components. For more information, see Compound Units. \ No newline at end of file diff --git a/docs/site/TEMP-TEXT-FILES/resolving-errors.txt b/docs/site/TEMP-TEXT-FILES/resolving-errors.txt deleted file mode 100644 index 45c7901bc7a..00000000000 --- a/docs/site/TEMP-TEXT-FILES/resolving-errors.txt +++ /dev/null @@ -1,22 +0,0 @@ -Handling Logical Group Errors -A "logical group" is a set of items that need to be treated as a single unit in terms of voting. -Examples of common logical groups in the survey tool data are: -Sets of month names or weekday names in a calendar. -Any group of items that have plural categories associated with them ( for example, in currencies "1 US Dollar", "5 US Dollars" ). -In compact decimals, groups of formats for 4-5-6 digits, 7-8-9 digits, 10-11-12 digits, or 13-14-15 digits. -All errors in logical groups must be resolved. All non-resolved errors must be resolved by the CLDR technical committee before a new version of CLDR can be released. -Logical Group Errors -There are two Errors or Warnings that you may see in the SurveyTool, and these errors should fix from linguistic side if as much as possible. -Error type 1: "Incomplete Logical Group" -This is most serious and it means that one or more items in what's considered as a logical group has been added; however, in doing so at least one other is missing (✘). -To fix: Make sure that values for ALL of the items in the logical group are there. -An example: vote/enter values for all of the month names. Once you enter values for all the items in a logical group, this error will disappear. -Error type 2: "Inconsistent Draft Status" -This happens when the voting results would leave one of items in a group having a lower draft status (✔︎ approved, ✔︎ contributed, ✘ provisional, ✘ unconfirmed) than some other item in the group. -All of the items have to have the same status. -To fix: Go through all your votes and use the forum to coordinating with other vetters and come to an agreement on all items in the group. -Error type 3: "This item has a lower draft status (in its logical group) than X.". -same as Error type 2. -Inherited items can count as errors if they are part of a Logical Group. The easiest way to resolve these are to explicitly vote for the inherited or aliased items. Here is an example, before and after. -Before -After \ No newline at end of file diff --git a/docs/site/TEMP-TEXT-FILES/review-formats.txt b/docs/site/TEMP-TEXT-FILES/review-formats.txt deleted file mode 100644 index 15a8f878666..00000000000 --- a/docs/site/TEMP-TEXT-FILES/review-formats.txt +++ /dev/null @@ -1,38 +0,0 @@ -Review Reports -The Reports provide a way to get an overview of some of the formats to help ensure consistency. They help you to see how the data from different sections will fit together to produce the results that users will see. -Some of the reports will not be relevant during a Limited Submission, because you will not be able to make changes to data. In v43, only the Person Names Report is relevant. -You should review the reports: -well before* the end of the Submission phase -at the start of the Vetting phase -well before* the end of the Vetting phase -* Make sure you leave enough time that you can vote for any additional needed items. Often a fix will require cooperation from other vetters, so you will need to file at least one forum request so that they know that there is a problem and why you think it needs fixing. -Once you are done, check the appropriate item in the top window: -I have reviewed the items below, and they are all acceptable -The items are not all acceptable, but I have entered in votes for the right ones and filed at least one forum post to explain the problems. -I have not reviewed the items. -Sometimes there will be a structural problem that cannot be fixed by votes, even if all the vetters agree. For example, Latvian needs some extra support for formatting person names. In that case, you should file a ticket to report the problem. Don't file a ticket if the problem can be solved by you and the other vetters changing your votes. -To get started, in the Survey tool, open the Reports from the left navigation. -General Tips -To correct the data, use the View links on the right of each line in Reports to go directly to the field and correct the data. Sometimes the 'view' can't go to the exact line, where there are multiple items involved in the formatting. -File at least one forum request where you need others to change their votes. If it is a general problem, such as the capitalization being wrong for all abbreviated months, you can file a forum request for the first one, and state the general problem there. -Examples of Problems -Check for consistency between different forms by looking at them side by side. -The casing is inconsistent; (e.g. some months are capitalized and others are lower cased). see. capitalization rule. -Spelling consistency. -Use of hyphens in some rows/columns, but not in others. -Some abbreviations have periods and others don't, or some use native 'periods' and others use ASCII ".". -Person Name tips -Please read the Miscellaneous: Person Name Formats under "Review Report". -Date & Time Review Tips -Even if your language preference is to use the 12 hour format, it's important to also pay attention to the Times 24h section. In applications using CLDR data, users can set their preference to 24 hour formats. -Lower down on the page are charts of weekdays, months, and quarters for review. When a language has two different forms depending on format vs stand-alone, there will be two rows for the same item. Russian, for example, uses the genitive for format (top row, highlighted in yellow in the screenshot), and the nominative for stand-alone (second row). -Number formats Review tips -Each forms should be acceptable for your locale. -Review the cells within each row for consistency. -Also look for consistency across the rows for consistency. -Check that each cell has the correct plural form (if your language has plural forms). -Zones Review tips -The first two columns identify the timezone (metazone) -Compare the items in each row for consistency. -Compare the items in the same column across different rows. -City names that use hyphens do not show the hyphens in patterns because they are constructed from the city name and the pattern {0} Zeit. Consider whether it would be better to always remove the hyphens, or to add them to the pattern {0}-Zeit. \ No newline at end of file