tc39 / ecma402 Goto Github PK

View Code? Open in Web Editor NEW

524.0 524.0 102.0 4.94 MB

Status, process, and documents for ECMA 402

Home Page: https://tc39.es/ecma402/

License: Other

HTML 99.63% Shell 0.37%

ecma402's People

Contributors

Stargazers

Watchers

Forkers

caridy zbraniecki anba stasm littledan watilde jswalden mathiasbynens looknd nwservices alexxnica kryndex jebcat1982 okuryu rxaviers dilijev kleopatra999 jameslinus chicoxyzzy jackhorton fabalbon srl295 a136806 sffc snowlend nciric achatt89 rockframeworks ms2ger gsathya ryzokuken gibson042 frankyftang neotim intern5 dalavancloud leobalter romulocintra bocoup jkalio52 kamingli1st jennylut santiagoh3 spectranaut ttblood hebertsilva smulacashout cymgen30 caiolima longlho mbrukman pmsto shvaikalesh rkirsling cybernetics mobz pyrocat101 cortizg ptomato wangyuks federicobucchi isabella232 exe-boss palxx marcdahan bruna17 manny27nyc fayti1703 gramidt loonyfoxe frenchy77 fayazmiraz acidburn0zzz idanho global-localhost global19 global19-atlassian-net daebowale yinhk-forks pinkdiamond1 v-s-naaviinesh iwtem ghdevs boldgainobu dimples1988 alexsio-nau jsph-lm justingrant ben-allen ethicalsecurity-agency akjkmega trflynn89 salewski linusg arai-a seanpm2001 jessealama gesa

ecma402's Issues

[Discussion] find a unicode symbol to replace `[[<_key_>]]` mechanism in the 3rd edition

/cc @rwaldron @bterlson

[Migrating Ecma-402 to HTML] Section 7 - Requirements for Standard Built-in ECMAScript Objects

All intrinsics must be code formatted

Expose datetime formatting user preferences

Cloned from whatwg/html#171

All operating systems allow user to select time format. Usually between 12h, 24h or follow automatic.

Firefox OS is currently providing navigator.mozHour12 that can take true/false/undefined values and that nicely fits into Intl API:

(new Date()).toLocaleString(navigator.languages, {
  hour12: navigator.mozHour12,
  hour: 'numeric',
  minute: 'numeric'
});

When "use default" is used, mozHour is undefined and Intl uses the default hour12 value for the given locale. If mozHour is specified to either true or false, Intl API follows that.

There's also an event associated with it: timeformatchange that is dipatched on window.

We're not yet sure if it should be part of Ecma 402 or HTML spec because of events, but I'm opening the issue here to start the conversation in case the HTML group will throw it here.

I can see us wanting to design the API to handle more than one preference or just follow @ehsan's proposal from the HTML issue:

[NoInterfaceObject, Exposed=(Window,Worker)]
interface NavigatorSystemHour12 {
  readonly attribute boolean systemHour12;
};
Navigator implements NavigatorSystemHour12;
WorkerNavigator implements NavigatorSystemHour12;

Extend Formatters to allow for token formatting

UPDATE:

DateTimeFormat and NumberFormat are design to provide an opaque string as an output that is not meant to be manipulated by the consumer.
Unfortunately, that is impossible if one needs to format the output and that currently forces very dirty solutions that may break internationalization.

Example use case, is formatting minutes token to be smaller or different color than hours in hour:minute string.

var formatter = Intl.DateTimeFormat(navigator.languages, {
  hour: 'numeric',
  minute: 'numeric'
});
var string = formatter.format(new Date());
element.innerHTML = string; // we want to display hour token with bold font.

Modern user interfaces do that all the time - examples are Material Design in Android, modern Windows Mobile UI and PS4.

My proposal is to allow for token format strings to be available as option for the formatter. It could look like this:

var formatter = Intl.DateTimeFormat(navigator.languages, {
  hour: 'numeric',
  minute: 'numeric',
  format: {
    hour: '<strong>$&</strong>' // similar to String.replace(hour, '<strong>$&</strong>')
  }
});
var string = formatter.format(new Date()); // return '<strong>14</strong>:12' in pl

For non HTML outputs, the formatting may involve console formatting characters for terminal UI apps.

[Migrating Ecma-402 to HTML] Section 10 - Collator Objects

[Proposal] Compact Decimal Format to abbreviate large numbers

CLDR contains the details to abbreviate large numbers, e.g.:

        "decimalFormats-numberSystem-latn": {
          "standard": "#,##0.###",
          "long": {
            "decimalFormat": {
              "1000-count-one": "0 thousand",
              "1000-count-other": "0 thousand",
              "10000-count-one": "00 thousand",
              "10000-count-other": "00 thousand",
              "100000-count-one": "000 thousand",
              "100000-count-other": "000 thousand",
              "1000000-count-one": "0 million",
              "1000000-count-other": "0 million",
              "10000000-count-one": "00 million",
              "10000000-count-other": "00 million",
              "100000000-count-one": "000 million",
              "100000000-count-other": "000 million",
              "1000000000-count-one": "0 billion",
              "1000000000-count-other": "0 billion",
              "10000000000-count-one": "00 billion",
              "10000000000-count-other": "00 billion",
              "100000000000-count-one": "000 billion",
              "100000000000-count-other": "000 billion",
              "1000000000000-count-one": "0 trillion",
              "1000000000000-count-other": "0 trillion",
              "10000000000000-count-one": "00 trillion",
              "10000000000000-count-other": "00 trillion",
              "100000000000000-count-one": "000 trillion",
              "100000000000000-count-other": "000 trillion"
            }
          },
          "short": {
            "decimalFormat": {
              "1000-count-one": "0K",
              "1000-count-other": "0K",
              "10000-count-one": "00K",
              "10000-count-other": "00K",
              "100000-count-one": "000K",
              "100000-count-other": "000K",
              "1000000-count-one": "0M",
              "1000000-count-other": "0M",
              "10000000-count-one": "00M",
              "10000000-count-other": "00M",
              "100000000-count-one": "000M",
              "100000000-count-other": "000M",
              "1000000000-count-one": "0B",
              "1000000000-count-other": "0B",
              "10000000000-count-one": "00B",
              "10000000000-count-other": "00B",
              "100000000000-count-one": "000B",
              "100000000000-count-other": "000B",
              "1000000000000-count-one": "0T",
              "1000000000000-count-other": "0T",
              "10000000000000-count-one": "00T",
              "10000000000000-count-other": "00T",
              "100000000000000-count-one": "000T",
              "100000000000000-count-other": "000T"
            }
          }
        },

@rxaviers proposed this feature a while ago IIRC, but I could find the thread, so I'm posting it here to formalize the proposal.

Proposal

This proposal goes hand-to-hand with the pluralization (#34) since it will have to compute what's the pluralization token to choose the right format (In the CLDR data you will see that for english we have a lot of *-count-one and *-count-other).

The initial proposal could be to add one more configuration to specify either:

a) best-fit for the abbreviation, which means we choose the biggest matching decimal from the segments
b) the decimal reference to force to use a particular formatting option (e.g.: 10000, which could produce 1234K);

new Intl.NumberFormat('en', { compact: 'best-fit' }).format(1234000); // 1.2M
new Intl.NumberFormat('en', { compact: '4-digits' }).format(1234000); // 1234K

Open questions

how to match this with currency values so we can produce something like $12M.
the name of the configuration option when creating a new Intl.NumberFormat()
1M vs 1.2M
1M vs 1 Million (this will probably require another configuration to specify long vs short.
rounding and custom rounding settings? (related to rxaviers/ecma402-number-format-round-option#1 (comment))

/cc @jfparadis @zbraniecki

[Migrating Ecma-402 to HTML] Section 6 - Identification of Locales, Currencies, and Time Zones

[Migrating Ecma-402 to HTML] Annex A/B

https://github.com/tc39/ecma402/blob/master/spec/source/annexes.html

All string literals must be code formatted

@bterlson why have you changed the convention for string literals?

to:

[Migrating Ecma-402 to HTML] Section 12 - DateTimeFormat Objects

https://github.com/tc39/ecma402/blob/master/spec/source/datetimeformat.html

API for BCP 47 language tags

(cloned from bugzilla)

The ECMAScript Internationalization API Specification describes a number of abstract operations to process BCP 47 language tags. Some of them might be useful to expose as public API:

IsStructurallyValidLanguageTag
CanonicalizeLanguageTag
CanonicalizeLocaleList
LookupMatcher

Unit formatting API

Unit formatting is similar to number formatting, but it involves one more units.
The goal is to produce strings like "5 MB/s", "10 KB", "35 °C" etc.

I believe the most relevant for the Web audience areas would be:

Digital units
Duration units
Weather units
Length units
Compound units

and maybe:

Energy units
Frequency units

I would suggest that we limit ourselves first to just short version of units and use compound patterns to produce related units (like speed out of length and duration units).

In the future we may leave a window open to add 'variant' option for 'short'/'long' etc.

Proposed API:

var f = Intl.UnitFormat(navigator.languages, {
  units: [
    {type: 'digital', match: 'bestFit'}
  ]
});
f.format(1024); // would return '1 KB' in en-US

var f = Intl.UnitFormat(navigator.languages, {
  units: {
    {type: 'digital', unit: 'bestFit'},
    {type: 'length', unit: 'second'}
  }
});
f.format(1024); // would return '1 KB/s' in en-US

var f = Intl.UnitFormat(navigator.languages, {
  units: {
    {type: 'length', unit: 'bestFit'},
  }
});
f.format(2048); // would return '2 km' in en-US

Open questions

Should that be a separate formatter or should we try to extend NumberFormat to handle units?
How to handle unitless formatting like '5/101' (use case: 5 out of 101 emails unread)
How to define reference unit in which the passed variable is in? Should we hardcode - meter for length, byte for digital? Or should we allow for it to be an option?
We should define some rounding strategy options

9.2.5 ResolveLocale 13.f.ii only considers zero or one 'type' subtag

ResolveLocale cannot accept some valid calendars in a u-ca- extension, because it only considers a key subtag followed by zero or one type subtags. RFC 6067 allows for "zero or more" type subtags to make the keyword.

In the current spec the following calendar identifiers are incorrectly matched:

ethiopic-amete-alem is an alias for ethioaa, but will be matched as ethiopic.
islamic-civil replaces the deprecated islamicc, but will be matched as islamic.
islamic-umalqura, islamic-tbla, and islamic-rgsa will each be matched as islamic.

[Discussion] bidi control characters when formatting dates

Notes:

Edge currently includes "a bunch" of bidi control characters when formatting dates.
CLDR has the bidi data for structured text (STT is still in the proposal status, need more exploration here).
CLDR has direction marks in the date patterns for locales that need it.
Spec says nothing about bidi (probably assumes the previous bullet).
Other browsers are not using bidi structured text for dates explicitly.

Problems:

Formatting dates as regular text does not preserve the structure or direction and as a result the text on the screen becomes incomprehensible.
Users in different geographies are accustomed to different rules for structured text display. (e.g.: Arabic vs. Hebrew).
Isomorphic/Universal apps that will format dates on the server using node/v8 will produce different results than Chakra/Edge. (e.g.: React checksum will fail, which implies a full re-rendering of the initial payload).
Chakra/Edge doing something different causes interop problems when users try to parse the localized output.

Proposals:

Spec the details about STT and direction marks and the use of bidi for dates, and align all implementations with Chakra/Edge.
Spec the optional use of STT and direction marks for dates, get Chakra/Edge to align, get others to add the new option.
Make localized output opaque by ignoring STT rules on dates, and get Chakra/Edge to drop the feature (add a strong statement about how the localized output should be considered opaque).
Spec the use of direction marks for dates, get Chakra/Edge to align (others are already using CLDR which includes the direction marks when needed).

Links:

STT (Structured Text): http://cldr.unicode.org/development/development-process/design-proposals/bidi-handling-of-structured-text

'length' property of DateTime Format Function

The 'length' property of the DateTime Format function returned from get Intl.DateTimeFormat.prototype.format changed from 0 to 1 in ECMA-402, 2nd ed.

In ECMA-402, 1st ed:

a. Let F be a Function object, with internal properties set as specified for built-in functions in ES5, 15, or successor, and the length property set to 0, that takes the argument date and performs the following steps:

In ECMA-402, 2nd ed:

a. Let F be a new built-in function object as defined in 12.3.4.
b. The value of F’s length property is 1.

The change is not listed in Annex B, so I'm not sure if it was intentional.

(V8 and SpiderMonkey still return 0, test262 also tests for 0 (https://github.com/tc39/test262/blob/master/test/intl402/DateTimeFormat/prototype/format/12.3.2_1_a_L15.js). Only JavaScriptCore already returns 1.)

2nd Edition HTML Release Checklist

Review hi-fi doc https://tc39.github.io/ecma402/ to match current pdf doc www.ecma-international.org/ecma-402/2.0/ECMA-402.pdf
Update build steps, re-org files and clean up the repo (caridy)
Produce HTML version using old-toc ecmarkup
Produce PDF from HTML
Tag 2nd edition code for posterity

[proposal] expose internal date pattern resolved per instance

As part of the effort to align with EWM, we want to expose more internals, low level APIs. The resolved pattern for dates is something V8 is already exposing, where it matches the CLDR pattern used to format the date. E.g.:

(new Intl.DateTimeFormat("en-US")).resolved.pattern // should return "M/d/y"

From the user perspective, having access to the CLDR pattern helps to interoperate with other systems and languages, including the DOM. E.g.: specifically the <input/> tag where the pattern can be specified.

Segmentation / Break Iteration

( Placeholder, TBD )

Intl.PluralRules()

UPDATE: This proposal has advanced to stage 1, details here: https://github.com/caridy/intl-plural-rules-spec

cloned from: https://groups.google.com/forum/#!topic/javascript-globalization/3nFDf5al5hU

It would be very helpful for all l10n libraries to get CLDR plural forms into Intl API.

The proposed API could look like this:

var cardinal = new Intl.PluralFormat(‘en’, {style: ‘cardinal’});
console.log(cardinal.format(0)); // “other”
console.log(cardinal.format(1)); // “one”
console.log(cardinal.format(2)); // “other”

var ordinal = new Intl.PluralFormat(‘en’, {style: ‘ordinal’});
console.log(ordinal.format(11)); // “one”
console.log(ordinal.format(22)); // “two”
console.log(ordinal.format(33)); // “few”
console.log(ordinal.format(44)); // “other”

Relative time/dates

There are two major areas where relative time/date formatting could use standardization:

Relative time representation

Good example are timers - representing time like "31 hours, 15 minutes and 23 seconds" as "31:15:23".

pretty relative date/time.
A lot of web apps humanize relative date and time with messages like:

a few seconds ago
less than a minute ago
one hour ago
in one hour
yesterday
last week
in 2 days

All of those patterns are available in CLDR and it would enable web authors to provide better user interfaces if we could incorporate this kind of formatting either into DateTimeFormat or into a separate RelativeDateTimeFormat.

Pseudolocales support

Mozilla and Microsoft use a convention of pseudo-locales to help developers ensure their user interfaces are localizable. Two pseudo-locales we use are:

qps-ploc - slightly wider version of pseudo-english
qps-plocm - slightly wider version of pseudo-RTL-english

You can see an example screenshot of Firefox OS using qps-ploc here: https://bug900182.bmoattachments.org/attachment.cgi?id=8418596

And Microsoft docs here: https://msdn.microsoft.com/en-us/library/windows/desktop/dd319106%28v=vs.85%29.aspx

It would be helpful if Intl API followed the convention and recognized pseudolocales for Numbers and DateTime formatting.

Add support for arbitrary code inside template tags

Why not support any arbitrary code inside template strings expressions?

This is not supported i.e:

`${ if (a == 2) {
    console.log('hi');
} }`

And with more complex code, like if, if-else, for loops, etc..

It would be nice to develop a template engine that uses this native approach instead of a string manipulation.

Expose dayperiod option in DateTimeFormat

Currently, DateTimeFormat is always displaying dayperiod token if hour12 is set. This is not always matching UX design.

We have encountered multiple scenarios in which the hour12 setting and dayperiod displaying are controlled separately, including MacOS Date&Time Settings.

I'd like to get dayperiod as another token for DateTimeFormat options as a boolean flag where 'true' means to display it if hour12 is true, and 'false' means not to display it even when hour12 is true.

10.1.1 InitializeCollator: collator must be in italics

Intl.DurationFormat

Since there's little progress with deciding on what to do with durations, and ICU's API doesn't seem too defined about it, I'd like to start a discussion about possible choices. It seems to me that we have four:

We can add it to NumberFormat
We can add it to UnitFormat
We can create a separate DurationFormat
We can kickstart some RuleBasedNumberFormat

The common use cases are:

Timer "05:12:21"
Stopwatch "12:21.05"
Video/Music player remaining/elapsed time "-04:13"

The formatting differs from time formatter, because the UI will define the range of units to be displayed, like h:m:s, m:s.S and the number of digits per unit (m:ss vs mm:ss or mm:ss.SSS vs mm:ss.SS)

NumberFormat would be convenient for that reason - it would benefit from similar API of defining minimumIntegerDigits and maximumIntegerDigits for each component, and the value itself may be just a number of milliseconds.

Fitting it into UnitFormat seems to be natural fit for CLDR Unit Elements [0] together with compound units, unit sequences and coordinate units, but may be hard and requiring us to bend over backward to accommodate for the options that will be required.

Custom DurationFormat seems to be the easiest choice from the API design perspective, but it increases the number of objects we specify. It's been the choice I made so far for Firefox OS patterns, but I'll be happy to migrate our code once we reach a consensus here.

RuleBasedNumberFormat is what ICU seems to be using now for duration patterns, but it's a pretty big API most similar to Mozilla's proprietary Date.prototype.toLocaleFormat and I'm not sure if we want to go there.

The API I use in DurationFormat looks like this:

var f = Intl.DurationFormat(locales, {
  minUnit: 'second', // millisecond | second | minute | hour
  maxUnit: 'hour' // millisecond | second | minute | hour
});
f.format(i); // 52:34:51

Thougths?

[0] http://www.unicode.org/reports/tr35/tr35-general.html#Unit_Elements

[Compatibility Hazard] ECMA 402 2nd Edition Changed the [[Call]] Behavior of Intl Constructors

The 1st Edition of ECMA 402 specified the [[Call]] behavior for Intl constructors; e.g. Intl.DateTimeFormat.call(this [, locales [, options]]) to return the this context object that was passed-in. In the 2nd Edition, the [[Call]] behavior no longer states that the passed-in context object should be retuned. This change is a potential compatibility hazard.

As the developer and maintainer of the popular FormatJS i18n libraries, I've begun receiving issues from developers testing Chrome Canary (49) that their dates and numbers were failing for format, causing an Error to be thrown.

The high-level framework integration libs that are part of FormatJS — react-intl, ember-intl, handlebars-intl — are used by many web apps, including many of Yahoo's web apps. All these libraries use the underlying intl-format-cache which memoizes the Intl constructors because they are expensive to create. The memoization technique essentially does the following:

function constructIntlInstance(IntlConstructor) {
    return function () {
        var args = Array.prototype.slice.call(arguments);
        var instance = Object.create(IntlConstructor.prototype);
        IntlConstructor.apply(instance, args);
        return instance;
    };
}

Note: That this code depends on the following invariant:

var instance = Object.create(IntlConstructor.prototype);
instance === IntlConstructor.call(instance); // true

It is dependent on the the Intl constructors being .call()-able and the returning the context object passed-in. This .call() behavior for the Intl constructors is supported in all ECMA 402 1st Edition implementations.

After receiving an issue report about this code causing an Error to be thrown Chrome Canary (49), I dug in and found this recent V8 change which updates V8's implementation to match ECMA 402 2nd Edition, thus removing the code that returns the passed-in context object when the Intl constructors are .call()-ed.

Today, I've released [email protected] which changes the memoization implementation to make sure the [[Construct]] behavior always happens by invoking the Intl constructors with new. Essentially doing the following:

function constructIntlInstance(IntlConstructor) {
    return (...args) => new IntlConstructor(...args);
}

In ES5 an equivalent would be:

function constructIntlInstance(IntlConstructor) {
    return function () {
        var args = Array.prototype.slice.call(arguments);
        return new (Function.prototype.bind.apply(IntlConstructor, [null].concat(args)))();
    };
}

While this issue is now "fixed" in intl-format-cache, developers must upgrade their dependencies and re-deploy their apps. I will help to communicate this change, but I'm worried that removing the 1st Edition [[Call]] behavior will break many apps/sites 😞

How should we move forward to prevent end-users from having broken experiences?

Edited based on @rwaldron's feedback.

Numbering/formatting issue in latest spec PDF

Just noticed this in the official 2nd edition doc linked from the Ecma site. The numbers go in the following order, 5-a-b, 1-c, 6:

I think the _1_ here is supposed to be _c_ and the _c_ should be _d_.

[Migrating Ecma-402 to HTML] Section 11 - NumberFormat Objects

https://github.com/tc39/ecma402/blob/master/spec/source/numberformat.html

Expose ability to produce a base form of the word used by Collator

A common use case for Collator is to sort long lists of terms, and a common UX pattern for long lists of words is to group them by letters and potentially show a sidebar letter scrolling.

Collator allows for different levels of sensitivity when comparing, but that doesn't help with grouping. We can sort "żaba" into words starting with "z", but we can't identify that after removing diacritics and reducing it to a base form it is "zaba" (and that's what Collator is using with sensitivity = base).

Could we expose to users?

Create an "npm update-pages" (like ecmascript-asyncawait has)

https://github.com/tc39/ecmascript-asyncawait/blob/master/package.json#L31-L33

First day of the week -- Intl.getCalendarInfo()

It seems like there's no way with the current specifications to know the first day of the week for a given locale.

I understand ISO 8601 does consider Monday as the first working day: maybe this is the reason. However there are some locales (AFAIK Canada, U.S. and Mexico) where Sunday should be shown as first day of the week - eg in calendars or date pickers. Is there any plan to make available this information?

Table 1, 2, 3 "not found" using latest ecmarkup

Errors look like:

[2015-09-29T14:37:25.218Z] Warning: can't find clause, production, note or example with id #table-1
...

Using eg. <emu-clause id="table-1">:

Error: Clause doesn't have header: <emu-clause id="table-1">

But we don't want a header here and tables aren't:

a clause
a note
a production

Using eg.

<emu-example id="table-1">
  <figure>
    <figcaption>Table 1 &mdash; Collator options settable through both extension keys and options properties</figcaption>
    <table class="real-table">
...
    </table>
  </figure>
</emu-example>

Looks more like what we want, but includes a generated <figcaption> that we don't want:

Maybe @bterlson has advice? Maybe we need an <emu-figure>?

<emu-figure id="table-1">
  <figcaption>...</figcaption>
  <table>...</table>
</emu-figure>

Note that <figure id=...> has same error.

Intl.DisplayNames API

There's a lot of data related to Language names, Timezone names, Script names and Region names contained in CLDR that would allow many App Settings to be easier to develop.

Many popular applications contain some combination of "language selector" and "timezone selector" in their UI. We use it in Firefox OS and Firefox desktop, but most of popular webapps like Gmail, Facebook or Twitter do the same.

It would be awesome to tap into CLDR resources and expose the ability to get localized versions of those tokens. We'll want to do this for Firefox OS as part of the 'mozIntl' API and I think it would make sense to standardize it.

Open questions:

We'll have to fallback gracefully if the localized value for the token is not available
We'll need to evaluate the disk space cost of that data. I can see us not wanting to keep Script/Region/Territory data, and only expose Language/Timezone. But it would be smart to design API to allow us to expose more in the future.
Most use cases will involve a loop to retrieve many names (for a drop-down selector), so the function that retrieves the name for a given language has to be fast, but we probably don't want to load all strings into memory. Need to design the spec for balance.

Tracking the new algorithm convention for return-if-abrupt

tc39/ecma262#129

Propose for way to display alternative symbol currency variant for Number.prototype.toLocaleString() and Intl.NumberFormat

Notice the issue in andyearnshaw/Intl.js#84
Actually I have three proposals:

Change "symbol" and "symbol-alt-variant" for "RUB" currency in CLDR. The ₽ symbol is official since 2014 and many people already use it and you can even see it on coins in Russia now.
Add a way to display alternative symbol currency variant for Number.prototype.toLocaleString() and Intl.NumberFormat. There is already currencyDisplay option but it can be only name, symbol or code and there is no way to display alternative symbol.
Add a way to display custom symbol currency variant. This will be useful in situations when some official currency symbol was changed but specification is in progress of adopting these changes. So one can override currency symbol. Also this may be useful for some virtual currencies some websites may use for their specific purposes.

[Spec Bug] Requested options should expand best match options in DateTime Format

Background

From the algo described in "2.6.2 Elements availableFormats, appendItems", it seems that the requested options should have a leading role when formatting the best match pattern. Below is the relevant section:

http://unicode.org/reports/tr35/tr35-dates.html#availableFormats_appendItems

Once a skeleton match is found, the corresponding pattern is used, but with adjustments. Consider the following dateFormatItem:

    <dateFormatItem id="yMMMd">d MMM y</dateFormatItem>

If this is the best match for yMMMMd, pattern is automatically expanded to produce the pattern "d MMMM y" in response to the request. Of course, if the desired behavior is that a request for yMMMMd should produce something other than "d MMMM y", a separate dateFormatItem must be present, for example:

    <dateFormatItem id="yMMMMd">d 'de' MMMM 'de' y</dateFormatItem>

The Problem

402 does not consider the requested options as part of the resolvedOptions, instead, it only uses the required options to find the best match. In algo 12.1.1 InitializeDateTimeFormat (dateTimeFormat, locales, options), in step 26.c.1, Set dateTimeFormat.[[<prop>]] to p., it stores the output option in an internal slot (e.g.: "long" for "{month}") from the format returned from best match algo, the proposal is to overruling that with the value of property "month" from requested options if exist, as described in unicode documentation cited above.

This issue is the major source of problems in Intl.js (Intl polyfill issues: 125, 117, 124, 145 and co.).

Rework spec text in 9.2.8: SupportedLocales (availableLocales, requestedLocales, options) to link to the abstract operations

there isn't a notion of an abstract op as a value, instead we should just call the corresponding abstract directly in the then or the else.

"NewTarget" must not be in italics

For reference, see: http://tc39.github.io/ecma262/#sec-error-message

The incorrect text is here: https://github.com/tc39/ecma402/blob/master/spec/collator.html#L129

Publish an html version of the 2.0 specification

The 1.0 specification is available
http://www.ecma-international.org/ecma-402/1.0/
but there is no html version for 2.0
http://www.ecma-international.org/ecma-402/2.0/

Thanks for putting the current draft (3.0) on GitHub pages 👍
(referencing and deeplinking specs is just nicer with html)

[Migrating Ecma-402 to HTML] Section 5 - Notational Conventions

[Migrating Ecma-402 to HTML] Section 9 - Locale and Parameter Negotiation

https://github.com/tc39/ecma402/blob/master/spec/source/negotiation.html

Add navigator.locales for user preferences

There are already two languages for specifying options to a locale:

A simple locale tag (e.g., "en-US") plus an object with further options ({ tz: "Asia/Jerusalem" })
A BCP 47 string representing the combination of the two (e.g., "en-US-u-tz-jerusalm")

Each has some advantages:

The options object is easier for programmers to read and write, and easier to manipulate one field of.
The BCP47 tag is easier to pass over the network, and it can passed around as a single object for all the possible locale information (TODO: confirm that everything we represent in options objects has a BCP47 subtag)

An idea here: resolvedOptions() shouldn't return just a plain old object, but an instance of a special class whose toString() method returns the BCP47 locale for those options! This way, we get the best of all worlds. The output of resolvedOptions() can also then be used as the first (and only) argument to constructors like Intl.DateTimeFormat.

These Intl.Locale objects can be used generally as the way of representing locale information within ECMAScript. HTML could include an Array of relevant complex Locales (not just simple locales like "en-US") in navigator.locales, which would be usable either as a string or to get these higher-level properties; detailed user preferences could be reflected in the Locale which would be difficult to reflect in the simple navigator.language. Locales would still have own properties for things that resolvedOptions() currently includes as own properties, and would therefore probably be web-compatible as an upgrade. We probably couldn't just replace navigator.language with a Locale for web compatibility reasons, but there's no reason that the string first argument to formatters can't be more flexible (if it isn't already) by continuing to call ToString() and then parsing the BCP47.

(To clarify, this isn't my idea but was something a group of us came up with yesterday.)

@domenic @ericf @caridy Thoughts?

Locale related APIs -- language matching

UPDATE: Latest Proposal - Nov 2015

I started working on a spec and polyfill for Intl.Locale API.

I should have some proposal for this next week. Currently I have five functions on it:

isStructurallyValidLanguageTag(locale)
canonicalizeLocaleList(locales)
resolveLocale(availableLocales, requestedLocales, options, relevantExtensionKeys, localeData)
prioritizeAvailableLocales(availableLocales, requestedLocales, defaultLocale)
getDirection(locale)

The rationale for those five are:

isStructurallyValidTag
This allows testing language tag for sanitation/validation purposes.
canonicalizeLocaleList
This is the core function allowing all language negotiation operations
resolveLocale

this exposes the core function used by all Intl formatters that allows people to write their own custom formatters or polyfills.

prioritizeAvailableLocales

This is basic language negotiation function for all localization libraries. It takes requestedLocales list (navigator.languages or custom), availableLocales (app provided locales), and a fallback defaultLocale (default app locale, not default host env locale) and returns the list of prioritized available locales that best matches user requested locales.

getDirection

Simple function allowing l10n libraries to set html.dir based on their negotiated locale.

@caridy - One question I'd like to get help with is about using Set instead of Array. In all cases where we carry a list of locales, we always want it deduplicated and Set does just that. It would be sweet to use it instead of an Array, but it's API seems to be significantly less user friendly than that of the Array - it's impossible to just retrieve locales[0], it's not easy to map etc.
Should I stick to Array or try to use Set?

Parsing APIs

2020 Update: See #342 for further discussion.

Currently the spec has no APIs for parsing dates or numbers. Supposedly the response used to be "use a datepicker" when asking about date parsing. I'm wondering if that changed.

List formatting API

A common scenario is to have a list of items (Array in JS) which has to be formatted in a given language. CLDR provides all the patterns for that.

Example use cases:

List of participants in a chat
List of days for which an alarm is set
List of affected events in Calendar app

Proposed API:

var f = new Intl.ListFormat(navigator.languages, {
  'type': 'regular', // regular or duration
  'variant': 'regular', // regular, short, narrow
});

f.format(['John', 'Amy', 'Nick']); // "John, Amy and Nick" in en-US

this would also supersede current Array.prototype.toLocaleString with ability to build better localized lists.

I can imagine this not having it's own object and only extending Array.prototype.toLocaleString with options.

Intl.breakIterator

Standardize Intl.v8BreakIterator.

Backpointers:

Update 1 (Sept 26th, 2016):

Proposal from @littledan: https://github.com/littledan/BreakIterator/blob/master/README.md

[Migrating Ecma-402 to HTML] Section 13 - Locale Sensitive Functions of the ECMAScript Language Specification

Grouping for sorting

Collator is enabling users to build locale-aware sorted list of words. Common UI scenario like Contacts list, or Music app list is to group the list by character and then provide a sidebar grouping selector.

I can see two APIs that could facilitate that and I'm not sure which one would be better:

List on input, nested list on output

var input = ['abcd', 'foo', 'γβς', 'أبجدية  عربية'];
var ret = [
  ['a', ['abcd']],
  ['f', ['foo']],
  ['γ', ['γβς']],
  ['أبجدية  عربية']
]

In this scenario, the words are sorted by alphabet, the alphabets are sorted by language preferences, and then you receive a nested list of your terms with grouping characters.

The settings would be similar to collator - sensitivity, maybe max number of groups (which would give a..c on one hand, and aa...ac, ad...ag on the other)

I like it because it's simple and works well for simple scenarios, but I feel it doesn't scale well.

The problem with this input is that it operates on strings and requires the user to later compare the result back to his original database. If there are 10000 song titles, then operating on the result of this API in order to get the song entry that is associated with a given name is going to be very confusing.

One other limitation of this approach is that with very big databases, a common practice is to avoid having to load the whole database into memory. So instead of that, we could combine two different techniques:

an API to provide a list of character groups
an API to provide an indexable version of the strings

In that scenario, Intl API is called when the entry is being added to the database and it receives an indexable version of the string with things like sensitivity taken into account.

Then, another API provides the list of group names [a..z][ا...غ] etc. so that the UI can show options. Then it's on the user and his database to be able to load just entries from a visible/selected range.

I like the latter API because it feels more scalable and leads to better performance.

I'm looking for opinions, prior work or ideas

[Migrating Ecma-402 to HTML] Section 8 - The Intl Object

https://github.com/tc39/ecma402/blob/master/spec/source/intl.html

tc39 / ecma402 Goto Github PK

ecma402's People

Contributors

Stargazers

Watchers

Forkers

ecma402's Issues

Proposal

Open questions

Notes:

Problems:

Proposals:

Links:

Background

The Problem

Recommend Projects

Recommend Topics

Recommend Org