-
Notifications
You must be signed in to change notification settings - Fork 215
Fix 2-digit date formatting and repeated month symbol for ja/zh #145
Conversation
…ths were formatted incorrectly, plus the month symbol within Japanese and Chinese date strings was repeated. - change the interpretation of CLDR data so that matching a format uses the 'availableFormats skeleton', and formatting an object uses the format string. Previously both tasks used the format string. - once a format has been located, allow a requested format to be used if it is similar to the located format. e.g. numbers can be formatted as 2-digit, but short can't. Previously, once the requested format had been used to locate a nearest match, only the format in that string would be used.
hey @ianhk, thanks a lot for putting some time on this. Few notes:
I expect that the solution comes from the data side, and the data that we generate for internal consumption. |
Hi @caridy, Sorry for being cryptic! I had two issues - both of which overlap some open issues. I took a look at the spec - http://unicode.org/reports/tr35/tr35-dates.html#availableFormats_appendItems My interpretation of the spec suggests the Currently the polyfill uses the format string for matching (after a bit of tweaking in expandFormat()) and for formatting. And that once located the format string is used without regard to the desired format specified when matching. The change is two part (github makes it look worse than I think it is). src/cldr.js src/core.js I'm not sure if it changes the spec - although I believe this is closer to the desired behaviour. Both my issues are fixed by this change. The double symbol issue is fixed because I request a
Here the format string indicates month should be a number followed by the month symbol. My request for a Hope this makes sense. (and understand this is a significant change) |
ok, that makes sense. Ideally, we can work out the details by producing the right data structure rather than making any changes in the algo from spec. The way I see it, we can modify the way we compute the data structure (which today I'm using the right hang side operand from CLDR, IIRC), by changing that we should be good because the algo computes the best match based on the structure, and output the formatted token based on the pattern and pattern12, which make easy to modify the input from cldr. I will look into the details on monday. |
btw, if this is true, then this was completely my fault, an oversight on my part when interpreting the unicode information, and we should be able to quickly fix this without touching the 402 spec. |
Thanks for your speedy reply. It may not be clear from the diffs, but I haven't touched the format string parsing or matching algorithm. Parsing is now called twice, for left (skeleton) and right (format) hand side and places the results into the same rules array. Matching is against the left, but the return value from the match is based on the right. (if any of that makes any sense...) |
I wonder why not just matching against the left hand side, and picking the pattern from the right, which implies keeping the exact same data structure we use today, just different values since we are computing everything based on the right hand side today. |
I think we need parsed options (day/month/year etc formats) from both sides. Left hand for matching, right for formatting. Hence storing both sides in the same rule. |
I'm taking a crack at this one, will have something on monday. |
This is a significant change, though fixes the issues for us in key languages so that formats we use are the same as native Intl on FF and Chrome (ignoring the comma difference for one language in Chrome). Tests pass.
Fix issues in intl.js polyfill that meant 2-digit years, days and months were formatted incorrectly, plus the month symbol within Japanese and Chinese date strings was repeated.