* Unicode accents
* Lexer now looks for combining dicritical marks and adds them to the same character
* Parser's `parseSymbol` now recognizes both combined and uncombined forms of Unicode accents, and builds accent objects just like the accent functions
* Added CJK support to math mode (not just text mode)
* Add invalid combining character test
* Add MathML test
* Add weak support for other Latin-1 characters
This maintains backwards compatibility, but it uses the wrong font.
There's a TODO to fix this later.
Also refactor symbol code to use for..of
* Update Unicode screenshot
* Remove dot from accented i and j (in math mode)
Also add dotless Unicode characters to support some accented i's and j's
* Fix \imath, \jmath, \pounds, and more tests
* Switch from for..of to .split().forEach()
Save around 800 bytes in minified code
* Fix split
* normalize() detection
* Convert back to vanilla for loops
* Fix merge
* Move normalize dependency to unicodeMake.js
* Make unicodeSymbols into a lookup table instead of macros
This is important for multi-accented characters.
* Add comments about when to run
* Move symbols definition into unicodeMake/Symbols.js
* Remove CJK support in text mode
* Add missing semicolon
* Refactor unicodeAccents to its own file
* Dotless i/j support in text mode
* Remove excess character mappings
* Fix Åå in math mode (still via Times)
* Update to support #1030
* Add accented Greek letter support (for supported Greek symbols)
* Update screenshot
* remove Æ, æ, Ø, ø, and ß from math mode test
TeX and CSS treat line heights in fundamentally different ways. In
TeX, every character is treated as a box of its precise height and
depth; the line height (\baselineskip) applies after characters have
been assembled into lines. In CSS, in contrast, every character
creates a "line box" corresponding to the accompanying font. When
characters of different fonts and sizes are placed into the same
span, the resulting line box contains the line boxes of all children.
This is unfortunate because, for example, we want `\frac{1}{2}` to
behave in vertical spacing contexts like it is exactly as tall and
deep as the visible fraction (which is the TeX behavior). Given CSS
constraints, though, in most contexts the fraction has extra vertical
space: the line boxes for the numerator and denominator create
padding. For small boxes, this isn't so bad. To really see the
problem put a tall rule in the denominator of a fraction, or check
out the VerticalSpacing screenshotter test, which has way more space
than it should.
Solving this problem in CSS is difficult. There is no easy way to get
rid of the extra line boxes.
But there is *a* way, namely tables. A table-cell with vertical-align
top, bottom, or middle is ignored for the purposes of line height
calculation.
So in this commit, makeVList puts its contents into a
vertical-align:bottom table-cell (to clear unwanted line boxes), with
an extra row used to represent depth.
Many Chrome screenshotter tests change. This is because Chrome rounds
table dimensions to integral numbers of pixels, while it uses
sub-pixel positioning for non-table displayed tabs. That makes many
vlists a fraction of a pixel wider than they used to be.
So `\text{Hi}` becomes one <span...>Hi</span>, rather than two
<span...>H</span><span...>i</span>.
This allows the font renderer to apply kerning, which changes some
test output.
Summary:
This diff provides support for Latin-1, Cyrillic, and CJK characters
inside \text{} groups. For Latin-1 and Cyrillic characters we use
glyph metrics from a glyph from Basic Latin that has roughly the same
bounding box. We use the metrics for a capital 'M' to approximate the
full-width CJK characters. Half-width characters are not supported yet.
Test Plan:
- make test
- make screenshots
Reviewers: emily