Commit Graph

26 Commits

Author SHA1 Message Date
renovate[bot]
16a34c3b35 chore(deps): update dependency flow-bin to v0.134.0 [skip netlify] (#2533)
* chore(deps): update dependency flow-bin to v0.134.0 [skip netlify]

* Fix flow errors

Co-authored-by: Renovate Bot <bot@renovateapp.com>
Co-authored-by: Young Min Kin <mail@ylem.kim>
2020-09-26 12:19:42 +09:00
Ron Kok
2dc8f8121a Support \operatorname* (#1899)
* Support \operatorname*

* Fix lint errors

* Fix YAML

* Update screenshots

* Break out a function to avoid code duplication

* Fix lint errors

* Restore wrapper span

* Update docs

* Reinstall color macros lost in merge

* Update screenshots

* Add type annotations, Move to utils file, add \limits to screenshots

* Fix lint errors

* Rearrange screen shot to fit onto page

* Update screenshots

* tweak location of utils.js and assembleSupSup.js
2019-07-17 21:38:44 -04:00
ylemkimon
3dfd17d9b4 Add catcode to Lexer, move comment parsing back to Lexer (#1789)
* Remove redundant consumeSpaces()

- Spaces after command sequence are ignored in Lexer
- parseExpression consumes spaces in the math mode

* Add catcode to Lexer, move comment parsing back to Lexer

- Fix parsing a comment before a sup/subscript argument
- Fix parsing a comment before an expression
- Fix parsing a comment before or between \hline
- Fix parsing a comment in the macro definition
- Fix parsing a comment including a command sequence

* Update Lexer.js

* Update Parser.js

* catcode -> catcodes
2018-11-24 18:42:14 -05:00
ylemkimon
3907545e2c Add raw string group, move comment parsing to Parser, change URL group parser (#1711)
* Add raw string group

* Move comment parsing to Parser

* Use raw string group in URL group parser

* Update types.js

* Add multi-level nested url test
2018-10-12 21:21:57 -04:00
ylemkimon
fdb155aa97 Build ECMAScript modules (#1479)
* Separate type import statement from module import statement

* Remove extension from import statements

* Build ECMAScript modules

* Add `cross-env` devDependency

* Use `babel-plugin-import-rename` instead of custom plugin

* Improve `.babelrc` style and add comments

* Update README.md

* Change file extension to `.mjs`

Comply with Node.js spec. Use extensionless package:main.

* Enforce only ESM compatible imports

* Dedupe packages

* Add `unicodeMake.js` to overrides:excludedFiles

* Fix .eslintrc merge conflict

* Use rollup to bundle ES module

* Remove `eslint-plugin-import`

* Change build directory to `dist`

* Change build directory to `dist`

* Change build directory

* Move docs from README.md to browser.md

* Update update-sri.js

* Revert update-sri.js

* Revert update-sri.js

* Update .eslintrc

* Remove SSH key testing
2018-08-13 13:06:40 +09:00
Erik Demaine
2202aa774f Comments without terminating newlines, \href fixes, \url support (#1529)
* Comments without terminating newlines in nonstrict mode

Fix #1506 by allowing single-line comments (`%` without terminating newline)
in nonstrict mode.  `Lexer` and `MacroExpander` now store the `Settings`
object, so the `Lexer` can complain about missing newline according to the
`strict` setting.  I filtered this out from the snapshot tests with a slightly
different `replacer`.

* Reimplement \href like \verb, add \url

Major restructuring to lex URL arguments differently, e.g. to support
`\href%{hello}` and `\href{http://foo.com/#test%}{hello}`.  The new URL
parsing code is simpler, but involves a special case in `parseSymbol`
like `\verb`.

Also add support for `\url` while we're here.

* Cleanup

* Fix flow errors and improve error messages

* Add \url to documentation

* Improve doc formatting
2018-07-31 14:13:30 -04:00
ylemkimon
518379aed5 lexer: Remove match-at dependency, use RegExp (#1447)
* lexer: Remove `match-at` dependency, use RegExp

* chore(package): update flow-bin to version 0.75.0

* Fix flow error

* Remove unused flow libs

* Minor fix

* Throw an error when `RegExp.exec` jumps
2018-06-28 03:13:27 +09:00
Erik Demaine
1ed99d9ff3 Strict setting controls \newline display-mode behavior; fix MacroExpander space handling (#1314)
* Strict setting controls \newline display-mode behavior

* Bug-fix space handling in macros

Whitespace after a \controlWord is now handled within the lexer, not by the
MacroExpander.  This way, \\ expanding to \newline doesn't accidentally
cause spaces to get consumed.

* Rename nonstrict -> reportNonstrict; strictBehavior -> useStrictBehavior

* Second category of errorCodes
2018-05-16 09:37:41 -04:00
Erik Demaine
484d44ee70 Unicode accents (#992)
* Unicode accents

* Lexer now looks for combining dicritical marks and adds them to the same character
* Parser's `parseSymbol` now recognizes both combined and uncombined forms of Unicode accents, and builds accent objects just like the accent functions
* Added CJK support to math mode (not just text mode)

* Add invalid combining character test

* Add MathML test

* Add weak support for other Latin-1 characters

This maintains backwards compatibility, but it uses the wrong font.
There's a TODO to fix this later.

Also refactor symbol code to use for..of

* Update Unicode screenshot

* Remove dot from accented i and j (in math mode)

Also add dotless Unicode characters to support some accented i's and j's

* Fix \imath, \jmath, \pounds, and more tests

* Switch from for..of to .split().forEach()

Save around 800 bytes in minified code

* Fix split

* normalize() detection

* Convert back to vanilla for loops

* Fix merge

* Move normalize dependency to unicodeMake.js

* Make unicodeSymbols into a lookup table instead of macros

This is important for multi-accented characters.

* Add comments about when to run

* Move symbols definition into unicodeMake/Symbols.js

* Remove CJK support in text mode

* Add missing semicolon

* Refactor unicodeAccents to its own file

* Dotless i/j support in text mode

* Remove excess character mappings

* Fix Åå in math mode (still via Times)

* Update to support #1030

* Add accented Greek letter support (for supported Greek symbols)

* Update screenshot

* remove Æ, æ, Ø, ø, and ß from math mode test
2017-12-28 23:32:45 -07:00
Erik Demaine
3280652bd6 Fix space handling (#912)
Fixes several issues with space handling: (fix #910)
1. "Control symbols" (as they're called in the TeXbook), such as `\\`, should
   not have spaces eaten after them (only "control words" such as `\foo`).
2. In math mode, spaces should be consumed at the parser level, not the
   gullet level.  This enables `\\ [x]` to parse differently from `\\[x]`
3. Eat spaces between arguments, so `\frac x y` still works.
   (This used to work only because math mode ate all spaces.
    The analog in text mode wouldn't have worked.)

Also eat spaces in initial arguments in math mode, and before ^ and _ in atoms.
2017-10-10 10:09:37 -04:00
Kevin Barabash
eaef0127c5 Add support for comments, fixes #20 (#884) 2017-09-25 21:50:27 -06:00
Ashish Myles
59bed2ad08 Add SourceLocation to encapsulate Token/ParseNode debug information. (#904)
* Add SourceLocation to encapsulate Token/ParseNode debug information.

* Specify concrete Token text type as it captures type mismatches.

* Responded to comments.
2017-09-25 14:29:41 -04:00
Erik Demaine
f10ea4cbeb Implement \verb (#614)
* Implement \verb

* Implement @gagern's comments

* \verb: look up characters one at a time.

* Add screenshot test for \verb

* Add error tests for \verb

* Include space symbol in typewriter font, and fix single quotes

This is based on https://github.com/Khan/MathJax-dev/pull/2
which hasn't been accepted yet at the time this commit is made.

* Add \verb* tests

* \verb should use Typewriter-Regular font!

* Switch \verb to use text mode and no-break space.

* Screenshot update with Typewriter-Regular

* \verb test: fix *, add commas to make spaces clear

* Fix spaces and style handling

* Implement @kevinbarabash's comments

* Make error clearly an assertion failure

* verb screenshot for Chrome
2017-09-21 23:43:05 -04:00
Erik Demaine
6857689946 Advanced macro support and magic \dots (#794)
* Advanced macro support and magic \dots

* Fix \relax behavior

* Use \DOTSB in \iff, \implies, \impliedby

* Add multiple expansion test

* Implement some of @kevinbarash's comments

* More @kevinbarabash comments

* Token moved from merge

* Add type to defineMacro

* @flow
2017-09-04 20:27:04 -04:00
Ashish Myles
13f3eac741 To @flow: Token, Lexer, ParseError, and ParseNode. (#839)
* To @flow: Token, Lexer, ParseError, and ParseNode.

* PR fixes 1.
2017-09-04 15:27:58 -04:00
Hossein Saniei
a019f36f8a Upgrade the source to use ES6 syntax including classes, import and static properties (#679)
* Add babel transform-class-properties to have static class properties

* Upgrade Lexer and Parser files to use ES6 classes

* Update eslint max line length to 90 character (more indent because of using ES6 classes)

* Upgrade eslint and jasmin to support ES stage-2 features

* Use static properties to place constants near their functions

* Migrate all remaining sources to ES6 syntax

* Increase eslint max line length to 84

* Remove non-babelified endpoint in dev server.js

* Clean up server.js functions after removing browserified

* Make screenshotter not to use babel endpoint as we babelify everything now
2017-07-03 08:09:21 -04:00
Martin von Gagern
bd9db332d2 Turn var into const or let 2017-01-13 22:37:17 -05:00
Martin von Gagern
4a9c2acbf7 Add some more symbols (#502)
This adds support for the following input sequences:

    -- --- ` ' `` '' \degree \pounds \maltese

resulting in – — ‘ ’ “ ” ° £ ✠ symbols already present in our fonts.

As part of this modification, the recognition of multiple dashes was moved
from the lexer to the parser.
This is neccessary since in math mode a sequence of hyphens is just a
sequence of minus signs.  Just like a pair of apostrophes in math mode is a
double prime not a right double quotation mark.
To make this easier, parseGroup and parseOptionalGroup have been merged.
2016-07-24 19:56:31 -07:00
Martin von Gagern
8c55aed39a Allow macro definitions in settings (#493)
* Introduce MacroExpander

The job of the MacroExpander is turning a stream of possibly expandable
tokens, as obtained from the Lexer, into a stream of non-expandable tokens
(in KaTeX, even though they may well be expandable in TeX) which can be
processed by the Parser.  The challenge here is that we don't have
mode-specific lexer implementations any more, so we need to do everything on
the token level, including reassembly of sizes and colors.

* Make macros available in development server

Now one can specify macro definitions like \foo=bar as part of the query
string and use these macros in the formula being typeset.

* Add tests for macro expansions

* Handle end of input in special groups

This avoids an infinite loop if input ends prematurely.

* Simplify parseSpecialGroup

The parseSpecialGroup methos now returns a single token spanning the whole
special group, and leaves matching that string against a suitable regular
expression to whoever is calling the method.  Suggested by @cbreeden.

* Incorporate review suggestions

Add improvements suggested by Kevin Barabash during review.

* Input range sanity checks

Ensure that both tokens of a token range come from the same lexer,
and that the range has a non-negative length.

* Improved wording of two comments
2016-07-08 12:24:31 -07:00
Kevin Barabash
14a58adb90 Migrate to eslint
Summary
We'd like contributors to use the same linter and lint rules that we use
internally.  This diff swaps out eslint for jshint and fixes all lint failures
except for the max-len failures in the test suites.

Test Plan:
- ka-lint src
- make lint
- make test

Reviewers: emily
2015-12-01 10:02:08 -08:00
Martin von Gagern
d423bec089 Rewrote lexer, avoiding some mode-specific distinctions
There are two main motivations for this commit.  One is unicode input, which
requires unicode characters to get past the lexer.  See discussion in #261.
The second is in preparation for #266, where we'd deal with one token of
look-ahead but might be lexing that token in an unknown mode in some cases.
The unit test shipped with this commit addresses the latter concern, since
it checks that a math-mode-only token may immediately follow some text mode
content group.

In this new implementation, all the various things that could get matched
have been collected into a single regular expression.  The hope is that
this will be beneficial for performance and keep the code simpler.
The code was written with Unicode input in mind, including non-BMP codepoints.

The role of the lexer as a gate keeper, keeping out invalid TeX syntax, has
been abandoned.  That role is still fulfilled by the symbols and functions
tables, though, since any input which is neither a symbol nor a command is
still considered invalid input, even though it lexes successfully.
2015-10-02 20:06:03 +02:00
Martin von Gagern
2f7a54877a Implement environments, for arrays and matrices in particular
This commit introduces environments, and implements the parser
infrastructure to handle them, even including arguments after the
“\begin{name}” construct.  It also offers a way to turn array-like data
structures, i.e. delimited by “&” and “\\”, into nested arrays of groups.
Environments are essentially functions which call back to the parser to
parse their body.  It is their responsibility to stop at the next “\end”,
while the parser takes care of verifing that the names match between
“\begin” and “\end”.  The environment has to return a ParseResult, to
provide the position that goes with the resulting node.

One application of this is the “array” environment.  So far, it supports
column alignment, but no column separators, and no multi-column shorthands
using “*{…}”.  Building on the same infrastructure, there are “matrix”,
“pmatrix”, “bmatrix”, “vmatrix” and “Vmatrix” environments.  Internally
these are just “\left..\right” wrapped around an array with no margins at
its ends.  Spacing for arrays and matrices was derived from the LaTeX
sources, and comments indicate the appropriate references.

Now we have hard-wired breaks in parseExpression, to always break on “}”,
“\end”, “\right”, “&”, “\\” and “\cr”.  This means that these symbols are
never PART of an expression, at least not without some nesting.  They may
follow AFTER an expression, and the caller of parseExpression should be
expecting them.  The implicit groups for sizing or styling don't care what
ended the expression, which is all right for them.  We still have support
for breakOnToken, but now it is only used for “]” since that MAY be used to
terminate an optional argument, but otherwise it's an ordinary symbol.
2015-06-18 22:24:40 +02:00
Ben Alpert
0f6530096b Don't slice in lexer
Summary: Theoretically this allocates way less. In practice it seems to be exactly the same speed.

Test Plan: make test

Reviewers: emily

Reviewed By: emily

Differential Revision: https://phabricator.khanacademy.org/D16621
2015-04-06 10:39:39 -07:00
Jmeas
fec04614b8 Adds JSHint to the build system and tidies up code. 2014-10-01 21:28:46 -04:00
Emily Eisenberg
def1a47935 Add optional arguments
Summary:
Add correct parsing of optional arguments. Now, things like `\rule` can shift
based on its argument, and parsing of `\sqrt[3]{x}` fails (correctly) because we
don't support that yet.

Also, cleaned up the lexing code a bit. There was a vestige of the old types in
the lexer (they have now been completely moved to symbols.js). As a byproduct,
this made it hard to call `expect("]")`, because it would look at the type of
the Token and the type for "]" was "close". Now, all functions just look at the
text of the parsed token, and in special occasions (like in the dimension lexer)
it can return some data along with it.

Test Plan:
 - Make sure tests still work, and new tests work
 - Make sure no huxley screenshots changed
 - Make EXTRA SURE `\sqrt[3]{x}` fails.

Reviewers: alpert

Reviewed By: alpert

Differential Revision: http://phabricator.khanacademy.org/D13505
2014-10-01 14:20:47 -07:00
Emily Eisenberg
35d9d972fd Move js files into src/
Test plan:
- Make sure huxley tests, jasmine tests, make build, make metrics, make test all
  still work.

Auditors: alpert
2014-09-15 02:50:34 -07:00