lolight - tokenizer and syntax highlighter

Lightweight tokenizer and syntax highlighter, less than 3kB of minified code. No language specific syntax support, just a CSS stylable breakdown into tokens. Default styles included.

This is an awesome piece of code in my opinion. Instead of adding support for whole languages, lolight is basically one big regular expression that finds common words in code like define, class and end and applies classes to them! Apart from that, a few other expressions find comments, numbers, strings and that’s it.

This is the keyword regex, which I find fascinating, it’s a bit like Codegolf:

^(a(bstract|lias|nd|rguments|rray|s(m|sert)?|uto)|b(ase|egin|ool(ean)?|reak|yte)|c(ase|atch|har|hecked|lass|lone|ompl|onst|ontinue)|de(bugger|cimal|clare|f(ault|er|p)?|init|l(egate|ete)?)|dim|do|double|e(cho|ls?if|lse(if)?|nd|nsure|num|vent|x(cept|ec|p(licit|ort)|te(nds|nsion|rn)))|f(allthrough|alse|inal(ly)?|ixed|loat|or(each)?|riend|rom|unc(tion)?)|global|goto|guard|i(f|mp(lements|licit|ort)|n(it|clude(_once)?|line|out|stanceof|t(erface|ernal)?)?|s)|l(ambda|et|ock|ong)|m(odule|utable)|NaN|n(amespace|ative|ext|ew|il|ot|ull)|o(bject|perator|r|ut|verride)|p(ackage|arams|rivate|rotected|rotocol|ublic)|r(aise|e(adonly|do|f|gister|peat|quire(_once)?|scue|strict|try|turn))|s(byte|ealed|el(f|ect)|hort|igned|izeof|tatic|tring|truct|ubscript|uper|ynchronized|witch)|t(emplate|hen|his|hrows?|ransient|rue|ry|ype(alias|def|id|name|of))|u(n(checked|def(ined)?|ion|less|signed|til)|se|sing)|v(ar|irtual|oid|olatile)|w(char_t|hen|here|hile|ith)|xor|yield)$

It’s not perfect, and one code block got it to “break out” of the block and highlight the whole html of my page all the way down to the closing html tag. But it is novel, interesting and small in size. I’m glad I found it.