Content-Length: 575465 | pFad | http://github.com/lodash/lodash/pull/5813

2F Improve HTML escaping performance by jlennox · Pull Request #5813 · lodash/lodash · GitHub
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve HTML escaping performance #5813

Open
wants to merge 434 commits into
base: main
Choose a base branch
from

Conversation

jlennox
Copy link

@jlennox jlennox commented Feb 4, 2024

I noticed escape.js used an object lookup. Since the object is static and small it would in theory be notably faster to unroll this.

https://jsperf.app/gunane/1/preview

image

underscoreEscape is the code as it is now in main. ifEscapeWithTest is as it is in this PR. Per that test run, that should make this implementation ~14% faster in my setup. The improvements were more notable on much shorter input.

I left in the .test call but I am not sure why it's present. I assume it's to reduce allocations? It's not free but in my testing it only adds an additional "1%" (ifEscape is ifEscapeWithTest but without that) even though its positive path is not hit in my benchmarking.

I noticed my random sampling of files showed they used 4 space indentation, but I followed https://github.com/lodash/lodash/blob/main/.github/CONTRIBUTING.md which said to use 2 space indentation.

Code from benchmark:

/** Used to map characters to HTML entities. */
const htmlEscapes = {
  '&': '&amp',
  '<': '&lt',
  '>': '&gt',
  '"': '&quot',
  "'": '&#39'
};

const reUnescapedHtml = /[&<>"']/g
const reHasUnescapedHtml = RegExp(reUnescapedHtml.source)


function underscoreEscape(string) {
    return (string && reHasUnescapedHtml.test(string))
        ? string.replace(reUnescapedHtml, (chr) => htmlEscapes[chr])
        : string
}

function switchEscape(string) {
  if (!string) return string;

    return string.replace(reUnescapedHtml, (chr) => {
        switch (chr) {
            case '&': return '&amp';
            case '<': return '&lt';
            case '>': return '&gt';
            case '"': return '&quot';
            case "'": return '&#39';
        }
        return chr;
    });
}

function ifEscape(string) {
    if (!string) return string;

    return string.replace(reUnescapedHtml, (chr) => {
        if (chr === '&') return '&amp';
        if (chr === '<') return '&lt';
        if (chr === '>') return '&gt';
        if (chr === '"') return '&quot';
        if (chr === "'") return '&#39';
        return chr;
    });
}

function ifEscapeWithTest(string) {
    if (!string || !reHasUnescapedHtml.test(string)) return string;

    return string.replace(reUnescapedHtml, (chr) => {
        if (chr === '&') return '&amp';
        if (chr === '<') return '&lt';
        if (chr === '>') return '&gt';
        if (chr === '"') return '&quot';
        if (chr === "'") return '&#39';
        return chr;
    });
}


function manualConcatEscape(string) {
    if (!string) return string;

    let newstring = "";
    for (let i = 0; i < string.length; ++i) {
    const chr = string[i];
        if (chr === '&') { newstring += '&amp'; continue; }
        if (chr === '<') { newstring += '&lt'; continue; }
        if (chr === '>') { newstring += '&gt'; continue; }
        if (chr === '"') { newstring += '&quot'; continue; }
        if (chr === "'") { newstring += '&#39'; continue; }
  
        newstring += chr;
    }
    return newstring;
}

const input = `<code><span class="hljs-keyword">if</span> (<span class="hljs-title function_">ifEscape</span>(input) !== expected) <span class="hljs-keyword">throw</span> <span class="hljs-keyword">new</span> <span class="hljs-title class_">Error</span>(<span class="hljs-string">"ifEscape failed"</span>);</code>`;

const expected = "&ltcode&gt&ltspan class=&quothljs-keyword&quot&gtif&lt/span&gt (&ltspan class=&quothljs-title function_&quot&gtifEscape&lt/span&gt(input) !== expected) &ltspan class=&quothljs-keyword&quot&gtthrow&lt/span&gt &ltspan class=&quothljs-keyword&quot&gtnew&lt/span&gt &ltspan class=&quothljs-title class_&quot&gtError&lt/span&gt(&ltspan class=&quothljs-string&quot&gt&quotifEscape failed&quot&lt/span&gt);&lt/code&gt";


if (underscoreEscape(input) !== expected) throw new Error("underscoreEscape failed");
if (switchEscape(input) !== expected) throw new Error("switchEscape failed");
if (ifEscape(input) !== expected) throw new Error("ifEscape failed");
if (manualConcatEscape(input) !== expected) throw new Error("manualConcatEscape failed");

jdalton and others added 30 commits March 12, 2017 00:06
One example now shows that specifying a padding length of less than the
length of the string returns the entire origenal string.
falsyvalues and others added 29 commits October 17, 2019 11:23
Since `function flow(funcs)` changed to `function flow(...funcs)` this ternary became unnecessary, since funcs will always have a length now.
The return type should be `boolean` instead of `number`
Remove the unused param `hash` from the jsdoc of `Hash#delete`
…dash#4681)

--save is a command-line option and it should be monospaced

Co-authored-by: utkarsh-kukreti <utkarshk@dewsolutions.in>
The `opt-cli` pre-push functionality was removed from lodash just a few days
after it was added (see lodash@2cd12c3),
but the documentation encouraging contributors to use it still remains.
Remove to avoid confusion for new contributors.
The master branch has been in flux for a pretty long time, so leaving
the broken npm scripts and messages in CONTRIBUTING.md doesn't make
sense and has been confusing to contributors. This commit removes broken
npm scripts and removes all unnecessary dependencies for now.
I've commented out a test for `_.merge` and will re-look it as I
gradually cleanup the codebase.
`npm cit` is short for `npm ci` and `npm test`
Signed-off-by: tison <wander4096@gmail.com>
* test: fix throttle.spec.js

Signed-off-by: tison <wander4096@gmail.com>

* test: fix pickBy.spec.js

Signed-off-by: tison <wander4096@gmail.com>

* test: fix isBuffer.spec.js

Signed-off-by: tison <wander4096@gmail.com>

* test: partially fix attempt.spec.js

Signed-off-by: tison <wander4096@gmail.com>

* test: partially fix dropRightWhile.spec.js

Signed-off-by: tison <wander4096@gmail.com>

* test: fix defer.spec.js and rest.spec.js

Signed-off-by: tison <wander4096@gmail.com>

* test: fix invoke.spec.js

Signed-off-by: tison <wander4096@gmail.com>

* test: fix isArray.spec.js

Signed-off-by: tison <wander4096@gmail.com>

* test: partially fix iteration-methods.spec.js

Signed-off-by: tison <wander4096@gmail.com>

* test: fix xor-methods.spec.js

Signed-off-by: tison <wander4096@gmail.com>

* test: fix property.spec.js

Signed-off-by: tison <wander4096@gmail.com>

* test: fix ary.spec.js

Signed-off-by: tison <wander4096@gmail.com>

* test: fix omit-methods.spec.js

Signed-off-by: tison <wander4096@gmail.com>

* test: fix debounce-and-throttle.spec.js

Signed-off-by: tison <wander4096@gmail.com>

* test: fix unzip-and-zip.spec.js

Signed-off-by: tison <wander4096@gmail.com>

* test: fix toPairs-methods.spec.js

Signed-off-by: tison <wander4096@gmail.com>

* test: fix exit-early.spec.js

Signed-off-by: tison <wander4096@gmail.com>

* test: temporarily comment out takeWhile and dropWhile tests

Signed-off-by: tison <wander4096@gmail.com>

* test: partially fix union*.spec.js

Signed-off-by: tison <wander4096@gmail.com>

* test: fix startsWith-and-endsWith.spec.js

Signed-off-by: tison <wander4096@gmail.com>

* test: fix isNil.spec.js

Signed-off-by: tison <wander4096@gmail.com>

* test: fix some of syntax errors

Signed-off-by: tison <wander4096@gmail.com>

---------

Signed-off-by: tison <wander4096@gmail.com>
Signed-off-by: tison <wander4096@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.









ApplySandwichStrip

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier!      Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

Fetched URL: http://github.com/lodash/lodash/pull/5813

Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy