Skip to content

Ensure that angle brackets in pyscript tag are escaped before parsing #684

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Aug 16, 2022

Conversation

philippjfr
Copy link
Contributor

@philippjfr philippjfr commented Aug 12, 2022

Without escaping angle brackets (< and >) the DOMParser will strip out anything that looks like an HTML tag.

  • Add test

@philippjfr philippjfr added the type: bug Something isn't working label Aug 12, 2022
Co-authored-by: James A. Bednar <jbednar@users.noreply.github.com>
@philippjfr philippjfr requested a review from fpliger August 13, 2022 08:55
@madhur-tandon madhur-tandon merged commit 8275aa2 into main Aug 16, 2022
@madhur-tandon madhur-tandon deleted the angle_bracket_escape branch August 16, 2022 16:11
function escape(str: string): string {
return str.replace(/</g, "&lt;").replace(/>/g, "&gt;")
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should escape more?
I'm not an expert in the field, but a quick googling found this:
https://stackoverflow.com/a/6234804

I guess we should probably escape ', " and & as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was indeed quite conservative here, however I think < and > may indeed be special in the regard that they absolutely break the parser while the others are generally parsed correctly. Might be best to simply write some tests to confirm.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for example, imagine the following code:

<py-script>
js.console.info("a &amp; b");
</py-script>

I would expect it to print literally a &amp; b, what it actually prints is a & b.
And if you try to print "a &quot b" is even worse, because it is parsed as a quote " and so python read "a " b", which results in a python SyntaxError.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's indeed bad, sounds like we actually have to unescape those HTML entities.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uhm right, for those it's the opposite direction.
Btw, I just checked what JS does:

<script>
    console.info("a &amp; b");
</script>

prints a &amp; b, so we should probably do the same.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Something isn't working
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

4 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy