-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Ensure that angle brackets in pyscript tag are escaped before parsing #684
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Co-authored-by: James A. Bednar <jbednar@users.noreply.github.com>
function escape(str: string): string { | ||
return str.replace(/</g, "<").replace(/>/g, ">") | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should escape more?
I'm not an expert in the field, but a quick googling found this:
https://stackoverflow.com/a/6234804
I guess we should probably escape '
, "
and &
as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was indeed quite conservative here, however I think <
and >
may indeed be special in the regard that they absolutely break the parser while the others are generally parsed correctly. Might be best to simply write some tests to confirm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for example, imagine the following code:
<py-script>
js.console.info("a & b");
</py-script>
I would expect it to print literally a & b
, what it actually prints is a & b
.
And if you try to print "a " b"
is even worse, because it is parsed as a quote "
and so python read "a " b"
, which results in a python SyntaxError
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's indeed bad, sounds like we actually have to unescape those HTML entities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uhm right, for those it's the opposite direction.
Btw, I just checked what JS does:
<script>
console.info("a & b");
</script>
prints a & b
, so we should probably do the same.
Without escaping angle brackets (
<
and>
) the DOMParser will strip out anything that looks like an HTML tag.