The main case I can think of is wanting some forum functionality. Perhaps you want to allow your users to be able to write in markdown. This would provide an extra layer of protection as you could take the HTML generated from the markdown and further lock it down to only an allowed set of elements like `h1`. Just in case someone tried some of the markdown escape hatches that you didn't expect.
I think this might be the answer. There's no point to it by itself (either you separate data and code or you don't and let the user do anything to your page), but if you're already using a sanitiser and you can't use `textContent` because (such as with Markdown) there'll be HTML tags in the output, then this could be extra hardening. Thanks!
You still need innerHTML when you want to inject HTML tags in the page, and you could already use innerText when you didn't want to.
Having something in between is seriously useless.
What makes you say this?
If you mean to convey that it's possible to configure it to filter properly, let me introduce you to `textContent` which is older than Firefox (I'm struggling to find a date it's so old)
How would I set a header level using textContent?
document.createElement("h1").textContent = `Hello, ${username}!`
If you allow <h1> in the setHTML configuration or use the default, users with the tag in their username also always get it rendered as markupIf that's true, seems like it's still a security risk given what you can do with CSS these days: https://news.ycombinator.com/item?id=47132102
Or I guess you could completely restyle and change the text of UI elements so it looks like the user is doing one thing when they're actually doing something completely different like sending you money
.setHTML("<h1>Hello</h1>", new Sanitizer({}))
will strip all elements out. That's not too difficult.Plus this is defense-in-depth. Backends will still need to sanitize usernames on some standard anyhow (there's not a lot of systems out there that should take arbitrary Unicode input as usernames), and backends SHOULD (in the RFC sense [1]) still HTML-escape anything they output that they don't want to be raw HTML.
new Sanitizer({})
This Sanitizer will allow everything by default, but setHTML will still block elements/attributes that can lead to XSS.You might want something like:
new Sanitizer({ replaceWithChildrenElements: ["h1"], elements: [], attributes: [] })
This will replace <h1> elements with their children (i.e. text in this case), but disallow all other elements and attributes.How exactly, given that setHTML sanitizes the input? If you don't want to have any HTML tags allowed, seems you can configure that already? https://wicg.github.io/sanitizer-api/#built-in-safe-default-...
The article says that the output is:
<h1>Hello my name is</h1>
So it keeps (non-script) html tags (and presumably also attributes) in the input. Idk how you're asking "how" since it's the default behaviorStripping HTML tags completely has always been possible with the drop-in replacement `textContent`. Making a custom configuration object for that is much more roundabout
I can see how it's a way of allowing some tags like bold and italic without needing a library or some custom parser, but I didn't understand what the point of this default could be and so why it exists (a sibling comment proposed a plausible answer: hardening on top of another solution)
> Yes, because that's the default configuration, if you don't want that, stop using the default configuration?
"don't use it if it's not what you want" is perhaps the silliest possible answer to the question "what's the use-case for this"
Maybe you meant .innerHTML? .innerText AFAIK doesn't try to parse HTML (why would it?), but I don't understand what you mean with nonstandard, both .innerHTML and .innerText are part of the standards, and I think they've been for a long time.
> but I didn't understand what the point of this default could be and so why it exists (a sibling comment proposed a plausible answer: hardening on top of another solution) [...] the question "what's the use-case for this"
I guess maybe third time could be the charm: it's for preventing XSS holes that are very common when people use .innerHTML
That information is in the question, so sadly no this still doesn't make sense to me because I don't understand any scenario in which this is what the developer wants. You always still need more code (to filter the right tags) or can just use textContent (separating data and code completely, imo the recommended solution)
> Maybe you meant .innerHTML? .innerText AFAIK doesn't try to parse HTML (why would it?)
No, I didn't mean that, yes it does, and no I don't know why it is this way. If you don't believe me and don't want to check it out for yourself, I'm not sure what more I can say
Client-side includes.
The default might be suitable for something like an internal blog where you want to allow people to sometimes go crazy with `<style>` tags etc, just not inject scripts, but I would expect it to almost always make sense to define a specific allowed tag and attribute list, as is usually done with the userland predecessors to this API.
Your lack of imagination is disturbing :-)
Anyone who wants to provide some level of flexibility but within bounds. Say, you want to allow <strong> and <em> in a forum post but not <script>. It's not too difficult to imagine uses.
With a safe API like this one that's tied to the browser's own interpretation of HTML (i.e. it is perfectly placed to know exactly what is and isn't dangerous given it is the one rendering it) wouldn't it be much better to rely on that?
Are we taking out all the fun of the web? I absolutely loved the <marquee> names people had in the early days of Facebook, it was all harmless fun.
If injection of frontend code takes down your backend, your backend sucks, fix it.