PR welcome? can I fix html sanitizer to allow for standard data- attributes?

I just ran into this exact same issue. If you set a breakpoint in strava’s embed script, you can see the element they get when they run their querySelector on the dom: the author’s intended content is stripped down to suddenly just a raw:

<div class="strava-embed-placeholder"></div>

So all the data-* attributes with real IDs you tried to enter have been stripped. These attributes are a very standard web practice. See Mozilla docs on HTMLElement/dataset. These are very common to use as sort of parameter-arguments to a separate script you’re running. So I might have an attribute like data-embed-id="1234567" where my script is going to look for the embedId dataset key and use the value as its argument. Without this simple attribute practice the use of scripts on writeas appears trivially broken (I truly don’t think this is just a small edge-case practice of script embedding)

I’m guessing there’s an html sanitizer that was written without knowledge of this very safe practice. I couldn’t find the writefreely code responsible after a quick search, but I’d be happy to send a patch if it’s welcome.

I’m pretty sure this diff is all that’s needed:

  func getSanitizationPolicy() *bluemonday.Policy {
	  policy := bluemonday.UGCPolicy()
+	  policy.AllowAttrs("style", "class", "id", "data").Globally()
	  policy.AllowAttrs("src", "style").OnElements("iframe", "video", "audio")
	  policy.AllowAttrs("src", "type").OnElements("source")
	  policy.AllowAttrs("frameborder", "width", "height").Matching(bluemonday.Integer).OnElements("iframe")
	  policy.AllowAttrs("allowfullscreen").OnElements("iframe")
	  policy.AllowAttrs("controls", "loop", "muted", "autoplay").OnElements("video")
	  policy.AllowAttrs("controls", "loop", "muted", "autoplay", "preload").OnElements("audio")
	  policy.AllowAttrs("target").OnElements("a")
	  policy.AllowAttrs("title").OnElements("abbr")
-	  policy.AllowAttrs("style", "class", "id").Globally()
	  policy.AllowAttrs("alt").OnElements("img")
	  policy.AllowElements("header", "footer")
	  policy.AllowURLSchemes("http", "https", "mailto", "xmpp")
	  return policy
  }

Based on:

@matt would this be acceptable as a PR?

Thanks for bringing this up and taking a look at it. Yes, more than happy to merge a pull request if confirmed that this works! Should be on the right track there.