I have this thing which I'm dealing with right now, XSS.
I need to detect if a string contains XSS or not. In order to solve it I used that link. And this is the code I'm using:
public static boolean containsXSS(String value) {
if (StringUtils.isEmpty(value)) {
return false;
}
String stripXss = stripXSS(value);
return !value.equals(stripXss);
}
public static String stripXSS(String value) {
if (StringUtils.isBlank(value))
return value;
// Use the ESAPI library to avoid encoded attacks.
Encoder encoder = ESAPI.encoder();
value = encoder.canonicalize(value);
// Avoid null characters
value = value.replaceAll("\0", "");
// Clean out HTML
Document.OutputSettings outputSettings = new Document.OutputSettings();
outputSettings.escapeMode(Entities.EscapeMode.xhtml);
outputSettings.prettyPrint(false);
value = Jsoup.clean(value, "", Whitelist.none(), outputSettings);
return value;
}
Using the code above I do succeed to catch things like: <script>alert('xss')</script>
My problem is that I identify the following string as containing XSS although it's not: {"item" :5}
It's because jsoup.clean turns it into {"item" :5}
I have tried to solve but with no success. It makes me wonder if my algorithm is completely wrong (if so where can I find the algorithm to detect XSS), perhaps I don't need to compare to the original String?
I would very appreciate if you could help me.
thanks