-2

We're having conflict with coworkers on whether we should htmlencode user input and then save it to db ( vs saving it straight forward as it is)

I also found various answers which says that DB should save the plain(!) input.

Why ? because DB should know that the user length is 1 in < and not 4 in &lt;

The html encoding should only be made when outputting.

but:

Having said that , I see that Stackoverflow is not following this rule.

When I save a question here at SO , which contains plain < it does show (obviously) the &lt; in the preview pane. BUT when I submit the question : it submit the content as json with html encode !

Json.stringify is not doing html encode

So If I type in the input :

enter image description here

And submits :

I see (via fiddler) that it sends actually the html encode value :

enter image description here

Question :

As you see - i'm a bit confused. the common logic says that db should save whatever the user type 1:1.

The sanitization should be made at the output

0

2 Answers 2

1
+50

You must encode input sent to the server, because otherwise the anti Cross Site Scripting protection on the server will block the entire request. However, you decode this input before saving to the DB.

In other, what you see in the POST isn't necessarily what is saved to the database.

Sign up to request clarification or add additional context in comments.

8 Comments

Craig , what is the problem (with all the respect to the XSS library) to send 1:1 to the db , and then - when output - sanitize + encode... ? I mean - why doing the decode again at the server just becuase of this layer..
Allowing unencoded requests to the server is flat-out dangerous. It's worth the extra steps for the (relatively infrequent!) times you want to allow markup in submissions to ensure that those fields and those fields only contain markup. Additionally, you don't usually allow any HTML at all in the POST. Usually it's only a carefully limited whitelist of tags. So you'll be encoding / decoding anyway.
It's important to recognize that not all servers have anti-XSS features, and also important to recognize that if you were to rely upon client-side encoding, an attacker could use Fiddler to simply send an unencoded attack string to the server.
@ericlaw, I didn't suggest relying on client-side encoding! I agree that would not be safe. Rather, I said you should block unencoded requests at the server.
still i dont see what's the problem with unencoded requests at the server ( when going to insert to db). db should save data. not person data or readeable data or safe data - but DATA. but when outputing - I agree that I should do : "make this data not harming when displayed".
|
0

I think it's always better to ensure your database is safe before storing anything in DB. This is because we don't know how many downstream services will consume the data. Encoding before storing in db at least can guarantee security even if engineers forget to encode in front-end UI when they consume the data.

Regarding the statement about db should save whatever the user type 1:1, this is totally wrong. They never consider security. Normally we need to do lots of checking and transformation in DTO validaton layer. For example, we check and transform email addresses to all lower cases before storing into DB. Security is always the number one rule!

1 Comment

I wasnt talking about email validation. That's not related to encoding. Anyway nowdays i know that db should store data. Not encoded or whatsoever. There can be non web consumer also. So encoded is at output lane

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.