Should I save in db - user input as html encode? [closed]

Question

Closed. This question is opinion-based. It is not currently accepting answers.

Want to improve this question? Because this question may lead to opinionated discussion, debate, and answers, it has been closed. You may edit the question if you feel you can improve it so that it requires answers that include facts and citations or a detailed explanation of the proposed solution. If edited, the question will be reviewed and might be reopened.

Closed last year.

Improve this question

We're having conflict with coworkers on whether we should htmlencode user input and then save it to db ( vs saving it straight forward as it is)

I also found various answers which says that DB should save the plain(!) input.

Why ? because DB should know that the user length is 1 in < and not 4 in <

The html encoding should only be made when outputting.

but:

Having said that , I see that Stackoverflow is not following this rule.

When I save a question here at SO , which contains plain < it does show (obviously) the < in the preview pane. BUT when I submit the question : it submit the content as json with html encode !

Json.stringify is not doing html encode

So If I type in the input :

enter image description here

And submits :

I see (via fiddler) that it sends actually the html encode value :

enter image description here

Question :

As you see - i'm a bit confused. the common logic says that db should save whatever the user type 1:1.

The sanitization should be made at the output

Craig Stuntz · Accepted Answer · 2014-03-11 12:17:38Z

1

+50

You must encode input sent to the server, because otherwise the anti Cross Site Scripting protection on the server will block the entire request. However, you decode this input before saving to the DB.

In other, what you see in the POST isn't necessarily what is saved to the database.

answered Mar 11, 2014 at 12:17

Craig Stuntz

127k12 gold badges257 silver badges275 bronze badges

Sign up to request clarification or add additional context in comments.

8 Comments

Royi Namir Over a year ago

Craig , what is the problem (with all the respect to the XSS library) to send 1:1 to the db , and then - when output - sanitize + encode... ? I mean - why doing the decode again at the server just becuase of this layer..

Craig Stuntz Over a year ago

Allowing unencoded requests to the server is flat-out dangerous. It's worth the extra steps for the (relatively infrequent!) times you want to allow markup in submissions to ensure that those fields and those fields only contain markup. Additionally, you don't usually allow any HTML at all in the POST. Usually it's only a carefully limited whitelist of tags. So you'll be encoding / decoding anyway.

EricLaw Over a year ago

It's important to recognize that not all servers have anti-XSS features, and also important to recognize that if you were to rely upon client-side encoding, an attacker could use Fiddler to simply send an unencoded attack string to the server.

Craig Stuntz Over a year ago

@ericlaw, I didn't suggest relying on client-side encoding! I agree that would not be safe. Rather, I said you should block unencoded requests at the server.

Royi Namir Over a year ago

still i dont see what's the problem with unencoded requests at the server ( when going to insert to db). db should save data. not person data or readeable data or safe data - but DATA. but when outputing - I agree that I should do : "make this data not harming when displayed".

|

Johnny Yin · Accepted Answer · 2024-06-05 00:46:09Z

0

I think it's always better to ensure your database is safe before storing anything in DB. This is because we don't know how many downstream services will consume the data. Encoding before storing in db at least can guarantee security even if engineers forget to encode in front-end UI when they consume the data.

Regarding the statement about db should save whatever the user type 1:1, this is totally wrong. They never consider security. Normally we need to do lots of checking and transformation in DTO validaton layer. For example, we check and transform email addresses to all lower cases before storing into DB. Security is always the number one rule!

answered Jun 5, 2024 at 0:46

Johnny Yin

413 bronze badges

1 Comment

Royi Namir Over a year ago

I wasnt talking about email validation. That's not related to encoding. Anyway nowdays i know that db should store data. Not encoded or whatsoever. There can be non web consumer also. So encoded is at output lane

Collectives™ on Stack Overflow

Should I save in db - user input as html encode? [closed]

2 Answers 2

8 Comments

1 Comment

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

8 Comments

1 Comment

Linked

Related