11

I have a Post model with a filefield which is used to upload files. How can I validate the file type (pdf for now, or any other types if I change to later). Preferably i'd like to validate the content, but if not I guess suffix would do too. I tried to look up online but most of the solutions I found are from way back and as the Django document get updated they don't work any more. Please if anyone can help. Thanks.

class Post(models.Model):
    author = models.ForeignKey('auth.User',default='')
    title = models.CharField(max_length=200)
    text = models.TextField()
    PDF = models.FileField(null=True, blank=True)
    created_date = models.DateTimeField(
            default=timezone.now)
    published_date = models.DateTimeField(
            blank=True, null=True)

    def publish(self):
        self.published_date = timezone.now()
        self.save()

    def __str__(self):
        return self.title
1

4 Answers 4

23

With Django 1.11 you can use FileExtensionValidator. With earlier versions, or for extra validation, you can build your own validator based on it. And you should probably create a validator either way because of this warning:

Don’t rely on validation of the file extension to determine a file’s type. Files can be renamed to have any extension no matter what data they contain.

Here's a sample code with the existing validator:

from django.core.validators import FileExtensionValidator
class Post(models.Model):
    PDF = models.FileField(null=True, blank=True, validators=[FileExtensionValidator(['pdf'])])

Source code is also available so you can easily create your own:

https://docs.djangoproject.com/en/1.11/_modules/django/core/validators/#FileExtensionValidator

Sign up to request clarification or add additional context in comments.

1 Comment

You REALLY should not trust only validation of the file name extension! Django really should include a validator that checks the file's content to be what is claimed by using libmagic. See @bimsapi's answer below, and check out github.com/mbourqui/django-constrainedfilefield or write a custom validator that uses libmagic!
2

Think of validation in terms of:

  • Name/extension
  • Metadata (content type, size)
  • Actual content (is it really a PNG as the content-type says, or is it a malicious PDF?)

The first two are mostly cosmetic - pretty easy to spoof/fake that information. By adding content validation (via file magic - https://pypi.python.org/pypi/filemagic) you add a little bit of additional protection

Here is a good, related answer: Django: Validate file type of uploaded file It may be old, but the core idea should be easily adapted.

Comments

0

Firstly, I'd advise you change 'PDF' to 'pdf', then to validate in older versions of Django, you could do this

forms.py

class PostForm(forms.ModelForm):
    # fields here
    class Meta:
        model = Post
        fields = ["title", "text", "pdf"]

    def clean(self):
        cd = self.cleaned_data
        pdf = cd.get('pdf', None)
        if pdf is not None:
            main, sub = pdf.content_type.split('/')
            # main here would most likely be application, as pdf mime type is application/pdf, 
            # but I'd like to be on a safer side should in case it returns octet-stream/pdf
            if not (main in ["application", "octet-stream"] and sub == "pdf"):
                raise forms.ValidationError(u'Please use a PDF file')
         return cd

2 Comments

This works great when I upload a file, but somehow gives an error when I try to edit my file and delete the attachment
content_type comes from the user, and they could say it's any filetype they want. It's bad practice to trust that, please use the answer provided by @bimsapi instead
0

Here is a simple example for a form with file type validation based on Django 1.11 FileExtensionValidator

class ImageForm(ModelForm):
    ALLOWED_TYPES = ['jpg', 'jpeg', 'png', 'gif']

    class Meta:
        model = Profile
        fields = ['image', ]

    def clean_avatar(self):
        image = self.cleaned_data.get('image', None)
        if not avatar:
            raise forms.ValidationError('Missing image file')
        try:
            extension = os.path.splitext(image.name)[1][1:].lower()
            if extension in self.ALLOWED_TYPES:
                return avatar
            else:
                raise forms.ValidationError('File types is not allowed')
        except Exception as e:
            raise forms.ValidationError('Can not identify file type')

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.