3

I was looking at the documentation and in the example section, I don't see how to create a UUID based on File Contents. Google did not help me either.

I've tried this:

>>> import uuid
>>> data = open('/media/emmc/DCIM/100ABC06/00059.JPG','rb')
>>> contents = data.read()
>>> len(contents)
9155
>>> uuid = uuid.UUID(contents)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.7/uuid.py", line 134, in __init__
ValueError: badly formed hexadecimal UUID string

Also this:

>>> uuid = uuid.UUID(str(contents))
>>> uuid = uuid.UUID(contents.decode('ascii'))
>>> uuid = uuid.UUID(contents.decode('utf8'))

Please help me understand how to generate a UUID based on File contents in Python 2.7.

1
  • 1
    The documents you link tell you how to use the function. They don't say you can convert a whole file into a UUID. Probably what you want is a hash. Commented Jun 2, 2015 at 13:52

2 Answers 2

2

If you want to create a hash of a file content, you probably don't need UUID. Instead, you should use hashlib and MD5, SHA-1, SHA-256 or any other supported algorithm to create a fingerprint of your file.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks - I did the MD5sum of the file.
1

When you pass a string to uuid.UUID(), the string must be either 32 or 16 hexadecimal digits.

refer to the docs https://docs.python.org/2/library/uuid.html

Create a UUID from either a string of 32 hexadecimal digits, a string of 16 bytes as the bytes argument, a string of 16 bytes in little-endian order as the bytes_le argument, a tuple of six integers (32-bit time_low, 16-bit time_mid, 16-bit time_hi_version, 8-bit clock_seq_hi_variant, 8-bit clock_seq_low, 48-bit node) as the fields argument, or a single 128-bit integer as the int argument. When a string of hex digits is given, curly braces, hyphens, and a URN prefix are all optional. For example, these expressions all yield the same UUID:

UUID('{12345678-1234-5678-1234-567812345678}')
UUID('12345678123456781234567812345678')
UUID('urn:uuid:12345678-1234-5678-1234-567812345678')
UUID(bytes='\x12\x34\x56\x78'*4)
UUID(bytes_le='\x78\x56\x34\x12\x34\x12\x78\x56' +
              '\x12\x34\x56\x78\x12\x34\x56\x78')
UUID(fields=(0x12345678, 0x1234, 0x5678, 0x12, 0x34, 0x567812345678))
UUID(int=0x12345678123456781234567812345678)

1 Comment

This is valid in Python 3 as well.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.