2

I have a question about data storage. I have a program that is creating a list of objects. What is the best way to store these on file so that the program can reload them later? I've tried to use Pickle, but I think I might be heading down the wrong alley and I keep getting this error when I try to read back the data:

    Traceback (most recent call last):
     File "test.py", line 110, in <module>
knowledge = pickle.load(open("data.txt"))
    File "/sw/lib/python3.1/pickle.py", line 1356, in load
 encoding=encoding, errors=errors).load()
File "/sw/lib/python3.1/codecs.py", line 300, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
  UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 0: invalid start byte

Edited to add: here's a bit of the code I'm trying:

FILE = open("data.txt", "rb")

knowledge = pickle.load(open("data.txt"))

FILE = open("data.txt", 'wb')

pickle.dump(knowledge, FILE)
2
  • 1
    Which Python version? How did you create the file? Commented Jun 8, 2011 at 14:54
  • Retry pickling. Read the documentation carefully! Post some code here and we'll help you find what's wrong :). You could also use JSON, there are several modules for that around. Commented Jun 8, 2011 at 14:55

4 Answers 4

9

I think the problem is that the line

knowledge = pickle.load(open("data.txt"))

doesn't open the file in binary mode. Python 3.2:

>>> import pickle
>>> 
>>> knowledge = {1:2, "fred": 19.3}
>>> 
>>> with open("data.txt", 'wb') as FILE:
...     pickle.dump(knowledge, FILE)
... 
>>> knowledge2 = pickle.load(open("data.txt"))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/codecs.py", line 300, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 0: invalid start byte
>>> knowledge2 = pickle.load(open("data.txt","rb"))
>>> knowledge2
{1: 2, 'fred': 19.3}
Sign up to request clarification or add additional context in comments.

Comments

1

No need to rewrite shelve, Python's object persistence library. Example:

import shelve

d = shelve.open(filename) # open -- file may get suffix added by low-level
                          # library

d[key] = data   # store data at key (overwrites old data if
                # using an existing key)
data = d[key]   # retrieve a COPY of data at key (raise KeyError if no
                # such key)
del d[key]      # delete data stored at key (raises KeyError
                # if no such key)
flag = d.has_key(key)   # true if the key exists
klist = d.keys() # a list of all existing keys (slow!)

# as d was opened WITHOUT writeback=True, beware:
d['xx'] = range(4)  # this works as expected, but...
d['xx'].append(5)   # *this doesn't!* -- d['xx'] is STILL range(4)!

# having opened d without writeback=True, you need to code carefully:
temp = d['xx']      # extracts the copy
temp.append(5)      # mutates the copy
d['xx'] = temp      # stores the copy right back, to persist it

# or, d=shelve.open(filename,writeback=True) would let you just code
# d['xx'].append(5) and have it work as expected, BUT it would also
# consume more memory and make the d.close() operation slower.

d.close()       # close it

Comments

0

If you're just want to recreate some class objects later, the easiest solution would be to dump their properties into a file and them read them back, creating the objects based on the contents.

See: http://docs.python.org/tutorial/inputoutput.html

2 Comments

No, this is not easy. It's a lot of additional typing and violates DRY (and therefore also carries the risk of getting out of sync).
The data structure is fairly complicated. The article you linked to advises against doing in manually and suggests pickle. Do you have any idea what might be causing my error?
-1

You can use cPickle, or Picke it doesn't matter. Open in binary mode (rb) , and try setting the protocol to -1.

Try something like this:

import cPickle

my_file= open('wohoo.file', 'wb')

largeObject=  Magic() #insert your logic here
cPickle.dump(largeObject, my_file, -1)
my_file.close()

other_file = open('wohoo.file', 'rb')
welcomeBack - cPickle.load(other_file )
other_file.close()

1 Comment

-1 Provably wrong, read again. It does find the file. It can even read it. It only fails to decode it into the encoding Python prefers.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.