0

I'm trying some things in python, in the hope to learn it. I'm on Ubuntu 20.04. I have the following problem:

When I write a script and I want to try it, making changes and re-running it to see what happens, I use a text editor (Geany).

Now I have written, among other things in my script, a function that populates a very long list (43.5MB). If I want to inspect one element of the list and I write print(mylist[0][0]['key']) for instance, the list gets repopulated when I re-run the script, which takes a long time. But the list is already in memory and it doesn't change in the second run. If I however do change something in the rest of the code, e.g. the list generating part, then I can easily re-run the whole thing and see the effect of the changes.

If I do it in the python console, then I can just print the list changing the indices according to the element I want to inspect, but if I then want to change something in the list generation, that's not that easy.

So I have a dilemma. Is there a way to run the script once and then access the list changing the indices without having to re-run the script, while at the same time keeping the flexibility to also change other parts of the script and run it again?

5
  • 2
    You may need IPython. Commented Aug 15, 2022 at 9:08
  • Save the list to a file next to your script file? By default try to load this file, if it doesn't exist, create it with your long processing. If you want to regenerate the thing, delete the file by hand, or pass some argument to your script Commented Aug 15, 2022 at 9:08
  • @AlexeyLarionov: With a file is how I'm doing it now, but I see it a little as a workaround and with overhead. I was wondering if there is a quicker way. Commented Aug 15, 2022 at 9:18
  • If you want to work interactively, IPython or a Jupyter notebook can be very convenient. Commented Aug 15, 2022 at 9:25
  • @ThierryLathuille, Mechanic Pig: I will look into that. I only briefly skimmed through and I found both somewhat complicated. But then again they can do a lot of things, so maybe it's indeed the way to go. Commented Aug 15, 2022 at 9:28

1 Answer 1

2

Here's how I would approach your script, say my_script.py.

import pickle

def generate_list():
    return []

def dump_file(data, filepath):
    with open(filepath, 'wb') as f:
        pickle.dump(data, f)

def load_file(filepath):
    with open(filepath, 'rb') as f:
        return pickle.load(f)

if __name__ == "__main__":
    need_generate = False
    try:
        my_list = load_file('cached_list.pickle')
    except:
        need_generate = True
    need_generate = need_generate or (len(sys.argv) >= 2 and sys.argv[1] == 'generate')
    if need_generate:
        my_list = generate_list()
        dump_file(my_list, 'cached_list.pickle')

    # now use my_list

If you call python my_script.py it will try to load the file from a file in the same folder cached_list.pickle. If it fails to load (e.g. if this file doesn't exist), the list will be regenerated and saved to the file for future loads. You can also call the script like python my_script.py generate to forcefully regenerate the list and save it to file. You can alternatively delete the file from the folder to regenerate it

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.