5

I was just curious if anyone could tell me if using SQLite to store a dictionary (as in brute force) for use in a Python script was an efficient method. While I am relatively new to Python I do have quite a bit of experience with other programming languages and am currently working on a pentesting tool to use in Backtrack. So far I am quite impressed with the speed and simplicity of Python, and my SQL queries seem to be working pretty ideally to return the needed prefixes for my brute force tool. However, I guess what I'm wondering is what is the standard for storing large data files in Python? Am I overlooking a better (faster) way of storing my prefixes simply because of my comfort with SQL? Please bear in mind that I am not using Python to query IDs 0 through n and use them, rather I am using Python to narrow down the possibilities and query those dictionary entries that match the criteria. Any help or opinions would be much appreciated!

1
  • How fast would you like it to be? How large is the dataset MB,GB,TB? What do you mean by "prefixes" is it as in "prefix tree" (trie)? How well do operations you perform on the data correspond to relational model? Commented Nov 6, 2011 at 8:29

3 Answers 3

3

Yes, sqlite is a reasonable choice for implementing a dictionary. For speed, use the :memory: option and make sure to create the appropriate indexes for your lookups and queries.

For large, persistent databases it also works well. For speed, take care to commit large transactions instead of per-key.

The suggested and appropriate uses for sqlite as a datastore is covered at their website: http://www.sqlite.org/features.html

Sign up to request clarification or add additional context in comments.

Comments

1

If Raymond Hettinger also recommends SQLite, then that's probably your best bet.

But a native Python solution would be to use a "pickle" file. You would build a Python dict that holds the data, then "pickle" the dict; later you could "unpickle" the dict. If you only have one key you need to search on, then this might possibly be a good way to go.

For Python 2.x you would likely want to use the cPickle module. For Python 3.x, there is only pickle, but I believe it is as fast as cPickle.

http://docs.python.org/library/pickle.html

On the other hand if your data set is truly large, so large that SQLite is starting to choke on it, then instead of splitting it up into multiple smaller SQLite files and managing them, it might make sense to just dump everything into a real database such as PostgreSQL.

Comments

1

Semi off-topic, here are some useful links.

THC-Hydra :P

Also here is a great video on password policies and using then to brute force.

http://www.irongeek.com/i.php?page=videos/hack3rcon2/martin-bos-your-password-policy-sucks

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.