0

I'm new to python so below is a (poorly written) script which aims to find the password of a zip using a kind of mix between "brute force" and "dictionnary".

ex: "thisisastringdictionnary" => dictionnary; password=>"astring" the script will test any possible subschain within a certain limit (for example 500 caracters).

The script below works fine but it's very slow. That's why I would like to use pool/multiprocessing.

script without multiprocessing (working):

import zipfile
import thread
import sys
zipfilename = 'toopen.zip'
dictionary = 'dictionnary.txt'
zip_file = zipfile.ZipFile(zipfilename)
def openZip(sub, start, stringLen):
    try:
        zip_file.extractall(pwd=sub)
        sub = 'Password found: %s' % sub
        print sub
        sys.exit(0)
    except SystemExit:
        sys.exit(0)
    except:
        print str((start/float(stringLen))*100)+"%"
        pass
def main():
    password = None
    zip_file = zipfile.ZipFile(zipfilename)
    with open(dictionary, 'r') as f:
        for line in f.readlines():
            password = line.strip('\n')
            for start in range(len(password)):
                for index in range(500):
                    sub = password[start:index+1]
                    openZip(sub, start, len(password));

if __name__ == '__main__':
    main()

With the tries I have done with multiprocessing I encountered several problem:

  • the script won't stop/exit when the password is found
  • the printing inside the try catch displays weirdly (like every process is printing with no order) => So the progress indicator is not workign anymore
  • I'm not even sure I'm doing this right :/

Below my try:

import zipfile
import thread
import sys
from multiprocessing import Pool

zipfilename = 'toopen.zip'
dictionary = 'dictionnary.txt'
zip_file = zipfile.ZipFile(zipfilename)
def openZip(sub):
    try:
        zip_file.extractall(pwd=sub[0])
        sub = 'Password found: %s' % sub[0]
        print sub[0]
        sys.exit(0)
    except SystemExit:
        sys.exit(0)
    except:
        print str((sub[1]/float(sub[2]))*100)+"%"
        pass
def main():
    p = Pool(4)

    password = None
    zip_file = zipfile.ZipFile(zipfilename)
    with open(dictionary, 'r') as f:
        for line in f.readlines():
            password = line.strip('\n')
            pwdList = []
            for start in range(len(password)):
                for index in range(500):
                    sub = password[start:index+1]
                    pwdList.append([sub, start, len(password)])
            p.map(openZip, pwdList)

if __name__ == '__main__':
    main()

I'm probably missing something trivial but I'm having a hard time to catch the way to use multiprocessing properly. Any help would be greatly appreciated. :)

0

1 Answer 1

1

Two Things -

1) Code of progress indicator requires re-thinking in multi-threaded program

There are multiple threads running in parallel. The print statements will be spitted out on stdout depending on which thread gets scheduled. So the output displaying the progress indicator will be all jumbled up. Since you are tracking progress per line of dictionary, you can think of printing thread id along with progress indicator. Better would be to print line/password from dictionary which is being processed by current thread.

Another approach could be printing overall progress with respect to lines processed from dictionary file. If a thread has processed 7th line of dictionary file having total 10 lines. Then a progress of 70% can be displayed when this thread finishes. Please note that the accuracy of this progress indicator will again depend on scheduling of threads. Thread processing line 6 may finish later than 7. So it will first display 70% and then 60%. This can be avoided by storing max. line processed by threads and displaying progress based on max line. An approximation of progress should be sufficient. If more accuracy is expected then it will get more complex and you have to synchronize the thread for capturing progress.

2) Exiting whole process when password is found

sys.exit() terminates only the thread. For exiting process, you should use os._exit or other mechanism.

Sign up to request clarification or add additional context in comments.

6 Comments

I don't understand the first point, why would I have to iterate over my sublist instead of accessign arguments like I do ? Also the other arguments are only here to display the progress (badly done) where should I put the progress prnit ? thanks for your answer already :)
Have you tried printing sub[0] in openZip method ? It will be a list. You can think of displaying progress once only when a thread is about to finish. In openZip method, it would after suggested for loop.
it's not a list no. pwdList is a list of list but the map function is passing each list as sub, so sub is a list and sub[0] is the first argument ...
In this case, you can print status after p.map. It should not be in openZip.
yes I tried but it's not gonna work it will measure pwdList being build not the pool work
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.