I need to write a script that will extract all the data out of a client's files without having to launch the application they were created in. The application saves all data to an XML file, encrypts only the content said file before it compresses it, and finally changes the extension of the compressed file to make it "more difficult" to recognize.
The application has a text viewer that will show part, but not all, of the data in the client file, making my coworkers and I have to do a combination of copying that text and tabbing through the fields in all the other areas we need to extract the data from. Attempting to use the application itself is slow and practically useless, as it has horrendous memory issues, causing it to crash constantly.
Anyway - I am able to figure out some of the basics because the encryption seems fairly weak, or at least it's easy to see the pattern it uses. Each client file from each customer on each machine I am going to be using is going to have the same output for the same characters. The encryption is the same between all files.
It changes blocks of three characters (adding a character to the beginning of each block of three), restarting on the fourth. = appears to be a null character.
For example: A becomes QQ==, AA is QUE=, AAA is QUFB, and AAAA is QUFBQQ==
I've found the basics, it's just a list of all the Unicode characters that are changed, for example QQ== would be A, Qg== is B, Qc== is C, we move to RA== for D, RQ== for E, and cycle onwards through the character table.
It starts to get fun when we introduce a second and/or third character to the string, as it now has ITA= for !0, and moves forward four alphanumeric characters like so: ITE= for !1, ITY= for !6, and rolling back to ITc= for !7, up to IT8= for !> before moving to the next character in the second position and starting over like so IUA= for !?. And so on.
Anyway, I would love some pointers on a few things here. How do I take what I know and find the algorithm? And from there, how do I use that to decrypt the rest of the data?