0

I am trying to do a string comparison between strings in two text files. The text is produced by bintext application reading from .exe and produce files with format such as below :

File pos Mem pos ID Text

======== ======= == ====

00000000004D 00000040004D 0 !This program cannot be run in DOS mode.

0000000000A0 0000004000A0 0 Rich!

I tried to do a line.split with a white space, but as a result the last column content is also separated with white space. Instead of [!this program cannot be run in dos mode] I got [!this,program,cannot,be,run,in,DOS,Mode,.]

Is there any simple way to put in the array the entire column 3 from the txt file without splitting it ?

2
  • 1
    Did you utilize split's count parameter? Commented Aug 7, 2014 at 21:27
  • i didnt, i put the count and play around with the parameter and it got the result i wanted. thanks ! Commented Aug 7, 2014 at 22:21

4 Answers 4

2

How about this:

data = []
for line in input_file:
    data.append(line.strip().split(' ', 3))

This will give you:

['00000000004D', '00000040004D', '0', '!This program cannot be run in DOS mode.']
['0000000000A0', '0000004000A0', '0', 'Rich!']

Documentation on split() function

Sign up to request clarification or add additional context in comments.

1 Comment

i used this advice and increased the split count since the file is not exactly separated by one space per column. i tried to post the actual formatting initially but it reads only one space :\ issue resolved.
1

If the first part of the string has a constant length, using slicing;

In [1]: s = '00000000004D 00000040004D 0 !This program cannot be run in DOS mode.'

In [2]: s[28:]
Out[2]: '!This program cannot be run in DOS mode.'

Comments

1

As you can see in Python docs, the str.split method has an optional maxsplit argument which, if gives, specifies that a maximum number of splits that can be performed on the list.

Assuming that you already know how to read the file you can specify a maximum of 3 splits:

data = "00000000004D 00000040004D 0 !This program cannot be run in DOS mode."
data.split(None, 3)

Comments

0
In [93]: s = "00000000004D 00000040004D 0 !This program cannot be run in DOS mode."

In [94]: s.rsplit("0",1)[-1] # rsplit once on the 0
Out[94]: ' !This program cannot be run in DOS mode.'

      import re
[96]: re.split("\d\s",s)[-1] # single digit followed by a space
Out[96]: '!This program cannot be run in DOS mode.'

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.