0

I have a file containing lines like

12  45 some text
56 78      #another type of text
22     34 after column 2 are other data

I need to split each line storing the two first elements in two variables and the text after the second column in one variable. In C, using sscanf() this can be accomplished as

sscanf(line,"%d %d %s",&a,&b,textArray);

I know about scanf python module but apparently it is not standard and it is not included in Debian.

How can you do this using the standard Python tools?

2 Answers 2

1

split is all you need.

 line.split(None, 2)

Docs for split with emphasis added:

string.split(s[, sep[, maxsplit]])

Return a list of the words of the string s. If the optional second argument sep is absent or None, the words are separated by arbitrary strings of whitespace characters (space, tab, newline, return, formfeed). If the second argument sep is present and not None, it specifies a string to be used as the word separator. The returned list will then have one more item than the number of non-overlapping occurrences of the separator in the string. If maxsplit is given, at most maxsplit number of splits occur, and the remainder of the string is returned as the final element of the list (thus, the list will have at most maxsplit+1 elements). If maxsplit is not specified or -1, then there is no limit on the number of splits (all possible splits are made).

The behavior of split on an empty string depends on the value of sep. If sep is not specified, or specified as None, the result will be an empty list. If sep is specified as any string, the result will be a list containing one element which is an empty string.

Sign up to request clarification or add additional context in comments.

1 Comment

You are ok, thanks. If line="22 34 after column 2 are other data", using parts=line.split(None,2) does the work
0

With the assumption that your first elements of the string are numbers I would suggest some thing like

def split(line):

  list0= line.split()
  list1 = [y for y in list0 if y.isdigit() ]
  rest = ' '.join([c for c in list0 if c not in list1[:2]])

  a = list1[0]
  b = list1[2]
  return a,b,rest

# ex:

print split('22     34 after column 2 are other data')

# output >> ('22', '2', 'after column 2 are other data')

4 Comments

Thank you for your answer. But try to apply to the line "22 34 after column 2 are other data". In this case, after the column 2 exist a number that could be retained in rest. And also the spaces between 22 and 34 could not be appended before the text in rest
Thank you. Isuppose the line list0= [x for x in line.split() if x != ''] can be replaced by simply list0=line.split()
Mmmmm, well, the method is not totally correct. For example, if the line is "22 34 after column 2 are other data", i. e., if the data after the column 2 are separated by an arbitrary number of spaces I need this second part exactly as it is in the file, but the join methos separates the elements only by one space, so the behavior of the function is not the same as the sscanf in C
@jgpallero so you need white spaces to?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.