2

I'm trying to split a string into a int list for further processing. But somehow I can't remove certain whitespaces in between elements of the list. The string x is supposed to have a length of 1000 instead of 1019. I tried reading the documentation for python and saw the function strip() for stripping whitespaces from strings. However, it only works for trailing and leading whitespaces. How should I go about removing these whitespaces and also how do I convert a str list to a int list? My code is as follows :

import array

x = """73167176531330624919225119674426574742355349194934
96983520312774506326239578318016984801869478851843
85861560789112949495459501737958331952853208805511
12540698747158523863050715693290963295227443043557
66896648950445244523161731856403098711121722383113
62229893423380308135336276614282806444486645238749
30358907296290491560440772390713810515859307960866
70172427121883998797908792274921901699720888093776
65727333001053367881220235421809751254540594752243
52584907711670556013604839586446706324415722155397
53697817977846174064955149290862569321978468622482
83972241375657056057490261407972968652414535100474
82166370484403199890008895243450658541227588666881
16427171479924442928230863465674813919123162824586
17866458359124566529476545682848912883142607690042
24219022671055626321111109370544217506941658960408
07198403850962455444362981230987879927244284909188
84580156166097919133875499200524063689912560717606
05886116467109405077541002256983155200055935729725
71636269561882670428252483600823257530420752963450"""

y=[]  

for i in range(0,len(x)): #String is now in a string list
    if x[i]!='':
        y.append(x[i])
        print(y[i])

print(len(x))

9 Answers 9

4

See this SO question. Not an exact duplicate but the answer is what you need :)

''.join(x.split())
Sign up to request clarification or add additional context in comments.

2 Comments

Wow that was a real elegant method! Thanks!
@paradox thanks, but as said I stole it from the other SO question :)
2
>>> x = """73167176531330624919225119674426574742355349194934
... 96983520312774506326239578318016984801869478851843
... 85861560789112949495459501737958331952853208805511
... 12540698747158523863050715693290963295227443043557
... 66896648950445244523161731856403098711121722383113
... 62229893423380308135336276614282806444486645238749
... 30358907296290491560440772390713810515859307960866
... 70172427121883998797908792274921901699720888093776
... 65727333001053367881220235421809751254540594752243
... 52584907711670556013604839586446706324415722155397
... 53697817977846174064955149290862569321978468622482
... 83972241375657056057490261407972968652414535100474
... 82166370484403199890008895243450658541227588666881
... 16427171479924442928230863465674813919123162824586
... 17866458359124566529476545682848912883142607690042
... 24219022671055626321111109370544217506941658960408
... 07198403850962455444362981230987879927244284909188
... 84580156166097919133875499200524063689912560717606
... 05886116467109405077541002256983155200055935729725
... 71636269561882670428252483600823257530420752963450"""
>>> len(x)
1019
>>> x = x.replace("\n", "")
>>> len(x)
1000

At this point you can handle x however you like, such as to convert it to an int:

>>> x = int(x)

Or convert it to a list of ints, one per character:

>>> x = [int(c) for c in x]

You can also do this in place in your source, which you might find more convenient:

>>> x = """731...
... ...450""".strip().replace("\n", "")
# strip used if you want to include extra leading or trailing whitespace
# for formatting

1 Comment

I like it, this seems to express the programmer's intention better than other solutions.
1

Easier than all this: you can remove the newlines from the literal string by escaping them with backlashes:

x = """73167176531330624919225119674426574742355349194934\
96983520312774506326239578318016984801869478851843\
85861560789112949495459501737958331952853208805511\
12540698747158523863050715693290963295227443043557\
66896648950445244523161731856403098711121722383113\
62229893423380308135336276614282806444486645238749\
30358907296290491560440772390713810515859307960866\
70172427121883998797908792274921901699720888093776\
65727333001053367881220235421809751254540594752243\
52584907711670556013604839586446706324415722155397\
53697817977846174064955149290862569321978468622482\
83972241375657056057490261407972968652414535100474\
82166370484403199890008895243450658541227588666881\
16427171479924442928230863465674813919123162824586\
17866458359124566529476545682848912883142607690042\
24219022671055626321111109370544217506941658960408\
07198403850962455444362981230987879927244284909188\
84580156166097919133875499200524063689912560717606\
05886116467109405077541002256983155200055935729725\
71636269561882670428252483600823257530420752963450"""

print len(x)

-> 1000

~ray

1 Comment

I would not call it easier since you have to edit each line of the input string/variable but it is a great alternative way to approach the problem.
0

There are many ways of removing whitespace from strings in python. Here are two ways I know of:

import string
string.join(str.split(), "")

Or,

import re
re.sub("\s+", "", str)

2 Comments

I like ''.join(str.split()) more. It's shorter :)
It looks nicer, it's just harder for me to remember. :)
0

Maybe you can rewrite you string like this:

x = ("1234" +
     "5678"
    )

This will avoid the newlines in the string, which you get using multiline strings.

Comments

0

For all whitespaces (\n and ' '):

y = [int(i) for i in re.sub(r'\s', '', x)]

For your specific string which actually contains newlines instead of ' ':

y = [int(i) for i in x if i != '\n']

Comments

0

This should produce an int list:

x = [int(i) for i in x if i.isdigit()]

1 Comment

It will, but it would be a list of individual digits, not of the large whitespace-separated numbers in the OP.
0

Try this:

import string

y = [int(elt) for elt in x if elt not in string.whitespace]

(thanks tgray)

3 Comments

Nope doesn't work, I get an error : ValueError: invalid literal for int() with base 10: ' '
Yeah, that's because you also have return characters. See the edited version.
You probably want to use string.whitespace in your conditional. i.e. import string; y = [int(i) for i in x if i not in string.whitespace]
0

Use re.sub("\s", "", current_string) to remove the whitespace.
As for converting the str list to int list, simply convert the str to int when splitting the values into the list. I'm unsure how you plan to do the splitting; so, I cannot write an actual code for the exact conversion.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.