1

I have a 1996 * 9 array:

array([[ 0.,  1.,  1., ...,  1.,  1.,  0.],
       [ 1.,  1.,  0., ...,  1.,  0.,  1.],
       [ 0.,  1.,  1., ...,  1.,  1.,  0.],
       ..., 
       [ 0.,  0.,  0., ...,  0.,  0.,  1.],
       [ 0.,  1.,  1., ...,  1.,  1.,  0.],
       [ 0.,  1.,  1., ...,  1.,  1.,  0.]])

I want a 1996 * 1 array.

What I did:

pd.DataFrame(train_L.astype(int)).apply(lambda x: ''.join(str(x)), axis = 1)

I get

0       0    0\n1    1\n2    1\n3    1\n4    1\n5    1...
1       0    1\n1    1\n2    0\n3    0\n4    0\n5    0...
2       0    0\n1    1\n2    1\n3    0\n4    1\n5    1...
3       0    0\n1    1\n2    1\n3    0\n4    1\n5    1...
4       0    1\n1    0\n2    0\n3    0\n4    0\n5    0...

The problem:

  1. I introduced an extra all-zero column.
  2. introduced \n1
  3. convert type too many times.

My question: Is there a easy way to do the merge without such caveats?


Example output

What I have:

v1 v2 v3 ... v9
1  0  0  ... 1

I want:

      v1
1\t0\t0\t...\t1
  1. The number of columns reduce to 1
  2. Each element is separated by \t.

Why I need such weird form:

For image processing, we have one column for the labels of image. However, one image may have multiple labels. I have to squeeze multiple labels into 1 column. That's the requirement by the library.

7
  • 2
    What do you mean by 1996 x 1 array? you're not concatenating or reshaping here, can you explain the desired output Commented Feb 16, 2016 at 19:35
  • I updated my question. Is that clear? Commented Feb 16, 2016 at 19:38
  • @EdChum: He wants all the columns to be merged into one with ''.join() Commented Feb 16, 2016 at 19:40
  • So you want df.astype(str).apply(lambda x: ''.join, axis=1)? Commented Feb 16, 2016 at 19:41
  • What you want is now clear, but I feel I should mention that's a very unusual thing to want. Commented Feb 16, 2016 at 19:42

2 Answers 2

1

This results in a string, which is probably not what you want. Perhaps you should explain why you would like your data in your requested format.

a = np.array([[ 0.,  1.,  1., 1.,  1.,  0.],
              [ 1.,  1.,  0., 1.,  0.,  1.],
              [ 0.,  1.,  1., 1.,  1.,  0.],
              [ 0.,  0.,  0., 0.,  0.,  1.],
              [ 0.,  1.,  1., 1.,  1.,  0.],
              [ 0.,  1.,  1., 1.,  1.,  0.]])

v = pd.DataFrame(['\t'.join([str(val) for val in row]) for row in a], columns=['v1'])

for row in v.iterrows():
    print(row[1].v1)
0.0     1.0     1.0     1.0     1.0     0.0
1.0     1.0     0.0     1.0     0.0     1.0
0.0     1.0     1.0     1.0     1.0     0.0
0.0     0.0     0.0     0.0     0.0     1.0
0.0     1.0     1.0     1.0     1.0     0.0
0.0     1.0     1.0     1.0     1.0     0.0

>>> v
                             v1
0  0.0\t1.0\t1.0\t1.0\t1.0\t0.0
1  1.0\t1.0\t0.0\t1.0\t0.0\t1.0
2  0.0\t1.0\t1.0\t1.0\t1.0\t0.0
3  0.0\t0.0\t0.0\t0.0\t0.0\t1.0
4  0.0\t1.0\t1.0\t1.0\t1.0\t0.0
5  0.0\t1.0\t1.0\t1.0\t1.0\t0.0
Sign up to request clarification or add additional context in comments.

1 Comment

Updated why I need such weird form in the question.
1

You can apply a lambda after converting the dtype to str:

In [14]:

df = pd.DataFrame(np.random.randn(4,5))
df

Out[14]:
          0         1         2         3         4
0  1.036485 -1.243777  1.286254  1.973786 -0.083245
1  1.698828  1.696846  0.037732 -0.630546 -0.135069
2 -1.231337 -1.166480  0.046414 -0.965710  1.341809
3  0.591176  0.275267 -0.446553 -0.230353  0.258817

In [16]:
df.astype(str).apply(lambda x: ''.join(x), axis=1)

Out[16]:
0    1.03648484941-1.243776761241.286253591521.9737...
1    1.698827772721.696846119330.0377324485782-0.63...
2    -1.23133722226-1.166480155330.046414100678-0.9...
3    0.5911755605680.275266550205-0.446552705185-0....
dtype: object

It seems you want a tab you can just join with a tab:

In [17]:
df.astype(str).apply(lambda x: '\t'.join(x), axis=1)

Out[17]:
0    1.03648484941\t-1.24377676124\t1.28625359152\t...
1    1.69882777272\t1.69684611933\t0.0377324485782\...
2    -1.23133722226\t-1.16648015533\t0.046414100678...
3    0.591175560568\t0.275266550205\t-0.44655270518...
dtype: object

3 Comments

Thanks! One problem the separation is missed(by a tab).
What separation, you stated you wanted to join without a space?
Each element is separated by \t. I am not sure whether the example output is clear.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.