Revisions to Conditional Concatenation of a Pandas DataFrame

replaced http://stackoverflow.com/ with https://stackoverflow.com/

Source Link

edited May 23, 2017 at 12:40

1

There's no need to create a lambda for this.

Let's suppose we have the following dataframe:

my_df = pd.DataFrame({
    'Apple':  ['1', '4', '7'],
    'Pear':   ['2', '5', '8'],
    'Cherry': ['3', np.nan, '9']})

Which is:

Apple Cherry Pear
   1      3    2
   4    NaN    5
   7      9    8

An easier way to achieve what you want without the apply() function is:

use iterrows() to parse each row one by one.
use Series() and str.cat() to do the merge.

You'll get this:

l = []
for _, row in my_df.iterrows():
    l.append(pd.Series(row).str.cat(sep='::'))

empty_df = pd.DataFrame(l, columns=['Result'])

Doing this, NaN will automatically be taken out, and will lead us to the desired result:

Result
1::3::2
   4::5
7::9::8

The entire program may look like:

import pandas as pd
import numpy as np


def merge_columns(my_df):
    l = []
    for _, row in my_df.iterrows():
        l.append(pd.Series(row).str.cat(sep='::'))
    empty_df = pd.DataFrame(l, columns=['Result'])

    return empty_df.to_string(index=False)


if __name__ == '__main__':
    my_df = pd.DataFrame({
        'Apple': ['1', '4', '7'],
        'Pear': ['2', '5', '8'],
        'Cherry': ['3', np.nan, '9']})
    print(merge_columns(my_df))

There are other things that I added to my answer as:

if __name__ == '__main__'if __name__ == '__main__'
added the logic into its own function so that you can reuse it later

As @MathiasEttinger suggested, you can also modify the above function to use list comprehension to get a slightly better performance:

def merge_columns_1(my_df):
    l = [pd.Series(row).str.cat(sep='::') for _, row in my_df.iterrows()]

    return pd.DataFrame(l, columns=['Result']).to_string(index=False)

I'll let the order of the columns as an exercise for OP.

There's no need to create a lambda for this.

Let's suppose we have the following dataframe:

my_df = pd.DataFrame({
    'Apple':  ['1', '4', '7'],
    'Pear':   ['2', '5', '8'],
    'Cherry': ['3', np.nan, '9']})

Which is:

Apple Cherry Pear
   1      3    2
   4    NaN    5
   7      9    8

An easier way to achieve what you want without the apply() function is:

use iterrows() to parse each row one by one.
use Series() and str.cat() to do the merge.

You'll get this:

l = []
for _, row in my_df.iterrows():
    l.append(pd.Series(row).str.cat(sep='::'))

empty_df = pd.DataFrame(l, columns=['Result'])

Doing this, NaN will automatically be taken out, and will lead us to the desired result:

Result
1::3::2
   4::5
7::9::8

The entire program may look like:

import pandas as pd
import numpy as np


def merge_columns(my_df):
    l = []
    for _, row in my_df.iterrows():
        l.append(pd.Series(row).str.cat(sep='::'))
    empty_df = pd.DataFrame(l, columns=['Result'])

    return empty_df.to_string(index=False)


if __name__ == '__main__':
    my_df = pd.DataFrame({
        'Apple': ['1', '4', '7'],
        'Pear': ['2', '5', '8'],
        'Cherry': ['3', np.nan, '9']})
    print(merge_columns(my_df))

There are other things that I added to my answer as:

if __name__ == '__main__'
added the logic into its own function so that you can reuse it later

As @MathiasEttinger suggested, you can also modify the above function to use list comprehension to get a slightly better performance:

def merge_columns_1(my_df):
    l = [pd.Series(row).str.cat(sep='::') for _, row in my_df.iterrows()]

    return pd.DataFrame(l, columns=['Result']).to_string(index=False)

I'll let the order of the columns as an exercise for OP.

There's no need to create a lambda for this.

Let's suppose we have the following dataframe:

my_df = pd.DataFrame({
    'Apple':  ['1', '4', '7'],
    'Pear':   ['2', '5', '8'],
    'Cherry': ['3', np.nan, '9']})

Which is:

Apple Cherry Pear
   1      3    2
   4    NaN    5
   7      9    8

An easier way to achieve what you want without the apply() function is:

use iterrows() to parse each row one by one.
use Series() and str.cat() to do the merge.

You'll get this:

l = []
for _, row in my_df.iterrows():
    l.append(pd.Series(row).str.cat(sep='::'))

empty_df = pd.DataFrame(l, columns=['Result'])

Doing this, NaN will automatically be taken out, and will lead us to the desired result:

Result
1::3::2
   4::5
7::9::8

The entire program may look like:

import pandas as pd
import numpy as np


def merge_columns(my_df):
    l = []
    for _, row in my_df.iterrows():
        l.append(pd.Series(row).str.cat(sep='::'))
    empty_df = pd.DataFrame(l, columns=['Result'])

    return empty_df.to_string(index=False)


if __name__ == '__main__':
    my_df = pd.DataFrame({
        'Apple': ['1', '4', '7'],
        'Pear': ['2', '5', '8'],
        'Cherry': ['3', np.nan, '9']})
    print(merge_columns(my_df))

There are other things that I added to my answer as:

if __name__ == '__main__'
added the logic into its own function so that you can reuse it later

As @MathiasEttinger suggested, you can also modify the above function to use list comprehension to get a slightly better performance:

def merge_columns_1(my_df):
    l = [pd.Series(row).str.cat(sep='::') for _, row in my_df.iterrows()]

    return pd.DataFrame(l, columns=['Result']).to_string(index=False)

I'll let the order of the columns as an exercise for OP.

replaced http://codereview.stackexchange.com/ with https://codereview.stackexchange.com/

Source Link

edited Apr 13, 2017 at 12:40

Community Bot

1

There's no need to create a lambda for this.

Let's suppose we have the following dataframe:

my_df = pd.DataFrame({
    'Apple':  ['1', '4', '7'],
    'Pear':   ['2', '5', '8'],
    'Cherry': ['3', np.nan, '9']})

Which is:

Apple Cherry Pear
   1      3    2
   4    NaN    5
   7      9    8

An easier way to achieve what you want without the apply() function is:

use iterrows() to parse each row one by one.
use Series() and str.cat() to do the merge.

You'll get this:

l = []
for _, row in my_df.iterrows():
    l.append(pd.Series(row).str.cat(sep='::'))

empty_df = pd.DataFrame(l, columns=['Result'])

Doing this, NaN will automatically be taken out, and will lead us to the desired result:

Result
1::3::2
   4::5
7::9::8

The entire program may look like:

import pandas as pd
import numpy as np


def merge_columns(my_df):
    l = []
    for _, row in my_df.iterrows():
        l.append(pd.Series(row).str.cat(sep='::'))
    empty_df = pd.DataFrame(l, columns=['Result'])

    return empty_df.to_string(index=False)


if __name__ == '__main__':
    my_df = pd.DataFrame({
        'Apple': ['1', '4', '7'],
        'Pear': ['2', '5', '8'],
        'Cherry': ['3', np.nan, '9']})
    print(merge_columns(my_df))

There are other things that I added to my answer as:

if __name__ == '__main__'
added the logic into its own function so that you can reuse it later

As @MathiasEttinger @MathiasEttinger suggested, you can also modify the above function to use list comprehension to get a slightly better performance:

def merge_columns_1(my_df):
    l = [pd.Series(row).str.cat(sep='::') for _, row in my_df.iterrows()]

    return pd.DataFrame(l, columns=['Result']).to_string(index=False)

I'll let the order of the columns as an exercise for OP.

There's no need to create a lambda for this.

Let's suppose we have the following dataframe:

my_df = pd.DataFrame({
    'Apple':  ['1', '4', '7'],
    'Pear':   ['2', '5', '8'],
    'Cherry': ['3', np.nan, '9']})

Which is:

Apple Cherry Pear
   1      3    2
   4    NaN    5
   7      9    8

An easier way to achieve what you want without the apply() function is:

use iterrows() to parse each row one by one.
use Series() and str.cat() to do the merge.

You'll get this:

l = []
for _, row in my_df.iterrows():
    l.append(pd.Series(row).str.cat(sep='::'))

empty_df = pd.DataFrame(l, columns=['Result'])

Doing this, NaN will automatically be taken out, and will lead us to the desired result:

Result
1::3::2
   4::5
7::9::8

The entire program may look like:

import pandas as pd
import numpy as np


def merge_columns(my_df):
    l = []
    for _, row in my_df.iterrows():
        l.append(pd.Series(row).str.cat(sep='::'))
    empty_df = pd.DataFrame(l, columns=['Result'])

    return empty_df.to_string(index=False)


if __name__ == '__main__':
    my_df = pd.DataFrame({
        'Apple': ['1', '4', '7'],
        'Pear': ['2', '5', '8'],
        'Cherry': ['3', np.nan, '9']})
    print(merge_columns(my_df))

There are other things that I added to my answer as:

if __name__ == '__main__'
added the logic into its own function so that you can reuse it later

As @MathiasEttinger suggested, you can also modify the above function to use list comprehension to get a slightly better performance:

def merge_columns_1(my_df):
    l = [pd.Series(row).str.cat(sep='::') for _, row in my_df.iterrows()]

    return pd.DataFrame(l, columns=['Result']).to_string(index=False)

I'll let the order of the columns as an exercise for OP.

There's no need to create a lambda for this.

Let's suppose we have the following dataframe:

my_df = pd.DataFrame({
    'Apple':  ['1', '4', '7'],
    'Pear':   ['2', '5', '8'],
    'Cherry': ['3', np.nan, '9']})

Which is:

Apple Cherry Pear
   1      3    2
   4    NaN    5
   7      9    8

An easier way to achieve what you want without the apply() function is:

use iterrows() to parse each row one by one.
use Series() and str.cat() to do the merge.

You'll get this:

l = []
for _, row in my_df.iterrows():
    l.append(pd.Series(row).str.cat(sep='::'))

empty_df = pd.DataFrame(l, columns=['Result'])

Doing this, NaN will automatically be taken out, and will lead us to the desired result:

Result
1::3::2
   4::5
7::9::8

The entire program may look like:

import pandas as pd
import numpy as np


def merge_columns(my_df):
    l = []
    for _, row in my_df.iterrows():
        l.append(pd.Series(row).str.cat(sep='::'))
    empty_df = pd.DataFrame(l, columns=['Result'])

    return empty_df.to_string(index=False)


if __name__ == '__main__':
    my_df = pd.DataFrame({
        'Apple': ['1', '4', '7'],
        'Pear': ['2', '5', '8'],
        'Cherry': ['3', np.nan, '9']})
    print(merge_columns(my_df))

There are other things that I added to my answer as:

if __name__ == '__main__'
added the logic into its own function so that you can reuse it later

As @MathiasEttinger suggested, you can also modify the above function to use list comprehension to get a slightly better performance:

def merge_columns_1(my_df):
    l = [pd.Series(row).str.cat(sep='::') for _, row in my_df.iterrows()]

    return pd.DataFrame(l, columns=['Result']).to_string(index=False)

I'll let the order of the columns as an exercise for OP.

added 39 characters in body

Source Link

edited Feb 6, 2017 at 11:03

Grajdeanu Alex

9.3k
4
32
71

There's no need to create a lambda for this.

Let's suppose we have the following dataframe:

my_df = pd.DataFrame({
    'Apple':  ['1', '4', '7'],
    'Pear':   ['2', '5', '8'],
    'Cherry': ['3', np.nan, '9']})

Which is:

Apple Cherry Pear
   1      3    2
   4    NaN    5
   7      9    8

An easier way to achieve what you want without the apply() function is:

use iterrows() to parse each row one by one.
use Series() and str.cat() to do the merge.

You'll get this:

l = []
for _, row in my_df.iterrows():
    l.append(pd.Series(row).str.cat(sep='::'))

empty_df = pd.DataFrame(l, columns=['Result'])

Doing this, NaN will automatically be taken out, and will lead us to the desired result:

Result
1::3::2
   4::5
7::9::8

The entire program may look like:

import pandas as pd
import numpy as np


def merge_columns(my_df):
    l = []
    for _, row in my_df.iterrows():
        l.append(pd.Series(row).str.cat(sep='::'))
    empty_df = pd.DataFrame(l, columns=['Result'])

    return empty_df.to_string(index=False)


if __name__ == '__main__':
    my_df = pd.DataFrame({
        'Apple': ['1', '4', '7'],
        'Pear': ['2', '5', '8'],
        'Cherry': ['3', np.nan, '9']})
    print(merge_columns(my_df))

There are other things that I added to my answer as:

if __name__ == '__main__'
added the logic into its own function so that you can reuse it later

As @MathiasEttinger suggested, you can also modify the above function to use list comprehension to get a slightly better performance:

def merge_columns_1(my_df):
    l = [pd.Series(row).str.cat(sep='::') for _, row in my_df.iterrows()]

    return pd.DataFrame(l, columns=['Result']).to_string(index=False)

I'll let the order of the columns as an exercise for OP.

There's no need to create a lambda for this.

Let's suppose we have the following dataframe:

my_df = pd.DataFrame({
    'Apple':  ['1', '4', '7'],
    'Pear':   ['2', '5', '8'],
    'Cherry': ['3', np.nan, '9']})

Which is:

Apple Cherry Pear
   1      3    2
   4    NaN    5
   7      9    8

An easier way to achieve what you want without the apply() function is:

use iterrows() to parse each row one by one.
use Series() and str.cat() to do the merge.

You'll get this:

l = []
for _, row in my_df.iterrows():
    l.append(pd.Series(row).str.cat(sep='::'))

empty_df = pd.DataFrame(l, columns=['Result'])

Doing this, NaN will automatically be taken out, and will lead us to the desired result:

Result
1::3::2
   4::5
7::9::8

The entire program may look like:

import pandas as pd
import numpy as np


def merge_columns(my_df):
    l = []
    for _, row in my_df.iterrows():
        l.append(pd.Series(row).str.cat(sep='::'))
    empty_df = pd.DataFrame(l, columns=['Result'])

    return empty_df.to_string(index=False)


if __name__ == '__main__':
    my_df = pd.DataFrame({
        'Apple': ['1', '4', '7'],
        'Pear': ['2', '5', '8'],
        'Cherry': ['3', np.nan, '9']})
    print(merge_columns(my_df))

There are other things that I added to my answer as:

if __name__ == '__main__'
added the logic into its own function so that you can reuse it later

As @MathiasEttinger suggested, you can also modify the above function to use list comprehension:

def merge_columns_1(my_df):
    l = [pd.Series(row).str.cat(sep='::') for _, row in my_df.iterrows()]

    return pd.DataFrame(l, columns=['Result']).to_string(index=False)

I'll let the order of the columns as an exercise for OP.

There's no need to create a lambda for this.

Let's suppose we have the following dataframe:

my_df = pd.DataFrame({
    'Apple':  ['1', '4', '7'],
    'Pear':   ['2', '5', '8'],
    'Cherry': ['3', np.nan, '9']})

Which is:

Apple Cherry Pear
   1      3    2
   4    NaN    5
   7      9    8

An easier way to achieve what you want without the apply() function is:

use iterrows() to parse each row one by one.
use Series() and str.cat() to do the merge.

You'll get this:

l = []
for _, row in my_df.iterrows():
    l.append(pd.Series(row).str.cat(sep='::'))

empty_df = pd.DataFrame(l, columns=['Result'])

Doing this, NaN will automatically be taken out, and will lead us to the desired result:

Result
1::3::2
   4::5
7::9::8

The entire program may look like:

import pandas as pd
import numpy as np


def merge_columns(my_df):
    l = []
    for _, row in my_df.iterrows():
        l.append(pd.Series(row).str.cat(sep='::'))
    empty_df = pd.DataFrame(l, columns=['Result'])

    return empty_df.to_string(index=False)


if __name__ == '__main__':
    my_df = pd.DataFrame({
        'Apple': ['1', '4', '7'],
        'Pear': ['2', '5', '8'],
        'Cherry': ['3', np.nan, '9']})
    print(merge_columns(my_df))

There are other things that I added to my answer as:

if __name__ == '__main__'
added the logic into its own function so that you can reuse it later

As @MathiasEttinger suggested, you can also modify the above function to use list comprehension to get a slightly better performance:

def merge_columns_1(my_df):
    l = [pd.Series(row).str.cat(sep='::') for _, row in my_df.iterrows()]

    return pd.DataFrame(l, columns=['Result']).to_string(index=False)

I'll let the order of the columns as an exercise for OP.

added 441 characters in body

Source Link

edited Feb 6, 2017 at 10:56

Grajdeanu Alex

9.3k
4
32
71

Loading

added 265 characters in body

Source Link

edited Feb 6, 2017 at 9:07

Grajdeanu Alex

9.3k
4
32
71

Loading

Source Link

answered Feb 6, 2017 at 8:45

Grajdeanu Alex

9.3k
4
32
71

Loading

Stack Exchange Network

Return to Answer