1

The Scenario:

I have 2 dataframes fc0 and yc0. Where fc0 is a Cluster and yc0 is another dataframe which needs to be merged in fc0.

The Nature of data is as follows:

fc0

uid         1         2         3         4         5         6  
234  235  4.000000  4.074464  4.128026  3.973045  3.921663  4.024864   
235  236  3.524208  3.125669  3.652112  3.626923  3.524318  3.650589   
236  237  4.174080  4.226267  4.200133  4.150983  4.124157  4.200052

yc0

iid  uid    1    2    5    6    9    15
0    944  5.0  3.0  4.0  3.0  3.0  5.0 

The Twist

I have 1682 columns in fc0 and I have few hundered values in yc0. Now I need the yc0 to go into fc0

In haste of resolving it, I even tried yc0.reset_index(inplace=True) but wasn't really helpful.

Expected Output

     uid         1         2         3         4         5         6  
234  235  4.000000  4.074464  4.128026  3.973045  3.921663  4.024864   
235  236  3.524208  3.125669  3.652112  3.626923  3.524318  3.650589   
236  237  4.174080  4.226267  4.200133  4.150983  4.124157  4.200052
944  5.0       3.0       NaN       NaN       4.0       3.0       3.0

References

Link1 Tried this, but landed up inserting NaN values for 1st 16 Columns and rest of the data shifted by that many columns

Link2 Couldn't match column keys, besides I tried it for row.

Link3 Merging doesn't match the columns in it.

Link4 Concatenation doesn't work that way.

Link5 Same issues with Join.

EDIT 1

fc0.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 235 entries, 234 to 468
Columns: 1683 entries, uid to 1682
dtypes: float64(1682), int64(1)
memory usage: 3.0 MB

and

yc0.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1 entries, 0 to 0
Columns: 336 entries, uid to 1007
dtypes: float64(335), int64(1)
memory usage: 2.7 KB
7
  • Is it right that the values in your yc0 DataFrame don't align with the columns in your expected output? For example in yc0 the uid value is 944, but in the expected output 944 is the index, and uid is now 5.0. Commented Dec 7, 2017 at 18:17
  • @Ben well, when I see that in show variables of Pycharm, I can see the uid is 944 but when I print it, it shows iid uid in yc0. So I'm quite confused which one is correct! Commented Dec 7, 2017 at 18:22
  • Can you show the info for both input data frames in your question. fc0.info() and yc0.info() ? Commented Dec 7, 2017 at 18:52
  • @ScottBoston Please check in Edit 1 of Question. Commented Dec 8, 2017 at 1:38
  • It seems that you need to adjust the columns in your yc0 dataframe. You could try something like isolate the columns that are numerical, df.columns = df.columns - 1 then use pd.concat. Pandas will align the columns based on the column index. The trick for you is get the columns of your yc0 datafame to line up correctly with your fc0 dataframe. Commented Dec 8, 2017 at 3:49

1 Answer 1

3

Here's a MVCE example. Does this small sample data show the functionality that you are expecting?

df1 = pd.DataFrame(np.random.randint(0,100,(5,4)), columns=list('ABCE'))

    A   B   C   E
0  81  57  54  88
1  63  63  74  10
2  13  89  88  66
3  90  81   3  31
4  66  93  55   4

df2 = pd.DataFrame(np.random.randint(0,100,(5,4)), columns=list('BCDE'))

    B   C   D   E
0  93  48  62  25
1  24  97  52  88
2  53  50  21  13
3  81  27   7  81
4  10  21  77  19

df_out = pd.concat([df1,df2])
print(df_out)

Output:

      A   B   C     D   E
0  81.0  57  54   NaN  88
1  63.0  63  74   NaN  10
2  13.0  89  88   NaN  66
3  90.0  81   3   NaN  31
4  66.0  93  55   NaN   4
0   NaN  93  48  62.0  25
1   NaN  24  97  52.0  88
2   NaN  53  50  21.0  13
3   NaN  81  27   7.0  81
4   NaN  10  21  77.0  19
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.