0

My code is downloading an excel report and turning it into a dataframe that I then create a few columns based on existing column information. I wasn't experiencing issues before, but am getting this error now:

ValueError: cannot set using a multi-index selection indexer with a different length than the value

Here is a sample of the code. The error occurs on the first line:

df.loc[df['Blank'] != 'ENDING MY','Month'] = pd.DatetimeIndex(df['Date']).month
df.loc[(df['Blank'] == 'ENDING MY') & (df['Commodity'] == 'All Upland Cotton'),'Month'] = 7
df.loc[(df['Blank'] == 'ENDING MY') & (df['Commodity'] == 'All Wheat'),'Month'] = 5
df.loc[(df['Blank'] == 'ENDING MY') & (df['Commodity'] == 'Corn'),'Month'] = 8
df.loc[(df['Blank'] == 'ENDING MY') & (df['Commodity'] == 'Soybeans'),'Month'] = 8
df.loc[(df['Blank'] == 'ENDING MY') & (df['Commodity'] == 'Sorghum'),'Month'] = 8

There are only three potential variables in the 'Blank' column: it is empty, STARTING MY, or ENDING MY. This particular data pull has both ENDING MY and STARTING MY which might be different than when I was testing it.

However, because the code is outputting an error on the first line, the two options are STARTING MY and blank. Before we had blank and not STARTING MY so I tried a line of code before this that was simply:

df.loc[df['Blank'] == 'STARTING MY','Month'] = pd.DatetimeIndex(df['Date']).month

The error threw again on that line.

Does anyone have any ideas why it is causing this issue and what I can do to fix it?

Sample columns from dataframe: ** are column names. The month column is calculated. In this case soybeans are supposed to be month 8 as well.

**Commodity** **Blank** **Value1** **Value 2** **Value 3** **Date**    **Month**
All Wheat                   1           3          4       2020-08-17      8
All Wheat                   4           4          2       2020-08-17      8
Corn                        1           12         5       2020-08-17      8
Corn                        4           24         5       2020-08-17      8
Soybeans      ENDING MY     2           34         24      2020-08-17      8
Soybeans      ENDING MY     34          2          34      2020-08-17      8
Sorghum       STARTING MY   4           45         3       2020-08-17      8
Sorghum       STARTING MY   4           34         4       2020-08-17      8
2
  • 1
    Can you paste a sample of your dataframe in the question? Commented Aug 17, 2020 at 13:12
  • sure, I've updated with a sample! Commented Aug 17, 2020 at 13:51

1 Answer 1

1

IIUC, you need to do this, convert the Date column to datetime first then, set the value:

df['Date'] = pd.to_datetime(df['Date'])
df.loc[df['Blank'] == 'STARTING MY','Month'] = df['Date'].dt.month
print(df)

   Commodity        Blank  Value1  Value 2  Value 3       Date  Month
0  All Wheat          NaN       1        3        4 2020-08-17    NaN
1  All Wheat          NaN       4        4        2 2020-08-17    NaN
2       Corn          NaN       1       12        5 2020-08-17    NaN
3       Corn          NaN       4       24        5 2020-08-17    NaN
4   Soybeans    ENDING MY       2       34       24 2020-08-17    NaN
5   Soybeans    ENDING MY      34        2       34 2020-08-17    NaN
6    Sorghum  STARTING MY       4       45        3 2020-08-17    8.0
7    Sorghum  STARTING MY       4       34        4 2020-08-17    8.0
Sign up to request clarification or add additional context in comments.

2 Comments

I appreciate your response, but that doesn't solve the issue. I end up with the same error. I believe it is already in datetime, i just used df = df.fillna('') to rid the NaN spaces.
Hmm are you sure you converted date to datetime object using pd.to_datetime and then just use df['Date']? That shouldn't give you the same error.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.