2

I have a dataframe df_alltrades with a number of similar columns where the string includes integers:

    instrument  Bid0Mkt  Bid1Mkt  Bid2Mkt  Bid3Mkt Bid4Mkt  ...
0   EURUSD      1.1      1.2      1.2      1.3     1.3
1   NZDUSD      0.6      0.65     0.7      0.9     0.92
.   .           .        .        .        .       
.   .           .        .        .        .       

I also have corresponding columns for OfferXMkt and a number of other columns formatted as str.

I want to remove any columns where X in BidXMkt and OfferXMkt is greater than 0 so that I end up with something like:

     instrument  Bid0Mkt  Offer0Mkt ...
0    EURUSD      1.1      
1    NZDUSD      0.6      
.    .           .        
.    .           .            

I can do this by writing out each if statement

for column in df_alltrades.columns:
    if '1' in column:
        df_alltrades.drop(columns=column, inplace=True)
    if '2' in column:
        df_alltrades.drop(columns=column, inplace=True)
    if '3' in column:
        df_alltrades.drop(columns=column, inplace=True)
    if '4' in column:
        df_alltrades.drop(columns=column, inplace=True)

but I was hoping to clean it up with a loop, something like:

for column in df_alltrades.columns:
    for C in range(1, 5):
        if C in column:
            df_alltrades.drop(columns=column, inplace=True)

but I'm getting the error: TypeError: 'in <string>' requires string as left operand, not int

I have also tried things like;

for X in range(1, 5):
    BidMkt = 'Bid{}Mkt'.format(X)

    df_new = df_alltrades.drop([BidMkt], axis = 1)

which even though print(BidMkt) gives the full list, only Bid4Mkt is dropped.

Thanks for any help!

3 Answers 3

1

The error is telling you where the problem is C needs to be a string not an integer:

   for column in df_alltrades.columns:
        for C in range(1, 5):
            if str(C) in column:
                df_alltrades.drop(columns=column, inplace=True)

but if I were you I would first do this to get the headers as a list

headers = list(df.columns.values)

Filter the necessary columns from this list and then use it to slice the Dataframe like this

df = df[headers]
Sign up to request clarification or add additional context in comments.

1 Comment

Ahh thank you! Just the if str(C) in column: line has fixed my problem, this was driving me mad
1

Check what columns don't have 0 in it:

for column in df_alltrades.columns:
    if 'Bid' in column and '0' not in column:
        df_alltrades.drop(columns=column, inplace=True)

2 Comments

Unfortunately this would remove all other columns I have though, including instrument (and others not listed)
@cscarlet Sorry, I had forgot about restricting it only to those columns. Would the edit work for you?
1

Create a remove list using list comprehension and drop those columns

d = {'instrument': {0: 'EURUSD', 1: 'NZDUSD'},
 'Bid0Mkt': {0: 1.1, 1: 0.6},
 'Bid1Mkt': {0: 1.2, 1: 0.65},
 'Bid2Mkt': {0: 1.2, 1: 0.7},
 'Bid3Mkt': {0: 1.3, 1: 0.9},
 'Bid4Mkt': {0: 1.3, 1: 0.92},
 'Another': {0: 1, 1: 2}}

df = pd.DataFrame(d)

remove = [col for col in df.columns if col.startswith('Bid') and int(col[3]) > 0]
new_df = df.drop(columns=remove)

  instrument  Bid0Mkt  Another
0     EURUSD      1.1        1
1     NZDUSD      0.6        2

4 Comments

Getting ValueError: invalid literal for int() with base 10: 't' quick read through this looks like it won't work due to the string containing letters "Strings with letters cannot be converted to an integer unless those letters have a special meaning in Python."
@cscarlet then your actual column layout is not the same as your example. Do your Bid columns begin at df.columns[1:] Probably not. df.columns[1:] skips the first column, which in your example is instrument That way I am only looking at the bid columns.
Ah okay, they do but there are other columns following the Bid columns that do not contain integers - sorry I should have specified
@cscarlet Then just modify the list comprehension to include string.startswith: remove = [col for col in df.columns if col.startswith('Bid') and int(col[3]) > 0]

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.