Removing Empty Dataframes with pandas

Question

I have written the following code to use regex to request pages, and look for strings that resemble interest rates. The overall code works; however, it is creating multiple empty dataframes and I can't get the code to drop the empty frames to clean up my output. I have been trying to use .dropna, .drop, and .empty to try and deprecate the dataframes but the output remains unchanged and keeps printing the empty dataframes with the information I have already. Is there an method I am not aware of that could get rid of these empty frames. Code and output below:

plcompetitors = ['https://www.lendingclub.com/loans/personal-loans',
                'https://www.marcus.com/us/en/personal-loans',
                'https://www.discover.com/personal-loans/']

#cycle through links in array until it finds APR rates/fixed or variable using regex
for link in plcompetitors:
    cdate = datetime.date.today()
    l = r.get(link)
    l.encoding = 'utf-8'
    data = l.text
    soup = bs(data, 'html.parser')
    paragraph = soup.find_all(text=re.compile('[0-9]%'))
    for n in paragraph:
        matches = []
        matches.extend(re.findall('(?i)\d+(?:\.\d+)?%\s*(?:to|-)\s*\d+(?:\.\d+)?%', n.string))
        sint = pd.Series(matches)
        qdate = pd.Series([datetime.datetime.now()]*len(sint))
        slink = pd.Series([link]*len(sint))
        df = pd.concat([qdate,sint,slink],axis=1)
        df.columns = ['Date','Interest Rate', 'URL']
        print(df)

Output:

  ...
0 ...
1 ...

[2 rows x 3 columns]
 ...
0 ...

[1 rows x 3 columns]
 ...
0 ...
1 ...
2 ...
3 ...

[4 rows x 3 columns]
Empty DataFrame
Columns: [Date, Interest Rate, URL]
Index: []
Empty DataFrame
Columns: [Date, Interest Rate, URL]
Index: []
Empty DataFrame
Columns: [Date, Interest Rate, URL]
Index: []
Empty DataFrame
Columns: [Date, Interest Rate, URL]
Index: []
  ...
0 ...

[1 rows x 3 columns]
Empty DataFrame
Columns: [Date, Interest Rate, URL]
Index: []
Empty DataFrame
Columns: [Date, Interest Rate, URL]
Index: []
Empty DataFrame
Columns: [Date, Interest Rate, URL]
Index: []
Empty DataFrame
Columns: [Date, Interest Rate, URL]
Index: []

Teddy Ward · Accepted Answer · 2018-06-26 23:13:03Z

6

How about you just don't print/use the empty ones?

if df.empty:
  continue

Or

if not df.empty:
  print(df)

answered Jun 26, 2018 at 23:13

Teddy Ward

4503 silver badges17 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

wavey Over a year ago

Of course it is that simple. Thank you. Wow I feel stupid. Appreciate that

piRSquared Over a year ago

@dtrinh simple but not well known. I don’t think 🤔

user85779 · Accepted Answer · 2018-06-27 00:43:22Z

0

if df.dropna(how='all').empty:
    continue

as per https://pandas.pydata.org/pandas-docs/version/0.18/generated/pandas.Series.empty.html a df with only nans will return False for .empty so if that matters good to use dropna first. You can use 'any' if having any NaN is too much or 'all' if you only want to drop a row/column if its all NaNs (probably what you want)

answered Jun 27, 2018 at 0:43

user85779

3342 silver badges11 bronze badges

Collectives™ on Stack Overflow

Removing Empty Dataframes with pandas

2 Answers 2

2 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related