Taking python output to a pandas dataframe

Question

I'm trying to take the output from this code into a pandas dataframe. I'm really only trying to pull the first part of the output which is the stock symbols,company name, field3, field4. The output has a lot of other data I'm not interested in but it's giving me everything. Could someone help me to put this into a dataframe if possible?

The current output is in this format

["ABBV","AbbVie","_DRUGM","S&P 100,&nbsp;S&P 500"],["ABC","AmerisourceBergen","_MEDID","S&P 500"],

Desired Output

Full Code

    import requests
    import pandas as pd
    import requests
    
    url = "https://www.stockrover.com/build/production/Research/tail.js?1644930560"
    
    payload={}
    headers = {}
    
    response = requests.request("GET", url, headers=headers, data=payload)
    
    print(response.text)

Please clarify what you are asking. You identify a list of lists as the output but fail to tell us what it is the output of. You want a dataframe but fail to define what you expect the format of the dataframe to be. We are good but unfortunately not clairvoyant. Please edit your question to show a minimal reproducible set consisting of sample input, expected output, actual output, and only the relevant code necessary to reproduce the problem. — itprorh66
– itprorh66, Commented Feb 26, 2022 at 18:50

preritdas · Accepted Answer · 2022-02-26 22:02:32Z

1

Use a dictionary to store the data from your tuple of lists, then create a DataFrame based on that dictionary. In my solution below, I omit the 'ID' field because the index of the DataFrame serves the same purpose.

import pandas as pd

# Store the data you're getting from requests
data = ["ABBV","AbbVie","_DRUGM","S&P 100,&nbsp;S&P 500"],["ABC","AmerisourceBergen","_MEDID","S&P 500"]

# Create an empty dictionary with relevant keys
dic = {
    "Ticker": [],
    "Name": [],
    "Field3": [],
    "Field4": []
}

# Append data to the dictionary for every list in your `response`
for pos, lst in enumerate(data):
    dic['Ticker'].append(lst[0])
    dic['Name'].append(lst[1])
    dic['Field3'].append(lst[2])
    dic['Field4'].append(lst[3])

# Create a DataFrame from the dictionary above
df = pd.DataFrame(dic)

The resulting dictionary looks like so.

Edit: A More Efficient Approach

In my solution above, I manually called the list form of each key in the dic dictionary. Using zip we can streamline the process and have it work for any length response and any changes you make to the labels of the dictionary.

The only caveat to this method is that you have to make sure the order of keys in the dictionary lines up with the data in each list in your response. For example, if Ticker is the first dictionary key, the ticker must be the first item in the list resulted from your response. This was true for the first solution, too, however.

new_dic = {
    "Ticker": [],
    "Name": [],
    "Field3": [],
    "Field4": []
}

for pos, lst in enumerate(data): # Iterate position and list
    for key, item in zip(new_dic, data[pos]): # Iterate key and item in list
        new_dic[key].append(item) # Append to each key the item in list

df = pd.DataFrame(new_dic)

The result is identical to the method above:

Edit (even better!)

I'm coming back to this after learning from a commenter that pd.DataFrame() can input two-dimensional array data and output a DataFrame. This would streamline the entire process several times over:

import pandas as pd

# Store the data you're getting from requests
data = ["ABBV","AbbVie","_DRUGM","S&P 100,&nbsp;S&P 500"],["ABC","AmerisourceBergen","_MEDID","S&P 500"]

# Define columns
columns = ['ticker', 'name', 'field3', 'field4']

df = pd.DataFrame(data, columns = columns)

The result (same as first two):

edited Feb 26, 2022 at 22:02

answered Feb 26, 2022 at 19:56

preritdas

5354 silver badges12 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

J R Over a year ago

I appreciate your solutions very much, they work really nice! Can you point me in the right direction how to pull the information directly from the output instead of having the data hardcoded like data = ["ABBV". Thanks again for all the help!

Rabinzel Over a year ago

I'm not sure why you build a new dictionary and then append to it? If you have to rely on the position in the list, then just do this one line and it does exactly the same (data as a list like in OP's post): df = pd.DataFrame(data, columns= ['ticker', 'name', 'field3', 'field4'])

preritdas Over a year ago

@JR Sure thing. Saving your outputs to variables is generally good practice. But, something you can do is convert your requests.response() output to JSON formatting using the .json() function: response = requests.get(url).json() for example. This will make type(response) a dictionary. You can subvert the dic/new_dic dictionaries in this way as long as you modify the code to interpret the response in the way it is returned.

Rabinzel Over a year ago

I played around with it a bit in the past because I couldn't figure out if pd.DataFrame takes my input data row- or columnwise. By the time you have different data with different format goals. If you pass a dict, your key is the column name and the value (which is e.g a list then) are the column values. If you just pass a list of lists (or tuples) to the dataframe, every list/tuple represents a row and the dataframe's columns are the length of every list. see documentation of pandas there are good examples. You're welcome!

Rabinzel Over a year ago

the good thing is, I didn't know about the json thingy with request. Will try it myself. So we helped each other :) .....and I hope we helped the OP too!

|

Collectives™ on Stack Overflow

Taking python output to a pandas dataframe

1 Answer 1

Edit: A More Efficient Approach

Edit (even better!)

6 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Edit: A More Efficient Approach

Edit (even better!)

6 Comments

Your Answer

Sign up or log in

Post as a guest

Related