0

So, first of all, I'm relatively new to Python so I'm not sure how to achieve my task. I was following an online tutorial on how to plot a decision tree using the Iris dataset (for classification). However, I'm trying to plot a single tree from regression.

Here's a snip of the data I'm using: Data

Here's the code I was using:

# Import Libraries and Load Data
import pandas as pd 
data = pd.read_csv("/Users/.../Desktop/cars_test.csv") 
import matplotlib.pyplot as plt
import numpy as np
cars = data

# Model
from sklearn.ensemble import RandomForestRegressor
model = RandomForestRegressor(n_estimators=10)

# Train
model.fit(cars.data, cars.target)

# Extract single tree for analysis
estimator = model.estimators_[5]

However, I'm getting an error that I'm not sure how to fix... The error I'm getting is:

AttributeError                            Traceback (most recent call
last) <ipython-input-27-37164305d7fe> in <module>()
     10 
     11 # Train
---> 12 model.fit(cars.data, cars.target)
     13 
     14 # Extract single tree for analysis

~/anaconda3/lib/python3.6/site-packages/pandas/core/generic.py in
__getattr__(self, name)    4370             if self._info_axis._can_hold_identifiers_and_holds_name(name):    4371   
return self[name]
-> 4372             return object.__getattribute__(self, name)    4373     4374     def __setattr__(self, name, value):

AttributeError: 'DataFrame' object has no attribute 'data'

Any suggestions as to what I'm doing wrong?

1 Answer 1

1

You need to adapt the code to deal with your own data (note that the DataFrame you loaded doesn't have attributes for target or data). This means extracting the matrix of input data (X) and response variable (y) from your original dataset. I'm making a few assumptions here, but you can adapt accordingly.

# Import Libraries and Load Data
import pandas as pd 
data = pd.read_csv("/Users/.../Desktop/cars_test.csv") 
import matplotlib.pyplot as plt
import numpy as np
cars = data

# Model
from sklearn.ensemble import RandomForestRegressor
model = RandomForestRegressor(n_estimators=10)

X = cars.loc[:, cars.columns != 'th_km_per_year'].values
y = cars['th_km_per_year'].values

# Train
model.fit(X, y)

# Extract single tree for analysis
estimator = model.estimators_[5]
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.