How to store model to hdfs using python

Question

I am trying to store my model to hdfs using python.

This code is by using pydoop library

import pydoop.hdfs as hdfs

    from_path = prediction_model.fit(orginal_telecom_80p_train[features], orginal_telecom_80p_train["Churn"])
    to_path ='hdfs://192.168.1.101:8020/user/volumata/python_models/churn_model.sav'
    hdfs.put(from_path, to_path)

But, while using this, I am getting this error

AttributeError: 'LogisticRegression' object has no attribute 'startswith'

Then I tried using the pickle option

import pickle 
with open('hdfs://192.168.1.101:8020/user/volumata/python_models/') as hdfs_loc:
pickle.dump(prediction_model, hdfs_loc)

Pickle option is working fine in local, when i tried to store the model in hdfs, this option is also not working for me. Can anyone please suggest how to proceed further for storing the models to hdfs by using python script?

Please edit your question to include the full traceback.... Also, not sure where prediction_model comes from, but 1) i don't think prediction_model.fit is returning a path of a file 2) PySpark is commonly used for machine learning with Hadoop — OneCricketeer
– OneCricketeer, Commented Apr 20, 2018 at 22:55
Because it doesn't work, yes. I don't use Pickle on Hadoop, so I can't give you a proper solution here — OneCricketeer
– OneCricketeer, Commented Apr 23, 2018 at 5:24
Then what else you used for hadoop? can give me any other suggestions instead of pickle — SARANYA
– SARANYA, Commented Apr 23, 2018 at 5:27

simleo · Accepted Answer · 2018-05-16 13:07:04Z

1

You have to use hdfs.open instead of open, and open the file for writing:

import pickle
import pydoop.hdfs as hdfs

with hdfs.open(to_path, 'w') as f:
    pickle.dump(prediction_model, f)

answered May 16, 2018 at 13:07

simleo

3,00525 silver badges25 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

SARANYA Over a year ago

Traceback (most recent call last): File "<ipython-input-5-09aac99ede3d>", line 1, in <module> with hdfs.open(to_path, 'w') as f: File "/home/volumata/anaconda3/lib/python3.5/site-packages/pydoop/hdfs/__init__.py", line 121, in open fs = hdfs(host, port, user) File "/home/volumata/anaconda3/lib/python3.5/site-packages/pydoop/hdfs/fs.py", line 146, in init host = common.encode_host(host) File "/home/volumata/anaconda3/lib/python3.5/site-packages/pydoop/hdfs/common.py", line 53, in encode_host if isinstance(host, unicode):

simleo Over a year ago

@SARANYA make sure you are using the latest pydoop version, currently 2.0a3: pip install --upgrade --pre pydoop. Pydoop 1.x does not support Python 3

SARANYA Over a year ago

same issue after upgrading also

simleo Over a year ago

Maybe the upgrade wasn't successful. The if isinstance(host, unicode) check is not on line 53 in the latest version. Moreover, it's only triggered if you are using Python 2. Check the value of pydoop.__version__

Collectives™ on Stack Overflow

How to store model to hdfs using python

1 Answer 1

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related