0

Convert DataFrame in Json , add column name as shown in desired output skill and suggestion after that saved as it is in MongoDB collection

Python Pandas DataFrame is as input

    0     1     2       3       4       5       6       7
java    hadoop  java    hdfs    c       c++     php     python   html

c       c       c++     hdfs    python  hadoop  java    php      html

c++     c++     c       python  hdfs    hadoop  java    php      html

hadoop  hadoop  java    hdfs    c       c++     php     python   html

hdfs    hdfs    hadoop  java    c       c++     python  php      html

python  python  c++     html    c       php     hdfs    hadoop   java

Desired Output saved into MongoDB collection as

{ "_id" : ObjectId("5922a781205a763b55e2e90e"), "skill" : "java", "suggestions" : [ "hadoop", "java", "hdfs", "c", "c++", "php", "python", "html" ] }

{ "_id" : ObjectId("5922a781205a763b55e2e91e"), "skill" : "c", "suggestions" : [ "c", "c++", "hdfs", "python", "hadoop", "java", "php", "html" ] }

{ "_id" : ObjectId("5922a781205a763b55e2e92e"), "skill" : "c++", "suggestions" : [ "c++", "c", "python", "hdfs", "hadoop", "java", "php", "html" ] }

{ "_id" : ObjectId("5922a781205a763b55e2e93e"), "skill" : "hadoop", "suggestions" : [ "hadoop", "java", "hdfs", "c", "c++", "php", "python", "html" ] }

1 Answer 1

1

First of all, you need to translate the data into the corresponding format.

strlist = [['java','hadoop','java','hdfs','c','c++','php','python','html'],
      ['c','c','c++','hdfs','python','hadoop','java','php','html'],
      ['c++','c++','c','python','hdfs','hadoop','java','php','html'],
      ['hadoop','hadoop','java','hdfs','c','c++','php','python','html'],
      ['hdfs','hdfs','hadoop','java','c','c++','python','php','html'],
      ['python','python','c++','html','c','php','hdfs','hadoop','java']]

df = pd.DataFrame(strlist)

#I guess you need the following code
df['skill']=df[df.columns[:1]].values
df['suggestions'] = df[df.columns[1:]].values.tolist()
df = df[['skill','suggestions']]

print(df)
    skill                                        suggestions
0    java  [hadoop, java, hdfs, c, c++, php, python, html...
1       c  [c, c++, hdfs, python, hadoop, java, php, html...
2     c++  [c++, c, python, hdfs, hadoop, java, php, html...
3  hadoop  [hadoop, java, hdfs, c, c++, php, python, html...
4    hdfs  [hdfs, hadoop, java, c, c++, python, php, html...
5  python  [python, c++, html, c, php, hdfs, hadoop, java...

Then insert dataframe into the mongdb database.

records = json.loads(df.T.to_json()).values()
collection.insert_many(records)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.