3

I have an excel file with one sheet. This contains two columns num1, num2 and both of them has integer values. I'm trying to pull this data and insert it into Mysql database using Sqlalchemy and pandas.

from sqlalchemy import create_engine, MetaData,Column,Integer
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker,validates
import pandas as pd

Base = declarative_base()
connection_string = # give your connection string here
engine= create_engine(connection_string)
Base.metadata.bind = engine
s = sessionmaker()
session = s()

class a(Base):
    __tablename__ = 'a'
    id = Column(Integer,primary_key=True)
    num1 = Column(Integer)
    num2 = Column(Integer)

a.__table__.create(checkfirst=True)

excel_sheet_path = # give path to the excel sheet
sheetname = # give your sheet name here

df = pd.read_excel(excel_sheet_path,sheetname).transpose()


dict = df.to_dict()

for i in dict.values():
    session.add(a(**i))
session.commit()

This code throwing me an AttributeError saying

AttributeError: 'numpy.int64' object has no attribute 'translate'

So before converting dataframe into dictionary I tried many functions like astype, to_numeric to change the datatypes into normal python int, but they aren't working at all. The problem seems to persist only when the dataframe has all integer values. If you have atleast one another column with type string or date, Then the program works normally. How do I solve this ?

3

1 Answer 1

1

In trouble with this, too. I finally find out a somewhat unskillful solution as follows:

def trans(data):
"""
translate numpy.int/float into python native data type
"""
result = []
for i in data.index:
    # i = data.index[0]
    d0 = data.iloc[i].values
    d = []
    for j in d0:
        if 'int' in str(type(j)):
            res = j.item() if 'item' in dir(j) else j
        elif 'float' in str(type(j)):
            res = j.item() if 'item' in dir(j) else j
        else:
            res = j
        d.append(res)
    d = tuple(d)
    result.append(d)
result = tuple(result)
return result

However, it performs poorly when handling data with large number of rows. You will spend some minutes in translate a dataframe with more than 100,000 records.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.