0

I am trying to save an array of strings to Postgres but when I check, the array of strings is saved as an array of chars. Example using sqlalchemy for my database engine

df = pd.read_csv('data.csv')
df.to_sql('tablename', dtypes={'array_col':sqlalchemy.dialects.postgresql.Array(sqlalchemy.dialects.postgresql.text)})

when I query for 'array_col', I'm expecting this:

['one','two']

What I get is this

['','o','n','e','','t','w','o']
1
  • Can you include the format the strings are stored in within the csv? Commented Apr 3, 2022 at 20:34

1 Answer 1

2

I think you need to convert the string from the csv into an array first using converters argument in read_csv() before calling to_sql.

This example assumes they are stored separated by commas themselves. I used hobbies as the name of my array column. Once the value was converted to a list I did not seem to pass dtype to to_sql but if you still need that then you would use dtype={'hobbies': ARRAY(TEXT)} in this example. I think ARRAY and TEXT are the correct types, not to be confused with array or text. I define my column array type in my sqlalchemy model below.


import sys
from io import StringIO
from sqlalchemy import (
    create_engine,
    Integer,
    String,
)
from sqlalchemy.schema import (
    Column,
)
from sqlalchemy.sql import select
from sqlalchemy.orm import declarative_base
import pandas as pd
from sqlalchemy.dialects.postgresql import ARRAY, TEXT

Base = declarative_base()


username, password, db = sys.argv[1:4]


engine = create_engine(f"postgresql+psycopg2://{username}:{password}@/{db}", echo=False)


class User(Base):
    __tablename__ = "users"
    id = Column(Integer, primary_key=True)
    name = Column(String(8), index=True)
    hobbies = Column(ARRAY(TEXT))


Base.metadata.create_all(engine)

csv_content = '''"name","hobbies"
1,"User 1","running,jumping"
2,"User 2","sitting,sleeping"
'''

with engine.begin() as conn:
    def convert_to_array(v):
        return [s.strip() for s in v.split(',') if s.strip()]

    content = StringIO(csv_content)
    df = pd.read_csv(content, converters={'hobbies': convert_to_array})
    df.to_sql('users', schema="public", con=conn, if_exists='append', index_label='id')


with engine.begin() as conn:
    for user in conn.execute(select(User)).all():
        print(user.id, user.name, user.hobbies)
        print ("|".join(user.hobbies))
        print (type(user.hobbies))

Sign up to request clarification or add additional context in comments.

1 Comment

convert_to_array might need to be adjusted based on how your array is stored in your csv

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.