0

I am trying to read data from a MySQL table, and one of the columns contains large varchar values e.g. of length 49085. When I read the results of the query into a dataframe, the column value is truncated at 87 characters. Please see the code below and the output. Does anyone know how I can read the entire string without truncation?

In the code below table test contains a column description where one of the rows has a string of length 49085.

Code:

import sys
import os
from sqlalchemy import create_engine
import pandas as pd

db_connection_str = 'mysql+pymysql://username:password@host/db_name'
db_connection = create_engine(db_connection_str)

#this returns 1 row where the value in the description field is of length 49085
df = pd.read_sql("select id, description, length(description) as len from myTable where length(description) = 49085", con=db_connection)

#this returns the truncated value of length 87
print(df)
len(str(df['description']))

Output:

   id                                             description    len
0  1  This document is for the testing Team.\n\nThe attach...  49085
87
3
  • have you treid another driver? Commented Oct 19, 2021 at 16:19
  • I haven't, don't know much about that. Do you mean trying something other than sqlalchemy? Commented Oct 19, 2021 at 16:20
  • yes try mysql.connector Commented Oct 19, 2021 at 16:36

1 Answer 1

1

You are being misled by len(str(df['description'])). df['description'] returns a <class 'pandas.core.series.Series'> object and if we call str() on it we get

'0    xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...\nName: description, dtype: object'

The length of that string will be 87 for any arbitrarily large string in the Series. To test the actual length of the string, use

print(len(df['description'][0]))

or similar.

Sign up to request clarification or add additional context in comments.

8 Comments

Thanks! that's helpful to know. I tried print(len(df['description'][0])) and it does show the correct length of 49085. But then when I write the df to a .txt file, I still get the truncated value. Below is the code I'm using to write it to a .txt file.
writePath = r'sample_data.txt' with open(writePath, 'a') as f: dfAsString = df.to_string(index=False) f.writelines(dfAsString)
If you're looking to dump the DataFrame to a text file you might have better luck with something like df.to_csv()
Correct me if I'm wrong, but wouldn't that have the same issue? Coz to open the csv I'd need to do that in Excel and the max length of a cell in Excel is 30k characters.
Many applications other than Excel can consume CSV files. What exactly do you intend to do with that text file?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.