1
df = pd.DataFrame({'a': ['asdf']}, dtype="string")
df["a"].replace({"a":"b"}, regex=True)

not chagend

df = pd.DataFrame({'a': ['asdf']}, dtype="object")
df["a"].replace({"a":"b"}, regex=True)

changed

I want to convert string value to other. but, if I use type string, I could not use replace method. how to chage string type data? should I use object type?

3
  • 1
    I tried the first code and it worked; changed the a to b. pandas version 1.3 Commented Aug 26, 2021 at 6:48
  • afternoon_drinker, you need to use .str method on string dtype, see the answer for more details. Commented Aug 26, 2021 at 8:30
  • @sammywemmy, thank you. I used 1.2.4 Commented Aug 26, 2021 at 9:29

2 Answers 2

1

If you see the difference by checking with df.dtypes it's evident that you r datatype is ultimately is an object but column is only string hence you need to apply pandas.Series.str.replace to get your results.

However, when you choose dtype="object" your both dtype and column data remains object thus you don't need to use .str converion.

Please check the source code, which explains it well:

For calling .str.{method} on a Series or Index, it is necessary to first initialize the :class:StringMethods object, and then call the method.

>>> df = pd.DataFrame({'a': ['asdf']}, dtype="string")
>>> df
      a
0  asdf

>>> df.dtypes
a    string
dtype: object

>>> df["a"].str.replace("a", "b", regex=True)
0    bsdf
Name: a, dtype: string
>>> df = pd.DataFrame({'a': ['asdf']}, dtype="object")
>>> df.dtypes
a    object
dtype: object

dtype:

browned from @HYRY.

Look at here source of inspiration for below explanation

From pandas docs where All dtypes can now be converted to StringDtype

The dtype object comes from NumPy, it describes the type of element in a ndarray. Every element in an ndarray must have the same size in bytes. For int64 and float64, they are 8 bytes. But for strings, the length of the string is not fixed. So instead of saving the bytes of strings in the ndarray directly, Pandas uses an object ndarray, which saves pointers to objects; because of this the dtype of this kind ndarray is object.

Here is an example:

  • the int64 array contains 4 int64 value.
  • the object array contains 4 pointers to 3 string objects.

enter image description here

Note:

Object dtype have a much broader scope. They can not only include strings, but also any other data that Pandas doesn't understand.

Sign up to request clarification or add additional context in comments.

1 Comment

thank you. Which type is better to use for string operations?
0

For the string type you can do this:

df = pd.DataFrame({'a': ['asdf']}, dtype="string")
df["a"].str.replace("a","b")

1 Comment

thank you. how to do in case of df["a"].replace({"a":"b"}, regex=False)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.