12

I need to run a task which can be done with a loop, but I imagine that there is a more efficient and pretty way to do this. I have a DataFrame which has an integer column, which I would like to transform to a 4-digit string representation. That is, 3 should be converted to '0003', 234 should be converted to '0234'. I am looking for a vector operation that will do this to all entries in the column at once, quickly with simple code.

2 Answers 2

14

you can use Series.str.zfill() method:

df['column_name'] = df['column_name'].astype(str).str.zfill(4)

Demo:

In [29]: df = pd.DataFrame({'a':[1,2], 'b':[3,234]})

In [30]: df
Out[30]:
   a    b
0  1    3
1  2  234

In [31]: df['b'] = df['b'].astype(str).str.zfill(4)

In [32]: df
Out[32]:
   a     b
0  1  0003
1  2  0234
Sign up to request clarification or add additional context in comments.

3 Comments

First time i have seen this one, nice @MaxU +1
Nice, but the transformation is applied AFTER the int column is converted to string. Technically, does not answer the quesstion.
@jpmartineau, OP asked: "which I would like to transform to a 4-digit string representation" - i think it answers exactly that. ;)
8

You can also do this using the Series.apply() method and an f-string wrapped in a lambda function:

In [1]: import pandas as pd


In [2]: df = pd.DataFrame({'a':[1,2], 'b':[3,234]})

In [3]: df
Out[3]:
   a    b
0  1    3
1  2  234

In [4]: df['b'] = df['b'].apply(lambda x: f"{x:04d}")

In [5]: df
Out[5]:
   a     b
0  1  0003
1  2  0234

In the f-string, the part after the colon says "zero-pad the field with four characters and make it a signed base-10 integer".

4 Comments

This reply better answers to the title (but not to the description) of the question
How does it not fit the description? It is "a vector operation that will do this to all entries in the column at once, quickly with simple code".
The description only asks for zero padding so str.zfill seemed more appropriate. However, from a quick timeit, it seems apply is twice faster for a 100000 vector.
From the linked format specification for f-strings: "When no explicit alignment is given, preceding the width field by a zero ('0') character enables sign-aware zero-padding for numeric types. This is equivalent to a fill character of '0' with an alignment type of '='." (emphasis mine).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.