1

The title of this question might not be appropriate...

So let's suppose I have the following input.csv :

Division,id,name
1,3870,name1
1,4537,name2
1,5690,name3

I need to do some treatments based on the id row, that fetch like this :

>>> get_data(3870)
[{"matchId": 42, comment: "Awesome match"}, {"matchId": 43, comment: "StackOverflow is quite good"}]

My objective is to output a csv that is a join between the first one, and the related data retrieved through get_data :

Division,id,name,matchId,comment
1,3870,name1,42,Awesome match
1,3870,name1,43,StackOverflow is quite good
1,4537,name2,90,Random value
1,4537,name2,91,Still a random value
1,5690,name3,10,Guess what it is
1,5690,name3,11,A random value

However, for some reasons, in the process, the integer data are converted into float :

Division,id,name,matchId,comment
1.0,3870.0,name1,42.0,Awesome match
1.0,3870.0,name1,43.0,StackOverflow is quite good
1.0,4537.0,name2,90.0,Random value
1.0,4537.0,name2,91.0,Still a random value
1.0,5690.0,name3,10.0,Guess what it is
1.0,5690.0,name3,11.0,A random value

Here is short version of my code, I think I missed something...

input_df = pd.read_csv(INPUT_FILE)
output_df = pd.DataFrame()

for index, row in input_df.iterrows():
    matches = get_data(row)

    rdict = dict(row)
    for m in matches:
        m.update(rdict)

    output_df = output_df.append(m, ignore_index=True)

    # FIXME: this was an attempt to solve the problem
    output_df["id"] = output_df["id"].astype(int)
    output_df["matchId"] = output_df["matchId"].astype(int)

    output_df.to_csv(OUTPUT_FILE, index=False)

How can I convert every float column into integer ?

1 Answer 1

1

First solution is add parameter float_format='%.0f' to to_csv:

print output_df.to_csv(index=False, float_format='%.0f')
Division,comment,id,matchId,name
1,StackOverflow is quite good,3870,43,name1
1,StackOverflow is quite good,4537,43,name2
1,StackOverflow is quite good,5690,43,name3

Second possible solution is apply function convert_to_int instead of astype:

print output_df
   Division                      comment    id  matchId   name
0         1  StackOverflow is quite good  3870       43  name1
1         1  StackOverflow is quite good  4537       43  name2
2         1  StackOverflow is quite good  5690       43  name3

print output_df.dtypes
Division    float64
comment      object
id          float64
matchId     float64
name         object
dtype: object

def convert_to_int(x):
    try:
        return x.astype(int)
    except:
        return x

output_df = output_df.apply(convert_to_int)

print output_df
   Division                      comment    id  matchId   name
0         1  StackOverflow is quite good  3870       43  name1
1         1  StackOverflow is quite good  4537       43  name2
2         1  StackOverflow is quite good  5690       43  name3

print output_df.dtypes
Division     int32
comment     object
id           int32
matchId      int32
name        object
dtype: object
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.