1

I have excel(test.xlsx) sheet having multiple columns,col1,col2,col3,col4 and so on.. I want to perform some operation on col2,col3 and then the output output.xlsx having all the columns again with the updated col2,col3..

What I was trying..

df = pd.read_xlsx('test.xlsx')
col = ['col2','col3']
df_with_some_operation = df[col].<some_op>
df_with_some_operation.to_excel(output.xlsx)

Need help on this code so that all the columns including col2,col3 get included in final output.xlsx

For better visualisation ... Check below, I do not want to change the column names, only want to update the content.. I picked this example to made it simplify .. col2 and col3 - > multiplied by 2 . just a note, there are multiple columns in actual, but only on 2 I have to do some work..

input.xlsx
col1   col2 col3
 1      2    3

output.xls
col1  col2  col3
1     4    6
9
  • Why can't you just use drop? Commented Aug 27, 2018 at 17:12
  • @roganjosh I need to keep all the columns as it is, just have to update col2,col3 . The excel sheet needs to be used with all the columns. Any better way you can suggest ? Commented Aug 27, 2018 at 17:13
  • Ok, so drop the columns you don't want before to_excel Commented Aug 27, 2018 at 17:14
  • 1
    To provide a good solution, you need to indicate what's involved in <and then the logic work>. Vectorised operations may be possible instead of the more generic but inefficient pd.DataFrame.apply. Commented Aug 27, 2018 at 17:23
  • 1
    @jpp Please check the code, updated.. Commented Aug 27, 2018 at 17:26

3 Answers 3

1

You can just assign the result of pd.DataFrame.applymap to df[cols]. This will leave the rest of your dataframe unchanged.

df = pd.read_excel('test.xlsx')

cols = ['col2','col3']
df[cols] = df[cols].applymap(lambda c: translate.translate_text(...))

df.to_excel('output.xlsx')

If you want 2 new columns, you can use pd.DataFrame.join:

df = df.join(df[cols].applymap(lambda c: translate.translate_text(...))\
                     .set_axis(['col2a', 'col3a'], 1))
Sign up to request clarification or add additional context in comments.

Comments

1

just include the newly generated columns in the orignal dataframe.

df_with_some_operation = df[col].<and then the logic work>
newcolums=["coln1","coln2"]


df[newcolums]=df_with_some_operation

now this way your dataframe df if you save will have all the orignal columns as well as the modifications you made.

Note: you can directly assign the new columns instead of writing down separately, like above. This is for understanding only:

newcolums=["coln1","coln2"]

df[newcolumns]=df[col].<and then the logic work>

2 Comments

*Sorry for misunderstanding , I just updated the last line, I need all the columns including the updated col1 and col2 in the output.xlsx
@steveJ made modifications
1
import pandas as pd

df = pd.DataFrame({'A': [2, 3, 4], 'B': [5, 7, 9], 'C':[10, 11, 12]})

df['A'] = df ['A']*3
df['B'] = df ['B']*2

df.to_excel('output.xlsx')

6 Comments

What does this have to do with the question?
from @steveJ :"I need to keep all the columns as it is, just have to update col2,col3", I am just trying provide an example for his requirement (updating columns A and B and keeping column C as it is)
But this writes all the columns to the file
Thats what he asked
The question has been edited to change "apart from" to "including". Apologies :/
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.