0

source table

Please see above my source table:

I am interesting to calculate new column "Group" based on list of interest lots:

List of interest

Value of column "Group" based on condition if lots in source table in column "Lot" exist in lots of interest. If this is not true, value in "Group"column will be copied from "LOT_VIRTUAL_LINE" cell

Desired output:

enter image description here

1
  • 3
    Instead of including images (which can't be copied and pasted), please include plain text (which can). Commented Dec 8, 2015 at 3:30

2 Answers 2

1

Because this question is tagged Pandas, I assume we are talking dataframes and series instead of plain lists. You can use loc to locate the rows and columns that match you criteria (e.g. whether each element in the LOT column isin the series of lots of interest).

df = pd.DataFrame({'LOT': ['A1111', 'A2222', 'A3333', 'B1111', 'B2222', 'B3333'], 
                   'LOT_VIRTUAL_LINE': ['AAA'] * 3 + ['BBB'] * 3})
s = pd.Series(['A1111', 'B2222'], name='Lots Of Interest')
# or... df2 = pd.read_csv('file_path/file_name.csv')

# Value of 'GROUP' defaults to 'LOT_VIRTUAL_LINE'.
df['GROUP'] = df.LOT_VIRTUAL_LINE

# But gets overwritten by 'LOT' if it is in the 'Lots of Interest' series.
mask = df.LOT.isin(s)
# or... mask = df.LOT.isin(df2['Lots of Interest'])  # Whatever the column name is.
df.loc[mask, 'GROUP'] = df.loc[mask, 'LOT']

# Confirm results.
>>> df
     LOT LOT_VIRTUAL_LINE  GROUP
0  A1111              AAA  A1111
1  A2222              AAA    AAA
2  A3333              AAA    AAA
3  B1111              BBB    BBB
4  B2222              BBB  B2222
5  B3333              BBB    BBB
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you, Alexander. Can you please show how I am reading csv file contains one column into Pandas series? Thanks
can you please show me the first two and last two list of the output from pd.read_csv('full_path_to_filename') where full_path_to_filename is just that.
1

Assuming we have a list named lots_of_interest, for instance as the result of read_csv(path).loc[:, 'lots_of_interest'].tolist():

df['Group'] = df.apply(lambda x: x['LOT'] if x['LOT'].isin(lots_of_interest) else x['LOT_VIRTUAL_LINE'], axis=1)

5 Comments

Thank you a lot, Stefan. And if lots_of_interest is data frame read from csv file?
pd.read_csv("C:/Users/fdoktorm/Documents/Databases/SQL Pathfinder/Startup Integration Toolkit/Examples For Development/LotsOfInterest.csv").loc[:, 'LotsOfInterest'].to_list() gives error:(type(self).__name__, name)) AttributeError: 'Series' object has no attribute 'to_list'
Sorry, it's .tolist().
Thank you, Now in lambda function row following error: AttributeError: ("'str' object has no attribute 'isin'", u'occurred at index 0')
Fixed with if x['SPC_LOT'] in LotsOfInterest statement. Thank you a lot, Stefan. Great help

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.