I have a CSV dataset and I need to filter it with conditions but the problem is that the condition can be true for multiple days. What I want is to keep the last true value for this condition.
My dataset looks like this
Date City Summary FlightNo. Terminal Company
2-18-2019 NY Airplane Land 23 7 Delta
2-18-2019 London Cargo handling 4 5 British
2-18-2019 Dubai Airplane land 92 7 Emirates
2-19-2019 Dubai Airplane stay 92 5 Emirates
2-19-2019 Paris Flight cancel 78 2 British
2-19-2019 London Airplane Land 4 5 British
2-19-2019 LA Airplane Land 7 2 United
2-20-2019 Dubai Airplane land 92 3 Emirates
2-20-2019 LA Airplane land 29 3 Delta
2-20-2019 NY Airplane left 23 1 Delta
2-21-2019 Paris Airplane reschedu 78 2 British
2-21-2019 London Airplane land 4 3 British
2-21-2019 LA Airplane from NY land 29 5 Delta
~~~
3-10-2019 London Airplane land 5 5 KLM
3-10-2019 Paris Airplane Land 78 7 AirFrance
3-10-2019 LA Reschedule 29 4 United
3-11-2019 NY Cargo handled 23 7 Delta
3-11-2019 Dubai Arrived be4 2 day 34 7 Etihad
~~~
3-21-2019 Dubai Airplane land 92 5 Emirates
3-21-2019 New Delhi Reschedule 9 4 AirAsia
3-21-2019 London Cargo handling 5 2 Lufthansa
3-22-2019 New Delhi Airplane Land 9 3 AirAsia
3-22-2019 NY Reschedule 23 2 United
3-22-2019 Dubai Airplane land 35 1 Emirates
So the code should give us the last entry for plane landing where City == City and Flight No. == Flight No and Company == Company. As you can see this condition can be true for multiple days. So If all the three conditions are true and Summary contains Airplane Lands return the last true entire
Edited The desired output should look like the dataset below:
Date City Summary FlightNo. Terminal Company
2-18-2019 NY Airplane Land 23 7 Delta
2-19-2019 LA Airplane Land 7 2 United
2-20-2019 Dubai Airplane land 92 3 Emirates
2-21-2019 London Airplane Land 4 3 British
2-21-2019 LA Airplane from NY land 29 5 Delta
~~~
3-10-2019 London Airplane land 5 5 KLM
3-10-2019 Paris Airplane Land 78 7 AirFrance
~~~
3-21-2019 Dubai Airplane land 92 5 Emirates
3-22-2019 New Delhi Airplane Land 9 3 AirAsia
3-22-2019 Dubai Airplane land 35 1 Emirates
As shown above to delete row all three columns(City, FlightNo., and Company) should be the same if any of them is different then both rows should be kept.
The logic of it: Condition1: If df[Summary] contains "Airplane" and "land" return the row Condition2: Frome the already filtered dataset If df[City] == df[City] and df[FlightNo.] == df[FlightNo.] and df[Company] == df[Company] is true with 3 days then keep either the last or the first. So if returns rows with airplane land in the same city with same flight number runned by the same company on the 18th and 20th then one day row should be kept only. But if it was on the 1st and 15th from the same month then keep both rows.
Please help me find a what to apply all condition and keep the last True entrie.
EDIT:
Keep first if condition are true in the next 3 days Input
print (df)
Date City Code Summary Flight No. Company
0 2-18-2019 021 Airplane land 23 Emirates
1 2-18-2019 013 Airplane land 23 Etihad
2 2-19-2019 021 Airplane land 23 Emirates
3 2-19-2019 013 Airplane Land 23 Etihad
4 2-20-2019 021 Airplane land 23 Emirates
5 2-20-2019 055 Airplane land 23 Emirates
6 2-20-2019 013 Airplane land 23 Etihad
7 2-21-2019 021 Airplane land 23 Emirates
8 2-21-2019 013 Airplane land 78 Emirates
9 2-21-2019 055 Airplane from NY land 23 Emirates
10 2-22-2019 021 Airplane land 78 Emirates
11 2-22-2019 013 Airplane Land 78 Emirates
12 2-22-2019 055 Airplane land 78 Emirates
13 2-23-2019 021 Airplane land 78 Etihad
Output:
print (df)
Date City Code Summary Flight No. Company
0 2-18-2019 021 Airplane land 23 Emirates
1 2-18-2019 013 Airplane land 23 Etihad
5 2-20-2019 055 Airplane land 23 Emirates
7 2-21-2019 021 Airplane land 23 Emirates
8 2-21-2019 013 Airplane land 78 Emirates
10 2-22-2019 021 Airplane land 78 Emirates
12 2-22-2019 055 Airplane land 78 Emirates