I need to extract strings from D column (yellow) whenever there is # in a row in F column (blue). I am a beginner and was trying Pandas and openpyxl for this task, but with no luck. Which one would be better for this?
I want them stored so I can access them later.
Also, extracting the numbers from H column (green) would be easiest with regular expressions? Link to onedrive with the excel
-
Post text and not images for giving us any chance to work with your data.Nickil Maveli– Nickil Maveli2016-12-30 14:33:16 +00:00Commented Dec 30, 2016 at 14:33
-
@NickilMaveli added link to the workbook, thanks for headsupk_mishap– k_mishap2016-12-30 14:38:26 +00:00Commented Dec 30, 2016 at 14:38
Add a comment
|
2 Answers
I think you need read_excel first and it seems first 7 rows has to be skipped:
df = pd.read_excel('LTE_KPIs_up.xlsx', skiprows=7)
#print (df)
And then select by loc with boolean indexing:
print (df.loc[df.Unit == '#', 'KPI name'])
0 UE-triggered ERAB Setup Attempts
1 UE-triggered ERAB Setup Successes
4 MME-initiated ERAB Setup Attempts
5 MME-initiated ERAB Setup Successes
8 eNodeB-initiated ERAB Release Attempts
9 eNodeB-initiated ERAB Drops
11 MME-initiated ERAB Release Attempts
12 MME-initiated ERAB Drops
14 ERAB Modification Attempts
15 ERAB Modification Successes
18 HO Preparation Attempts
19 HO Preparation Successes
22 HO Resource Allocation Attempts
23 HO Resource Allocation Successes
26 Handover Attempts
27 Handover Successes
33 EPS Attach Attempts
34 EPS Attach Successes
37 EPS Detach Attempts
38 EPS Detach Successes
40 EPS Authentication Attempts
41 EPS Authentication Successes
43 EPS Security Setup Attempts
44 EPS Security Setup Successes
46 EMM Identification Attepmt
47 EMM Identification Successes
49 EPS Service Request Attemptss
50 EPS Service Request Successes
52 Tracking Area Update Attempts
53 Tracking Area Update Successes
117 S6a Delete Subscriber Data Attempts
118 S6a Delete Subscriber Data Successes
120 S6a Notification Attempts
121 S6a Notification Successes
126 S11 Create Session Attempts
127 S11 Create Session Successes
130 S11 Create Bearer Attempts
131 S11 Create Bearer Successes
134 S11 Update Bearer Attempts
135 S11 Update Bearer Successes
138 Modify Access Bearer Attempts
139 Modify Access Bearer Successes
141 Release Access Bearer Attempts
142 Release Access Bearer Successes
144 Downlink Data Notification Attempts
145 Downlink Data Notification Successes
147 S11 Delete Session Attempts
148 S11 Delete Session Successes
150 S11 Delete Bearer Attempts
151 S11 Delete Bearer Successes
154 Suspend Attempts
155 Suspend Successes
157 Resume Attempts
158 Resume Successes
162 ME Identity Check Attempts
163 ME Identity Check Successes
168 Credit Control Initial Attempts
169 Credit Control Initial Successes
171 Credit Control Termination Attempts
172 Credit Control Termination Successes
Name: KPI name, dtype: object
Comments
you could select the required values from column F using following code. Also I assume that column H has an '=' sign before number
import csv
import pandas as pd
from io import StringIO
Excelfile = "file.xlsx"
df = pd.read_excel(open(Excelfile,'rb'), sheetname='Sheet1')
selectstring = df['ColumnD'].where(df['ColumnF'] == '#')
print selectstring
print df['Columnh'].str.split('=')[1]