I have the following code block:
from jira import JIRA
import pandas as pd
cert_path = 'C:\\cert.crt'
start_date = '2020-10-01'
end_date = '2020-10-31'
a_session = JIRA(server='https://jira.myinstance-A.com', options={'verify': cert_path}, kerberos=True)
b_session = JIRA(server='https://jira.myinstance-B.com', options={'verify': cert_path}, kerberos=True)
c_session = JIRA(server='https://jira.myinstance-C.com', options={'verify': cert_path}, kerberos=True)
query_1 = 'project = \"Test Project 1\" and issuetype = Incident and resolution = Resolved and updated >= {} and updated <= {}'.format(start_date, end_date)
query_2 = 'project = \"Test Project 2\" and issuetype = Incident and resolution = Resolved and updated >= {} and updated <= {}'.format(start_date, end_date)
query_3 = 'project = \"Test Project 3\" and issuetype = Defect and resolution = Resolved and releasedate >= {} and releasedate <= {}'.format(start_date, end_date)
query_4 = 'project = \"Test Project 4\" and issuetype = Enhancement and resolution = Done and completed >= {} and completed <= {}'.format(start_date, end_date)
block_size = 100
block_num = 0
all_issues = []
while True:
start = block_num * block_size
issues = a_session.search_issues(query_1, start, block_size)
if len(issues) == 0:
break
block_num += 1
for issue in issues:
all_issues.append(issue)
issues = pd.DataFrame()
for issue in all_issues:
d = {
'key' : issue.key,
'type' : issue.fields.type,
'creator' : issue.fields.creator,
'resolution' : issue.fields.resolution
}
issues = issues.append(d, ignore_index=True)
This code runs fine and allows me to:
- retrieve data associated with only
query_1(which connects toa_session) - save that data into a Pandas dataframe
Now, I would like to be able to:
a. retrieve the data associated with query_2 (which also onnects to a_session) and save it to the issues dataframe
b. retrieve the data associated with query_3 (which connects to b_session) and save it to the issues dataframe
c. retrieve the data associated with query_4 (which connects to c_session) and save it to the issues dataframe
Notice that the structure of query_3 and query_4 is different than that of query_1 and query_2 (the field names are different, among other things).
I could write one GIANT script (which would probably work). But, I'm sure there is a more elegant way of approaching this (perhaps with a nested loop).
What's the best way of adapting this code block such that it treats cases a, b, and c above?
Any help would be much appreciated by this Python novice! Thanks in advance!
UPDATE 1:
I used the (very elegant) solution suggested by @Nick ODell. The code runs fine, but for whatever reason, I get a None result. I spent the past few hours trying to debug this and my leading theory is that the field names are not passed (as they are in d in the original code block I posted).
I tried to amend the get_all_issues function as follows:
def get_all_issues(session, query):
start = 0
all_issues = []
while True:
issues = session.search_issues(query, start, block_size)
if len(issues) == 0:
# No more issues
break
start += len(issues)
for issue in issues:
all_issues.append(issue)
issues = pd.DataFrame
for issue in all_issues:
d = {
'key' : issue.key,
'type' : issue.fields.type,
'creator' : issue.fields.creator,
'resolution' : issue.fields.resolution
}
issues = issues.append(d, ignore_index=True)
But, now there is an error message saying:
ValueError: All objects passed were None.
How would we amend the get_all_issues() function such that we can nest the following for loop and pass in the name fields, as follows?
for issue in all_issues:
d = {
'key' : issue.key,
'type' : issue.fields.type,
'creator' : issue.fields.creator,
'resolution' : issue.fields.resolution
}
issues = issues.append(d, ignore_index=True)
UPDATE 2:
Instead of using pd.json_normalize(issues), I used pd.DataFrame(issues) and added a dictionary of field names. The following code works ** because all fields exist in a_session, b_session, and c_session**:
def get_all_issues(session, query):
block_size = 50
block_num = 0
start = 0
all_issues = []
while True:
issues = session.search_issues(query, start, block_size)
if len(issues) == 0:
# No more issues
break
start += len(issues)
for issue in issues:
all_issues.append(issue)
issues = pd.DataFrame(issues)
for issue in all_issues:
d = {
'key' : issue.key,
'type' : issue.fields.type,
'creator' : issue.fields.creator,
'resolution' : issue.fields.resolution
}
issues = issues.append(d, ignore_index=True)
return issues
Then, I added 3 new custom fields to the dictionary:
for issue in all_issues:
d = {
'key' : issue.key,
'type' : issue.fields.type,
'creator' : issue.fields.creator,
'resolution' : issue.fields.resolution,
'system_change' : issue.fields.customfield_123,
'system_resources' : issue.fields.customfield_456,
'system_backup' : issue.fields.customfield_789
}
Custom field 123 exists in a_session and b_session, but not in c_session. Custom field 456 exists only in c_session. And, custom field 789 exists in b_session and c_session.
Doing so results in the following error: AttributeError: type object 'PropertyHolder' has no attribute 'customfield_123'.
Can anyone suggest an elegant solution to handle this? (i.e. the ability to have a dictionary with any number of fields, and the code 'understands' which fields relate to a given session) Thanks!