1

I am getting some json from an api, and with the json response i am writing the data to a csv file. some of the json keys are numbers(actually strings but can be converted) and some of them are strings. If they are numbers I need to convert them to an actual name. I have an xml file as a look up table of sort and I am getting some errors on getting the text from a xml field.

.json file

[
    {
    "id": "228",
    "fullName2": "users name",
    "4600.0": "0000-00-00",
    "4600.2": "some text",
    }
]

columnLookup.xml file as a look up table

<data>
 <fields>
   <field id="NA" alias="fullName2">full name</field>
   <field id="15493" alias="id">id</field>
   <field id="4600.0" alias="jobTitle">Job Title</field>
 </fields>
</data>

.py file Mycode

def write_to_csv_file(self):
        with open(self.mJSON_file_name) as json_file:
            data = json.load(json_file)

            csv_file = open(self.mCSV_file_name, 'w')

            csv_writer = csv.writer(csv_file)
            count = 0
            keys_dict = {}
            for emp in data:
                if count == 0:
                    header = emp.keys()
                    for key in header:
                        headers = get_name_from_id(key)
                        keys_dict[headers] = ''
                    
                    csv_writer.writerow(keys_dict.keys())
                    count += 1
                csv_writer.writerow(emp.values())
        
        csv_file.close()
        self.mCSV_data = csv_file

def get_name_from_id( search_id):
    query = 'field[@id="{}"]'.format(search_id)
    alias_query = 'field[@alias="{}"]'.format(search_id)

    tree = ET.parse('columnLookup.xml')
    root = tree.getroot()

    for data in root.iter('fields'):
        if search_id.isdigit():
            return data.find(query).text
        else:
            return data.find(alias_query).text

explenation:

basically I get that key from the json file and pass it to the get_name_from_id() function. if it can be converted to a digit. I look up the id of the value in the xml file and return the text. If it cant be converted to a digit, then I look up the text value from the alias of a field. im crashing on "fullName2".

when the key from the json file is "fullName2", I need it to find the field in the xml file with alias="fullName2" and then return the text "full name". any ideas why im getting the attribute error?

8
  • Try query = '//field[@id="{}"]'.format(search_id) that is a relative path lookup. Commented Jul 22, 2021 at 21:42
  • always put full error message (starting at word "Traceback") in question (not comment) as text (not screenshot, not link to external portal). There are other useful information. Commented Jul 22, 2021 at 23:48
  • error meas that it couldn't fine element so you get None and try to do None.text. First you should get item = data.find(query) and item = data.find(alias_query) and check if itme is not None and then use return item.text. Commented Jul 22, 2021 at 23:49
  • your for-loop is wrong - you have return in if and else so it exits function on first element fields. Commented Jul 22, 2021 at 23:53
  • @LMC I think it would need also dot at start - './/field[@id="{}"]' - to make it relative to data Commented Jul 22, 2021 at 23:56

2 Answers 2

1

I tried create minimal working code with data directly in code and I found problem makes isdigit() which checks only if there are digits in string - not if this is string with integer or float - so "4600.0".isdigit() gives False and it searchs it in alias instead of id.

You shouldn't use isdigit but check both xpath and if one of them gives node then return `text

for data in root.iter('fields'):
    if data.find(query):
        return data.find(query).text
    if data.find(alias_query):
        return data.find(alias_query).text

Other problem can be that you not use relative xpath so it search in wrong place.

query = './/field[@id="{}"]'.format(search_id)
alias_query = './/field[@alias="{}"]'.format(search_id)

Minimal working code

text_json ='''[
    {
    "id": "228",
    "fullName2": "users name",
    "4600.0": "0000-00-00",
    "4600.2": "some text"
    }
]'''

text_xml = '''<data>
 <fields>
   <field id="NA" alias="fullName2">full name</field>
   <field id="15493" alias="id">id</field>
   <field id="4600.0" alias="jobTitle">Job Title</field>
 </fields>
</data>
'''

import json
import csv
import lxml.etree as ET

class Test():
        
    def write_to_csv_file(self):
        #with open(self.mJSON_file_name) as json_file:
        #    data = json.load(json_file)
        data = json.loads(text_json)
        
        #with open(self.mCSV_file_name, 'w') as csv_file:
        with open('output.csv', 'w') as csv_file:
            csv_writer = csv.writer(csv_file)
            
            header_added = False
            
            for row in data:
                if not header_added:
                    headers = []
                    for key in row.keys():
                        name = get_name_from_id(key)
                        headers.append(name)
                    csv_writer.writerow(headers)
                    header_added = True
            
                csv_writer.writerow(row.values())
        
        #self.mCSV_data = csv_file

# ----

# read it only once

root = ET.fromstring(text_xml)
#tree = ET.parse('columnLookup.xml')
#root = tree.getroot()

def get_name_from_id(search_id):
    print('[get_name_from_id] search_id:', search_id)
    
    query_id    = './/field[@id="{}"]'.format(search_id)
    query_alias = './/field[@alias="{}"]'.format(search_id)
    
    for fields in root.iter('fields'):
        items = fields.xpath(query_alias)
        if items:
            print('query:', query_alias)
            print('items:', items)
            print('text:', items[0].text)
            return items[0].text
        items = fields.xpath(query_id)
        if items:
            print('query:', query_alias)
            print('items:', items)
            print('text:', items[0].text)
            return items[0].text
        
# --- main ---

t = Test()
t.write_to_csv_file()
Sign up to request clarification or add additional context in comments.

Comments

1

Both attributes can be queried at the same time returning the first result. Assuming no duplicates and the search argument exists.

from lxml import etree
tree = etree.parse('test.xml')
arr = tree.xpath('//field[@id="{0}" or @alias="{0}"]'.format('15493'))
print(arr[0].text)
# result: id
arr = tree.xpath('//field[@id="{0}" or @alias="{0}"]'.format('fullName2'))
print(arr[0].text)
# result: full name
len(arr)
# result: 1
# the list contains 1 element

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.