xml to csv python 3

Question

I have a XML file which I want to convert to CSV using Python. The program should convert the every xml data and also large one.

I tried it for 6 weeks, but without any success. I searched the whole Internet but now I need your help.

The program is printing a csv data but without any contents. Thank you for your help!

Please provide a sample of the code you've tried so far and also what the output should look like — user2668284
– user2668284, Commented Aug 9, 2021 at 8:33
Could you put in some sample data in your expected CSV file format and show us how it looks like? It isn't clear. — Ram
– Ram, Commented Aug 9, 2021 at 9:51
This "unicode" stuff seems like a distraction that has nothing to do with the actual problem of getting from XML to CSV. Just replace xml2csv(unicode(sys.argv[1])) with xml2csv("input.xml"). Please trim your code down as much as possible. You need to provide a proper minimal reproducible example. There are also indentation errors in the code in the question. — mzjn
– mzjn, Commented Aug 9, 2021 at 10:11

Martin Evans · Accepted Answer · 2021-08-10 09:09:57Z

It is not possible to write a generic script that would handle ALL types of XML files, you will need to adapt the script for your own needs.

The following approach should get you started, it first finds all of the Metrics tags and builds up CSV rows from there using a standard csv.DictWriter().

from bs4 import BeautifulSoup
import csv

headers = [
    "Scope", 
    "Project", 
    "Namespace", 
    "Type", 
    "Member",
    "MaintainabilityIndex",
    "CyclomaticComplexity",
    "ClassCoupling",
    "DepthOfInheritance",
    "SourceLines",
    "ExecutableLines",
]

namespace = ''

with open('input.xml') as f_input:
    soup = BeautifulSoup(f_input.read(), "xml")

with open('output.csv', 'w', newline='') as f_output:
    csv_output = csv.DictWriter(f_output, fieldnames=headers)
    csv_output.writeheader()
    
    for metrics in soup.find_all('Metrics'):
        parent = metrics.parent
        
        if parent.name == 'Namespace':
            namespace = parent['Name']
        elif parent.name == 'Method':
            member = parent['Name']
        else:
            member = ''
        
        row = {'Scope' : parent.name, 'Namespace' : namespace, 'Member' : member}
        
        for metric in metrics.find_all('Metric'):
            row[metric['Name']] = metric['Value']
        
        csv_output.writerow(row)

This would give you an output.csv CSV file starting like:

Scope,Project,Namespace,Type,Member,MaintainabilityIndex,CyclomaticComplexity,ClassCoupling,DepthOfInheritance,SourceLines,ExecutableLines
Assembly,,,,,88,17,35,3,100,22
Namespace,,TEST,,,84,7,26,1,63,15
NamedType,,TEST,,,91,2,8,1,14,3
Method,,TEST,,void Program.Main(string[] args),93,1,3,,4,1
Method,,TEST,,IHostBuilder Program.CreateHostBuilder(string[] args),84,1,6,,6,2
NamedType,,TEST,,,78,5,19,1,43,12
Method,,TEST,,Startup.Startup(IConfiguration configuration),96,1,1,,4,1

Hopefully this helps to get you started. I made use of the xml parser as this does not convert all tags to lowercase.

Collectives™ on Stack Overflow

xml to csv python 3

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related