1

I have String variables of the following format:

<?xml version="1.0" encoding="UTF-8"?>
<packages>
    <package name="firstPackage" version="LATEST" params="xxx" force="false"/>
    <package name="secondPackage" version="LATEST" params="xxx" force="false"/>
</packages>

My goal is to split the String + add to a list such that I end up with the following:

packages = ["<package name="firstPackage" version="LATEST" params="xxx" force="false"/>", "<package name="secondPackage" version="LATEST" params="xxx" force="false"/>"]

I am fairly new to Python but I believe the order of operations is:

  1. Split string by new line ("\n")
  2. For each line that is split this way, add to a list

To do this, I think I need a nested loop. The first loop to go through the initial multi-line Strings, but then another inside to break apart each line and store it as a separate list item. Below is some semi-pseudo code I have written:

variableList = [] # list containing Multi-line strings
packageList = [] # list to contain separated items
for item in variableList:
    stringVariable = item
    splitPackage = (stringVariable.split("\n"))
    for package in splitPackage:
        packageList.append(splitPackage)

Any help/advise would be appreciated.

3
  • Use the splitlines() function. Commented Aug 24, 2021 at 1:25
  • 1
    That's not pseudocode. It looks like working Python code that should do what you want. What's the problem? Commented Aug 24, 2021 at 1:26
  • Okay, so when you ran the code, what happened? How is that different from what you wanted to happen? Commented Aug 24, 2021 at 1:56

2 Answers 2

1
package_list = []
spl = multlinestring.splitlines()

for line in spl:
    if line.find('name') != -1:
        # trim leading/trailing whitespaces
        line = line.strip()
        package_list.append(line)
Sign up to request clarification or add additional context in comments.

Comments

1

A suggestion, don't use splitlines. Use a library that parses xml.

Python comes with a library to parse xml, to achieve your goal try this:

import xml.etree.ElementTree as ET

# your xml wasn't valid
source = """<?xml version="1.0" encoding="utf-8"?>
<packages>
    <package name="firstPackage" version="LATEST" params="xxx" force="false"/>
    <package name="secondPackage" version="LATEST" params="xxx" force="false"/>
</packages>"""

root = ET.fromstring(source)
for child in root:
    print(child.attrib)

This will output:

{'name': 'firstPackage', 'version': 'LATEST', 'params': 'xxx', 'force': 'false'}
{'name': 'secondPackage', 'version': 'LATEST', 'params': 'xxx', 'force': 'false'}

Plus point with this approach is you can convert structured data into any format you'd like, for example into the list you want:

packages = []
for child in root:
    package_attributes = child.attrib.values()
    desired_format = '<package name="{}" version="{}" params="{}" force="{}"/>'
    packages.append(desired_format.format(*package_attributes))

This will give you your desired list.

By the way you could use a different library to parse xml, since the documentation for this one says:

The xml.etree.ElementTree module is not secure against maliciously constructed data

2 Comments

The XML is being provided to me - it's format is not my choice unfortunately. I have to work with what I am given.
if you are certain the first line will be same throughout all inputs just do this: source_code = source_code.replace("<xml", "<?xml").replace('utf-8">', 'utf-8"?>')

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.