0

So given input as a txt file, we have to create an XML Tree or any hierarchical structure for easier parsing. And then find the last employee of the last CEO.

The txt gives the structure of a company where the columns are in the following order: name, salary, employer. Those with 'NOBODY' as employer are the CEOS in the company. The ones with the employer name work under the mentioned employer. The txt looks something like this:

Vineel Phatak, 520, NOBODY
Ajay Joshi, 250, Vineel Phatak
Abhishek Chauhan, 120, Ajay Joshi
Jayesh Godse, 500, NOBODY
Vijaya Mundada, 60, Abhishek Chauhan
Shital Tuteja, 45, Jayesh Godse
Rajan Gawli, 700, Vineel Phatak
Zeba Khan, 300, Jayesh Godse
Chaitali Sood, 100, Zeba Khan
Sheila Rodrigues, 35, Vineel Phatak

Given this, we have to something like this is to be accomplished:

Company
->Vineel Phatak
-->Ajay Joshi
--->Abhishek Chauhan
---->Vijaya Mundada
-->Rajan Gawli
-->Sheila Rodrigues

->Jayesh Godse
-->Shital Tuteja
-->Zeba Khan
--->Chaitali Sood

In XML Format:

<company>
    <Vineel Phatak>
        <Ajay Joshi>
            <Abhishek Chauhan>
                <Vijaya Mundada />
            </Abhishek Chauhan>
        </Ajay Joshi>
        <Rajan Gawli />
        <Sheila Rodrigues />
    </Vineel Phatak>

    <Jayesh Godse>
        <Shital Tuteja />
        <Zeba Khan>
            <Chaitali Sood />
        </Zeba Khan>
    </Jayesh Godse>
</company>

What I tried doing is that after creating an element named company, since we need to add subelements to the root(company), I tried generating those and appending into a list. And then parsing through the list and comparing to get the values.

# Find last employee of the last introduced CEO
import xml.etree.ElementTree as ET

# Reading Input
inD = open('input.txt', 'r')
data = inD.readlines()
inD.close()

# Creating an element and saving all subelement to list
all_element = []
company = ET.Element('Company')
ceos = []
for i in data:
    t = i.strip().split(',')
    if(t[2].strip() == 'NOBODY'):
        ceos.append(t[0])
    all_element.append(ET.SubElement(company, t[0]))
# company.clear()
# Creating a function to add subelements
def findChilds(name, emp):
    global all_element
    for i in all_element:
        if emp == i.tag:
            name = ET.SubElement(i, name)

# If it is CEO hence no emplyer then directly add subelement to company or else add to the previous subelement
for j in data:
    t = j.strip().split(',')
    if t[2].strip() == 'NOBODY':
        e = ET.SubElement(company, t[0])
    elif t[2].strip() != 'NOBODY':
        findChilds(t[0].strip(), t[2].strip())
        
ET.dump(company)

And the result is below:

<Company><Vineel Phatak><Ajay Joshi /><Rajan Gawli /><Sheila Rodrigues /></Vineel Phatak><Ajay Joshi><Abhishek Chauhan /></Ajay Joshi><Abhishek Chauhan><Vijaya Mundada /></Abhishek Chauhan><Jayesh Godse><Shital Tuteja /><Zeba Khan /></Jayesh Godse><Vijaya Mundada /><Shital Tuteja /><Rajan Gawli /><Zeba Khan><Chaitali Sood /></Zeba Khan><Chaitali Sood /><Sheila Rodrigues /><Vineel Phatak /><Jayesh Godse /></Company>

Which you can see isn't entirely correct. Also deleting the elements (line 18) isn't working as it refuses to add subelements other than the ceos

So at the end, we need to create this hierarchy and then print out the name of the last employee of the last CEO, which in this case is:
Last CEO: Jayesh Godse
Last Employee of CEO(direct or indirect, last to be introduced from the input): Chaitali Sood

Output:
Chaitali Sood

Also the number of CEO and number of Children and GrandChildren under them are not definite, neither are the names.

Im new to the ElementTree, so there might be some predefined functions that I may not have knowledge of, so please pardon my ignorance. Insights and suggestions are much appreciated. Thanks in advance!

1 Answer 1

1

Before listing my example, one remark regarding xml structures: when creating an xml structure, it is a better practice to use the 'class of the object' for the element's tag and store its 'attributes' like name and salary as xml attributes:
<employee name="Vineel Phatak" salary="520"/>
instead of:
<Vineel Phatak/>
This will make parsing a lot easier and gives more flexibility to extend the format.

My example

An example implementation for your questions:

import csv
from dataclasses import dataclass
import xml.etree.ElementTree as ET


@dataclass
class Employee:
    linenumber: int
    name: str
    salary: str
    manager_name: str
    subordinates: list


employees = {}  # a dictionary to map names to employees

# load employees
with open('company.csv') as csvfile:
    reader = csv.reader(csvfile)
    for linenumber, row in enumerate(reader):
        (name, salary, manager_name) = [value.strip() for value in row]
        employees[name] = Employee(linenumber, name, salary, manager_name, [])


# link employees to their subordinates
ceos = []
for employee in employees.values():
    if employee.manager_name == 'NOBODY':
        # store the ceos in a list to start building the xml from later
        ceos.append(employee)
    else:
        # look up the manager by it name
        manager = employees[employee.manager_name]
        manager.subordinates.append(employee)

# create xml
companyelement = ET.Element('company')

def add_employees_to_xml_element(xmlelement, employees):
    for employee in employees:
        employee_element = ET.Element("employee", {
            "name": employee.name,
            "salary": employee.salary
        })
        xmlelement.append(employee_element)
        add_employees_to_xml_element(employee_element, employee.subordinates)


add_employees_to_xml_element(companyelement, ceos)
ET.dump(companyelement)

# find the last entered ceo
def linenumber_key(ceo): return ceo.linenumber


last_entered_ceo = max(ceos, key=linenumber_key)
print(f"Last entered CEO: {last_entered_ceo.name}")

# find the last entered (in)direct subordinate of the last entered ceo
def find_last_entered_subordinate(employee, current_last=None):
    for subordinate in employee.subordinates:
        if not current_last:
            current_last = subordinate  # ensuring an initial value
        else:
            current_last = max([current_last, subordinate], key=linenumber_key)
        # recursive: travers the subordinate's subordinates
        current_last = find_last_entered_subordinate(subordinate, current_last)
    return current_last


last_employee = find_last_entered_subordinate(last_entered_ceo)
print(f"Last added subordinate of last CEO: {last_employee.name}")

I broke the excercise down in following parts:

  1. Load Employees from a CSV file into a dictionary, to facilitate (and speed up) lookup of employees by name later on. I also store the line number of each employee to use for your later questions.
  2. Link employees to their subordinates. Assuming that managers might be listed after their subordinates, this could not be combined in the first step. Every employee will have a list of its subordinates, CEOs are stored in a seperate 'root' list.
  3. Create xml using element tree and a recursive function traversing the above created CEOs list.
  4. Find the last entered CEO. We already have a list of CEOs, but because it is created from a dictionary (which does not ensure that elements are retrieved in the same order as they were added), I cannot just take the last element, but should find the CEO with the heighest linenumber.
  5. Find the last entered (in)direct subordinate of the last entered ceo. Similar to above, this time I used a recursive function to retrieve this employee based on the linenumber.

resulting xml:

<company>
    <employee name="Vineel Phatak" salary="520">
        <employee name="Ajay Joshi" salary="250">
            <employee name="Abhishek Chauhan" salary="120">
                <employee name="Vijaya Mundada" salary="60"/>
            </employee>
        </employee>
        <employee name="Rajan Gawli" salary="700"/>
        <employee name="Sheila Rodrigues" salary="35"/>
    </employee>
    <employee name="Jayesh Godse" salary="500">
        <employee name="Shital Tuteja" salary="45"/>
        <employee name="Zeba Khan" salary="300">
            <employee name="Chaitali Sood" salary="100"/>
        </employee>
    </employee>
</company>
Sign up to request clarification or add additional context in comments.

1 Comment

Note that I did not use elementtree for points 4. and 5. as we already have all data in memory.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.