0

Let's say I have this in a pandas DataFrame:

+-------------------+-----------------+----------+---------+----------+--------+--------------+
| Family            | Genus           | Species  | hasHair | laysEggs | canFly | hasLongHorns |
+-------------------+-----------------+----------+---------+----------+--------+--------------+
| Bovidae           | Ovis            | Sheep    |    1    |     0    |    0   |       0      |
+-------------------+-----------------+----------+---------+----------+--------+--------------+
| Passeroidea       | Passeridae      | Sparrow  |    0    |     1    |    1   |       0      |
+-------------------+-----------------+----------+---------+----------+--------+--------------+
| Ornithorhynchidae | Ornithorhynchus | Platypus |    1    |     1    |    0   |       0      |
+-------------------+-----------------+----------+---------+----------+--------+--------------+
| Bovidae           | Ovis            | Mouflon  |    1    |     0    |    0   |       1      |
+-------------------+-----------------+----------+---------+----------+--------+--------------+
| Passeroidea       | Passeridae      | Passer   |    0    |     1    |    1   |       0      |
+-------------------+-----------------+----------+---------+----------+--------+--------------+

I would like to "summarize" the data to obtain the following:

+-------------------+-----------------+----------+---------+----------+--------+--------------+
| Family            | Genus           | Species  | hasHair | laysEggs | canFly | hasLongHorns |
+-------------------+-----------------+----------+---------+----------+--------+--------------+
| Bovidae           | Ovis            | Sheep    |    1    |     0    |    0   |       0      |
+-------------------+-----------------+----------+---------+----------+--------+--------------+
|                   |                 | Mouflon  |    1    |     0    |    0   |       1      |
+-------------------+-----------------+----------+---------+----------+--------+--------------+
| Ornithorhynchidae | Ornithorhynchus | Platypus |    1    |     1    |    0   |       0      |
+-------------------+-----------------+----------+---------+----------+--------+--------------+
| Passeroidea       | Passeridae      | Sparrow  |    0    |     1    |    1   |       0      |
+-------------------+-----------------+----------+---------+----------+--------+--------------+
|                   |                 | Passer   |    0    |     1    |    1   |       0      |
+-------------------+-----------------+----------+---------+----------+--------+--------------+

As you can see, this is more layout to enhance readability than actual data processing: the values of the properties are unchanged. I just want to produce a report that can be easier to read.

Now, I'm not sure how to tackle this. Can anyone offer some pointers?

Thanks!

R.

1 Answer 1

2

For easier read you can create MultiIndex and sorting it:

df = df.set_index(['Family','Genus', 'Species']).sort_index()
Sign up to request clarification or add additional context in comments.

1 Comment

Ah yes, even simpler than I thought! :-) Thank you!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.