5

I have a collection called project, this collection contain different documents, and every document contain an array of object called data.

enter image description here

I want to be able to filter the data (Excel files) by projectAlias and use pymongon and pandas to structure this file in SQL (Columns and row)

For instance

enter image description here

4
  • I will upload the code that currently doing Commented Oct 31, 2019 at 21:31
  • So you want to filter it in mongo? $match? Commented May 3, 2020 at 17:41
  • @Adrian do you want to explain what kind of answer you are looking for here? Commented May 3, 2020 at 22:27
  • @Dekel I ended up throwing a new question up, as I was close to finish line on something I was trying to get something out the door :) I got the answer here: stackoverflow.com/a/61577290/4475605 Commented May 4, 2020 at 0:12

2 Answers 2

4
+50

There's no code here, so I have to make some guesses:

  • I'm going to assume all your data is already in its own array, extracted from whatever form it originally came in. If it needed to be collected from multiple documents, I assume that's already done
  • I assume each object has a key "projectAlias" with a string value
  • I assume any objects without a "projectAlias" key have been dealt with
  • For laziness, I assume you want to order the data lexicographically (e.g. "a" < "b", "A" < "a")

Something like this might be useful:

#Made up function for first and third assumptions
data_array = collect_data(documents)
data_array.sort(key=lambda obj: obj["projectAlias"])

#Or, to create a new array with sorted data
sorted_data = sorted(data_array, key=lambda obj: obj["projectAlias"])

The key arg for python's built in sort function takes some sort of other function and runs it on each element of the array before sorting the results of that array. Then because python is helpful, it has predefined comparisons of strings for sorting which puts capital letters first then lowercase for the English alphabet. That changes when you get into accents, umlauts, and other variations. I have no insight there.

If your data needs some other sorting, you would want to define a different lambda function for key that results in outputs that fit your desired sort more. Another one could be by the length of the value:

#Made up function for first and third assumptions
data_array = collect_data(documents)
data_array.sort(key=lambda obj: len(obj["projectAlias"]))

#Or, to create a new array with sorted data
sorted_data = sorted(data_array, key=lambda obj: len(obj["projectAlias"]))

If you want more info, the "Key Function" section here in python.org's wiki might be useful

Sign up to request clarification or add additional context in comments.

Comments

0

Since this question seems very generic I'm going to suggest using aggregation pipelines method

{ $filter: { input: <array>, as: <string>, cond: <expression> } }

More details can be found at https://docs.mongodb.com/manual/reference/operator/aggregation/filter/

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.