0

I have the below PostgreSQL table:

     cust     prod  day  month  year state  quant
0    Bloom    Pepsi    2     12  2011    NY   4232
1    Bloom    Bread   23      5  2015    PA   4167
2    Bloom    Pepsi   22      1  2016    CT   4404
3    Bloom   Fruits   11      1  2010    NJ   4369
4    Bloom     Milk    7     11  2016    CT    210

I have to find and display the average sale of Bloom per state and display like this:

CUST   AVG_NY  AVG_CT AVG_NJ
Bloom  28923   3241   1873

I converted the data to the below form:

[('Bloom', 'Pepsi', 2, 12, 2011, 'NY', 4232), ('Bloom', 'Eggs', 30, 11, 2010, 'NJ', 559), ('Bloom', 'Yogurt', 25, 7, 2014, 'PA', 17), ('Bloom', 'Yogurt', 3, 4, 2011, 'NJ', 1203), ('Bloom', 'Coke', 7, 2, 2010, 'NY', 1229), ('Bloom', 'Coke', 6, 10, 2018, 'PA', 2867), ('Bloom', 'Soap', 6, 1, 2015, 'CT', 4623), ('Bloom', 'Milk', 8, 9, 2010, 'NJ', 1106), ('Bloom', 'Milk', 19, 4, 2013, 'NY', 3516), ('Bloom', 'Soap', 7, 6, 
2015, 'PA', 3404)]

Below is my code and is probably the worsts way to do so:

connection = psycopg2.connect(user="postgres",
                                  password="ss",
                                  host="127.0.0.1",
                                  port="8800",
                                  database="postgres")
cursor = connection.cursor()
postgreSQL_select_Query = "select * from sales"
cursor.execute(postgreSQL_select_Query)

mobile_records = cursor.fetchall()
def takeSecond(elem):
    return elem[0][0]


mobile_records.sort(key=takeSecond)
Bloom1 = []
for i in mobile_records:
    if i[5] == 'NY' and i[0] == 'Bloom':
        Bloom1.append(i)
s1 = 0
for j in Bloom1:
    s1 += j[6]
avg1 = s1/len(Bloom1)


Bloom2 = []
for i in mobile_records:
    if i[5] == 'CT' and i[0] == 'Bloom':
        Bloom2.append(i)
s2 = 0
for j in Bloom2:
    s2 += j[6]
avg2 = s2/len(Bloom2)


Bloom3 = []
for i in mobile_records:
    if i[5] == 'NJ' and i[0] == 'Bloom':
        Bloom3.append(i)
s3 = 0
for j in Bloom3:
    s3 += j[6]
avg3 = s3/len(Bloom3) 

How do I even start to achieve this?

7
  • 1
    Take a look to Pandas Commented Oct 15, 2021 at 6:32
  • 1
    Do you need the average of quant for each state? Commented Oct 15, 2021 at 6:32
  • 1
    Why dont you query the database with a group by statement? Commented Oct 15, 2021 at 6:33
  • @MΛIK yes, quant for each state Commented Oct 15, 2021 at 6:37
  • 2
    I would say check the @MΛIK's answer. I would assume you mean the states by precondition. So, it is just a simple WHERE clause used along with GROUP BY. Commented Oct 15, 2021 at 6:42

1 Answer 1

2

You should take a deeper look into SQL. It's totally not necessary to do it like that. Just use a group by statement.

statement = "SELECT state, AVG(quant) FROM sales WHERE cust = Bloom GROUP BY state"

After executing it you can simply loop through the returned list and check each state.

data = cursor.fetchall()
for dataset in data:
    if dataset[0] == 'NJ':
        # do something with dataset[1]

Note: dataset[0] stores the state and dataset[1] the average.

Sign up to request clarification or add additional context in comments.

4 Comments

So, I have to get this data only using Python and select * from sales is the only query I can use as part of the assignment. That's why I can't use group by or any other SQL statement
@swombhai Ahh okay, now I understand your problem! You should have said that before! Is it allowed to select specific columns or only everything?
Everything, also I think I should use Pandas for this.
@swombhai Pandas would make it easier but it's also approachable with plain python.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.