0

I need to return multiple value in a single SQL query on my postgresql database. Here is my query so far:

SELECT AVG("percent"), MIN("percent"), MAX("percent"), AVG("profit"), MIN("profit"), MAX("profit")
FROM public.log_analyticss 
WHERE "buyPlatform" = 'platA' AND
"date" >= '1526356073.6126819'

Data

date             sellPlatform     profit      percent    
----------       ------------     ----------  ----------
1526356073.61    platA            0           10.1
1526356073.62    platA            22          11
1526356073.63    platA            3           7
1526356073.64    platA            1           8
1526356073.65    platA            11          9
1526356073.66    platA            12          10
1526356073.67    platA            13          15

Desired Result

date             sellPlatform     profit      percent    
----------       ------------     ----------  ----------
1526356073.61    platA            0           10.1         //MIN Profit
1526356073.62    platA            22          11           //MAX Profit
1526356073.63    platA            3           7            //MIN Perc
1526356073.67    platA            13          15           //MAX Perc

//Then somehow I want it return AVG as well if that is every possible. Otherwise, I don't mind running another query to do that.

The issue is that I don't just want the MIN and MAX values. I want the entire row of data from which the MIN and MAX value is taken.

I understand that I am asking for an AVG and a MIN/MAX value which is going to return data in two different formats. I know this is might not be possible. But any help on how to most efficiently do this would be extremely helpful.

Right now I am just pulling the entire dataset into my code and calculating the mean, min and max within my code which I know if very bad and also very very slow. The table has about 8 million rows and the dataset that I am grabbing is about 9000 rows so it is very slow the way I am doing it now.

3
  • Post the expected result for sample data Commented May 15, 2018 at 19:07
  • the resulting rows for mix should be different from max .. how you want these result ? .. show a proper data sample and the expected result Commented May 15, 2018 at 19:11
  • @RadimBača I just added a sample data set and result Commented May 15, 2018 at 19:18

3 Answers 3

1

The most efficient way to retrieve the row associated with the minimum / maximum generally doesn't involve the MIN()/MAX() aggregates at all; instead, you can just attach an ORDER BY to your query, and add a LIMIT 1 to fetch only the first record.

This means you need four SELECT statements with four different orderings, but you can factor out the expensive part (the fetch from log_analyticss) into a temp table or a CTE, e.g.:

WITH Data AS (
  SELECT *
  FROM public.log_analyticss 
  WHERE "buyPlatform" = 'platA' AND
    "date" >= '1526356073.6126819'
)
(SELECT 'Min percent', * FROM Data ORDER BY "percent" ASC LIMIT 1)
UNION ALL
(SELECT 'Max percent', * FROM Data ORDER BY "percent" DESC LIMIT 1)
UNION ALL
(SELECT 'Min profit', * FROM Data ORDER BY "profit" ASC LIMIT 1)
UNION ALL
(SELECT 'Max profit', * FROM Data ORDER BY "profit" DESC LIMIT 1)

In your case, a temp table might be better than a CTE, as you can re-use it to compute the averages.

Note that if one of these maximum/minimum values is shared by two different rows, this query would only return one of them. The chosen row is selected effectively at random, but you can attach more fields to the ORDER BY clause to serve as a tie-breaker.

If you actually want both records in this case, you'll need something more like Auston's or Radim's answer, i.e. compute the aggregates first, then join back to the data on the profit and percent columns. You can still make use of a temp table / CTE here to avoid hitting log_analyticss more than once.

Sign up to request clarification or add additional context in comments.

7 Comments

I am also not very concerned about duplicates. If there are duplicates then I don't care which one of the duplicates is returned
I tried to add this function to the end but it didn't work UNION ALL (SELECT 'Avg Profit', AVG("profit") FROM Data ) any thoughts on how I might do this for both profit and percent. It gives the error each UNION query must have the same number of columns
Definitely the best answer so far. I just want to add the AVG function as well if possible. Otherwise I will just run a second command for the AVG function. Do you know how I would use this same Data variable/temp table to run an AVG function?
@Nevin: Each branch of the UNION needs to have the same set of fields. The average doesn't really fit here, as it doesn't have an associated record. If you're determined to cram all of this into the same data structure regardless, you need to pad it out so that it looks like a log_analyticss row, e.g. SELECT 'Avg', NULL AS "date", NULL AS "sellPlatform", AVG("profit") AS "profit" AVG("percent") AS "percent". Note that the field order needs to exactly match the other subqueries.
That said, finding the max/min over 9000 records - even if you're doing it in client-side code - should take milliseconds at most. If this is taking 7 seconds to run, then it's probably spending 99% of that time in the initial fetch from log_analyticss (and if so, this is what you should be focused on optimising). Does the table have any indexes on buyPlatform or date?
|
0

You need something along these lines:

SELECT a.*
FROM public.log_analyticss  a
JOIN
(
    SELECT 
        MIN("percent") min_percent, 
        MAX("percent") max_percent, 
        MIN("profit") min_profit, 
        MAX("profit") max_profit
    FROM public.log_analyticss 
) t ON a.date = t.date AND
       a.sellPlatform = t.sellPlatform AND
       (a.profit = minprofit OR
        a.profit = maxprofit OR
        a.percent = minpercent OR
        a.percent = maxpercent)

Comments

0

I think the best way to do this is by two queries: The first retrieve the metrics, like you have done; The second query retrieve the sample registers.

Or you can try run over a temporary table (automatic dropped after end session):

CREATE TEMP TABLE statistics AS 
    SELECT AVG(percent) as perc_avg, MIN(percent) as perc_avg, MAX(percent) as perc_max, AVG(profit) as pro_avg, MIN(profit) as pro_min, MAX(profit) as pro_max
    FROM public.log_analyticss 
    WHERE buyPlatform = 'platA' AND
    sellPlatform = 'platB' AND 
    productId = '183948' AND
    date >= '1526356073.6126819'
;
    SELECT date, sellPlatform, profit, percent
    FROM public.log_analyticss a join statistics s
     on (a.profit = s.pro_max or a.profit = s.pro_min or
         a.percent = s.perc_max or a.percent = s.perc_min)
    WHERE buyPlatform = 'platA' AND
    sellPlatform = 'platB' AND 
    productId = '183948' AND
    date >= '1526356073.6126819';

Reference to temporary tables: http://www.postgresql.org/docs/9.2/static/sql-createtable.html

2 Comments

when I do this SELECT AVG("percent"), MIN("percent"), MAX("percent"), AVG("profit"), MIN("profit") it only give me the min, max and mean for either profit or percent not both
I think in this case u need two temporary one to percent and other to profit.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.