1

I'm learning PostgreSQL and I have a database log that has these columns: path text, ip inet, method text, status text, time timestamptz, id int

Using this database my goal is to find any days that have a error percent (status != '200 OK' / status) of over 2.5 percent. I have the query, seen below, that gets to the error percent as a decimal but I can't figure out how to multiply them by 10 to make them a 'percent' nor how to check if they are over 2.5.

I have tried both HAVING and WHERE in multiple spots but I either get a Programming error at group by or a syntax error wherever I enter the HAVING or WHERE. What's the correct way to change it to a percent and check if its over 2.5? Thanks in advance

Mock data:

CREATE TABLE log (
    path text,
    ip inet,
    method text,
    status text,
    "time" timestamp with time zone DEFAULT now(),
    id integer NOT NULL
);

INSERT INTO log VALUES ('/article/bears-love-berries', '198.51.100.76' , 'GET', '200 OK', '2016-07-1 12:54:22+00', 3355597);
INSERT INTO log VALUES ('/article/candidate-is-jerk', '198.51.100.76' , 'GET', '200 OK', '2016-07-1 19:54:22+00', 3355598);
INSERT INTO log VALUES ('/article/bears-love-berries', '198.51.100.76' , 'GET', '404 NOT FOUND', '2016-07-1 22:54:22+00', 3355599);
INSERT INTO log VALUES ('/article/goats-eat-googles', '203.0.113.42' , 'GET', '200 OK', '2016-07-11 12:54:22+00', 3355600);
INSERT INTO log VALUES ('/article/candidate-is-jerk', '203.0.113.42' , 'GET', '200 OK', '2016-07-11 17:54:22+00', 3355601);
INSERT INTO log VALUES ('/article/bears-love-berries', '203.0.113.42' , 'GET', '200 OK', '2016-07-11 19:54:22+00', 3355602);
INSERT INTO log VALUES ('/article/goats-eat-googles', '203.0.113.42' , 'GET', '404 NOT FOUND', '2016-07-15 13:54:22+00', 3355603);
INSERT INTO log VALUES ('/article/bears-love-berries', '198.51.100.34' , 'GET', '200 OK', '2016-07-15 20:54:22+00', 3355604);
INSERT INTO log VALUES ('/article/candidate-is-jerk', '198.51.100.34' , 'GET', '200 OK', '2016-07-15 23:54:22+00', 3355605);
INSERT INTO log VALUES ('/article/candidate-is-jerk', '203.0.113.12' , 'GET', '200 OK', '2016-07-21 19:54:22+00', 3355606);
INSERT INTO log VALUES ('/article/bears-love-berries', '203.0.113.12' , 'GET', '200 OK', '2016-07-21 12:54:22+00', 3355607);
INSERT INTO log VALUES ('/article/goats-eat-googles', '198.51.100.76' , 'GET', '200 OK', '2016-07-21 01:54:22+00', 3355608);
INSERT INTO log VALUES ('/article/goats-eat-googles', '198.51.100.76' , 'GET', '404 NOT FOUND', '2016-07-29 07:54:22+00', 3355609);
INSERT INTO log VALUES ('/article/candidate-is-jerk', '198.51.100.76' , 'GET', '404 NOT FOUND', '2016-07-29 20:54:22+00', 3355610);
INSERT INTO log VALUES ('/article/goats-eat-googles', '198.51.100.34' , 'GET', '200 OK', '2016-07-29 19:54:22+00', 3355611);

Query:

SELECT DATE(time) AS day,
(
  (SELECT CAST(COUNT(status) AS FLOAT)
  FROM log WHERE status != '200 OK') /
  (SELECT CAST(COUNT(status) AS FLOAT))
) AS views
FROM log GROUP BY day
ORDER BY views
2
  • Decompose the problem into smaller problems. What is a percent? It's a ratio of two numbers. What are the two numbers? You need a count status = 200 and a count of status != 200. I think if you can count the errors per day, you'll find the rest to be easy. Commented May 25, 2017 at 22:52
  • @Brandon I can find the count of the status != 200 and the count of the overall requests each day. The question is how do I test if the ratio of the two is greater than 2.5 percent, to put a WHERE or HAVING on the outer SELECT it has a syntax error or a column views not found error Commented May 25, 2017 at 23:26

1 Answer 1

1

After sharing a test case in sqlfiddle, we could get down to a running code that would do it.

The fiddle is here: http://sqlfiddle.com/#!15/b3ed9/39

Instead of multiplicating the result by 100 you can just check against the already-multiplied 0.025, no? I mean, if you have values between 0-1, the equivalent to 2.5% would be just 0.025.

Thus that command should do:

select views.day, views.cnt from
 (select datelog.day,
  (
   cast(count(case when datelog.status != '200 OK' then 1 else null end) as float)
   /
   cast(count(*) as float)
  ) as cnt
  from (select DATE(time) as day, status from log) as datelog
  group by datelog.day
 ) as views
 where views.cnt > 0.025

The last revision was deeply founded in the reply here: SQL use column from subselect in where clause

Sign up to request clarification or add additional context in comments.

8 Comments

I tried both this and HAVING but I get ProgrammingError: Column 'views' does not exist? Also the reason I'm trying to get it to be 2.5 and not 0.025 is I need to present it to python to display it without having to update ie times by 100 in python.
And when I do HAVING instead of where I get a syntax error on GROUP BY not the column error
Don't you think you missed a parens on your two subselect? I mean, I read <outer_select> <select_a> / (<select_b>) AS views -- so instead <outer_select> (<select_a> / <select_b>) AS views shouldn't do it?
Woops I accidently mistyped the post, but no it is still giving a views does not exist error
You are also missing 'FROM log' on your second sub-query. :)
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.