0

I have a table in an sqlite db and I receive two outputs when run in sqlite3 through my terminal and when called upon in a python script. The output in the python script is odd, it isn't an actual datapoint.

The table looks like:

    SELECT manager_name, points_for, wins, year FROM standings;

    manager_name    points_for  wins        year
     Eddy           1513.82     8           2014 
     Drew           1351.8      10          2014
     Sammy          1497.72     10          2014 
     Mike2          1474.22     10          2014 
     Sam            1360.94     7           2014 
     Rick           1379.1      7           2014          
     Jeff           1381.1      6           2014          
     Josh           1411.16     5           2014         
     Bay            1237.2      5           2014         
     Mike           1187.68     6           2014          
     G              1208.66     5           2014                   
     Brett          1245.78     5           2014          
     Luke           1395.76     4           2014              
     A  Roberman    1331.1      3           2014 

When I run the following query in sqlite3 I receive the correct response (most wins with points as tiebreaker)

    SELECT max(wins), points_for, manager_name, year FROM (SELECT manager_name, wins, points_for FROM standings ORDER BY wins DESC, points_for DESC);

    OUTPUT:

   manager_name   points_for   max(wins)    year
     SamK          1497.72       10         2014

But when I run the query through a python script I get a very odd result.

    df = pd.read_sql(" SELECT max(wins), points_for, manager_name, year FROM (SELECT manager_name, wins, points_for, year FROM standings ORDER BY wins DESC, points_for DESC)", conn)

    print df
               max(wins)  points_for manager_name  year
        0         10      1497.72        Al R      2014

Bear in mind I have run many similar python scripts in this db and have never gotten an output that did not make sense like this.

EDIT using pysqlite has same result

    c.execute('SELECT max(wins), points_for, manager_name, year FROM (SELECT manager_name, wins, points_for, year FROM standings ORDER BY wins DESC, points_for DESC)')

    r = c.fetchone()
    print r
    (10, 1497.72, u'Al R', 2014)
3
  • Not my cup of tea, but is the missing of "year" before FROM in sqlite3 query intentional? These queries are not the same (string-wise; i'm not into sql-queries)! Commented Jul 28, 2015 at 0:39
  • It was not intentional. I added the edit to include the query for year as well. Thanks Commented Jul 28, 2015 at 1:04
  • Maybe try pysqlite or apsw as a direct interface to sqlite instead of what I suspect is pandas? Also, check if there is a significant difference in sqlite_version library used? Also, try inserting the data from within python (using inserts instead of via commandline import/csv/etc) to see if that has an effect. Commented Jul 28, 2015 at 2:32

1 Answer 1

1

Both outputs are correct.

max(wins) outputs the largest wins value found in the subquery.
points_for, manager_name, and year output the value from some random row.

(The ORDER BY in the subquery has no effect; max() does its own ordering, if needed.)

To get row values from one of the rows that matches the max(), you must use SQLite 3.7.11 or later (this is an SQLite extension, and not part of standard SQL).

To actually get the first row(s) according to an ORDER BY, you must use the LIMIT clause:

SELECT wins, points_for, manager_name, year
FROM standings
ORDER BY wins DESC, points_for DESC
LIMIT 1;
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.