0

I have the following code that is aimed to return, from the local system, the databases,the version of PostgreSQL, the size of the databases, the tables in each database (not beginning with pg_) and write all this data to a .csv file. Environment: PostgreSQl: v14.6, Python: v3.11, O/S: Win10

#------------------------------------------------------------------------------
import psycopg2
import sys
import csv

##  global variables
database = 'postgres'
db_table = 'results'
db_host = 'localhost'
db_user = 'postgres'
db_password = 'postgres'
csv_file = "postgresData.csv"

# Connection parameters
param_dic = {
    "host"      : db_host,
    "database"  : database,
    "user"      : db_user,
    "password"  : db_password
}

def connect(params_dic):
    conn = None
    try:
        # connect to the PostgreSQL server
        print('Connecting to the PostgreSQL database...')
        conn = psycopg2.connect(**params_dic)
    except (Exception, psycopg2.DatabaseError) as error:
        print(error)
        sys.exit(1)
    ##  print the name of the connected database,  using dict key (database).    
    print("\u001b[32m\tConnecting to the database: \u001b[0m\t{}".format(params_dic["database"]))
    

    cursor = conn.cursor()

    # Get the PostgreSQL version
    cursor.execute("SELECT version();")
    postgres_version = cursor.fetchone()[0]
##    print(postgres_version)

    # Get a list of databases
    cursor.execute("SELECT datname FROM pg_database;")
    databases = [row[0] for row in cursor.fetchall()]
    # for da in databases:
    #     print(da)
       
    
    
    try:
        with open("database_info.csv", "w", newline="") as csv_file:
            csv_writer = csv.writer(csv_file)
            csv_writer.writerow(["Database Name", "Table Name", "Database Size", "PostgreSQL Version"])

            # Fetch database names
            cursor.execute("SELECT datname FROM pg_database;")
            databases = [row[0] for row in cursor.fetchall()]

            for database in databases:
                try:
                    cursor.execute(f"SELECT pg_size_pretty(pg_database_size('{database}'));")
                    database_size = cursor.fetchone()[0]

                # Fetch user-created base table names in the database
                    cursor.execute(f"""
                                   SELECT relname
                                   FROM pg_class
                                   WHERE relkind = 'r'
                                   AND relnamespace = (SELECT oid FROM pg_namespace WHERE nspname = 'public')
                                   AND relname NOT LIKE 'pg_%';
                                   """)
                    tables = [row[0] for row in cursor.fetchall()]

                    for table in tables:
                        csv_writer.writerow([database, table, database_size, postgres_version])
                except Exception as inner_error:
                    print(f"Error fetching data for database '{database}': {inner_error}")

        print("Data has been written to 'database_info.csv'.")
    except (Exception, psycopg2.DatabaseError) as error:
        print("Error: ", error)

    
    
    cursor.close()
    conn.close()

    print("Database information has been written to 'database_info.csv'.")

    ## I do not know why the following line is changing colour in Spyder IDE
    return conn


def main():
    conn = connect(param_dic)
    conn.close()


if __name__ == "__main__":
    main()

Two observations: Observation 1:

cursor.execute(f"""
                                   SELECT relname
                                   FROM pg_class
                                   WHERE relkind = 'r'
                                   AND relnamespace = (SELECT oid FROM pg_namespace WHERE nspname = 'public')
                                   AND relname NOT LIKE 'pg_%';
                                   """)

This is producing an warning which I cannot correct:

F-string is missing placeholders.

I would like to know why this is happening and how to correct the warning, please.

Observation 2:

SELECT relname
                                   FROM pg_class
                                   WHERE relkind = 'r'
                                   AND relnamespace = (SELECT oid FROM pg_namespace WHERE nspname = 'public')
                                   AND relname NOT LIKE 'pg_%';

This SQL script is directly from within the code segment above and from psql prompt is successfully returning the list of expected tables. Why is this SQL not working in the python script?

2
  • i tried your code and for me its returning some data into "database_info.csv" file: postgres,results,7892 kB,"PostgreSQL 16.0, compiled by Visual C++ build 1935, 64-bit" postgres,newtable,7892 kB,"PostgreSQL 16.0, compiled by Visual C++ build 1935, 64-bit" template1,results,7908 kB,"PostgreSQL 16.0, compiled by Visual C++ build 1935, 64-bit".... Commented Nov 4, 2023 at 17:00
  • I do not think that the code is working correctly, I have added the db DVD Rentals on postgresqltutorial.com/postgresql-getting-started/…. This has 15 tables. My current code is pulling each dbname on the system as expected. The code is pulling and incorrectly assigning a single table name to each dbname. Commented Nov 9, 2023 at 9:00

1 Answer 1

0

The program is connecting to a single database and then reading system catalog tables, but as documented

Most system catalogs are copied from the template database during database creation and are thereafter database-specific

The program must get a connection to each database in turn before reading the catalog tables, something like this:

from contextlib import closing
...

    for database in databases:
        if database = 'template0':
            # Can't conenct to template0
            continue
        with closing(psycopg2.connect(dbname=database)) as dconn:
            with dconn.cursor() as dcur:

                # Fetch user-created base table names in the database
                dcur.execute(
                     """
                     SELECT relname
                       FROM pg_class
                       WHERE relkind = 'r'
                         AND relnamespace = (
                           SELECT oid 
                             FROM pg_namespace 
                             WHERE nspname = 'public'
                         )
                       AND relname NOT LIKE 'pg_%';
                       """
                )
                for row in dcur:
                    print(row)
                print()

The warning about f-strings refers to f-strings with no formatting placeholder:

name = 'Alice'
# Placeholder, ok:
s = f'{name}'
# No placeholder, warning
s = f'Alice'
Sign up to request clarification or add additional context in comments.

4 Comments

if database = 'template0': # Can't conenct to template0 continue I do not understand the logic of this statement. The current code checks the databases, why do you suggest your alternative, please? f-string: understood, thank you
The template0 database does not accept connections, so we skip it.
To query an individual database's catalog, you must be connected to that database. In your code you connect to the database defined in param_dic so your query only works for that database.
I have figured the code to return the dbname and associated tables. When I have a working script I will submit it here. What I had not considered is that a dbname that does no have a table is not listed in the result. It is necessary to list all dbnames on the system. Any ideas? I thought that for row in dcur: if row = None: ... might work though this dd not. Any pointers, I would be grateful

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.