3

I'm using Postgresql on Windows in a C# application. The problem I'm having is really weird and can be described as follows:

  • I restart my Windows
  • I run the program
  • One specific query fails: SELECT COUNT(*) AS c FROM files WHERE total_bytes IS NOT NULL
  • I run the program again and everything works normally

Weird notes:

  • I tried making another query before that one (even using the same table) and it worked: SELECT COUNT(*) AS c FROM files
  • I wasn't able to reproduce the error restarting Postgresql. It only happens on a Windows reboot. And it happens only once.

The exception traceback:

Npgsql.NpgsqlException: Exception while reading from stream

   at Npgsql.ReadBuffer.Ensure(Int32 count, Boolean dontBreakOnTimeouts)
   at Npgsql.NpgsqlConnector.DoReadMessage(DataRowLoadingMode dataRowLoadingMode, Boolean isPrependedMessage)
   at Npgsql.NpgsqlConnector.ReadMessageWithPrepended(DataRowLoadingMode dataRowLoadingMode)
   at Npgsql.NpgsqlConnector.ReadMessage(DataRowLoadingMode dataRowLoadingMode)
   at Npgsql.NpgsqlConnector.ReadExpecting[T]()
   at Npgsql.NpgsqlDataReader.NextResultInternal()
   at Npgsql.NpgsqlDataReader.NextResult()
   at Npgsql.NpgsqlCommand.Execute(CommandBehavior behavior)
   at Npgsql.NpgsqlCommand.ExecuteDbDataReaderInternal(CommandBehavior behavior)
   at Npgsql.NpgsqlCommand.ExecuteDbDataReader(CommandBehavior behavior)
   at System.Data.Common.DbCommand.ExecuteReader()
   at Npgsql.NpgsqlCommand.ExecuteReader()
   at DriveShare.Database.Postgresql.ExecuteQuery(NpgsqlCommand command) in c:\projetos\driveshareclient\DriveShare\DriveShare\Database\Postgresql.cs:line 216
   at DriveShare.Database.Postgresql.Query(String sql, Object[] args) in c:\projetos\driveshareclient\DriveShare\DriveShare\Database\Postgresql.cs:line 72
   at DriveShare.Database.Postgresql.QueryOne(String sql, Object[] args) in c:\projetos\driveshareclient\DriveShare\DriveShare\Database\Postgresql.cs:line 83
   at DriveShare.Database.Postgresql.QueryValue(String key, String sql, Object[] args) in c:\projetos\driveshareclient\DriveShare\DriveShare\Database\Postgresql.cs:line 97
   at DriveShare.Database.Postgresql.QueryValue(String key, String sql) in c:\projetos\driveshareclient\DriveShare\DriveShare\Database\Postgresql.cs:line 92
   at DriveShare.Database.FileIndexDataSet.CountIndexedFiles() in c:\projetos\driveshareclient\DriveShare\DriveShare\Database\FileIndexDataSet.cs:line 89
   at DriveShare.Engine.DriveShareEngine.Start() in c:\projetos\driveshareclient\DriveShare\DriveShare\Engine\DriveShareEngine.cs:line 156
   at DriveShareWebService.Program.Main(String[] args) in c:\projetos\driveshareclient\DriveShare\DriveShareWebService\Program.cs:line 19

Since I have to keep the program working, I wrote a workaround to make the app retry that query before proceeding. I'm not proud of that:

public void WaitForConnection()
{
    int limitSeconds = 3 * 60;
    var start = DateTime.Now;
    while (true)
    {
        try
        {
            Log.WaitingForDatabaseConnection();
            Query("SELECT COUNT(*) AS c FROM files WHERE total_bytes IS NOT NULL");
            Log.DatabaseConnectionAquired();
            break;
        }
        catch (Exception e)
        {
            var wastedTime = DateTime.Now - start;
            if (wastedTime.TotalSeconds > limitSeconds)
                throw;
            else
                Log.Exception(e);
        }
        Thread.Sleep(1000);
    }
}

I'm using Npgsql (in a thin abstraction class) to connect to Postgresql. Postgresql logs show some entries about winsock errors that I do not yet understand:

2016-08-16 10:14:34 BRT LOG:  database system was shut down at 2016-08-16 10:12:07 BRT
2016-08-16 10:14:34 BRT FATAL:  the database system is starting up
2016-08-16 10:14:34 BRT LOG:  MultiXact member wraparound protections are now enabled
2016-08-16 10:14:34 BRT LOG:  sistema de banco de dados está pronto para aceitar conexões
2016-08-16 10:14:34 BRT LOG:  autovacuum launcher started
2016-08-16 10:17:16 BRT LOG:  could not receive data from client: unrecognized winsock error 10053
2016-08-16 10:17:27 BRT LOG:  could not send data to client: unrecognized winsock error 10054
2016-08-16 10:17:27 BRT STATEMENT:  SELECT path FROM files
2016-08-16 10:17:27 BRT FATAL:  connection to client lost
2016-08-16 10:17:27 BRT STATEMENT:  SELECT path FROM files
2016-08-16 10:17:27 BRT LOG:  could not receive data from client: unrecognized winsock error 10053
2016-08-16 10:17:27 BRT LOG:  unexpected EOF on client connection with an open transaction
2016-08-16 10:17:33 BRT LOG:  unexpected EOF on client connection with an open transaction
2016-08-16 10:25:14 BRT LOG:  could not receive data from client: unrecognized winsock error 10053
2016-08-16 10:25:15 BRT LOG:  could not receive data from client: unrecognized winsock error 10053
2016-08-16 10:25:15 BRT LOG:  unexpected EOF on client connection with an open transaction
2016-08-16 10:26:30 BRT LOG:  could not send data to client: unrecognized winsock error 10054
2016-08-16 10:26:30 BRT FATAL:  connection to client lost
2016-08-16 10:26:50 BRT LOG:  could not send data to client: unrecognized winsock error 10054
2016-08-16 10:26:50 BRT FATAL:  connection to client lost
2016-08-16 10:26:50 BRT LOG:  could not receive data from client: unrecognized winsock error 10053
2016-08-16 10:26:50 BRT LOG:  unexpected EOF on client connection with an open transaction
2016-08-16 10:27:06 BRT LOG:  could not send data to client: unrecognized winsock error 10054
2016-08-16 10:27:06 BRT FATAL:  connection to client lost
2016-08-16 10:27:06 BRT LOG:  could not send data to client: unrecognized winsock error 10054
2016-08-16 10:27:06 BRT FATAL:  connection to client lost
2016-08-16 10:27:30 BRT LOG:  pedido de desligamento rápido foi recebido
2016-08-16 10:27:30 BRT LOG:  interrompendo quaisquer transações ativas
2016-08-16 10:27:30 BRT LOG:  autovacuum launcher shutting down
2016-08-16 10:27:30 BRT ERROR:  canceling statement due to user request
2016-08-16 10:27:30 BRT LOG:  autovacuum launcher shutting down
2016-08-16 10:27:30 BRT LOG:  shutting down
2016-08-16 10:27:30 BRT LOG:  database system is shut down

I don't expect someone to know what exactly what my problem is. I was just hoping someone might have had some similar issues that could shed some light in it.

4
  • Do you get the same odd behavior if you query the db using a different client, like PgAdmin? Or does this weird behavior only happen from your application? Commented Aug 16, 2016 at 14:39
  • @sstan just tested. The query worked on PgAdmin, but took over 34 seconds on the first run. 686ms on the second. Commented Aug 16, 2016 at 14:50
  • 1
    I'm guessing it could be related to Windows startup mess and some sort of timeout in Npgsql. This query could just be slow enough to trigger it. Commented Aug 16, 2016 at 14:54
  • Sounds like a first-time timeout as others have suggested, but we can be more sure if you can simply post the full exception detail - there seems to be an inner exception to your NpgsqlException that's missing. Commented Aug 16, 2016 at 22:55

1 Answer 1

1

With some help I found the solution in the Npgsql docs, here.

Npgsql by default comes with some timeout parameters for connection and commands. After a Windows reboot, first access to the table was very slow, triggering the command timeout.

With additional parameters on the connection string I was able to change those settings higher and solve my problem:

connectionString += ";Timeout=180;Command Timeout=180";

Bonus tip: a Postgresql function pg_sleep(seconds) helped me reproduce the problem without actual reboots. Very helpful:

SELECT pg_sleep(60);
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.