7

We are using .NET Core 2.1 and Entity Framework Core 2.1.1

I have the following setup in Azure West Europe

  • Azure SQL Database -- Premium P2 250 DTU -- Public endpoint, no VNET peering -- "Allow access to Azure Services" = ON

  • Azure Functions -- Consumption Plan -- Timeout 10 Minutes

  • Azure Blob storage -- hot tier

Multiple blobs are uploaded to Azure Blob storage, Azure Functions (up to 5 concurrently) are fired via Azure Event Grid. Azure Functions check structure of the blobs against metadata stored in Azure SQL DB. Each blob contains up to 500K records and 5 columns of payload data. For each record Azure Functions makes a call against Azure SQL DB, so no caching.

I am getting often, when multiple blobs are processed in parallel (up to 5 asynchronous Azure Functions call at the same time), and when the blob size is larger 200K-500K records, the following transient and connection errors from .NET Core Entity Framework:

1. An exception has been raised that is likely due to a transient failure. Consider enabling transient error resiliency by adding 'EnableRetryOnFailure()' to the 'UseSqlServer' call.

2. A connection was successfully established with the server, but then an error occurred during the pre-login handshake. (provider: SSL Provider, error: 0 - The wait operation timed out.)

3. Connection Timeout Expired. The timeout period elapsed while attempting to consume the pre-login handshake acknowledgement. This could be because the pre-login handshake failed or the server was unable to respond back in time. This failure occurred while attempting to connect to the routing destination. The duration spent while attempting to connect to the original server was - [Pre-Login] initialization=13633; handshake=535; [Login] initialization=1; authentication=0; [Post-Login] complete=156; The duration spent while attempting to connect to this server was - [Pre-Login] initialization=5679; handshake=2044;

4. A connection was successfully established with the server, but then an error occurred during the pre-login handshake. (provider: SSL Provider, error: 0 - The wait operation timed out.)

  1. Server provided routing information, but timeout already expired.

At the same time there are any/no health events reported for the Azure SQL Database during the test, and the metrics look awesome: MAX Workers < 3.5%, Sum Successful Connections < 35, MAX Sessions Percentage < 0.045%, Max Log UI percentage < 0.024%, Sum Failed Connections = 0, MAX DTU < 10%, Max Data IO < 0.055%, MAX CPU < 10%.

Running connection stats on Azure SQL DB (sys.database_connection_stats_ex): No failed, aborted or throttled connections.

select *
from sys.database_connection_stats_ex
where start_time >= CAST(FLOOR(CAST(getdate() AS float)) AS DATETIME)
order by start_time desc

Has anyone faced similar issues in combintation with .Net Core Entity Framework and Azure SQL Database. Why I am getting those transient errors, why Azure SQL Database metrics look so good not reflecting at all that there are issues?

Thanks a lot in advance for any help.

using Microsoft.EntityFrameworkCore;

namespace MyProject.Domain.Data
{
    public sealed class ApplicationDbContextFactory : IApplicationDbContextFactory
    {
        private readonly IConfigurationDbConfiguration _configuration;
        private readonly IDateTimeService _dateTimeService;

        public ApplicationDbContextFactory(IConfigurationDbConfiguration configuration, IDateTimeService dateTimeService)
        {
            _configuration = configuration;
            _dateTimeService = dateTimeService;
        }

        public ApplicationDbContext Create()
        {
            //Not initialized in ctor due to unit testing static functions.
            var options = new DbContextOptionsBuilder<ApplicationDbContext>()
                .UseSqlServer(_configuration.ConfigurationDbConnectionString).Options;

            return new ApplicationDbContext(options, _dateTimeService);
        }
    }
}
4
  • Have you tried the EnableRetryOnFailure() approach ? it should resolve most of your issues. I would suggest you to create a post per exception as thery may not be related Commented Jan 15, 2019 at 9:27
  • Yes, EnableRetryOnFailure() has helped us a lot, we experience less issues, but they are not completely gone. However, the question why the transient issues were not reflected/reported in the Azure SQL DB metrics is still open. Commented Jan 16, 2019 at 10:08
  • Do you understand the concept of transient errors ? It is networking related, somehow this is why you need a retry strategy Commented Jan 16, 2019 at 10:15
  • @Thomas: thanks for your answers. Enabling retry on failure strategy has solved all our issues. Commented Jan 26, 2019 at 18:59

2 Answers 2

9

I've found this good documentation around sql database transient errors:

From the documentation:

A transient error has an underlying cause that soon resolves itself. An occasional cause of transient errors is when the Azure system quickly shifts hardware resources to better load-balance various workloads. Most of these reconfiguration events finish in less than 60 seconds. During this reconfiguration time span, you might have connectivity issues to SQL Database. Applications that connect to SQL Database should be built to expect these transient errors. To handle them, implement retry logic in their code instead of surfacing them to users as application errors.

Then it explains in details how to build retry logic for transient errors.

Entity Framework with SQL server implements a retry logic:

protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
{
    optionsBuilder
        .UseSqlServer("<connection string>", options => options.EnableRetryOnFailure());
}

You can find more information here:

Sign up to request clarification or add additional context in comments.

Comments

-3

Remove and recreate database user and make sure to fill Login Name box just below the User Name. This will fix same issue on older SQL versions too.

1 Comment

Make sure you reinstall Windows first. -.-

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.