0

We have client ERP solution which sends XML file data in Azure Event Hub. We need to process that XML file data & store it in a SQL Server table. The ERP system sends multiple files (1000's files with different schemas) per seconds into Event Hub.

We did a POC's with 2 approaches:

  1. Azure Function with custom Web API's along with stored procedures to do that but it has performance issue for failed cases

  2. Event HUB with Logic App - logging & error handling can be critical with this approach.

We don't have to use blob storage or Azure Data Lake.

Please suggest some best approach and links, or any tools which will help us.

1
  • Thousands files per seconds - I would definitely use Azure Functions and not Logic Apps Commented Jul 12, 2024 at 0:12

1 Answer 1

0

When dealing with

  1. High-throughput scenarios (thousands of files per second)
  2. A varying schemas from Azure Event Hub
  3. Storing data in SQL Server

Performance, flexibility & reliability are key factors.

Recommended Approach: Azure Functions with Event Hub Trigger

Why Azure Functions with Event Hub Trigger is Better:

  • Direct Integration: Azure Functions can directly connect to Event Hub using an Event Hub trigger, which eliminates the overhead of HTTP calls. This direct connection reduces latency and improves performance.
  • Flexibility: Logic Apps can become complex when dealing with a variety of schemas, often requiring repeated actions. Azure Functions offer more flexibility by allowing custom code to handle different schemas effectively.
  • Scalability: Event Hub can handle high throughput and provides a scalable way to ingest a large volume of events.
  • Error Handling: By using Azure Functions, you can implement robust error handling and logging mechanisms to manage failed cases.

Example: (isolated worker model)

 private readonly ILogger<EventHubsFunction> _logger;

    public EventHubsFunction(ILogger<EventHubsFunction> logger)
    {
        _logger = logger;
    }

    [Function(nameof(EventHubFunction))]
    [FixedDelayRetry(5, "00:00:10")]
    [EventHubOutput("dest", Connection = "EventHubConnection")]
    public string EventHubFunction(
        [EventHubTrigger("src", Connection = "EventHubConnection")] string[] input,
        FunctionContext context)
    {
        _logger.LogInformation("First Event Hubs triggered message: {msg}", input[0]);

        var message = $"Output message created at {DateTime.Now}";
        return message;
    }

See Azure Event Hubs trigger for Azure Functions

Additionally, Azure Functions can be used in combination with Durable Functions to handle long-running processes and complex orchestrations. This approach can help with a custom retry and error-handling mechanism if required.

Microsoft has guidlnes on resilient design for Event Hubs and Azure Functions: It covers key measures to ensure robust event streaming solutions. It discusses error handling, designing for idempotency, and managing retries to handle large data volumes effectively.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.