1

In my SQL Server DB, I have a table of IOT data (nearing 1,000,000 records). When the app receives inbound readings, I want to check that the DB doesn't already have a reading for that device with the same timestamp. What is the fastest way to check for records with matching properties?

Model

public class Reading
{
    public int Id { get; set; }
    public double Measurement { get; set; }
    public int DeviceId { get; set; }
    public Device Device { get; set; }
    public DateTime Timestamp { get; set; }
}

AddReading Method

public class ReadingRepository
{
    private readonly DataContext _context;

    public ReadingRepository(DataContext context)
    {
        _context = context;
    }

    public void AddReading(Reading reading)
    {
        // my proposed method... is there a better way?
        if (!_context.Readings.Any(r =>
            r.DeviceId == reading.DeviceId,
            r.Timestamp == reading.Timestamp))
            _context.Readings.Add(reading);
    }
}
1
  • 1
    SqlCommands are an option to cut your round trips in half. Could create a DataContext method along lines of AddReadingIfNotExists(Reading reading) Commented Jan 23, 2019 at 18:14

1 Answer 1

4

The fastest way is to insert and have a unique index blow and react to the error message.

Alternative do use a stored procedure.

Anything you do with EF will per definition NOT be fastest choice. And yes, this does not mean abandoning it all - EF is good for 80% to 95% of operations. Just bypass it for this one.

Sign up to request clarification or add additional context in comments.

3 Comments

Thanks, that's definitely something I'll have to think about doing. The readings do come in a large list via an API request and are first stored in a List<Reading> so I could check the list on itself, and that would probably do the trick. If I were to compare incoming lists against each other, I would suspect that I'd have to find a way to store the list of incoming readings in a variable that can persist across sessions.
Actually if you get LISTS anyway, then upload the lists to a temp or other staging table via SqlBulkCopy (i..e. allow doubles as per your rule) and then use ONE sp to move them to final storage.Suddenly a thread should be able to handle 50k rows in a second (actual number here) and you can use multiple threads / parallel requests. Eliminate doubles in the select.... from the select...into part.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.