1

I have a table in SQL Server:

select * from TaskSave

TaskID  SaveTypeID  ResultsPath PluginName                      PluginConfiguration
---------------------------------------------------------------------
  92       1         NULL       NULL                            NULL
  92       7         NULL       RGP_MSWord.WordDocumentOutput   www|D:\Users\Peter\Documents\Temp
  92       7         NULL       RGP_WC.WCOutput                 wcwc|D:\Users\Peter\Documents\Temp|.docx|123|456|789

which I am trying to read with C# / Entity Framework:

public static List<TaskSave> GetSavesPerTask(Task task)
{
    using (Entities dbContext = new Entities())
    {
        var savesPerTask = from p in dbContext.TaskSaves.Include("SaveType").Where(q => q.TaskID == task.TaskID) select p;
        //return savesPerTask.ToList();

        // DEBUG
        var x = savesPerTask.ToList();

        foreach (var y in x)
        {
            Console.WriteLine("SaveTypeID {0}, Plugin Name {1}", y.SaveTypeID, y.PluginName);
        }

        return x;
    }
}

The business rule states that TaskID + SaveTypeID + PluginName are unique, i.e. a task can have more than one plug-in. If the save type is not ‘plugin’, then TaskID + SaveTypeID must be unique.

My problem is that the GetSavesPerTask method returns the wrong results. It retrieves three rows, but row 2 is duplicated – I get 2 rows with PluginName of RGP_MSWord.WordDocumentOutput and not the RGP_WC.WCOutput row. The debug print shows:

SaveTypeID 1, Plugin Name 
SaveTypeID 7, Plugin Name RGP_MSWord.WordDocumentOutput
SaveTypeID 7, Plugin Name RGP_MSWord.WordDocumentOutput

Both the debugger and the ultimate user of the data agree that the third row is absent.

I have tried removing the include clause, but that makes no difference to the result set. Here is the SQL (from the simpler case) as reported by the debugger:

savesPerTask    {SELECT 
    [Extent1].[TaskID] AS [TaskID], 
    [Extent1].[SaveTypeID] AS [SaveTypeID], 
    [Extent1].[ResultsPath] AS [ResultsPath], 
    [Extent1].[PluginName] AS [PluginName], 
    [Extent1].[PluginConfiguration] AS [PluginConfiguration]
    FROM (SELECT 
    [TaskSave].[TaskID] AS [TaskID], 
    [TaskSave].[SaveTypeID] AS [SaveTypeID], 
    [TaskSave].[ResultsPath] AS [ResultsPath], 
    [TaskSave].[PluginName] AS [PluginName], 
    [TaskSave].[PluginConfiguration] AS [PluginConfiguration]
    FROM [dbo].[TaskSave] AS [TaskSave]) AS [Extent1]
    WHERE [Extent1].[TaskID] = @p__linq__0}

I have copy and pasted the SQL from the debugger into SSMS and it gets the correct results. I have tried deleting and recreating those two table in the EF model. I have recompiled many times as I’ve tried different things (adding debug, refreshing the model, moving the return outside of the using block, etc.).

The TaskSave table does not have a primary key, so I get Error 6002 during compilation (which I ignore as it does not seem to affect anything else and I can’t find any work-around on the internet)

'Database.dbo.TaskSave' does not have a primary key defined. The key has been inferred and the definition was created as a read-only table/view.

How can I fix the GetSavesPerTask method to return the correct rows? What silly mistake have I made? I have many other tables and CRUD operations, which seem to work as expected.

Here are my versions

  • .NET 4.6.01055
  • C# 2015
  • SQL Server 2014
  • SQL Server data tools 14.0.50730.0
2

1 Answer 1

2

Most likely, the trouble just simply is the lack of an explicit primary key. Every table in your database ought to have a primary key anyway...

What happens here is: since there's no primary key, EF will just use all non-nullable columns from the table (or the view) as a "replacement" PK. Not sure which those are in your case - possibly (TaskID, SaveTypeID) ??

And when EF reads the data, it will go:

  • read the row
  • check the primary key (or the "stand-in" PK in this case)
  • if it has already read a row with that PK - it just duplicates that row that it already has - it will disregard any non-PK columns from the table/view!

So in your case, once it's read a first row with the "stand-in" PK being TaskID = 92, SaveTypeID = 7, any further rows with those two values will just get the already read values - no matter what's stored in the database!

SOLUTION: as I said before: EVERY table OUGHT TO HAVE a proper primary key to uniquely identify each individual row!

In your case, I'd recommend to just add a surrogate key like this:

ALTER TABLE dbo.TaskSave
ADD TaskSaveID INT IDENTITY(1,1) NOT NULL

ALTER TABLE dbo.TaskSave
ADD CONSTRAINT PK_YourTableName
    PRIMARY KEY CLUSTERED(TaskSaveID)

Now, each row has its own, unique TaskSaveID, and EF can detect the proper PK, and all your troubles should be gone.

Sign up to request clarification or add additional context in comments.

3 Comments

A surrogate identity is enough to solve the problem at hand and is a sufficiently correct answer. A more recommended schema to me would be to set TaskID and SaveTypeID as primary key and create a separate TaskSavePlugin table with TaskID, SaveTypeID and PluginName as PK (PluginName is now not nullable). Then call dbContext.TaskSaves.Include("SaveType").Include("SaveTypePlugins")
@Rhumborl: I quite agree! This is a "quick fix" - to at least solve the missing PK design crime.....
Adding a primary key (identity column) fixed the problem and was easy to do. The TaskSave table started as a simple junction table but it grew until it broke.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.