using hashset in entity framework

Question

I want to know what is the difference between creating classes with or without using "hashset" in constructor.

Using code first approach (4.3) one can creat models like this:

public class Blog
 {
     public int Id { get; set; }
     public string Title { get; set; }
     public string BloggerName { get; set;}
     public virtual ICollection<Post> Posts { get; set; }
  }

public class Post
 {
    public int Id { get; set; }
    public string Title { get; set; }
    public DateTime DateCreated { get; set; }
    public string Content { get; set; }
    public int BlogId { get; set; }
    public ICollection<Comment> Comments { get; set; }
 }

or can create models like this :

public class Customer
{
    public Customer()
    {
        BrokerageAccounts = new HashSet<BrokerageAccount>();
    }
    public int Id { get; set; }
    public string FirstName { get; set; }
    public ICollection<BrokerageAccount> BrokerageAccounts { get; set; }
}

public class BrokerageAccount
{

    public int Id { get; set; }
    public string AccountNumber { get; set; }
    public int CustomerId { get; set; }

}

What is hashset doing here?

should i use hashset in the first two models also?

is there any article which shows the application of hashset?

NER1808 · Accepted Answer · 2015-03-23 18:02:26Z

28

The HashSet does not define the type of collection that will be generated when you actually fetch data. This will always be of type ICollection as declared.

The HashSet created in the constructor is to help you avoid NullReferenceExceptions when no records are fetched or exist in the many side of the relationship. It is in no way required.

For example, based on your question, when you try to use a relationship like...

var myCollection = Blog.Posts();

If no Posts exist then myCollection will be null. Which is OK, until you fluent chain things and do something like

var myCollectionCount = Blog.Posts.Count();

which will error with a NullReferenceException.

Where as

var myCollection = Customer.BrokerageAccounts();
var myCollectionCount = Customer.BrokerageAccounts.Count();

will result in and empty ICollection and a zero count. No exceptions :-)

edited Mar 23, 2015 at 18:02

answered Jul 30, 2014 at 12:18

NER1808

1,9992 gold badges33 silver badges49 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

bradlis7 Over a year ago

Is the () on properties valid (Blog.Posts())? Shouldn't it just be Blog.Posts to access the field?

linac Over a year ago

This seems to be wrong. The debugger shows me exactly the type I use in my constructor, even for data fetched from the database. This is also reflected in different behaviors when accessing the collection (ex. through DataBinding on those collections).

NER1808 Over a year ago

@linac It's not the HashSet that defines the return type, but the definition of the ICollection<T> property. The HashSet is used to just initializes the ICollection property. If you don't initialize the property in the constructor, the debugger will still show the ICollection type as defined. Nothing to do with the HashSet!!

Joep Beusenberg Over a year ago

You'll have to mark your property as virtual, for EF to override the collection type. Otherwise it has no other option than to keep the available list.

andrew pate Over a year ago

I'm kind of curious to know would there be a small performance hit from adding the hashset in the constructor.. Does that mean for every Blog entity framework makes, it has to first make a empty hashset, which is never used because entity framework then overrides the ICollection<BrokerageAccount> BrokerageAccounts getters and setters? ... I can see maybe having the hashset might be useful for unit-tests

Daniel · Accepted Answer · 2013-08-10 05:22:51Z

Generally speaking, it is best to use the collection that best expresses your intentions. If you do not specifically intend to use the HashSet's unique characteristics, I would not use it.

It is unordered and does not support lookups by index. Furthermore, it is not as well suited for sequential reads as other collections, and the fact that it allows you to add the same item multiple times without creating duplicates is only useful if you have a reason to use it for that. If that is not your intention, it can hide misbehaving code and make problems difficult to isolate.

The HashSet is mostly useful in situations where insertion and removal times are very important, such as when processing data. It is also extremely useful for comparing sets of data (again when processing) using operations like intersect, except, and union. In any other situation, the cons generally outweigh the pros.

Consider that when working with blog posts, inserts and removes are quite rare compared to reads, and you generally want to read the data in a specific order, anyway. That is more or less the exact opposite of what the HashSet is good at. It is highly doubtful that you would ever intend to add the same post twice, for any reason, and I see no reason why you would use set-based operations on posts in a class like that.

Jamie Ide · Accepted Answer · 2013-03-15 13:11:37Z

22

I'm fairly new to Entity Framework but this is my understanding. The collection types can be any type that implements ICollection<T>. In my opinion a HashSet is usually the semantically correct collection type. Most collections should only have one instance of a member (no duplicates) and HashSet best expresses this. I have been writing my classes as shown below and this has worked well so far. Note that the collection is typed as ISet<T> and the setter is private.

public class Customer
{
    public Customer()
    {
        BrokerageAccounts = new HashSet<BrokerageAccount>();
    }
    public int Id { get; set; }
    public string FirstName { get; set; }
    public ISet<BrokerageAccount> BrokerageAccounts { get; private set; }
}

answered Mar 15, 2013 at 13:11

Jamie Ide

49.4k18 gold badges85 silver badges118 bronze badges

2 Comments

Erik Schierboom Over a year ago

I completely agree. In most of the cases the HashSet is the most natural fit.

Steve Kennedy Over a year ago

hashset still seems right in EF6.x. Natively, EF will use hashsets in this exact manner during db-first creation of types too.

Collectives™ on Stack Overflow

using hashset in entity framework

3 Answers 3

5 Comments

Comments

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

5 Comments

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related