The Content Data Context

This post is part of a series on learning how to use Entity Framework. The rest of the posts in the series are linked below.

Basics of Entity Framework

Code First

Database First


Recall from an earlier post how the DbContext class provides the framework to establish and maintain an open connection to the database. This class is typically extended with a custom context for the application, which allows the developer to add DbSet instances for all the types which are part of the application’s Entity Framework model. Each DbSet property is a collection of entity instances, one for each record in the database. The DbSet is an implementation of IDbSet, which in turn extends IQueryable and IEnumerable.

The DbContext primarily fulfils the following roles.

Querying

Provides an interface to convert statements from LINQ-to-Entities (using queries as well as method syntax) to the SQL syntax specific to the underlying database provider.

Object Materialisation

Converts database records from ADO.NET types such as DataSet to application-specific types through mapping rules.

Change Tracking

Tracks the state of newly created, modified or deleted entities for later committing to the database.

Persistence

Triggering INSERT, UPDATE or DELETE queries on changed entities when the SaveChanges method is called.

Caching

Provides a local, lightweight embedded store for recently retrieved entities to avoid repeated querying against the primary database.

Connecting to the Database

The DbContext class automatically creates a database whose name is derived from the fully-qualified name of the context class. However, this behaviour can be overridden by the programmer by passing a custom connection string, database name, or connection string name from the .config file associated with the running application.

The DbContext class has 7 variants of the constructor to provide different ways of establishing a connection to the database server. The default constructor triggers the behaviour described above.

The constructor variant that takes a string parameter can be used to specify any one of the following.

  1. A standard connection string.
  2. A name of a connection string declared in the configuration file (using the syntax “name=connectionString”).
  3. An Entity Framework connection string declared in the configuration file.
  4. A database name (which is sought on the server .\SQLEXPRESS by default).

Alternative constructors can accept DbConnection or ObjectContext instances instead of a string in order to work with already open connections.

Entity Collections

Records in the database tables are referenced through a DbSet instance, which is an implementation of the set abstract data type specifically designed for Entity Framework. A collection can be non-generic by using the System.Data.Entity.DbSet type, or generic through the use of System.Data.Entity.DbSet<TEntity> type.

Collections of root entities are often exposed as DbSet properties of the application’s context class. But specific entity types can also be extracted from the database by using the Set method (or its generic variant) on the context.

The DbSet class exposes methods to perform your standard CRUD operations on the table. The most important methods are Create, Add, Find, Remove and Attach. You will notice that most of these methods are also available on the Set ADT, which the DbSet type is based on.

Create, Add and Attach sound similar in nature, but are quite different. The Create method instantiates an entity without adding it to the DbSet. The Add method is used to add the entity to the database and mark it for saving later. Finally, Attach adds an entity to the collection without marking it for saving. This is useful for adding temporary entities to read-only collections.

The revised ContentContext class for this application looks like this.

namespace Notadesigner.Blog
{
    public class ContentContext : DbContext
    {
        public ContentContext("Notadesigner")
        {
        }

        public virtual DbSet Blogs
        {
            get;
            set;
        }

        public virtual DbSet Posts
        {
            get;
            set;
        }

        public virtual DbSet Categories
        {
            get;
            set;
        }
    }
}

Based on this structure, the code to retrieve a specific Post by its primary key would be as shown below.

var db = new ContentContext();
var post = db.Posts.Find(10);

Inserting a new record is done in two parts. First, the entity instance is created, then it is added to the collection and the changes saved to the database.

var db = new ContentContext();
var post = db.Posts.Create();
post.Title = "Ways to Try and Take Over the World";
post.Slug = GenerateSlug(post.Title);
// Set other properties on the entity
…
db.Posts.Add(post);
db.SaveChanges();

Attach is useful when you wish to update only a few properties in a record which contains many columns.

var db = new ContentContext();
var post = db.Posts
                .AsNoTracking()
                .Where(p => p.PostId = 10)
                .FirstOrDefault();
// Change the title only
post.Title = "More Ways to Try and Take Over the World";
db.Posts.Attach(post);
db.SaveChanges();

The above code generates an update statement that only affects the Title column, as opposed to updating all the columns in the Posts table.

Deleting records is as simple as finding an entity and marking it for deletion through the Remove method.

var db = new ContentContext();
var post = db.Posts.Find(10);
db.Posts.Remove(post);
db.SaveChanges();

Irrespective of the method used, changes are committed only if the SaveChanges method is called on the context instance.

Conventions, Overrides

When an application using the code-first approach is launched for the first time, Entity Framework creates tables for each entity in the newly created database by converting the entity name into its plural form. For example, the Post entity is associated with a table called “Posts”. The columns in the table are also created automatically based on the entity properties. The property PostId is associated with a column of non-nullable integers of the same name, and it is assigned as the primary key. Similarly, the Title property gets associated with the non-nullable varchar column called Title, whose maximum size is 256 characters. Default conventions can be overridden by using attributes as shown in the previous post, or by using the Fluent API in the initialisation strategy.

Some of the most essential default conventions are shown below.

Schema

The dbo schema is used for all database operations by default.

Tables

Table names are derived from the plural form of the entity they represent.

Primary and Foreign Keys

The Id property or <Entity>Id property is the default primary key. A property with the same name is required in the dependent table. The foreign key is named by combining the dependent navigation property name with the principal primary key property name, separated by an underscore (i.e. Post_OwnerId).

Nullability

Columns to store reference type and nullable primitive type properties are nullable, and primitive type properties are stored in non-nullable columns.

Laying the Groundwork

This post is part of a series on learning how to use Entity Framework. The rest of the posts in the series are linked below.

Basics of Entity Framework

Code First

Database First


Many application developers prefer a code-first approach to implementing a business application with Entity Framework. In this workflow, the programmer begins by writing classes that represent entities in the domain model. As we are aware, each instance of these classes represents a single row from the database. The entity classes correspond to a single table. Public properties on the class are mapped to columns in this table.

For example, a Person class maps to the Persons table in the database in the conventional Entity Framework implementation. The properties Name (string), Age (int) and Gender (enum), correspond to columns Name (nvarchar), Age (int) and Gender (int). When a row is retrieved from the database, the values from its columns are loaded into the properties of this instance. Conversely, modified values in the Person instance can be written back to the table by passing its reference back into the appropriate Entity Framework APIs.

Traditionalists would cringe at the thought of writing application code without a rock-solid database model already in place to back it up. And their outlook is borne out of poor outcomes from automatic table-generation tools. Fortunately, Entity Framework (and several other modern ORM tools) address the table performance characteristics and data integrity problems very well.

If performance or correctness are a problem, they are probably a result of poor description of the application domain by the programmer, rather than a shortcoming of the ORM. Being able to generate entity structures from the application layer does not take away the need to understand the performance characteristics of a database engine. And errors in designing or describing the domain model by inexperienced application developer can still affect the product adversely.

The Business Model

The subject of this exercise is a multi-user blog that can be used to host and serve posts authored by one or more registered users. Posts are created and owned by any user with the appropriate level of access to the editor interface. The owner can then invite more authors to collaborate with them. Collaborators can edit the contents of a post, but cannot delete it.

Finally, anonymous users can view the post by visiting a specific URL for each post. The blog home page is a read-only index of all posts, sorted in reverse chronological order.

The Blog

The blog is the primary entity in our application. It has several descriptive properties of the site itself, such as the name, description and URL. A blog is owned by a user who has super-admin rights to the application. This user has to be created and linked to the blog at the time of setup.

using System.ComponentModel.DataAnnotations;

namespace Notadesigner.Blog.Models
{
    public class Blog
    {
        [Key]
        public int BlogId { get; set; }

        [Required]
        [StringLength(128)]
        public string Name { get; set; }

        [Required]
        public string BaseUrl { get; set; }

        [Required]
        [ForeignKey("Owner")]
        public int OwnerId { get; set; }

        public User Owner { get; set; }

        public ICollection<Category> Categories { get; set; }

        public ICollection<Post> Posts { get; set; }

        public ICollection<User> Users { get; set; }
    }
}

This example introduces several classes from the DataAnnotations namespace which Entity Framework uses when creating the database tables.

Key

The default convention of the framework is to make any property suffixed with the word Id into the primary key for the corresponding table. The Key attribute is not required on the BlogId property. But making it explicit helps maintenance later down the line. The index in the database engine is prefixed with IX_, e.g. IX_PostId.

Required

Nullable columns are the default convention in Entity Framework. The Required attribute changes this and marks them as non-nullable. In addition to enforcing data integrity, this attribute is also used to provide client-side validation in the MVC framework.

StringLength

This attribute defines an explicit maximum on the size of string columns, such as nvarchar in MS SQL. Like Required, this attribute is also usable for client-side validation.

ForeignKey

This attribute defines a foreign key between two entities which enforces relational integrity between tables in the database. The UserId column can only contain a value which already exists in the corresponding column in the User table.

This attribute requires the name of the property which contains the referenced entity record. In this case, the parameter is “Owner”, which is of type User. The User class is described further below.

Users & Roles

A User is the unique identification of a person who operates or accesses content from the blog. Is is composed of a minimum of a unique identification number and a unique user name. A person who wishes to identify as a particular User in the application also requires the password for that account. Passwords are stored as salted and hashed strings.

Roles are coarse-level controls over the actions that a single User can perform in the application. A role is composed of several granular permissions, which can either be explicitly granted or denied. Permissions are not provided by individual modules in the application. For example, the Posts module might have the permissions view, create, edit-title, edit-body, edit-attributes, publish, unpublish and delete. The Users module might offer the permissions create, reset-password, recover, edit-attributes and delete. A Role organises these permissions into a cohesive group of related permissions (e.g. Subscriber, Publisher, Content Creator, Intern, Moderator, Administrator, etc.).

In a typical usage scenario, Users must belong to at least one role in order to be able to perform any useful identifying activity in the application. But the application can also be configured to allow anonymous access to the entire application.

Posts

A Post is a single unit of content in the application. It is created and owned by a single user, and can have one or more collaborators who contribute to the content.

using System.ComponentModel.DataAnnotations;

namespace Notadesigner.Blog.Models
{
    public class Post
    {
        [Key]
        public int PostId { get; set; }

        [Required]
        [StringLength(256)]
        public string Title { get; set; }

        [Required]
        [Index(IsUnique = true)]
        [StringLength(128)]
        public string Slug { get; set; }

        [StringLength(102400)]
        public string Body { get; set; }

        [Required]
        public PostStatus Status { get; set; }

        public DateTime CreatedOn { get; set; }

        public DateTime ModifiedOn { get; set; }

        [Required]
        [ForeignKey("Owner")]
        public OwnerId { get; set; }

        public User Owner { get; set; }

        public ICollection<User> Collaborators { get; set; }

        public ICollection<Category> Categories { get; set; }
    }
}

The class is fairly vanilla, and has all the data annotations which were described previously. It also introduces the Index attribute on the Slug property. A slug is a human-friendly identifier for a post. Words and phrases add more context to the human mind than plain integers. So it is easier to remember and parse what the phrase “a-field-guide-to-entity-framework” points to, than it is to remember the post ID.

In order for slugs to work correctly, they have to be unique. Therefore, the model requires a unique constraint on the value contained in the slug. This instruction is converted into a unique, clustered index in the database table.

A single entity can have multiple indexes, only one of which can be clustered. An index can be composed of multiple properties.

Categories

Categories provide the taxonomy framework to organise posts into groups of common topics. It is a fairly common feature in all content management systems.

using System.ComponentModel.DataAnnotations;

namespace Notadesigner.Blog.Models
{
    public class Category
    {
        [Key]
        public int CategoryId { get; set; }

        [Required]
        [StringLength(256)]
        public string Name { get; set; }

        [Required]
        [Index(IsUnique = true)]
        public string Slug { get; set; }

        [Required]
        [ForeignKey("Owner")]
        public OwnerId { get; set; }

        public User Owner { get; set; }

        public ICollection<Post> Posts { get; set; }
    }
}

These classes collectively provide all the data objects needed to store and manipulate records from the database in the application. The next post in the series will cover the process of connecting to a data source and loading records into memory by using the data context.

Domain Entities

This post is part of a series on learning how to use Entity Framework. The rest of the posts in the series are linked below.

Basics of Entity Framework

Code First

Database First


An entity class is written in your application programming language using standard language constructs such as the qualifier, class name and public properties. All properties must have an accessor and a mutator.

namespace Notadesigner.Blog
{
    public class Post
    {
        public int PostId { get; set; }
        public string Title { get; set; }
        public string Body { get; set; }
        public DateTime CreatedOn { get; set; }
    }
}

The class is utilised into the Entity Framework by creating a DbSet instance property for it in the DbContext.

namespace Notadesigner.Blog
{
    public class ContentContext : DbContext
    {
        public DbSet Posts { get; set; }
    }
}

Scalar Properties

Entity classes have scalar properties to represent values which are stored in the database column directly. For example, the values for PostId, Title, Body and CreatedOn fields are stored in columns in the Posts table in the database.

Navigation Properties

Navigation properties are used to represent relationships between database tables. Entity Framework supports relationship multiplicities of zero-or-one, one, zero-or-many, or one-or-many. Object references are used to represent relationships. Properties whose type is of another entity are used for zero-or-one, or one multiplicity. Properties whose type is a generic collection instance represent zero-or-many, or one-or-many relationships.

namespace Notadesigner.Blog
{
    public class Post
    {
        public int PostId { get; set; }
        …
        // Multiplicity of one or zero-or-one
        public Author CreatedBy { get; set; }

        // Multiplicity of zero-or-many, or many
        public ICollection { get; set; }
    }
}

Entity States

Each entity class instance has an associated state which is maintained by the Entity Framework internally. The value of the state property changes automatically based on the operations that the developer performs on the entity instance or the database context instance.

This property can transition between the following enumerations.

  • Added
  • Modified
  • Deleted
  • Unchanged
  • Detached

The context instance in the object service layer is responsible for maintaining the current state and changing it in response to API calls. When the SaveChanges method is called, the context instance determines what database operation has to be performed on each entity instance in the DbSet collection based on the value of its State property.

An insert operation is performed for entities whose state is set to Added, update for entities whose state property is set to Modified and delete for entities whose state property is set to Deleted.

The Database Context

This post is part of a series on learning how to use Entity Framework. The rest of the posts in the series are linked below.

Basics of Entity Framework

Code First

Database First


DbContext is probably the single most important class in the EF API, as it directly represents a connection with the database from the application. A minimal context class inherits from DbContext and exposes DbSet instances for each entity type which has to be stored or retrieved from the database.

Querying for information retrieval and storage is done through the DbContext instance. If the developer has not overridden any configuration settings, Entity Framework attempts to connect to a database with the fully qualified name of the DbContext class itself.

namespace Notadesigner.Blog
{
    public class ContentContext : DbContext
    {
        public ContentContext()
        {
        }
    }
}

In the above example, Entity Framework automatically tries to connect to a database called Notadesigner.Blog.ContentContext. This can be overridden by setting the configuration in various ways.

namespace Notadesigner.Blog
{
    public class ContentContext : DbContext
    {
        public ContentContext("Notadesigner")
        {
        }
    }
}

Other ways to override the default connection name is to add a connection string in the application configuration file and supplying it as a parameter to the DbContext constructor.

Operating Entity Framework

This post is part of a series on learning how to use Entity Framework. The rest of the posts in the series are linked below.

Basics of Entity Framework

Code First

Database First


Define the Model

Entity Framework begins be preparing an in-memory model of the entity objects and their relationships, and the mapping of this model to the storage model. A default configuration can be derived by using conventional names for entities and their properties. Programmers can achieve finer control for exceptional cases by overriding configuration settings through various mechanisms.

If the database does not exist, or the configuration directs automatic modification or recreation of the database, then the database is modified or created to match the fields in the conceptual model.

Data Creation, Retrieval, Modification & Storage

An entity is instantiated from its defining class and added to the database context through the object service. Querying to retrieve records is also done through the same layer. Any modifications to the data are performed on the entity instances, followed by a call to the SaveChanges API to commit them to the database.