Data and data access technologies

In my previous post I spoke about key layers of distributed applications. Now we will go through the most crucial layer of any distributed application, the data layer. In this part, you will be introduced to various database technologies, along with .NET-related technologies.

Data can be stored in a wide range of data sources such as relational databases, files on the local filesystems, on the distributed filesystems, in a caching system, in storage located on the cloud, and in memory.

  • Relational databases (SQL server): This is the traditional data source that is designed to store and retrieve data. Queries are written in languages such as T-SQL-utilized Create, Retrieve, Update, and Delete (CRUD) operations model.
  • The filesystem: The filesystem is used to store and retrieve unstructured data on the local disk system in the files. One of the simplest options to store and retrieve data, it has many functional limits and is not distributed by its nature.
  • The Distributed File System (DFS): The DFS is the next level of file system that solves the size and other limitations introduced by local disks. In a nutshell, DFS is a pool of networked computers that store data files.
  • NoSQL databases: NoSQL databases are a new way of storing data in a non-relational fashion. Often, NoSQL databases are used to store large or very large volumes of data, and the biggest difference between these databases and relational database is that NoSQL data stores are schema-free. However, data can be organized by one or more different models, such as key-value stores and document stores, among others.
  • Cloud storage: Any infrastructure located on the cloud solves many issues, such as security, reliability, resilience, and maintenance. Cloud offerings such as Microsoft Azure Storage provide many ways of storing the data in different formats, which can be structured or unstructured. As with many other cloud storage offerings, Microsoft Azure Storage exposes the HTTP REST API, used by any application and client running on any platform that supports HTTP.
  • In-memory stores: In-memory stores are the fastest data stores that are limited in size, not persistent, and cumbersome to use in a distributed multi-server environment. In-memory stores are used to store temporary and volatile data.

ADO.NET and ADO.NET Entity Framework

.NET Framework has several database access options, and the foundation of most of them is ADO.NET. ADO.NET can be called a foundation for every other data access technology on Microsoft stacks. In a nutshell, ActiveX Data Objects .NET (ADO.NET) is a collection of classes that implement program interfaces to simplify the process of connecting to data stores without depending on the structure and implementation of a concrete data store and its location. The challenge that it offers is that most developers must write complex data access code (between the application and the database) that requires them to have a good understanding of the database itself, of raw tables, views, stored procedures, the database schema, table definitions and parameters, results, and so on.

This is mostly solved by the Object-relational mapping (ORM) approach. Programmers create a conceptual model of the data and write their data access code against that model, while an additional layer provides a bridge between the entity-relationship model and the actual data store. Entity Framework generates database entities according to database tables and provides the mechanism for basic CRUD operations, managing 1-to-1, 1-to-many, and many-to-many relationships, and the ability to have inheritance relationships between entities among others.

Basically, you have the ability to "talk" about your model not with the database but with the class model you wrote or generated from a database using Entity Framework. This is achieved by the creation of a combination of XML schema files, code generation, and the ADO.NET Entity Framework APIs. The schema files are used to define a conceptual layer, to be used as a map between the data store and the application. The ADO.NET Entity Framework allows you to write the application that uses classes that are generated from the conceptual schema. Entity Framework then takes care of the rest.

Another important component of Entity Framework that is often used by developers is Language Integrated Query (LINQ). It adds data querying capabilities to .NET languages and extends the language with SQL-like query expressions.

There are three approaches to working with Entity Framework in the project:

  • Database-first: This approach is used when you already have a database that is going to be used as a data source.
  • Model-first: This approach is used when you have no database. First, you draw the model in the Visual Designer and then instruct it to create the database for you with all the tables.
  • Code-first: This approach is used often as it provides a way to write your model in code as classes and instruct Entity Framework to generate the database with objects described in the code.

Key layers of distributed applications

Every application that is going to be used by end users should be designed appropriately as users are expecting to process information from various data sources that might be geographically distributed. They are also expecting this information to be up-to-date and capable of being inflected very fast. Designing such applications is not an easy task and involves integration among different groups of components. Let's review the layers that form a typical distributed application.

The responsibilities in a distributed system can be divided into four layers:

  • The data layer
  • The business logic layer
  • The server layer
  • The user interface layer

The data layer

The data layer is responsible for storing and accessing data and for querying, updating, or deleting this data. This layer includes the logic of data access and store performance that can be a complicated task, especially dealing with a large volume of data distributed among different data sources.

The business logic layer

The business logic layer is responsible for the crucial part of the application: logic that is executed between the client and data layers. Basically, the business logic layer contains the logic of the application. It is the "brain" that coordinates the integration between the data layer that is used for reading and storing the data and the user interface layer that interacts with the client.

The server layer

The server layer is sometimes called a services layer, and that is an accurate term as well. The server layer is responsible for exposing some of the capabilities of the application that can be consumed by other services and used as a data source, for example. This layer works as the interface between our application and the world of other services, which is different from that of the end users.

The server layer is an extremely important part of every distributed application; its proper design can impact the overall performance of the system as it is responsible for the defining of the collaboration principles between parts of applications and the distribution of load and data. It contains security mechanisms that validate requests as well.

The user interface layer

The user interface layer is the layer that is used by clients interacting with the application. This layer must contain only that part of the system that is responsible for rendering the interface consisting of the data, user interface components, and other things important in the process of interacting between the user and the application.

This layer also has the logic that can be used in the process of adapting the application user interface layer for different form factors, people, cultures, interfaces (such as touchscreens), screen sizes, and resolutions. At the same time, it must be simple and effective and must provide a smooth user experience.

Properly designed user interface design is important; if the user interface is not friendly and experience is not smooth or if the user does not understand how the system works and how it should be used, the application will not be used.