Arun Manglick - Architect View: December 2010

Tuesday, December 21, 2010

TRANSACTION ISOLATION LEVEL

· Transaction Isolation level decides how is one process isolated from other process.

· Using transaction levels you can implement locking in SQL SERVER.

· There are four transaction levels in SQL SERVER.

· You may only have one level set at a time.


READ COMMITTED (Default)	- The shared lock is held for the duration of the transaction, meaning that no other transactions can change the data at the same time. - Other transactions can insert and modify data in the same table, however, as long as it is not locked by the first transaction.
READ UNCOMMITTED	- No shared locks and no exclusive locks are honored. - This is the least restrictive isolation level resulting in the best concurrency but the least data integrity.
REPEATABLE READ	- This setting disallows dirty and non-repeatable reads. - However, even though the locks are held on read data, new rows can still be inserted in the table, and will subsequently be read by the transaction.
SERIALIZABLE	- This is the most restrictive setting holding shared locks on the range of data. - This setting does not allow the insertion of new rows in the range that is locked; therefore, no phantoms are allowed.

Syntax

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE

Example

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE

BEGIN TRANSACTION

Select Count(*) From Mytable

INSERT INTO MYTABLE VALUES (……)

- This example begins a transaction that makes a Select.

- Since the isolation level is set to 'Serializazble' and we did not commited the transaction, range locks will be held on the table's rows.

- Now, If you attempt to insert rows within the range of the same table, you will wait indefinitely.

- The little Query Analyzer globe will spin until the original transaction commits or rolls back

Hope this helps.

Regards,

Arun Manglick

'Famous Four' Concurrency Problems

'Famous Four' Concurrency Problems

The 'famous four' database concurrency problems occur when more than one user attempts to:

- Read data that another is modifying.

- Modify data that another is reading.

- Modify data that another transaction is trying to modify.

The famours four are as below – PLND (Planned)

	-
PHANTOM READS	- Phantom reads occur due to a transaction being able to read a row on the first read, but not second time, as the same row is been deleted by another transaction rows from the same table. - The phantom is the missing or new row.	For example suppose that a banker would like to offer a low interest rate all individuals with a balance >= $3000. The banker runs a SP two queries, to retrieve the list first and then Update their interest rate. The first query at 10.00 AM returns names of individuals with a balance of $3000 or higher, where it found Mr. Green's account. In the mean time at 10.01 , Mr. Green closes his account since he's about to move out of town. The second query at 10.02 got fail as the second query does not show Green's address on the list since he no longer holds an account with this bank. The phantom is the missing or new row.
LOST UPDATES	- Lost updates occur if you let two transactions modify the same data at the same time, and the transaction that completes first is lost. - Reason - READ UNCOMMITTED isolation level.
NONREPEATABLE READS	- Non-repeatable reads occur if a transaction is able to read the same row multiple times and gets a different value each time. - These occur when a transaction is reading data while a second transaction is modifying the same data. Data retrieved from the first query does not match the second query. - Reason - READ UNCOMMITTED isolation level.	Same as above, ecxcept minor change in Blue Text. In the mean time at 10.01 , Mr. Green withdrawn 2500$. The second query at 10.02 got fail as the second query will found less balance.
DIRTY READS	- Dirty reads are a special case of non-repeatable read. - Dirty reads occur while a transaction is updating a row, and a second transaction reads the row before the first transaction is committed. - If the original update rolls back, the data read by the second transaction is not the same, hence a dirty read has occurred.	Suppose that I intend to withdraw $200 from ATM, and my beginning balance is $1000. I punch in my account number and PIN, and the ATM reduces my balance to $800. At this point, I check my wallet, and am pleasantly surprised that my wife hasn't spent $300 from the last paycheck. I cancel my transaction immediately because I don't feel safe walking around with $500 in my wallet. If a banker were to run a report after I started my transaction but before I cancelled it, my balance would show up as $800, even though I never did withdraw the $200.

Hope this helps.

Regards,

Arun Manglick

Tuesday, December 7, 2010

Architecture - Scaling - Up & Out

When you scale your application, you can choose from and combine two basic choices:

Scale up: Get a bigger box

Scale out: Get more boxes.

Scale Up: Get a Bigger Box

Scaling Up, is achieved by adding hardware such as Processors, RAM, and Network Interface Cards to your existing servers to support increased capacity.

This is a simple option and one that can be cost effective. It does not introduce additional maintenance and support costs.

However, any single points of failure remain, which is a Risk.

Beyond a certain threshold, adding more hardware to the existing servers may not produce the desired results.

For an application to scale up effectively, the underlying framework, runtime, and computer architecture must scale up as well.

When scaling up, consider which resources the application is bound by. If it is memory-bound or network-bound, adding CPU resources will not help.

Scale Out: Get More Boxes – Web Farm

Scaling Out, is achieved by adding More Servers and use Load Balancing and Clustering Solutions.

Scale-out scenario also protects against hardware failures. Thus less risk. If one server fails, there are additional servers in the cluster that can take over the load.

It comes with additional management cost associated with scaling out and using Web farms and clustering technology.

Scale Out – Can be achieved in both Non-Distributed or Distributed Architecture.

· In Non-Distributed, you might host multiple Web servers in a Web farm that hosts presentation and business layers. See Below figure.

· In Distributed, you might physically partition your application's business logic and use a separately Load-Balanced Middle Tier along with a Load-Balanced Front Tier hosting the presentation layer.

In Non-distributed Achitecture - Scaling out Web servers in a Web farm

If your application is I/O-constrained and you must support an extremely large database, you might partition your database across multiple database servers. In general, the ability of an application to scale out depends more on its architecture than on underlying infrastructure.

Scaling Strategy:

1. Layered Design Practice

2. Choose between Scale-Up / Scale-Out.

3. Consider aspects like - Application Layer, Tier, Or Type of Data

4. Consider Database Partitioning at Design Time

Layered Design Practices to Follow

A Loosely Coupled, Layered design with clean, remotable interfaces is more easily Scaled Out than tightly-coupled layers with "chatty" interactions.

A Layered design will have natural clutch points, making it ideal for scaling out at the layer boundaries.

The trick is to find the right layer boundaries. For example, business logic may be more easily relocated to a loadbalanced, middle-tier Application Server Farm.

Choose between Scale-Up / Scale-Out:

As mentioned above, Scaling Out comes with additional cost. Thus one should look at Scale-Up options first and conduct performance tests to see whether scaling up meets your defined scalability and supports the necessary number of concurrent users at an acceptable level of performance.

If scaling up your solution does not provide adequate scalability because you reach CPU, I/O, or memory thresholds, you must Scale Out and introduce additional servers.

The Scaling up and then Out scenario, may not be best suitable always. You need to consider aspects of scalability that may vary by application layer, tier, or type of data.

For e.g. Scaling up and then Out, with Web or Application servers may not be the best approach. On the other hand, Scaling Up And Then Out, may be the right approach for your database servers, depending on the role of the data and how the data is used.

Consider aspects like - Application Layer, Tier, Or Type of Data:

· Stateless components: Design with Stateless components (for example, a Web front end with no In-Proc Session State and no Stateful business components), has high aspect to support Scaling Up & Out.

· Static, Reference & Read-Only data: This type of data can easily have many replicas in the right places, for better performance and scalability. This has minimal impact on design and can be largely driven by optimization considerations. Spreading replicas closer to the consumers of that data may be an equally valid approach. However, be aware that whenever you replicate, you will have a loosely synchronized system.

· Dynamic Data: This is data that is relevant to a particular user or session. And if subsequent requests can come to different Web or application servers, they all need to access it.This data is slightly more complicated to handle than static, read-only data, but you can still optimize and distribute quite easily.

The important aspect of this data is that you do not query it across partitions. For example, you ask for the contents of user A's shopping cart but do not ask to show all carts that contain a particular item.

· Core Data: This type of data is well maintained and protected. This is the main case where the "scale up, then out" approach usually applies. Generally, you do not want to hold this type of data in many places due to the complexity of keeping it synchronized. This is the classic case in which you would typically want to scale up as far as you can (ideally, remaining a single logical instance, with proper clustering), and only when this is not enough, consider partitioning and distribution scale-out.

Consider Database Partitioning at Design Time:

If your application uses a very large database and you anticipate an I/O bottleneck, ensure that you design for database partitioning up front.

Moving to a partitioned database later usually results in a significant amount of costly rework and often a complete database redesign.

Database Partitioning provides several benefits:

· The ability to restrict queries to a single partition, thereby limiting the resource usage to only a fraction of the data.

· The ability to engage multiple partitions, thereby getting more parallelism and superior performance because you can have more disks working to retrieve your data.

Hope this helps.

Regards,

Arun Manglick

Architecture - Cohesion(High) & Coupling(Low)

------------------------------------------------------------------------------------------------------------

What is High/Weak Cohesion:

High-Cohesion is achieved, when logically related entities, such as classes and its methods, are grouped together. Similarly, a component contains logically related classes.

This results in lesss round trips because the classes or components are logically grouped and may end up residing in same tiers. Thus to complete a operation, no local or remote calls are required.

Weak cohesion among components tends to result in more round trips because the classes or components are not logically grouped and may end up residing in different tiers.

This can force us to require a mix of local and remote calls to complete a logical operation.

------------------------------------------------------------------------------------------------------------

Cohesion measures how many different components take advantage of shared processing and data.

Coupling is a degree of dependency (at design or run time) that exists between parts of a system.

Reducing Coupling and increasing Cohesion are two key principles to increasing application scalability.

The 'Major Design Fact' that provides these two important design factors - High Cohesion & Low Coupling and thus the Increased Scalability is – Layered Design/Layering.

Using Layering, to partition your application ensures that your presentation, business, and data access logic are not interspersed.

This logical seperation leads to a High Cohesive Design in which related classes and data are located close to each other, generally within a single boundary. This helps reduce/optimize the use of expensive local/remote calls.

Layering, also ensures that your presentation, business, and data access logic are loosely coupled.

This logical seperation leads to a Loosely Coupled Design.

If you have tight coupling and need to make changes, the changes are likely to ripple across the tightly coupled components. With loosely coupled components, changes are limited because the complexities of individual components are encapsulated from other components/layers. In addition, loose coupling provides greater flexibility to choose optimized strategies for performance and scalability for different components of your system independently.

Recommendations - To ensure appropriate degrees of coupling and cohesion.

1. Design for Loose Coupling.

2. Design for High Cohesion.

3. Prefer Layered Design.

Design for Loose Coupling

Aim to minimize coupling within and across your application components. Making changes with tight coupling are likely to ripple across the tightly coupled components.

In addition, loose coupling provides greater flexibility to choose optimized strategies for performance and scalability for different components of your system independently

Patterns & Principles that enable loose coupling are:

· At Business Layer , Use Abstraction: The abstraction can be implemented using

· Public Object Interfaces,

· Common Interface Definitions,

· Abstract Base Classes, or

· Messaging

· Separate interface from implementation – By providing Facades.

· For Web applications - Consider a Message-based communication between the presentation layer and the business layer

Design for High Cohesion

High-Cohesion is achieved, when logically related entities, such as classes and its methods, are grouped together. Similarly, a component contains logically related classes.

Layering: Prefer Layered Design

Using Layering, to partition your application ensures that your presentation, business, and data access logic are not interspersed.

Layering, also ensures that your presentation, business, and data access logic are loosely coupled.

This logical seperation leads to a Loosely Coupled Design.

Hope this helps.

Regards,

Arun Manglick

06 - Deployment Performance Patterns (Scale Out)

There are below deployment patterns

· Performance Patterns

o Web Farms

o Application Farms

o Load Balancing Cluster

Consider the use of Web farms or Load Balancing Clusters, when designing a Scale Out strategy.

· Reliability Patterns

o Fail-Over Cluster

· Security Patterns

o Impersonation/Delegation

o Trusted Subsystem

o Multiple Trusted Service Identities

Performance Patterns:

WebFarm & Affinity:

A Web farm is a collection of servers where each server replicate/run the same application.

Requests from clients are distributed to each server in the farm, so that each has approximately the same loading.

Depending on the routing technology used, it may detect failed servers and remove them from the routing list to minimize the impact of a failure.

In simple scenarios, the routing may be on a "round robin" basis where a DNS server hands out the addresses of individual servers in rotation.

Figure 3 illustrates a simple Web farm where each server hosts all of the layers of the application except for the data store.

In Non-distributed Achitecture - Scaling out Web servers in a Web farm

Affinity and User Sessions

Web applications often rely on the maintenance of Session State between requests from the same user.

A Web farm can be configured to route all requests from the same user to the same server – a process known as Affinity – in order to maintain state where this is stored in memory on the Web server.

However, for maximum performance and reliability, you should use a separate session state store (SqlServer/StateServer) with a Web farm, to remove the requirement for affinity.

Application Farms

In Distributed Deployment , where the Business & Data layer runs on different physical tiers from the Presentation layer, Application Farm is used to Scale Out the Business & Data layer.

Similar to Web-Farm, Application-Farm is a also collection of servers where each server replicate/run the same application.

Requests from clients(presentation tier) are distributed to each server in the farm, so that each has approximately the same loading.

Load Balancing Cluster

Application/Service can be installed onto multiple servers that are configured to share the workload, as shown below.

This type of configuration is a Load-Balanced Cluster.

Load balancing Scales The Performance of server-based programs, such as a Web server, by distributing client requests across multiple servers.

Load balancing technologies, commonly referred to as Load Balancers, receive incoming requests and redirect them to a specific host if necessary.

The Load-Balanced Hosts concurrently respond to different client requests, even multiple requests from the same client.

For example, a Web browser may obtain the multiple images within a single Web page from different hosts in the cluster. This distributes the load, speeds up processing, and shortens the response time to clients.

Reliability Patterns

Reliability deployment patterns represent proven design solutions to common reliability problems. The most common approach to improving the reliability of your deployment is to use a fail-over cluster to ensure the availability of your application even if a server fails.

Fail-Over Cluster

A Failover Cluster is a set of servers that are configured so that if one server becomes unavailable, another server automatically takes over for the failed server and continues processing.

Application/Service can be installed onto multiple servers that are configured to take over for one another when a failure occurs.

The process of one server taking over for a failed server is commonly known as Failover. Each server in the cluster has at least one other server in the cluster identified as its standby server.

Security Patterns

Security patterns represent proven design solutions to common security problems.

Impersonation and Delegation approach is a good solution when you must flow the context of the original caller to downstream layers or components in your application.
Trusted Subsystem approach is a good solution when you want to handle authentication and authorization in upstream components and access a downstream resource with a single trusted identity.

Impersonation/Delegation

In the impersonation/delegation authorization model, resources (such as tables and procedures in SQL Server) and the types of operation (such as read, write, and delete) permitted on such each resource are secured using Windows Access Control Lists (ACLs) or the equivalent security features of the targeted resource.

Users access the resources using their original identity through impersonation, as below.

The impersonation/delegation authorization model

Trusted Subsystem

In the trusted subsystem (or trusted server) model, users are partitioned into application defined, logical roles.

Members of a particular role share the same privileges within the application.

Access to operations is authorized based on the role membership of the caller.

With this role-based (or operations-based) approach to security, access to operations (not back-end resources) is authorized based on the role membership of the caller.

Roles, analyzed and defined at application design time, are used as logical containers that group together users who share the same security privileges or capabilities within the application.

The middle tier service uses a fixed identity to access downstream services and resources, as illustrated in below Figure.

Multiple Trusted Service Identities

In some situations, you may require more than one trusted identity. For example, you may have two groups of users, one who should be authorized to perform read/write operations and the other read-only operations. The use of multiple trusted service identities provides the ability to exert more granular control over resource access and auditing, without having a large impact on scalability.

Reference: MS Patterns & Practices - Web Application Architecture Guide

Hope this helps.

Regards,

Arun Manglick