HMAC authentication in ASP.NET Web API

In this article I will explain the concepts behind HMAC authentication and will show how to write an example implementation for ASP.NET Web API using message handlers. The project will include both server and client side (using Web API’s HttpClient) bits.

HMAC based authentication

HMAC (hash-based message authentication code) provides a relatively simple way to authenticate HTTP messages using a secret that is known to both client and server. Unlike basic authentication it does not require transport level encryption (HTTPS), which makes its an appealing choice in certain scenarios. Moreover, it guarantees message integrity (prevents malicious third parties from modifying contents of the message).

On the other hand proper HMAC authentication implementation requires slightly more work than basic HTTP authentication and not all client platforms support it out of the box (most of them support cryptographic algorithms required to implement it though). My suggestion would be to use it only if HTTPS + basic authentication does not suit your requirements.

One prominent example of HMAC usage is Amazon S3 service.

The basic idea behind HMAC authentication in HTTP can be described as follows:

  • both client and server have access to a secret that will be used to generate HMAC – it can be a password (or preferably password hash) created by the user at the time of registration,
  • using the secret client generates a message signature using HMAC algorithm (the algorithm is provided by .NET ‘for free’),
  • signature is attached to the message (eg. as a header) and the message is sent,
  • the server receives the message and calculates its own version of the signature using the secret (both client and server use the same HMAC algorithm),
  • if the signature computed by the server matches the on the message it means that the message is authorized.

As you can see the secret key (eg. password hash) is only shared between client and server once (eg. during user registration). Noone will be able to produce a valid signature without the access to the secret also any modification of the message (eg. appending content) will result in server calculating a different signature and refusing authorization.

Broadly speaking to create a HMAC authenticated client/server pair using ASP.NET Web API we need:

  • method that will return a string representing given http request,
  • method that based on secret string and message representation calculates HMAC signature,
  • client side – message handler that uses these methods to calculate the signature and attaches it to the request (as HTTP header),
  • server side – message handler that calculates signature of incoming request and compares it with the one contained in the header.

Web API client

Ok, so let’s start by writing the first piece.

public interface IBuildMessageRepresentation  
{
    string BuildRequestRepresentation(HttpRequestMessage requestMessage);
}
public class CanonicalRepresentationBuilder : IBuildMessageRepresentation  
{
    /// <summary>
    /// Builds message representation as follows:
    /// HTTP METHOD\n +
    /// Content-MD5\n +  
    /// Timestamp\n +
    /// Username\n +
    /// Request URI
    /// </summary>
    /// <returns></returns>
    public string BuildRequestRepresentation(HttpRequestMessage requestMessage)
    {
        bool valid = IsRequestValid(requestMessage);
        if (!valid)
        {
            return null;
        }

        if (!requestMessage.Headers.Date.HasValue)
        {
            return null;
        }
        DateTime date = requestMessage.Headers.Date.Value.UtcDateTime;

        string md5 = requestMessage.Content == null ||
            requestMessage.Content.Headers.ContentMD5 == null ?  "" 
            : Convert.ToBase64String(requestMessage.Content.Headers.ContentMD5);

        string httpMethod = requestMessage.Method.Method;
        //string contentType = requestMessage.Content.Headers.ContentType.MediaType;
        if (!requestMessage.Headers.Contains(Configuration.UsernameHeader))
        {
            return null;
        }
        string username = requestMessage.Headers
            .GetValues(Configuration.UsernameHeader).First();
        string uri = requestMessage.RequestUri.AbsolutePath.ToLower();
        // you may need to add more headers if thats required for security reasons
        string representation = String.Join("\n", httpMethod,
            md5, date.ToString(CultureInfo.InvariantCulture),
            username, uri);

        return representation;
    }

    private bool IsRequestValid(HttpRequestMessage requestMessage)
    {
        //for simplicity I am omitting headers check (all required headers should be present)

        return true;
    }
}

A couple of points worth mentioning:

  • we construct message representation by concatenating ‘important’ headers, http method and uri,
  • instead of using incorporating the content we use its md5 hash (base64 encoded),
  • all parts of the message (eg. headers) that can affect its meaning and have side effects on the server side should be included in the representation (otherwise an attacker would be able to modify them without changing the signature).

Now lets look at that component that will calculate authentication code (signature).

public interface ICalculteSignature  
{
    string Signature(string secret, string value);
}
public class HmacSignatureCalculator : ICalculteSignature  
{
    public string Signature(string secret, string value)
    {
        var secretBytes = Encoding.UTF8.GetBytes(secret);
        var valueBytes = Encoding.UTF8.GetBytes(value);
        string signature;

        using (var hmac = new HMACSHA256(secretBytes))
        {
            var hash = hmac.ComputeHash(valueBytes);
            signature = Convert.ToBase64String(hash);
        }
        return signature;
    }
}

The signature will be encoded using base64 so that we can pass it easily in a header. What header you may ask? Well, unfortunately there is no standard way of  including message authentication codes into the message (as there is no standard way of constructing message representation). We will use Authorization HTTP header for that purpose providing a custom schema (ApiAuth).

Authorization: ApiAuth HMAC_SIGNATURE

The HMAC will be calculated and attached to the request in a custom message handler.

public class HmacSigningHandler : HttpClientHandler  
{
    private readonly ISecretRepository _secretRepository;
    private readonly IBuildMessageRepresentation _representationBuilder;
    private readonly ICalculteSignature _signatureCalculator;

    public string Username { get; set; }

    public HmacSigningHandler(ISecretRepository secretRepository,
                          IBuildMessageRepresentation representationBuilder,
                          ICalculteSignature signatureCalculator)
    {
        _secretRepository = secretRepository;
        _representationBuilder = representationBuilder;
        _signatureCalculator = signatureCalculator;
    }

    protected override Task<HttpResponseMessage> SendAsync(HttpRequestMessage request,
                                 System.Threading.CancellationToken cancellationToken)
    {
        if (!request.Headers.Contains(Configuration.UsernameHeader))
        {
            request.Headers.Add(Configuration.UsernameHeader, Username);
        }
        request.Headers.Date = new DateTimeOffset(DateTime.Now,DateTime.Now-DateTime.UtcNow);
        var representation = _representationBuilder.BuildRequestRepresentation(request);
        var secret = _secretRepository.GetSecretForUser(Username);
        string signature = _signatureCalculator.Signature(secret,
            representation);

        var header = new AuthenticationHeaderValue(Configuration.AuthenticationScheme, signature);

        request.Headers.Authorization = header;
        return base.SendAsync(request, cancellationToken);
    }
}
public class Configuration  
{
    public const string UsernameHeader = "X-ApiAuth-Username";
    public const string AuthenticationScheme = "ApiAuth";
}
public class DummySecretRepository : ISecretRepository  
{
    private readonly IDictionary<string, string> _userPasswords
        = new Dictionary<string, string>()
              {
                  {"username","password"}
              };

    public string GetSecretForUser(string username)
    {
        if (!_userPasswords.ContainsKey(username))
        {
            return null;
        }

        var userPassword = _userPasswords[username];
        var hashed = ComputeHash(userPassword, new SHA1CryptoServiceProvider());
        return hashed;
    }

    private string ComputeHash(string inputData, HashAlgorithm algorithm)
    {
        byte[] inputBytes = Encoding.UTF8.GetBytes(inputData);
        byte[] hashed = algorithm.ComputeHash(inputBytes);
        return Convert.ToBase64String(hashed);
    }
}

public interface ISecretRepository  
{
    string GetSecretForUser(string username);
}

In a real life scenario you could retrieve the hashed password from the a persistent store (a database). If you remember how we constructed our message representation you will notice that we also need to set content MD5 header. We could do it in HmacSigningHandler, but to have separation of concerns and because Web API allows us to combine handlers in a neat way I moved it to a separate (dedicated) handler.

public class RequestContentMd5Handler : DelegatingHandler  
{
    protected async override Task<HttpResponseMessage> SendAsync(HttpRequestMessage request,
                                       System.Threading.CancellationToken cancellationToken)
    {
        if (request.Content == null)
        {
            return await base.SendAsync(request, cancellationToken);
        }

        byte[] content = await request.Content.ReadAsByteArrayAsync();
        MD5 md5 = MD5.Create();
        byte[] hash = md5.ComputeHash(content);
        request.Content.Headers.ContentMD5 = hash;
        var response = await base.SendAsync(request, cancellationToken);
        return response;
    }
}

For simplicity the HMAC handler derives directly from HttpClientHandler. Here is how we would make a request:

static void Main(string[] args)  
{
    var signingHandler = new HmacSigningHandler(new DummySecretRepository(),
                                            new CanonicalRepresentationBuilder(),
                                            new HmacSignatureCalculator());
    signingHandler.Username = "username";

    var client = new HttpClient(new RequestContentMd5Handler()
    {
        InnerHandler = signingHandler
    });
    client.PostAsJsonAsync("http://localhost:48564/api/values","some content").Wait();
}

And that’s basically it as far as http client is concerned. Let’s have a look at server part.

Web API service

The general logic will be that we will want to authenticate every incoming request (we can us per route handlers to secure only one route for example). Each request’s authentication code will be calculated using the very same IBuildMessageRepresentation and ICalculateSignature implementations. If the signature does not match (or the content md5 hash is different from the value in the header) we will immediately return a 401 response.

public class HmacAuthenticationHandler : DelegatingHandler  
{
    private const string UnauthorizedMessage = "Unauthorized request";

    private readonly ISecretRepository _secretRepository;
    private readonly IBuildMessageRepresentation _representationBuilder;
    private readonly ICalculteSignature _signatureCalculator;

    public HmacAuthenticationHandler(ISecretRepository secretRepository,
        IBuildMessageRepresentation representationBuilder,
        ICalculteSignature signatureCalculator)
    {
        _secretRepository = secretRepository;
        _representationBuilder = representationBuilder;
        _signatureCalculator = signatureCalculator;
    }

    protected async Task<bool> IsAuthenticated(HttpRequestMessage requestMessage)
    {
        if (!requestMessage.Headers.Contains(Configuration.UsernameHeader))
        {
            return false;
        }

        if (requestMessage.Headers.Authorization == null 
            || requestMessage.Headers.Authorization.Scheme 
                    != Configuration.AuthenticationScheme)
        {
            return false;
        }

        string username = requestMessage.Headers.GetValues(Configuration.UsernameHeader)
                                .First();
        var secret = _secretRepository.GetSecretForUser(username);
        if (secret == null)
        {
            return false;
        }

        var representation = _representationBuilder.BuildRequestRepresentation(requestMessage);
        if (representation == null)
        {
            return false;
        }

        if (requestMessage.Content.Headers.ContentMD5 != null 
            && !await IsMd5Valid(requestMessage))
        {
            return false;
        }

        var signature = _signatureCalculator.Signature(secret, representation);        

        var result = requestMessage.Headers.Authorization.Parameter == signature;

        return result;
    }

    protected async override Task<HttpResponseMessage> SendAsync(HttpRequestMessage request,
           System.Threading.CancellationToken cancellationToken)
    {
        var isAuthenticated = await IsAuthenticated(request);

        if (!isAuthenticated)
        {
            var response = request
                .CreateErrorResponse(HttpStatusCode.Unauthorized, UnauthorizedMessage);
            response.Headers.WwwAuthenticate.Add(new AuthenticationHeaderValue(
                Configuration.AuthenticationScheme));
            return response;
        }
        return await base.SendAsync(request, cancellationToken);
    }
}

The bulk of work is done by IsAuthenticated() method. Also please note that we do not sign the response, meaning the client will not be able verify the authenticity of the response (although response signing would be easy to do given components that we already have). I have omitted IsMd5Valid() method for brevity, it basically compares content hash with MD5 header value (just remember not to compare byte[] arrays using == operator).

Configuration part is simple and can look like that (per route handler):

config.Routes.MapHttpRoute(  
                name: "DefaultApi",
                routeTemplate: "api/{controller}/{id}",
                constraints: null,
                handler: new HmacAuthenticationHandler(new DummySecretRepository(),
                    new CanonicalRepresentationBuilder(), new HmacSignatureCalculator())
                    {
                        InnerHandler = new HttpControllerDispatcher(config)
                    },
                defaults: new { id = RouteParameter.Optional }
            );

Replay attack prevention

There is one very important flaw in the current approach. Imagine a malicious third party intercepts a valid (properly authenticated) HTTP request coming from a legitimate client (eg. using a sniffer). Such a message can be stored and resent to our server at any time enabling attacker to repeat operations performed previously by authenticated users. Please note that new messages still cannot be created as the attacker does not know the secret nor has a way of retrieving it from intercepted data.

To help us fix this issue lets make following three observations/assumptions about dates of  requests in our system:

  • requests with different Date header values will have different signatures, thus attacker will not be able to modify the timestamp,
  • we assume identical, consecutive messages coming from a user will always have different timestamps – in other words that no client will want to send two or more identical messages at a given point in time,
  • we introduce a requirement that no http request can be older than X (eg. 5) minutes – if for any reason the message is delayed for more than that it will have to be resent with a refreshed timestamp.

Once we know the above we can introduce following changes into IsAuthenticated() method:

protected async Task<bool> IsAuthenticated(HttpRequestMessage requestMessage)  
{
    //(...)
    var isDateValid = IsDateValid(requestMessage);
    if (!isDateValid)
    {
        return false;
    }
    //(...)

    //disallow duplicate messages being sent within validity window (5 mins)
    if(MemoryCache.Default.Contains(signature))
    {
        return false;
    }

    var result = requestMessage.Headers.Authorization.Parameter == signature;
    if (result == true)
    {
        MemoryCache.Default.Add(signature, username,
                DateTimeOffset.UtcNow.AddMinutes(Configuration.ValidityPeriodInMinutes));
    }
    return result;
}

private bool IsDateValid(HttpRequestMessage requestMessage)  
{
    var utcNow = DateTime.UtcNow;
    var date = requestMessage.Headers.Date.Value.UtcDateTime;
    if (date >= utcNow.AddMinutes(Configuration.ValidityPeriodInMinutes)
        || date <= utcNow.AddMinutes(-Configuration.ValidityPeriodInMinutes))
    {
        return false;
    }
    return true;
}

For simplicity I didn’t test the example for sever and client residing in different timezones (although as long as we normalize the dates to UTC we should be save here).

The Database Timeline

1961 Development begins on Integrated Data Store, or IDS, at General Electric. IDS is generally considered the first “proper” database.” It was doing NoSQL and Big Data decades before today’s NoSQL databases.

1967 IBM develops Information Control System and Data Language/Interface (ICS/DL/I), a hierarchical database for the Apollo program. ICS later became Information Management System (IMS), which was included with IBM’s System360 mainframes.

1970 IBM researcher Edgar Codd publishes his paper A Relational Model of Data for Large Shared Data Banks, establishing the mathematics used by relational databases.

1973 David R. Woolley develops PLATO Notes, which would later influence the creation of Lotus Notes.

1974 Development begins at IBM on System R, an implementation of Codd’s relational databases and the first use of the structured query language (SQL). This later evolves into the commercial product IBM DB2. Inspired by Codd’s research, University of Berkeley students Michael Stonebraker and Eugene Wong begin development on INGRES, which became the basis for PostGreSQL, Sybase, and many other relational databases.

1979 The first publicly available version of Oracle is released.

1984 Ray Ozzie founds Iris Associates to create a PLATO-Notes-inspired groupware system.

1988 Lotus Agenda, powered by a document database, is released.

1989 Lotus Notes is released.

1990 Objectivity, Inc. releases its flagship object database.

1991 The key-value store Berkeley DB is developed

2003 Live Journal open sources the original version of Memcached.

2005 Damien Katz open sources CouchDB.

2006 Google publishes BigTable paper.

2007 Amazon publishes Dynamo paper. 10gen starts coding MongoDB. Powerset open sources its BigTable clone, Hbase. Neo4j released.

2008 Facebook open sources Cassandra.

2009 ReadWriteWeb asks: “Is the relational database doomed?” Redis released. First NoSQL meetup in San Francisco.

2010 Some of the leaders of the Memcached project, along with Zynga, open source Membase.

Service Performance Optimization Techniques for .NET – Part I

Abstract: Tuning service runtime performance will improve the utilization of individual services as well as the performance of service compositions that aggregate these services. Even though it is important to optimize every service architecture, agnostic services in particular, need to be carefully tuned to maximize their potential for reuse and recomposition.

Because the logic within a service is comprised of the collective logic of service capabilities, we need to begin by focusing on performance optimization on the service capability level.

In this article we will explore several approaches for reducing the duration of service capability processing. The upcoming techniques specifically focus on avoiding redundant processing, minimizing idle time, minimizing concurrent access to shared resources, and optimizing the data transfer between service capabilities and service consumers.

Caching to Avoid Costly Processing

Let’s first look at the elimination of unnecessary processing inside a service capability.

Specifically what we’ll be focusing on is:

avoidance of repeating calculations if the result doesn’t change

avoidance of costly database access if the data doesn’t change

developing a better performing implementation of capability logic

delegating costly capability logic to specialized hardware solutions

avoidance of costly XML transformations by designing service contracts with canonical schemas

A common means of reducing the quantity of processing is to avoid duplication of redundant capabilities through caching. Instead of executing the same capability twice, you simply store the results of the capability the first time and return the stored results the next time they are requested. Figure 1 shows a flow chart that illustrates a simple caching solution.

 img

Figure 1 – Caching the results of expensive business process activities can significantly improve performance.

For example, it doesn’t make sense to retrieve data from a database more than once if the data is known to not change (or at least known not to change frequently). Reading data from a database requires communication between the service logic and the database. In many cases it even requires communication over a network connection.

There is a lot of overhead just in setting up this type of communication and then there’s the effort of assembling the results of the query in the database. You can avoid all of this processing by avoiding database calls after the initial retrieval of the results. If the results change over time, you can still improve average performance by re-reading every 100 requests (or however often).

Caching can also be effective for expensive computations, data transformations or service invocations as long as:

results for a given input do not change or at least do not change frequently

delays in visibility of different results are acceptable

the number of computation results or database queries is limited

the same results are requested frequently

a local cache can be accessed faster than a remotely located database

computation of the cache key is not more expensive than computing the output

increased memory requirements due to large caches do not increase paging to disk (which slows down the overall throughput)

If your service capability meets this criteria, you can remove the replace several blocks from the performance model and replace them with cache access, as shown in Figure 2.

img

Figure 2 – The business logic, resource access, and message transformation blocks are removed.

To build a caching solution you can:

explicitly implement caching in the code of the service

intercept incoming messages before the capability logic is invoked

centralize the caching logic into a utility caching service

Each solution has its own strengths and weaknesses. For example, explicitly implementing caching logic inside of a service capability allows you to custom-tailor this logic to that particular capability. In this case you can be selective about the cache expiration and refresh algorithms or which parameters make up the cache key. This approach can also be quite labor intensive.

Intercepting messages, on the other hand, can be an efficient solution because messages for more than one service capability can be intercepted, potentially without changing the service implementation at all.

You can intercept messages in several different places:

Intermediary

An intermediary between the service and the consumer can transparently intercept messages, inspect them to compute a cache key for the parameters, and then only forward messages to the destination service if no response for the request parameters is present in the cache (Figure 3). This approach relies on the application of Service Agent [SDP].

img

Figure 3 – Passive intermediaries can cache responses without requiring modifications to the service or the consumer.

Service Container

This is a variation of the previous technique, but here the cache lives inside the same container as the service to avoid introducing a scalability bottleneck with the intermediary (Figure 4). Service frameworks, such as ASMX and WCF, allow for the interception of messages with an HTTP Module or a custom channel.

img

Figure 4 – Message interception inside the service container enables caching to occur outside the service implementation without involving an intermediary.

Service Proxy

With WCF we can build consumer-side custom channels that can make the caching logic transparent to service consumers and services. Figure 19.6 illustrates how the cache acts as a service proxy on the consumer side before sending the request to the service. Note that with this approach you will only realize significant performance benefits if the same consumer frequently requests the same data.

img

Figure 5 – Message interception by a service proxy inside the service consumer introduces caching logic that avoids unnecessary network communication.

Caching Utility Service

An autonomous utility service can be used to provide reusable caching logic, as per the Stateful Service pattern. For this technique to work, the performance savings of the caching logic need to outweigh the performance impact introduced by the extra utility service invocation and communication. This approach can be justified if autonomy and vendor neutrality are high design priorities.

img

Figure 6 – A utility service is explicitly invoked to handle caching.

Comparing Caching Techniques

Each option has its own trade-offs between potential performance increases and additional overhead. Table 1 provides a summary.

img

Table 1 – The pros and cons of different service caching architectures.

Cache Implementation Technologies

When you decide on a caching architecture, keep in mind that server-side message interception can still impact performance because your service will need to compute a cache key and if it ends up with an oversized cache, the cache itself can actually decrease performance (especially if multiple services run on a shared server).

The higher memory requirements of a service that caches data can lead to increased paging activity on the server as a whole. Modern 64 bit servers equipped with terabytes of memory can reduce the amount of paging activity and thus avoid any associated performance reduction. Hardware-assisted virtualization further enables you to partition hardware resources and isolate services running on the same physical hardware from each other.

You can also leverage existing libraries such as the System.Web.Caching namespace for Web applications. Solutions like System.Runtime.Caching or the Caching Application Block from the Enterprise Library are available for all .NET-based services. These libraries include some more specialized caching features, such as item expiration and cache scavenging. REST services hosted within WCF can leverage ASP.NET caching profiles for output caching and controlling caching headers.

Furthermore, a distributed caching extension is provided with Windows Server AppFabric that offers a distributed, in-memory cache for high performance requirements associated with large-scale service processing. This extension in particular addresses the following problems of distributed and partitioned caching:

storing cached data in memory across multiple servers to avoid costly database queries

synchronizing cache content across multiple caching nodes for low latency and high scale and high availability

caching partitions for fast look ups and load balancing across multiple caching servers

local in-memory caching of cache subsets within services to reduce look up times beyond savings realized by optimizations on the caching tier

You also have several options for implementing the message interceptor. ASMX and WCF both offer extensibility points to intercept message processing before the service implementation is invoked. WCF even offers the same extensibility on the service consumer side. Table 2 lists the technology options for these caching architectures.

img

Table 2 – Technology choices for implementing caching architectures.

 Computing Cache Keys

Let’s take a closer look at the moving parts that comprise a typical caching solution. First, we need to compute the cache key from the request message to check if we already have a matching response in the cache. Computing a generic key before the message has been deserialized is straight-forward when:

the document format does not vary (for example, there is no optional content)

the messages are XML element-centric and don’t contain data in XML attributes or mixed mode content

the code is already working with XML documents (for example, as with XmlDocument, XmlNode or XPathNavigator objects)

the message design only passes reference data (not fully populated business documents)

the services expose RESTful endpoints where URL parameters or partial URLs contain all reference data

In these situations, you can implement a simple, generic cache key algorithm. For example, you can load the request into an XmlDocument object and get the request data by examining the InnerText property of the document’s root node. The danger here is that you could wind up with a very long and comprehensive cache key if your request message contains many data elements.

Computing a message type-specific cache key requires much more coding work and you may have to embed code for each message type. For server-side caching with ASMX Web services, for example, you would add an HTTP Module to the request processing pipeline for the service call. Inside the custom module, you can then inspect data items in the XML message content that uniquely identifies a service request and possibly by-passes the service call.

For client-side caching with ASMX on the other hand, there is no transparent approach for adding caching logic. Custom proxy classes would have to perform all the caching-related processing. Depending on requirements and the number of service consumers, it might be easier to implement caching logic in the service consumer’s code or switch to WCF for adding caching logic transparently.

For WCF-based services, you would define a custom binding with a custom caching channel as part of the channel stack for either the service or the consumer. A custom channel allows access to perform capabilities on the Message object. Oftentimes that’s more convenient than programming against the raw XML message.

Caching REST Responses

The HTTP protocol defines content caching behavior on the server, on intermediaries, and on the client. This native form of content caching was originally responsible for driving wide-spread support in Web frameworks, like ASP.NET.

WCF adds support for ASP.NET caching profiles. These profiles control the caching behavior on the server as well as sending HTTP headers to control caching on intermediaries and the client.

You configure a WCF REST service for ASP.NET caching with a combination of attributes and configuration file settings. You can begin by attributing the service contract with the AspNetCacheProfile attribute. The attribute is only valid for GET requests, which support how REST uses GET as the preferred verb for read capabilities.

[ServiceContract]

public interface ICatalogService

{

[CapabilityContract]

WebGet( UriTemplate=”/param/{itemId}”)]

[AspNetCacheProfile(“CacheFor20SecondsServer”)]

string GetCatalogItem(string itemId);

}

Example 1

The attribute references a named profile stored in the service’s configuration file. The service implementation class also needs an attribute to connect the service into the ASP.NET processing pipeline.

[AspNetCompatibilityRequirements

RequirementsMode =

AspNetCompatibilityRequirementsMode.Required)]

public class CatalogService : ICatalogService

{

}

Example 2

The configuration file further needs to set up the service host for ASP.NET caching, by adding the aspNetCompatibilityEnabled attribute:

<system.serviceModel>

<serviceHostingEnvironment aspNetCompatibilityEnabled=”true” />

<services>

<service name=”Service”>

<endpoint address=”” binding=”webHttpBinding”

contract=”IService” />

</service>

</services>

<bindings>

<webHttpBinding />

</bindings>

</system.serviceModel>

Example 3

Note that this configuration is at the host level and therefore enables ASP.NET for all services under this host. This could change behavior and performance for other services that don’t require capabilities of ASP.NET. You should evaluate carefully if RESTful services and SOAP-based services should run in the same hosting environment.

The caching profile is also stored in the configuration file:

<system.web>

<caching>

<outputCacheSettings>

<outputCacheProfiles>

<add name=”CacheFor20SecondsServer”

duration=”20″ enabled=”true”

location=”Server” varyByParam=”itemId” />

</outputCacheProfiles>

</outputCacheSettings>

</caching>

</system.web>

Example 4

The profile’s location attribute indicates where the response can be cached. The preceding example configures server-side caching only, but other values are available to allow clients to cache responses as well. Client-side caching offers higher scalability and better performance because it doesn’t increase the service’s memory footprint and avoids unnecessary network calls.

If your architecture allows for response caching, caching should be enabled along the transmission chain because not all consumers may be built to support HTTP caching headers. WCF consumers, for example, ignore the caching attributes and repeat network calls even when HTTP Cache-Control headers indicate client-side caching is allowed.

If you consume cacheable data, you may invoke services with the System.Net.WebClient or System.Net.HttpWebRequest classes to optimize for performance.

Monitoring Cache Efficiency

After you have created a suitable caching architecture, it’s important that you monitor the cache for efficiency. System.Web.Caching and the Caching Application Block both include numerous performance counters to monitor efficiency metrics, such as the cache hit ratio, the number of cache misses, etc.

The cache hit ratio measures the number of times a cached response was returned divided by the total number of requests. If you notice that your cache hit ratio is low, or your number of misses is growing, then your cache criteria could not match the data requested by consumers or the cached items expired too quickly. Your caching is probably adding more overhead than it is improving performance of your service.

Reducing Resource Contention

By decreasing resource contention we can further improve performance by minimizing the time a service capability spends waiting for access to shared resources.

Shared resources in this context can be:

CPU time

memory

files

service container threads

single-threaded code sections

database connections

databases (locks, rows, tables)

Several of these may exist as shared resources that can be accessed concurrently, whereas others may be limited to one executing thread.

It’s important to understand that even resources that can be accessed concurrently, such as system memory, are not isolated from other programs. Physical memory allocated on behalf of one Web service on a server impacts all other processes on that server because it’s not available to other processes. Therefore, allocating large portions of available system memory to one service can actually reduce performance for all other services on that server.

The memory required by the other services may only be available as virtual memory, which means increased paging activity will reduce performance. Each time a service tries to access a page that is not currently loaded, the operating system has to load the page from disk. That is a slow capability compared to accessing a page that’s already available in memory (because disks are orders of magnitude slower than memory).

Execution of the service stalls until a page that’s currently in memory is written to disk to make room for the requested page, which is then loaded into memory. What disk access essentially does is turn a fast, memory based capability into a slow, disk-based capability that can degrade performance.

You can monitor performance counters built into the Windows operating system and the .NET framework to determine if paging is impacting performance and if you can improve performance by reducing paging.

A high number of Page Faults indicates high contention for available system memory. If requested pages are frequently not available and have to be loaded from disk, you may also want to keep an eye on performance counters like:

Memory\\Committed Bytes to ensure that it doesn’t exceed the amount of physical memory in your server (if you cannot tolerate performance degradation due to paging).

Memory\\Private Bytes to check that processes do not impact performance of other processes by allocating all memory for them.

You can best avoid the performance degradation caused by paging by reducing contention for system memory. This way, you reduce contention either by supplying large amounts of physical memory or by reducing concurrent access to the available memory. Likewise, eliminating contention for other system resources improves performance as well.

Request Throttling

Exclusive access to resources reduces contention and thus improves performance. Sometimes you can relieve contention just by adding more resources, more system memory or more CPUs. Other times, adding more resources is not an option, perhaps due to hardware restrictions or a limited number of available database connections. In that case you can avoid concurrency by throttling the number of requests sent to the service. This reduces contention to the number of concurrent service requests, which reduces CPU context switches, and concurrent access to shared resources.

Effective throttling shrinks the idle time component in the performance model as shown in Figure 7.

img

Figure 7 – Request throttling reduces idle time in a service capability.

Remember though, that throttling is typically scoped to a single service. Throttling can reduce contention within a service, but multiple services sharing a server can still compete for these resources and cause contention and slow downs. You could introduce an intermediary service to throttle access to multiple services, but an intermediate service often introduces performance issues instead of fixing them.

Throttling With WCF

WCF allows for the throttling of messages by adding a throttling behavior to the channel. You can configure the throttling behavior in the service or client configuration file as follows:

<configuration>

<system.serviceModel>

<services>

<service

name=”…”

behaviorConfiguration=”

Throttled” >

</service>

</services>

<behaviors>

<serviceBehaviors>

<behavior name=”Throttled”>

<serviceThrottling

maxConcurrentCalls=”8″

maxConcurrentSessions=”8″

maxConcurrentInstances=”8″ >

</behavior>

</serviceBehaviors>

</behaviors>

</system.serviceModel>

</configuration>

Example 5

This configuration limits concurrent processing to eight and improves performance in scenarios where contention for limited resources causes a problem. Processing more than eight requests simultaneously in a server with less than eight processors would cause costly context switches. Throttling the message processing reduces the amount of concurrency and thus the time lost due to switching context between the processing threads.

With the Windows Server AppFabric Application Server Extensions installed, you can also configure throttling parameters from the IIS Management tool. Note that as with everything else in WCF, you can configure the concurrency thresholds programmatically. However, placing these values in the configuration file allows for more flexibility to adjust the numbers as necessary without having to recompile code. Request Throttling with BizTalk Server

BizTalk Server provides more sophisticated throttling features than WCF. The BizTalk messaging engine continuously monitors several performance counters for each BizTalk host. Under several load conditions, BizTalk throttles message processing when certain performance counters exceed configured thresholds. The engine slows down processing of incoming and outgoing messages to reduce contention for resources, such as memory or server connections. You can configure the thresholds for each BizTalk host from the Administration Console.

Throttling offers a relatively simple means of reducing resource contention. But just like with other performance tuning steps, it’s important to understand what it can and cannot do. Throttling in WCF happens for a single service, but other services running on the same server could still compete for the equivalent resources. Throttling in BizTalk occurs at the host level. Other BizTalk hosts or services not hosted in BizTalk could compete for the same physical resources on the server.

You can only control contention in a meaningful way when fully applying the Service Autonomy principle to create truly autonomous service implementations that have their own set of dedicated resources, either physically or through hardware-based virtualization. This level of autonomy allows you to commit your serves to hard Service Level Agreement (SLA) requirements.

Conclusion

The second article in this two-part series goes over the following topics: Coarse-Grained Service Contracts, Selecting Application Containers, Performance Policies, REST Service Message Sizes, Hardware Encryption, High Performance Transport, MTOM Encoding, Performance Considerations for Service Contract Design, and Impact on Service-Orientation Principles.

( Referenced from http://www.servicetechmag.com/i69/1212-3 )

Detailed Tutorial for Building ASP.Net Web API RESTful Service

When you are designing, implementing, and releasing new REST API a lot of constraints and standards should be considered; once the API is available to public, and clients start consuming it, significant changes are very hard!

There are lot of API designs in the web; but there is no widely adopted design which work for all scenarios, that is why you are left with many choices and grey areas.

So in this multi-part series we’ll be building from scratch a sample eLearning system API which follows best practices for building RESTful API using Microsoft technology stack. We’ll use Entity framework 6 (Code First) and ASP.Net Web API.

Before digging into code samples and walkthrough I would like to talk little bit about the basics and characteristics of RESTful services and ASP.Net Web API.

Basics of RESTful services:

REST stands for Representational State Transfer, it is a simple stateless architecture that runs over HTTP where each unique URL is representation of some resource. There are four basic design principles which should be followed when creating RESTful service:

  • Use HTTP methods (verbs) explicitly and in consistent way to interact with resources (Uniform Interface), i.e. to retrieve a resource use GET, to create a resource use POST, to update a resource usePUT/PATCH, and to remove a resource use DELETE. 
  • Interaction with resources is stateless; each request initiated by the client should include within the HTTP headers and body of the request all the parameters, context, and data needed by the server to generate the response.
  • Resource identification should be done through URIs, in simple words the interaction between client and resource in the server should be done using URIs only. Those URIs can act like a service discovery and interface for your RESTful service.
  • Support JSON or/and XML as the format of the data exchanged in the request/response payload or in the HTTP body.

For more information about RESTful services, you can check this information rich IBM article.

Introducing the ASP.NET Web API

The ASP.Net Web API shipped with ASP.Net MVC4, it has been around more than one year and half. It is considered a framework for building HTTP services which can be consumed by broad range of clients such as browsers, smart phones, and desktop applications. It is not considered as a part of the MVC framework, it is part of the core ASP.Net platform and can be used in MVC projects, Asp.Net WebForms, or as stand alone web service.

ASP.Net Web API Stack

Today with the increase of using smart phones and the trend of building Single Page Apps (SPA); having a light weight Web API which exposes your services data to clients is very important. Asp.Net Web API will help you out of the box in creating RESTFul compliant services using features of HTTP like (URIs, request/response, headers, versioning, and different content formats).

What we’ll build in this multi-part series?

We need to keep things simple and easy to understand and learn from, but at the same time we need to cover different features provided by ASP.Net Web API and best practices to build RESTFul service.

We’ll be building a simple API for eLearning system, this API allows students to enroll in different courses, allows tutors to view students enrolled in each course, do CRUD operations on courses and students, and many more operations. I’ll be listing detailed use cases which we’ll covered in the next post.

We’ll discuss and implement different Web API features such as:

  • Using different routing configuration, controllers, resources association, formatting response, and filters.
  • Implementing Dependency Injection using Ninject.
  • Apply results pagination using different formatting techniques.
  • Implementing complex CRUD operations on multiple resources.
  • Securing Web API by using Basic authentication, forcing SSL.
  • Implementing API Versioning using different techniques (URL versioning, by query string, by version header, and by accept header).
  • Implement resources cashing.

Drupal architecture – how to implement loosely coupled communication across modules

Drupal is a free and opensource Content Management System which is used for building Websites. It is based on LAMP Architecture using PHP as implementation language. You can read success stories from the website and from the author Dries Buytaert ‘s personal blog. Most of the success of Drupal derives from his modular architecture which simplifies development, collaboration and user contribution. Drupal’s history can be read here.

Bird’s eye view of Drupal architecture can be sketeched by following diagram:

At the ground level there is Drupal API. It implements the basic functionality of the module system and of theCMS. Physically it is made up by a folder called includes/ which contains a set of php files (named with .incextension). Each of this php file implements an API that can be exploited by upper levels modules.
Core package contains all the Drupal Modules that implement the CMS engines. In this package you can find modules for node management, blogging, commenting, forum, menu and so on. Here is a list of Drupal v7 core modules as they appear on filesystem:

On top of Core modules there are Community Modules. These are modules contributed by the opensource community which are not in the main distribution. For example you can fine modules for Adsense, Amazon integration, Voting and many more (here you can find a complete list of community modules available).
At end there are User modules which are custom private modules built by developers for implementing specific project’s needs. A typical website is deployed using Drupal API and Core modules.

Drupal API

Following diagram reports how drupal API is structured (content of the /includes directory):

There is an entry point file which is called index.php. This is called on every Drupal Request. According the request it includes several .inc files. Every file implements a particular features: for example database.inc exports the API to access database, module.inc the hidden work of the module system, theme.inc implements the theming subsystem and so on. Every include exports a set of constants and functions in the form of:

<?php
// module sample.inc
define('SAMPLE_CONSTANT', 'sample value');
......

function a_method_1($param1, $param2) {
	// code here ...
}

function a_method_2() {
	// code here ...
}

function a_method_3($input_param1, $input_param2, $input_param3) {
	// code here
}
...................
...................

As an example this is the code extracted from includes/bootstrap.inc:

Here you can find a description on how drupal boots and processes each request.

Drupal Modules

A module is a set of PHP and configuration files deployed on a particular folder of Drupal installation. These files follows a particular convention according the following:

each has a

  • .info file (example: mymodule.info) which contains version and dependency information about the module
  • .module file (example: mymodule.module) which contains PHP code implementing module functionality. Generally module uses Drupal API
  • .api file (example: mymodule.api) which contains hooks implemented by module: event to which module is interested
  • .install file (example: mymodule.install) which contains code to execute when module is installed/uninstalled

One of the first things I questioned myself is how they implemented loosely coupled communication between modules. At an abstract level they used Observer/Observable pattern principally based on a file name convention. Every module notifies a set of internal events which carry out data. These events are called Hooks. When a module is interested in an event, it implements a particular method whose name contains the event name plus a prefix, which will be automatically called by the module subsystem. Dynamics is sketched by following diagram:

When Module_A wants to send an event to other interested modules, it invokes method invoke_all() from Drupal API (file modules.inc). invoke_all finds all modules implementing that particular hooks, calling for each a method called <modulename>_<hookname>(params).

RAILS VS DJANGO: AN IN-DEPTH TECHNICAL COMPARISON

I’d like to start with a disclaimer. I have been developing websites using Django for 3 years now and it’s no secret that I like Django. I wrote an open-source app for it and I have started sending patches to Django. I have however written this article to be as unbiased as possible and there’s plenty of compliments (and criticism) to both frameworks.

Six months ago I joined a project at my University using Ruby on Rails and have been working with it since then. The first thing I did was to look for for reviews and comparisons between the two frameworks and I remember being frustrated. Most of them were a little shallow and compared them on a higher level while I was looking for answers to questions like ‘how are database migrations handled by both?’, ‘what are the differences on the template syntax?’, ‘how is user authentication done?’. The following review will answer these questions and compare how the model, controller, view and testing is handled by each web framework.

A SHORT INTRODUCTION

Both frameworks were born out of the need of developing web applications faster and organizing the code better. They follow the MVC principle, which means the modelling of the domain (model), the presentation of the application data (view) and the user’s interaction (controller) are all separated from each other. As a side note, Django actually considers the framework to be the controller, so Django addresses itself as a model-template-view framework. Django’s template can be understood as the view and the view as the controller of the typical MVC scheme. I’ll be using the standard MVC nomenclature on this post.

Ruby on Rails

railsRuby on Rails (RoR) is a web framework written in Ruby and is frequently credited with making Ruby “famous”. Rails puts strong emphasis on convention-over-configuration and testing. Rails CoC means almost no config files, a predefined directory structure and following naming conventions. There’s plenty of magic everywhere: automatic imports, automatically passing controller instance variables to the view, a bunch of things such as template names are inferred automatically and much more. This means a developer only needs to specify unconventional aspects of the application, resulting in cleaner and shorter code.

Django

djangoDjango is a web framework written in Python and was named after the guitarrist Django Reinhardt. Django’s motivation lies in the “intensive deadlines of a newsroom and the stringent requirements of the experienced Web developers who wrote it”. Django follows explicit is better than implicit (a core Python principle), resulting in code that is very readable even for people that are not familiar with the framework. A project in Django is organized around apps.  Each app has its own models, controllers, views and tests and feels like a small project. Django projects are basically a collection of apps, with each app being responsible for a particular subsystem.

MODEL

Let’s start by looking how each framework handles the MVC principle. The model describes how the data looks like and contains the business logic.

Creating models

Rails creates models by running a command in terminal.

1
2
rails generate model Product name:string quantity_in_stock:integer
                             category:references

The command above will automatically generate a migration and an empty model file, which looks like this:

1
2
3
class Product < ActiveRecord::Base
end

Coming from a Django background, one thing that annoyed me was the fact that I couldn’t know which fields a model has just by looking into its model file. What I learned was that Rails uses the model files basically only for business logic and stores how all models looks like in a file called schemas.rb. This file is automatically updated every time a migration is ran. If we take a look at this file we can see how our Product model looks like.

1
2
3
4
5
6
7
create_table "products", :force => true do |t|
  t.string   "name",
  t.integer  "quantity_in_stock",
  t.integer  "category_id",
  t.datetime "created_at", :null => false
  t.datetime "updated_at", :null => false
end

Notice how there are two extra fields in the model. created_at and updated_at are fields that are added to every model in Rails and will be set automatically.

In Django models are defined in a file called models.py. The same Product model would look like this

1
2
3
4
5
6
class Product(models.Model):
    name = models.CharField()
    quantity_in_stock = models.IntegerField()
    category = models.ForeignKey('Category')
    created_at = models.DateTimeField(auto_now_add=True) # set when it's created
    updated_at = models.DateTimeField(auto_now=True) # set every time it's updated

Notice that we had to explicitly add created_at and updated_at in Django. We also had to tell Django how these fields behave through the parameters auto_now_add and auto_now.

Model field defaults and foreign keys

Rails will per default allow fields to be null. You can see on the example above that the three fields we created allowed null. The reference field to a Category will also neither create an index nor a foreign key constraint. This means referential integrity is not guaranteed. Django’s default is the exact opposite. No field is allowed to be null unless explicitly set so. Django’s ForeignKey will also create a foreign key constraint and an index automatically. Although Rails decision here may be motivated by performance concerns, I’d side with Django here as I believe this decision avoids (accidental) bad design and unexpected situations. For example, a previous student in our project wasn’t aware that all fields he had created allowed null per default. After a while we noticed that some of our tables contained data that made no sense, such as a poll with null as the title. Since Rails doesn’t add FKs, following our example we could have for example deleted a category that is still referenced by other products and these products would then have invalid references. An option is to use a third-party app that adds support for automatic creation of foreign keys.

Migrations

Migrations allow the database schema to be changed after it has already been created (actually, in Rails everything is a migration — even creating). I have to give props to Rails for supporting this out-of-the-box for a long time. This is done using Rails’ generator

1
$ rails generate migration AddPartNumberToProducts part_number:string

This would add a new field called part_number to the Product model.

Django only supports migrations by using an third-party library called South, however, I find South’s approach to be both somewhat cleaner and more practical. The equivalent migration above could be done by directly editing the Product model definition and adding a new field

1
2
3
class Product(models.Model):
    ... # old fields
    part_number = models.CharField()

and then calling

1
$ python manage.py schemamigration products --auto

South will automatically recognize that a new field was added to the Product model and create a migration file. It can then be synced by calling

1
$ python manage.py migrate products

Django is finally going to support migrations in its newest version (1.7) by integrating South into it.

Making queries

Thanks to object-relation mapping, you will not have to write a single SQL line in either framework. Thanks to Ruby’s expressiveness you can actually write range queries quite nicely.

1
Client.where(created_at: (Time.now.midnight - 1.day)..Time.now.midnight)

This would find all Clients created yesterday. Python doesn’t support syntax like 1.day, which is extremely readable and succinct, neither the .. range operator. However, sometimes in Rails I feel like I’m writing prepared statements again. To select all rows where a particular field is greater than a value, you have to write

1
Model.where('field >= ?', value)

Django’s way of doing this is not much better, but in my opinion, more elegant. The equivalent line in Django is looks like this:

1
Model.objects.filter(field__gt=value)

CONTROLLER

Controllers have the task of making sense of a request and returning an appropriate response. Web applications typically support adding, editing, deleting and showing details of a resource and RoR’s conventions really shine here by making the development of controllers short and sweet. Controllers are divided into methods, each representing one action (show for fetching the details of a resource, new for showing the form to create a resource, create for receiving the POST data from new and really creating the resource, etc.) The controllers’ instance variables (prefixed with @) are automatically passed to the view and Rails knows from the method name which template file to use as a view.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
class ProductsController < ApplicationController
  # automatically renders views/products/show.html.erb
  def show
    # params is a ruby hash that contains the request parameters
    # instance variables are automatically passed to views
    @product = Product.find(params[:id])
  end
  # returns an empty product, renders views/products/new.html.erb
  def new
    @product = Product.new
  end
  # Receives POST data the user submitted. Most likely coming from
  # a form in the 'new' view.
  def create
    @product = Product.new(params[:product])
    if @product.save
      redirect_to @product
    else
      # overrides default behavior of rendering create.html.erb
      render "new"
    end
  end
end

Django has two different ways of implementing controllers. You can use a method to represent each action, very similar to how Rails’ does it, or you can create a class for each controller action. Django however doesn’t have separate new and create methods, the creation of a resource happens in the same controller where an empty resource is created. There is also no convention on how to name your views. View variables need to be passed explicitly from the controller and the template file to be used needs to be set as well.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# django usually calls the 'show' method 'detail'
# the product_id parameter comes from the routing
def detail(request, product_id):
    p = Product.objects.get(pk=product_id) # pk = primary key
    # renders detail.html with the third parameter passed as context
    return render(request, 'products/detail.html', {'product': p})
def create(request):
    # check if form was submitted
    if request.method == 'POST':
        # similar to RoR' 'create' action
        form = ProductForm(request.POST) # A form bound to the POST data
        if form.is_valid(): # All validation rules pass
            new_product = form.save()
            return HttpResponseRedirect(new_product.get_absolute_url())
    else:
        # similar to RoR' 'new' action
        form = ProductForm() # An empty form
    return render(request, 'products/create.html', { 'form': form })

The amount of boilerplate in this example in Django is obvious when compared to RoR. Django seems to have noticed this and developed a second way of implementing controllers by harnessing inheritance and mixins. This second method is called class-based views (remember, Django calls the controller, view) and was introduced in Django 1.5 for promoting code reuse. Many commons actions such as displaying, listing, creating and updating a resource already have a class from which you can inherit from, greatly simplifying the code. Repetitive tasks such as specifying the view filename to be used, fetching the object and passing the object to the view are done automatically. The same example above would only be 4 lines using this pattern.

1
2
3
4
5
6
7
8
9
10
11
12
13
# Supposes the routing passed in a parameter called 'pk'
# containing the object id and uses it for fetching the object.
# Automatically renders the view /products/product_detail.html
# and passes product as a context variable to the view.
class ProductDetail(DetailView):
    model = Product
# Generates a form for the given model. If data is POSTed,
# automatically validates the form and creates the resource.
# Automatically renders the view /products/product_create.html
# and passes the form as a context variable to the view.
class ProductCreate(CreateView):
    model = Product

When the controllers are simple, using class-based views is usually the best choice, as the code generally ends up very compact and readable. However, depending on how non-standard your controllers are, many functions may need to be overridden to achieve the desired functionality. A common case is when the programmer wants to pass in additional variables to the view, this is done by overriding the functionget_context_data. Do you want to render a different template depending on a particular field of the current object (model instance)? You’ll have to override render_to_response. Do you want to change how the object will be fetched (default is using the primary key field pk)? You’ll have to override get_object. For example, if we wanted to select a particular product by its name instead of id and to also pass to our view which products are similar to it, the code would look like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
class ProductDetail(DetailView):
    model = Product
    def get_object(self, queryset=None):
        return get_object_or_404(Product, key=self.kwargs.get('name'))
    def get_context_data(self, **kwargs):
        # Call the base implementation first to get a context
        context = super(ProductDetail, self).get_context_data(**kwargs)
        # Add in the related products
        context['related_products'] = self.get_object().related_products
        return context

VIEW

Rails views uses the Embedded Ruby template system which allows you to write arbitrary Ruby code inside your templates. This means it is extremely powerful and fast, but with great power comes great responsibility. You have to be very careful to not mix your presentation layer with any other kind of logic. I have another example involving fellow students again. A new student had joined our RoR Project and was working on a new feature. It was time for code review. We started at the controller and the first thing that struck me was how empty his controller looked like. I immediately looked at his views and saw massive ruby blocks intermixed with HTML. Yes, Rails it not to blame for a programmers lack of experience, but my point is that frameworks can protect the developer from some bad practices. Django for instance has a very frugaltemplate language. You can do ifs and iterate through data using for loops, but there’s no way to select objects that were not passed in from the controller as it does not execute arbitrary Python expressions. This is a design decision that I strongly believe that pushes developers into the right direction. This would have forced the new student in our project to find the correct way of organizing his code.

Assets: CSS, Javascript and images

Rails comes with an excellent built-in asset pipeline. Rails’ asset pipeline is capable of concatenating, minifying and compressing Javascript and CSS files. Not only that, it also supports other languages such as Coffeescript, Sass and ERB. Django’s support of assets it pretty much shameful compared to Rails and leaves everything for the developer to handle. The only thing Django offers is something called static files, which basically collects all static files from each app to a single location. A third-party app called django_compressor offers a solution similar to Rails’ asset pipeline.

Forms

Forms in web applications are the interface through which users give input. Forms in Rails consist of helper methods that are used directly in the views.

1
2
3
4
5
6
7
8
9
10
11
<%= form_tag("/contact", method: "post") do %>
  <%= label_tag(:subject, "Subject:") %>
  <%= text_field_tag(:subject) %>
  <%= label_tag(:message, "Message:") %>
  <%= text_field_tag(:message) %>
  <%= label_tag(:subject, "Sender:") %>
  <%= text_field_tag(:sender) %>
  <%= label_tag(:subject, "CC myself:") %>
  <%= check_box_tag(:sender) %>
  <%= submit_tag("Search") %>
<% end %>

The input fields subject, message, etc. can than be read at a controller through the ruby hash (a dictionary-like structure) params, for example params[:subject] and params[:message]. Django on the other hand abstracted the concept of forms. Forms are normal classes that encapsulate fields and can contain validation rules. They look a lot like models.

1
2
3
4
5
class ContactForm(forms.Form):
    subject = forms.CharField(max_length=100)
    message = forms.CharField()
    sender = forms.EmailField()
    cc_myself = forms.BooleanField(required=False)

Django knows that the CharField counterpart in html are text input boxes and that BooleanFields are checkboxes. If you wish, you can change which input element will be used through the widget field. Forms in Django are instantiated in the controller.

1
2
3
4
5
6
7
8
9
def contact(request):
    if request.method == 'POST':
        form = ContactForm(request.POST)
        if form.is_valid():
            return HttpResponseRedirect('/thanks/') # Redirect after POST
    else:
        form = ContactForm() # An unbound form
    return render(request, 'contact.html', { 'form': form })

Django will throw in validation for free. By default all fields are required, except when defined otherwise (such as cc_myself). Using the snippet above, if the form fails the validation, the same form will be automatically displayed with an error message and the input given is shown again. The code below displays a form in a view.

1
2
3
4
<form action="/contact/" method="post">
{{ form.as_p }} <!-- generates a form very similar to rails' -->
<input type="submit" value="Submit" />
</form>

URLS AND ROUTING

Routing is the task of matching a particular URL to a controller. Rails makes building REST web services a breeze and routes are expressed in terms of HTTP verbs.

1
get '/products/:id', to: 'products#show'

In the example above a GET request to /products/any_id will be automatically routed to the controllerproducts and the action show. Since it’s such a common task to have all actions (create, show, index, etc) in a controller and thanks to convention-over-configuration, RoR created a way of quickly declaring all common routes, called resources. If you followed Rails’ conventions when naming your controller methods this is very handy.

1
2
3
4
5
6
# automatically maps GET /products/:id to products#show
#                    GET /products to products#index
#                    POST /products to products#create
#                    DELETE /products/:id to products#destroy
#                    etc.
resources :products

Django does not use the HTTP verb to route. Instead it is more verbose and uses regular expressions to match URLs to controllers.

1
2
3
4
5
6
7
8
urlpatterns = patterns('',
    # matches the detail method in the products controller
    url(r'^products/(?P\d+)/$', products.views.DetailView.as_view(), name='detail'),
    # matches the index method, you get the gist
    url(r'^products/$', products.views.IndexView.as_view(), name='index'),
    url(r'^products/create/$', products.views.CreateView.as_view(), name='create'),
    url(r'^products/(?P\d+)/delete/$', products.views.DeleteView.as_view(), name='delete'),
)

Since regular expressions are used, simple validation is automatically built in. Requesting /products/test/will not match any routing rule and a 404 will be raised, as test is not a valid integer. The difference in philosophy can be seen here again. Django does have any convention when naming controller actions, so Django does not have any cool helpers like Rails’ resource and every route has to be explicitly defined. This results in each controller requiring several routing rules.

TESTING

Testing is just a breeze in Rails and there’s a much stronger emphasis on it in Rails than in Django.

Fixtures

Both frameworks support fixtures (a fancy word for sample data) in a very similar way. I’d however give Rails a bonus point for making it more practical writing them as it infers from the file name for which model you’re writing them. Rails uses YAML-formatted fixtures, which is a human-readable data serialization format.

1
2
3
4
5
6
7
8
9
10