Minifying HTML With ASP.NET MVC And Razor

When I work on a Web application, I minify the Javascript and the CSS files with tools like Yahoo! Compressor or Microsoft Ajax Minifier. In two of his articles, Nicholas Zakas explains how to use these tools and the gains you can expect from minification.

Minifying HTML is more difficult. There are very few tools for that. Most of them integrates in the rendering pipeline. They are executed just after the HTML is generated by the pages. It means they are invoked after each page request and thus they increase the CPU usage. The only real solution to minify HTML while reducing CPU and bandwidth usage, is to rewrite the pages manually to remove all the useless blank characters.

A typical HTML page is indented and contains comments:

<div>
    <!-- a comment -->
    <h4>header</h4>
    <ul>
        <li>item 1</li>
        <li>item 2</li>
        <li>item 3</li>
    </ul>
</div>

After a manual minification, all the whitespace and comments are removed:

<div>
<h4>header</h4>
<ul>
<li>item 1</li>
<li>item 2</li>
<li>item 3</li>
</ul>
</div>

The HTML is 25% smaller. That’s 25% less bytes written to the wire which means less server usage and better response times.

Do not expect improvements if your application executes millions of SQL queries in each page. You first have to optimize your application logic before expecting results from minification.

ASP.NET MVC 3.0 brings the new Razor view engine as a replacement of ASPX. After a first project with Razor, I must say I am really happy with the simplified syntax and the integration with MVC. Razor has another useful advantage over ASPX: it’s very easy to extend. We are going to leverage it to minify automatically the HTML pages for us.

Razor works in two steps:

  1. The first time a page is called, it parses the cshtml file, generates the equivalent C# code, and compiles the rendering code in a DLL.
  2. Every time a page is called, Razor looks for the DLL and invokes the Execute method.

The idea is to minify the HTML before Razor generates the C# code by providing our own implementation of MvcCSharpRazorCodeGenerator. This class has a VisitSpan method called by Razor with all the tokens, Spans, found when parsing the page. There are several kinds of spans, some are for the C# expressions in the page, and others, MarkupSpans, contains the page HTML.

By overriding the VisitSpan method, we can catch all the page markup, apply a minify algorithm on it, and then let Razor generates the DLL for the page. For now, our Minify method is very simple, it trims the whitespace.

public sealed class MinifyHtmlCodeGenerator : MvcCSharpRazorCodeGenerator
{
    public MinifyHtmlCodeGenerator(string className, string rootNamespaceName, string sourceFileName, RazorEngineHost host)
        : base(className, rootNamespaceName, sourceFileName, host)
    {
    }
    public override void VisitSpan(Span span)
    {
        // We only minify the static text
        var markupSpan = span as MarkupSpan;
        if (markupSpan == null)
        {
            base.VisitSpan(span);
            return;
        }
        var content = markupSpan.Content;
        content = Minify(content);
        span.Content = content;
        base.VisitSpan(span);
    }
    private string Minify(string content)
    {
        return content.Trim();
    }
}

To integrate the new code generator in the application, we have to change the Views/Web.config to replace the default Razor factory by a new one:

<configuration>
  <system.web.webPages.razor>
    <!--<host factoryType="System.Web.Mvc.MvcWebRazorHostFactory, System.Web.Mvc, Version=3.0.0.0, Culture=neutral, PublicKeyToken=31BF3856AD364E35" />-->
    <host factoryType="Meleze.Web.Razor.MinifyHtmlWebRazorHostFactory,Meleze.Web.Razor" />
  </system.web.webPages.razor>
</configuration>

The factory code is straightforward. It extends the default factory to call our generator.

public sealed class MinifyHtmlWebRazorHostFactory : WebRazorHostFactory
{
    public override WebPageRazorHost CreateHost(string virtualPath, string physicalPath)
    {
        WebPageRazorHost host = base.CreateHost(virtualPath, physicalPath);
        if (host.IsSpecialPage)
        {
            return host;
        }
        return new MinifyHtmlMvcWebPageRazorHost(virtualPath, physicalPath);
    }
}
public sealed class MinifyHtmlMvcWebPageRazorHost : MvcWebPageRazorHost
{
    public MinifyHtmlMvcWebPageRazorHost(string virtualPath, string physicalPath)
        : base(virtualPath, physicalPath)
    {
    }
    public override RazorCodeGenerator DecorateCodeGenerator(RazorCodeGenerator incomingCodeGenerator)
    {
        if (incomingCodeGenerator is CSharpRazorCodeGenerator)
        {
            return new MinifyHtmlCodeGenerator(incomingCodeGenerator.ClassName, incomingCodeGenerator.RootNamespaceName, incomingCodeGenerator.SourceFileName, incomingCodeGenerator.Host);
        }
        return base.DecorateCodeGenerator(incomingCodeGenerator);
    }
}

We can already execute an ASP.NET MVC application with the HTML minifier. If we put a breakpoint in the VisitSpan method, we check that it’s called only when Razor compiles a page. It’s not called anymore when the views are rendered.

Now, we can enhance the minify algorithm with some new requirements:

  • Remove as much whitespace as possible. This does not mean to remove all the whitespace. Some of them are significant in the HTML and if we remove to much blanks, the page will not be rendered as the original page by the browsers. That’s the case when there is text inside a HTML tag.
  • Remove all the useless HTML comments. We have to keep all the Javascript and the IE conditional comments.

Of course the new implementation is more complex than a simple trim but it handles realistic HTML.

private string Minify(string content)
{
    if (string.IsNullOrWhiteSpace(content))
    {
        return string.Empty;
    }
    var builder = new StringBuilder(content.Length);
    // Minify the comments
    var icommentstart = content.IndexOf("<!--");
    while (icommentstart >= 0)
    {
        var icommentend = content.IndexOf("-->", icommentstart + 3);
        if (icommentend < 0)
        {
            break;
        }
        if (_commentsMarkers.Select(m => content.IndexOf(m, icommentstart)).Any(i => i > 0 && i < icommentend))
        {
            // There is a comment but it contains javascript or IE conditionals
            // => we keep it
            break;
        }
        builder.Append(content, 0, icommentstart);
        builder.Append(content, icommentend + 3, content.Length - icommentend - 3);
        content = builder.ToString();
        builder.Clear();
        icommentstart = content.IndexOf("<!--", icommentstart);
    }
    // Minify white space while keeping the HTML compatible with the given one
    var lines = content.Split(_whiteSpaceSepartors, StringSplitOptions.RemoveEmptyEntries);
    for (int i = 0; i < lines.Length; i++)
    {
        var line = lines[i];
        var trimmedLine = line.Trim();
        if (trimmedLine.Length == 0)
        {
            continue;
        }
        if (char.IsWhiteSpace(line[0]) && (trimmedLine[0] != '<'))
        {
            builder.Append(' ');
        }
        builder.Append(trimmedLine);
        if (char.IsWhiteSpace(line[line.Length - 1]) && (trimmedLine[trimmedLine.Length - 1] != '>'))
        {
            builder.Append(' ');
        }
        if ((i < lines.Length - 1) || (_whiteSpaceSepartors.Any(s => s == content[content.Length - 1])))
        {
            builder.Append('\n');
        }
    }
    return builder.ToString();
}
private static char[] _whiteSpaceSepartors = new char[] { '\n', '\r' };
private static string[] _commentsMarkers = new string[] { "{", "}", "function", "var", "[if" };

 

I put the minifier in practice in my own projects and also on a much bigger one, the Orchard CMS. With a simple change in the Web.Config, the HTML generated by these applications is from 10% to 30% smaller.

Doing this small tool allowed me to take a look at the Razor internals. I hope you find this useful.

You can find the full sources in https://github.com/meleze/Meleze.Web. There is also a NuGet package for those wanting the dll.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>