Content Management - Syntax Highlighting

Pretty code for all!

Intro

How can there be data science without code? Given this truth, I will want to feature code in a number of programing languages for various reasons e.g.: to understand code or concepts that I learn from someone else, or to detail implementation of algorithms or project code. Therefore, this post is to familiarise myself with

Rouge higlighting

Overview

Roxanne! No? Anyway… Jekyll (version 3) uses Rouge as its default code syntax highlighting engine. Alternatively, one could use Pygments instead, and the selection can be made in _config.yml. One can also add line numbers and modify the CSS styling that is applied to code snippets. There is more information in the Jekyll docs here and here.

Basically, anything between the syntax highlighting start ({% highlight %}) and end ({% endhighlight %}) liquid tags will have syntax highlighting applied to it according to the available CSS configuration. Apparently, Rouge is compatible with CSS stylesheets for Pygments.

Formatting guide

To use language-specific syntax highlighting, the start tag should have the format:

  • format according to the languageID language: {% highlight languageID %}.

  • format to languageID syntax with line numbers: {% highlight languageID linenos %}.

Documentation is available for the languageID codes (short names) supported by Rouge and Pygments. Given that I will use Rouge (the default), I selected a few of the languages that I might use and summarised them here for quick reference (Rouge codes and aliases in brackets). I have listed them according to how I might use them. Admittedly the categories are arbitrary, but still useful:

  • Jekyll blog: Markdown (markdown), Ruby (ruby), Liquid (liquid), YAML (yaml, yml), HTML (html), CSS (css).

  • Data Science: R (r, R,s,S), Python (python, py), SQL languages (sql).

  • Web: JavaScript (javascript, js), JSON (json, json-doc), HTTP requests (http).

  • App dev: Java (java), C++ (cpp)

  • Unix: Unix shell script (shell), Unix shell console (shell_session).

UPDATE: 2016-12-01 (more formats for reference)

  • Config: INI configuration format (ini), general configuration files (conf, config, configuration), .properties config files for Java (properties), Gradle: A powerful build system for the JVM (gradle).

Rouge Example

Below is a basic example of code written in the R lanaguage. This is a simple function in R to display (print) the phrase “hello console” onto the console repeatedly depending on the numerical value of the repeatCount variable. The code is nonsense but valid, but I wanted something more substantial than “hello world”.

The Liquid-formatted code supplied to Jekyll, highlighted using the {% highlight liquid %} start tag for syntax highlighting. However, inside this tag, we have a {% raw %} tag that surrounds the R code highlighing block as {% raw %}{% highlight r %}:

    
{% highlight r %}
greetConsole <- function(repeatCount){
for (i in 1:as.integer(repeatCount)){
print(paste("hello", "console", i))
}
return(repeatCount)
}
{% endhighlight %}

is evaluated to:

greetConsole <- function(repeatCount){
for (i in 1:as.integer(repeatCount)){
print(paste("hello", "console", i))
}
return(repeatCount)
}

and looks like this with line numbers:

1
2
3
4
5
6
greetConsole <- function(repeatCount){
for (i in 1:as.integer(repeatCount)){
print(paste("hello", "console", i))
}
return(repeatCount)
}

Note:

  • I have left the line numbering CSS to the default, but this can be changed with CSS targeting the .lineno CSS class definition.

  • Interestingly, this doesn’t do inline syntax highlighting, which is a great segway to the next section.

Github flavoured markdown (GFM)

Somehow this escaped my notice in the second post of this series on Markdown. Basically, Github seems to have their own spin on Markdown. GFM has similar syntax highlighting formatting to RMarkdown, and there is a list of GFM language code abbreviations. Interestingly, I’m not sure how to specify line numbers.

This is the same code example used previously in this post with GFM-formatted syntax highlighting. Note, I had to encase the code in a <pre> (preformatted text) tag, to get the right formatting. Alternatively I could have used the html symbol (&#96;) for the backtick.

``` r
greetConsole <- function(repeatCount){  
    for (i in 1:as.integer(repeatCount)){  
        print(paste("hello", "console", i))  
    }  
    return(repeatCount)  
}  
```

The code above is rendered as follows:

greetConsole <- function(repeatCount){  
    for (i in 1:as.integer(repeatCount)){  
        print(paste("hello", "console", i))  
    }  
    return(repeatCount)  
} 

More notes

GFM doesn’t seem to have inline syntax highlighting as well, but since Pandoc does, any syntax highlighting in RMarkdown will be processed before Jekyll gets a hold of it. That said, GFM seems to have basic inline code highlighting as long as no language is specified e.g. ``` hello``` evaluates to hello.

Incidentally GFM has emoji integration in the format :EMOJICODE:, and here are “all the emojis” :grinning:. They don’t render locally, but are still good to have :smile:.

Conclusion

Syntax highlighting is something that I will be using often to showcase code, so it was good to get a feel for my options. Rouge looks like the system that I will be using, with GFM as a backup option.

Update:

Needed to configure Jekyll for emojis!

Written on November 25, 2016