Hakyll compiler to include working code samples
(updated: 09/March/2018 - includes, local path & html, css and js support)
Ensuring that the code you include in a blog post is up to date and works can be a bit of a pain. Often I’ll change code while writing a post and then I have to find and copy anything that has changed. This is manual and error prone.
Fortunately Hakyll is reasonably easy to customise. Here I’ll show one way to write a hakyll compiler to help with this issue.
Goal
What I wanted was
- Include code from a git repo
- Work with a specific version of the code
- Check that the code builds
- Check that tests or any other custom actions succeed
- Check that the repo is still accessible
Example template markdown
---
title: testing
---
[<code setup.repo>] https://gist.github.com/53e179c4244411493ae1f9deebc3cc3f.git
[<code setup.sha>] 5a95ece18ecb248fb745b3e7cb19f5c4d410240f
[<code setup.run>] stack init --resolver lts-12.0
[<code setup.run>] stack build
[<code setup.run>] stack test
Some text
[<code>] main
more text
The markdown should be parsed as follows
[<code setup.repo>]
- is the git repo to pull the code from[<code setup.sha>]
- is the commit to work with[<code setup.run>]
- any number of commands to run in order.[<code>] main
- gets the section namedmain
from the code and inserts it as a markdown code block
Example haskell file with sections
{-# LANGUAGE NoImplicitPrelude #-}
{-# LANGUAGE OverloadedStrings #-}
import Protolude hiding (onException)
import System.FilePath ((</>))
import qualified System.FilePath as FP
import qualified System.Directory as Dir
import Control.Exception.Safe (onException, throwString)
{-! SECTION< main !-}
main :: IO ()
main = hakyll $ do
match "posts/*" $ do
route $ setExtension "html"
compile $ includeCodeCompiler
>>= renderPandoc
{-! SECTION> main !-}
{-! SECTION<
starts a code section{-! SECTION>
ends a code section- The parser will read all sections from all files in the repo, so section names must be unique. The advantage is that you don’t need to worry about finding paths or paths changing later on.
Result
When pandoc is run the include compiler will insert the code from the main
section and add a title showing the source path (repo relative) and the position (line from & to).
app/site.hs (32 to 37)
main :: IO ()
main = hakyll $ do
match "posts/*" $ do
route $ setExtension "html"
compile $ includeCodeCompiler
>>= renderPandoc
Before including the code, the includeCompiler will checkout the code and run the commands specified in the template. In the example template above I’m cloning from a github gist that does not have a stack.yaml so I run stack init
first. You can use the commands to run tests etc to ensure that your sample code is working 100%. If any action fails, the blog generation is aborted.
Constraints
- I only use markdown, so I’m assuming that all input is markdown
- This is not “production” code. I’m doing a lot of work in IO and throwing exceptions to abort on error
- It works for me, feel free to use the code and change it to match your needs.
Code
Customising hakyll
The hakyll tutorial will give you an idea of how to setup hakyll.
This is a fairly standard match clause to run your posts through pandoc to generate HTML output
match "posts/*" $ do
route $ setExtension "html"
compile $ pandocCompiler
>>= loadAndApplyTemplate "templates/post.html" postCtx
>>= saveSnapshot "content"
>>= loadAndApplyTemplate "templates/default.html" postCtx
>>= relativizeUrls
Lets modify this route to use a new compiler named includeCodeCompiler and pipe that output through pandoc
site.hs (32 to 39)
match "posts/*" $ do
route $ setExtension "html"
compile $ includeCodeCompiler
>>= renderPandoc
>>= loadAndApplyTemplate "templates/post.html" postCtx
>>= saveSnapshot "content"
>>= loadAndApplyTemplate "templates/default.html" postCtx
>>= relativizeUrls
The two changes to notice are
- Call includeCompiler rather than pandocCompiler
- The output of includeCompiler is passed to renderPandoc
Preliminaries
Here are the imports I’m using
site.hs (2 to 20)
{-# LANGUAGE NoImplicitPrelude #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE LambdaCase #-}
import Protolude hiding (onException)
import Prelude (String)
import Data.Map.Strict (Map)
import qualified Data.Map.Strict as Map
import qualified Data.List as Lst
import qualified Data.Text as Txt
import qualified Data.Text.IO as Txt
import Data.Monoid (mappend)
import Hakyll
import qualified System.Exit as Xit
import qualified System.Process as Proc
import System.FilePath ((</>))
import qualified System.FilePath as FP
import qualified System.Directory as Dir
import Control.Exception.Safe (onException, throwString)
And a few helper functions for running shell processes and finding files
site.hs (220 to 250)
runShell' :: FilePath -> Text -> IO ()
runShell' workingDir cmd = do
putText cmd
runShell workingDir cmd >>= \case
Right _ -> pass
Left e -> throwString $ Txt.unpack "Error running `" <> Txt.unpack cmd <> "` " <> show e
runShell :: FilePath -> Text -> IO (Either Int ())
runShell workingDir cmd = do
let p = Proc.shell $ Txt.unpack cmd
(_, _, _, phandle) <- Proc.createProcess p { Proc.cwd = Just workingDir }
Proc.waitForProcess phandle >>= \case
Xit.ExitSuccess -> pure $ Right ()
Xit.ExitFailure i -> pure $ Left i
getFilesRec :: FilePath -> IO [FilePath]
getFilesRec p = do
fs <- (p </>) <<$>> getFiles p
ds <- (p </>) <<$>> getDirs p
cs <- traverse getFilesRec ds
pure $ fs <> join cs
getDirs :: FilePath -> IO [FilePath]
getDirs p = do
entries <- (p </>) <<$>> Dir.listDirectory p
filterM Dir.doesDirectoryExist entries
getFiles :: FilePath -> IO [FilePath]
getFiles p = do
entries <- (p </>) <<$>> Dir.listDirectory p
filterM Dir.doesFileExist entries
The includeCodeCompiler
site.hs (51 to 66)
includeCodeCompiler :: Compiler (Item String)
includeCodeCompiler = do
p <- getResourceFilePath
getResourceString >>= withItemBody (unsafeCompiler . includeCompile p)
where
includeCompile :: FilePath -> String -> IO String
includeCompile compilingPath source =
includeCompile' source
`onException`
putStr ("Exception compiling includes for: " <> compilingPath)
includeCompile' :: String -> IO String
includeCompile' source = do
let ls1 = Txt.lines $ Txt.pack source
let (sourceNoSetup, repoPath', sha', cmds', path') = getConfig ls1
A pandoc compiler has the type Compiler (Item String)
. Since this compiler needs to read file it has to be able to perform IO. To allow IO the unsafeCompiler
function is used.
So this code, gets the current file path, the item body and starts the includeCompile in IO
onException
is used to print the name of the file being compiled if there is an exception.
Once the config (repo, sha and commands) have been read the main compiler logic can be run.
site.hs (70 to 110)
case path' of
Nothing ->
case (repoPath', sha', cmds') of
(Nothing, Nothing, []) -> pure $ Txt.unpack . Txt.unlines $ sourceNoSetup
(Just _, Nothing, _) -> throwString "No sha found"
(Just _, _, []) -> throwString "No run commands found"
(Nothing, _, (_:_)) -> throwString "No repo setup found"
(Nothing, Just _, []) -> throwString "No repo setup found"
(Just repoPath, Just sha, cmds) -> do
root <- Dir.getCurrentDirectory
let tempPath = root </> tmpDirectory defaultConfiguration </> "codeIncludeGit"
-- Cleanup from previous post
removeDirectoryRecursiveIfExists tempPath
Dir.createDirectoryIfMissing True tempPath
-- Clone the git repo
runShell' root $ "git clone \"" <> repoPath <> "\" \"" <> Txt.pack tempPath <> "\""
-- Goto the correct sha (if it was specified)
gotoSha sha tempPath
-- Execute the run commands (buid, test etc)
executeRunCommands cmds tempPath
-- Delete all dirs we are not interested in (exclude .git and .stack-work)
removeDirectoryRecursiveIfExists $ tempPath </> ".git"
removeDirectoryRecursiveIfExists $ tempPath </> ".stack-work"
includeCode tempPath repoPath sha sourceNoSetup
Just path ->
includeCode (Txt.unpack path) "**local**" "**local**" sourceNoSetup
includeCode tempPath repoPath sha sourceNoSetup = do
-- Get all files in the repo
files <- getFilesRec tempPath
-- All sections from all files
sections' <- Map.fromList . concat <$> traverse getSections files
let sections = Map.map (\(p, s, e, lang, ls) -> (drop (length tempPath + 1) p, s, e, lang, ls)) sections'
-- Replace sections in the file
replaced' <- traverse (replaceCodeLineSection tempPath sections) sourceNoSetup
let replaced = Txt.unlines . concat $ replaced'
-- Replace sha and repo tokens
pure . Txt.unpack . Txt.replace "2297510b93a903ab23a319f7921351a9725cef0e" sha $ Txt.replace "https://gist.github.com/53e179c4244411493ae1f9deebc3cc3f.git" repoPath replaced
site.hs (114 to 121)
executeRunCommands :: [Text] -> FilePath -> IO ()
executeRunCommands cmds path =
traverse_ (runShell' path) cmds
gotoSha :: Text -> FilePath -> IO ()
gotoSha sha tmpPath = do
runShell' tmpPath ("git reset " <> sha <> " --hard")
void $ runShell tmpPath "git clean -dfx"
This code does the following
- Pre-clone cleanup
- Clone
- Goto the configured commit
- Run the commands
- Read all the sections from the files
- Import the sections into the markdown
Details
Loading the config is done quite simply by filtering the source lines
site.hs (126 to 142)
getConfig ls =
let
cfgPath = "[<code setup.path>]"
cfgRepo = "[<code setup.repo>]"
cfgSha = "[<code setup.sha>]"
cfgRun = "[<code setup.run>]"
path = Txt.strip . Txt.drop (Txt.length cfgRepo) <$> headMay (filter (Txt.isPrefixOf cfgPath) ls)
repo = Txt.strip . Txt.drop (Txt.length cfgRepo) <$> headMay (filter (Txt.isPrefixOf cfgRepo) ls)
sha = Txt.strip . Txt.drop (Txt.length cfgSha) <$> headMay (filter (Txt.isPrefixOf cfgSha) ls)
run = Txt.strip . Txt.drop (Txt.length cfgRun) <$> filter (Txt.isPrefixOf cfgRun) ls
in
(filter (not . Txt.isPrefixOf "[<code setup.") ls, repo, sha, run, path)
removeDirectoryRecursiveIfExists p =
Dir.doesDirectoryExist p >>= \case
True -> Dir.removeDirectoryRecursive p
False -> pass
And once the sections have been loaded from the source code the tags can be replaced in the markdown. Each [<code>]
tag is replaced by a markdown code block, a title showing the source file and offset.
site.hs (147 to 175)
replaceCodeLineSection :: FilePath -> Map Text (FilePath, Int, Int, Text, [Text]) -> Text -> IO [Text]
replaceCodeLineSection tempPath sections line = do
let codeTag = "[<code>]"
let includeTag = "[<include>]"
if Txt.isPrefixOf codeTag line
then
let secName = Txt.strip . Txt.drop (Txt.length codeTag) $ line in
case Map.lookup secName sections of
Nothing -> throwString $ Txt.unpack $ "No section named " <> secName
Just (path, start, end, lang, code) ->
let title = Txt.pack path <> " (" <> show start <> " to " <> show end <> ")" in
pure [ "###### " <> title
, ""
, "~~~{." <> lang <> "}"
, Txt.unlines code
, "~~~"
, ""
]
else
if not $ Txt.isPrefixOf includeTag line
then pure [line]
else
let
incRelPath = Txt.strip . Txt.drop (Txt.length includeTag) $ line
incFullPath = tempPath </> Txt.unpack incRelPath
in
sequenceA [Txt.readFile incFullPath]
Getting sections from the repo
site.hs (181 to 215)
type LineState = (Int, [(Text, (FilePath, Int, Int, Text, [Text]))])
getSections :: FilePath -> IO [(Text, (FilePath, Int, Int, Text, [Text]))]
getSections f =
case FP.takeExtension f of
".hs" -> getLangSections "{-! SECTION< " "{-! SECTION> " "{-! SECTION" "haskell"
".js" -> getLangSections "//!SECTION< " "//!SECTION> " "//!SECTION" "javascript"
".html" -> getLangSections "<!-- !SECTION+ " "<!-- !SECTION- " "<!-- !SECTION" "html"
".css" -> getLangSections "/* !SECTION< " "/* !SECTION> " "/* !SECTION" "html"
_ -> pure []
where
getLangSections startToken endToken cleanToken lang = do
ls <- Txt.lines <$> Txt.readFile f
(_, r) <- foldlM (parseLine ls) (1, []) ls
pure r
where
parseLine :: [Text] -> LineState -> Text -> IO LineState
parseLine ls (lineNum, hist) l =
if not . Txt.isPrefixOf startToken . Txt.strip $ l
then pure (lineNum + 1, hist)
else do
let secName = Txt.strip . fst . Txt.breakOn " " . Txt.strip . Txt.drop (Txt.length startToken) . Txt.strip $ l
end <- scanForEnd ls secName lineNum
pure (lineNum + 1, (secName, (f, lineNum + 1, lineNum + length end, lang, end)) : hist)
scanForEnd ls secName fromLine =
let fromOffset = drop fromLine ls in
case Lst.span (not . Txt.isPrefixOf (endToken <> secName) . Txt.strip) fromOffset of
(_, []) -> throwString $ "No section end found for: " <> Txt.unpack secName
(r, _) -> pure $ filter cleanLine r
cleanLine =
not . Txt.isPrefixOf cleanToken . Txt.strip
Different types of files will need different tag styles. In the code above I’m handling haskell, javascript, css and HTML. You should be able to fairly easily add this to other languages as well.
parseLine
works by going line by line looking for a start token, and for each one that it finds it scans to find the end token. This is a little inefficient but it allows for nested and/or overlapping tags.
Code includes
Sometimes it is useful to include external files into a post. The [<include>]
tag makes this simple.
For example [<include>] /home/user/static/interestingStuff.json
. Unlike the [<code>]
tag, no assumptions are made about the included text. If you want it syntax highlighted simply wrap the text in a code fence.
Speeding up the writing process
Fetching the code from a remote repo and doing a full build each time can be pretty slow. This is fine when you are confirming that everything works correctly, but its not idea when you are writing a post and still making many small changes. To help with this there is a [<code setup.path>]
tag. This tag overrides the repo and run settings. If it is present then all code sections will be read from this path directly without any fetching, building or running of commands.
e.g. [<code setup.path>] /home/user/dev/myProject
Obviously it is important that you remove this setting once the post is done.
Using the compiler
Once you add this to your hakyll you can be sure that you are only using working code blocks. While there is a bit of code in this compiler most of it is for dealing with the file IO and parsing. I think it also shows how easily hakyll can be customised to do useful things.
This code works with hakyll 4.10.0.0. See the cabal file in the gist for other dependencies
Links
- Code on github 2297510b93a903ab23a319f7921351a9725cef0e
- Hakyll tutorial
- The simplest custom Hakyll compiler