___               _   ___   ___ 
|  _|_____ ___ ___| |_|_  | |  _|
|  _|     | .'|_ -|   |_| |_| . |
|_| |_|_|_|__,|___|_|_|_____|___|
                 u/fmash16's page

Generate a static site with posix compliable shell script, using find, grep, sed, pandoc and vim

After going through statice site generators like jekyll and hugo. I was always looking for something that is very simple and minimal, that would just generate a static site from markdown files. Hugo, Jekyll, Hexo and other site generators like these are really good and very user friendly. If all you want is a functioning website, those are your go to. But they have a lot of added functionalities that I did not have any use for. Then I came to know of this script called SSG5 written by Roman Zolotarev. And this is what I used to build this blogsite of mine. You can also get my modified version of the script from my github.

SSG5 - the minimal “cooler” site generator

SSG5 is a posix compliable shell script that simply generates html files from markdown, wraps them with the _header.html and _footer.html file provided. And thus you get your site. So simple, right?

All your posts and contents in markdown should be stored in a directory named “src”. You can write a home page for the site by putting in an index.md or index.html file in the src directory.

Installation

You can install ssg5 on your machine doing the following:

mkdir -p bin
ftp -vo bin/ssg5 https://rgz.ee/bin/ssg5
chmod +x bin/ssg5

And you got your ssg5! You can copy the executable over to the /usr/local/bin directory to directly use the command ssg5.

Now ssg5 uses lowdown or markdown.pl by default to convert markdown files to html files. For that you can install lowdown. But I am a pandoc user already, so I modified the script to use pandoc, more on it later.

Usage

In order to generate your site, run the following

PATH="$HOME/bin:$PATH"
mkdir src dst
echo '# Hello, World!' > src/index.md
echo '<html><title></title>' > src/_header.html
bin/ssg5 src dst 'Test' 'http://www'
firefox dst/index.html

And you got your simple site!

Styling

Well a site without css doesn’t look that good right? It’s 2020! Definitely it does not look cool. So for styling, you can define your styles in a style.css file in the root directory or a seperate css folder and link the file in your _header.html by adding the following line

<link rel="stylesheet" type="text/css" href="/css/style.css">

My customizations

Pandoc

I have been using pandoc for quite a while in order to make notes from markdown. So I modified the ssg5 script to use pandoc instead of lowdown or markdown.pl to generate html from md files. It is very simple actually. The pandoc command is basically

pandoc <filename>

Reducing build time using bash multithreading

Some degree of “threading” can be implemented in bash script by backgrounding the pandoc command. This enables multiple pandoc conversions to run simultaneously and save a considerable amount of time. This will be very effective if you have a large blogsite with a lot of markdown files for example. Let’s have a demostration of how much time we can save implementing this.

34026 blocks
[ssg] 269 files, 20 urls
../ssg5 src dst "fmash16's blog" "https://fmash16.github.io"  3.08s user 0.59s
system 66% cpu 5.518 total
34026 blocks
[ssg] 269 files, 20 urls
../ssg5 src dst "fmash16's blog" "https://fmash16.github.io"  1.76s user 0.67s
system 85% cpu 2.825 total

From 5.5s to 2.8s, that’s quite an improvement, right? The time got almost halved. It will be much more helpful for large sites.

Generate a home page with post list sorted by date

I also needed a good home page that will list all my posts sorted by date with their description and image. I added a function generate_post_list to the script that does just that.

The function simply pulls the title, date, description, tags and title image from the header part of the markdown file using grep and awk. And then it puts them into index.html file in the src directory with links to the posts that is rendered as the home page. For that the header of the md should be in the following format:

Title:
Date: YYYY-MM-DD(this format is preferred for sorting by date)
Description:
Tags:
Image:

In order to sort the posts by date, I used vim. The following vim command simply sorts the post block in the home page by the date

vim -N -u NONE -n \
  -c "set nomore" \
  -c ":g/SortByDate:/,/SortByDate/ s/$\n/@@@" \

  -c ":sort!" -c ":%s/@@@/\r/g" -c ":g/SortByDate/d" \
  -c ":wq" $src/index.html

And the generate post list function that I added is simple as follows:

generate_post_list() {
  if [ -f "$1/index.html" ]; then
    rm $1/index.html
  fi
  while read -r f
  do
    if [ "$f" == "./index.html" ] || [ "$f" == "./about.md" ]; then
      continue
    fi
    TITLE=$(grep -i title "$1/$f" | head -n1 | awk -F ": " '{ print $2 }')
    DATE=$(grep -i date "$1/$f" | head -n1 | awk -F ": " '{ print $2 }')
    DESC=$(grep -i description "$1/$f" | head -n1 | awk -F ": " '{ print $2 }')
    TAGS=$(grep -i tags "$1/$f" | head -n1 | awk -F ": " '{ print $2 }')
    IMAGE=$(grep -i image "$1/$f" | head -n1 | awk -F ": " '{ print $2 }')
    printf "\
SortByDate:$DATE
<h1 id=\"$TITLE\" style=\"border-bottom: 0px; padding-bottom: 0em;\">\n\
  <a href=\"/${f%\.md}.html\" style="color:#111" >$TITLE</a>\n\
</h1>\n\
<p class="date">$DATE</p>\n\
<p>$DESC<br/>\n\
<strong>Tags:</strong>$TAGS</p>\n\
<div style=\"text-align: center; border-bottom: 1px solid #ddd; padding-bottom: 0.5em;\">\n\
<a href=\"/${f%\.md}.html\"><img src=\"$IMAGE\" style=\"max-width:55vw; max-height:40vh;\"/></a>\n\
</div>\n\

SortByDate\n\n" >> "$1/index.html"
  done
}

Generate a post archive

A new feature added recently is generating a post archive sorted by date. It follows similar technique used to geenrate the post list. The archive is stored in the file archive.html, which can be linked to from the home page using the _header.html file.

Pagination

This is a feature that I felt was very necessary as my blogsite grew in size and the homepage was getting cluttered with lots of posts. Paginating the home page would make it more organized with only a fixed number of posts in a single page. This meant the post list had to be distributed over a number of index pages.

I integrated a very simple script in the ssg5 script in order to implement the pagination feature. The post list was generated and sorted by vim as discussed above. Then using simple bash commands like awk, head and cat, The index page was divided with a specified number of posts per page. The code used is


    # Pagination
    while true
    do
        line=$(awk '/<h2 id/ && ++n==13 {print NR}' $src/index.html)
        if [ "$line" == "" ];then
            break
        fi
        (head -$(($line-1)) > $src/index$index_file.html; cat > $src/index$((index_file+1)).html) < $src/index.html
        cp $src/index$(($index_file+1)).html $src/index.html
        printf "\
<center><a href=\"index$((index_file+1)).html\">Next>></a></center>\n" >> "$src/index$index_file.html"
        index_file=$(($index_file+1))
    done
    mv $src/index0.html $src/index.html

What I did here:

  1. Used awk to get the line number of the 13th post by wild character matching for the sequence <h2 id as that is what I used to generate the heading for each post in the index.

  2. Used head command to select text upto that line number and put that into a new index file by the name index{%d}.html. And put the rest into the index.html file and repeated the process until no post was left.

  3. Added a link to go to the next page at the end of each index page.

And that’s it. I am not very proud of the code :). And I admit that the code is very clumsy. But it just works for me. It does what I want. Anyone else is free to jump in and edit it according to their needs.

My modified version of ssg5

If you want your site to use pandoc and have a custom home page with a post list sorted by date like mine, you can have a copy of the modified ssg5 script I use on my github.

And finally, Happy blogging!