Generate a static site with posix compliable shell script, using find, grep, sed, pandoc and vim
After going through statice site generators like jekyll and hugo. I was always looking for something that is very simple and minimal, that would just generate a static site from markdown files. Hugo, Jekyll, Hexo and other site generators like these are really good and very user friendly. If all you want is a functioning website, those are your go to. But they have a lot of added functionalities that I did not have any use for. Then I came to know of this script called SSG5 written by Roman Zolotarev. And this is what I used to build this blogsite of mine. You can also get my modified version of the script from my github.
SSG5 - the minimal “cooler” site generator
SSG5 is a posix compliable shell script that simply generates html
files from markdown, wraps them with the _header.html
and
_footer.html
file provided. And thus you get your site. So
simple, right?
All your posts and contents in markdown should be stored in a directory named “src”. You can write a home page for the site by putting in an index.md or index.html file in the src directory.
Installation
You can install ssg5 on your machine doing the following:
mkdir -p bin
ftp -vo bin/ssg5 https://rgz.ee/bin/ssg5
chmod +x bin/ssg5
And you got your ssg5! You can copy the executable over to the
/usr/local/bin
directory to directly use the command
ssg5
.
Now ssg5 uses lowdown or markdown.pl by default to convert markdown files to html files. For that you can install lowdown. But I am a pandoc user already, so I modified the script to use pandoc, more on it later.
Usage
In order to generate your site, run the following
PATH="$HOME/bin:$PATH"
mkdir src dst
echo '# Hello, World!' > src/index.md
echo '<html><title></title>' > src/_header.html
bin/ssg5 src dst 'Test' 'http://www'
firefox dst/index.html
And you got your simple site!
Styling
Well a site without css doesn’t look that good right? It’s 2020!
Definitely it does not look cool. So for styling, you can define your
styles in a style.css file in the root directory or a seperate css
folder and link the file in your _header.html
by adding the
following line
<link rel="stylesheet" type="text/css" href="/css/style.css">
My customizations
Pandoc
I have been using pandoc for quite a while in order to make notes from markdown. So I modified the ssg5 script to use pandoc instead of lowdown or markdown.pl to generate html from md files. It is very simple actually. The pandoc command is basically
pandoc <filename>
Reducing build time using bash multithreading
Some degree of “threading” can be implemented in bash script by backgrounding the pandoc command. This enables multiple pandoc conversions to run simultaneously and save a considerable amount of time. This will be very effective if you have a large blogsite with a lot of markdown files for example. Let’s have a demostration of how much time we can save implementing this.
- Without threading:
34026 blocks
[ssg] 269 files, 20 urls
../ssg5 src dst "fmash16's blog" "https://fmash16.github.io" 3.08s user 0.59s
system 66% cpu 5.518 total
- WIth threading:
34026 blocks
[ssg] 269 files, 20 urls
../ssg5 src dst "fmash16's blog" "https://fmash16.github.io" 1.76s user 0.67s
system 85% cpu 2.825 total
From 5.5s to 2.8s, that’s quite an improvement, right? The time got almost halved. It will be much more helpful for large sites.
Generate a home page with post list sorted by date
I also needed a good home page that will list all my posts sorted by
date with their description and image. I added a function
generate_post_list
to the script that does just that.
The function simply pulls the title, date, description, tags and
title image from the header part of the markdown file using grep and
awk. And then it puts them into index.html
file in the src
directory with links to the posts that is rendered as the home page. For
that the header of the md should be in the following format:
Title:
Date: YYYY-MM-DD(this format is preferred for sorting by date)
Description:
Tags:
Image:
In order to sort the posts by date, I used vim. The following vim command simply sorts the post block in the home page by the date
vim -N -u NONE -n \
-c "set nomore" \
-c ":g/SortByDate:/,/SortByDate/ s/$\n/@@@" \
-c ":sort!" -c ":%s/@@@/\r/g" -c ":g/SortByDate/d" \
-c ":wq" $src/index.html
And the generate post list function that I added is simple as follows:
generate_post_list() {
if [ -f "$1/index.html" ]; then
rm $1/index.html
fi
while read -r f
do
if [ "$f" == "./index.html" ] || [ "$f" == "./about.md" ]; then
continue
fi
TITLE=$(grep -i title "$1/$f" | head -n1 | awk -F ": " '{ print $2 }')
DATE=$(grep -i date "$1/$f" | head -n1 | awk -F ": " '{ print $2 }')
DESC=$(grep -i description "$1/$f" | head -n1 | awk -F ": " '{ print $2 }')
TAGS=$(grep -i tags "$1/$f" | head -n1 | awk -F ": " '{ print $2 }')
IMAGE=$(grep -i image "$1/$f" | head -n1 | awk -F ": " '{ print $2 }')
printf "\
SortByDate:$DATE
<h1 id=\"$TITLE\" style=\"border-bottom: 0px; padding-bottom: 0em;\">\n\
<a href=\"/${f%\.md}.html\" style="color:#111" >$TITLE</a>\n\
</h1>\n\
<p class="date">$DATE</p>\n\
<p>$DESC<br/>\n\
<strong>Tags:</strong>$TAGS</p>\n\
<div style=\"text-align: center; border-bottom: 1px solid #ddd; padding-bottom: 0.5em;\">\n\
<a href=\"/${f%\.md}.html\"><img src=\"$IMAGE\" style=\"max-width:55vw; max-height:40vh;\"/></a>\n\
</div>\n\
SortByDate\n\n" >> "$1/index.html"
done
}
Generate a post archive
A new feature added recently is generating a post archive sorted by
date. It follows similar technique used to geenrate the post list. The
archive is stored in the file archive.html
, which can be
linked to from the home page using the _header.html file.
Pagination
This is a feature that I felt was very necessary as my blogsite grew in size and the homepage was getting cluttered with lots of posts. Paginating the home page would make it more organized with only a fixed number of posts in a single page. This meant the post list had to be distributed over a number of index pages.
I integrated a very simple script in the ssg5 script in order to
implement the pagination feature. The post list was generated and sorted
by vim as discussed above. Then using simple bash commands like
awk
, head
and cat
, The index page
was divided with a specified number of posts per page. The code used
is
# Pagination
while true
do
line=$(awk '/<h2 id/ && ++n==13 {print NR}' $src/index.html)
if [ "$line" == "" ];then
break
fi
(head -$(($line-1)) > $src/index$index_file.html; cat > $src/index$((index_file+1)).html) < $src/index.html
cp $src/index$(($index_file+1)).html $src/index.html
printf "\
<center><a href=\"index$((index_file+1)).html\">Next>></a></center>\n" >> "$src/index$index_file.html"
index_file=$(($index_file+1))
done
mv $src/index0.html $src/index.html
What I did here:
Used awk to get the line number of the 13th post by wild character matching for the sequence
<h2 id
as that is what I used to generate the heading for each post in the index.Used
head
command to select text upto that line number and put that into a new index file by the nameindex{%d}.html
. And put the rest into theindex.html
file and repeated the process until no post was left.Added a link to go to the next page at the end of each index page.
And that’s it. I am not very proud of the code :). And I admit that the code is very clumsy. But it just works for me. It does what I want. Anyone else is free to jump in and edit it according to their needs.
My modified version of ssg5
If you want your site to use pandoc and have a custom home page with a post list sorted by date like mine, you can have a copy of the modified ssg5 script I use on my github.
And finally, Happy blogging!