I recently worked on a project to port Pombola, a Django site to a static site. It was a legacy project with a huge amount of code that had several apps for different per-country versions so that they could use the same architecture and code. This means that in order to work on one country version, you had to download the whole repo, including the other countries' versions of the site. There were plans to extract the core into a Python package so that each site could live in its own repo, but that never happened.
Code always rots, in particular if you don't keep it tidy as you go. The problem is that requirements from users and clients always pile-up, and are always required "
<em>for yesterday</em>". Hence you make a fast patch that provides the required functionality and you promise yourself that you will come back in the future to fix it. We know that is a lie 😂.
A bad codebase only gets worse as you pile-up bad code over bad-code because it's simply too difficult (if not impossible) to do things the right way.
You can see a dedicated page with details about this project in my portfolio. Keep reading for the technical details on how it was built.
- You have a CMS to add content to the site, which means that non-technical users can manage it.
- You have interactive elements like comments, search and contact forms baked into the architecture.
- Slow because it talks to a database and it needs resources from the server to generate the pages on the fly.
- Needs some sort of caching of pages that are requested frequently, so that you don't have to dynamicly create them each time, and can save hosting resources. But this is not easy to get right!
- You have to watch out for security issues and attacks to your database (good ol' SQL injection and similar).
- Maintenance cost is high, and you need to do backups of the database contents frequently and test them.
- Hosting cost is high
- Database management, migrations etc. is WORK.
- Super fast load times.
- Hosting is easy and cheap, all your host needs to support is html, some (like Gitlab pages) even support several static site generators.
- Easy to build and maintain, fast to develop.
- A bit more secure.
- Your content can be kept under very granular version-control with a resource like prose.io.
- Loads of static site generators to choose from, see for example the ones offered by GitLab.
- No CMS, so until now they remained mainly for tech users; however this is less of a problem today since there are tools like prose.io and mavo.io, among many others
- You need third party services for things like comments, contact forms and search.
You can read a more detailed summary in this Smashing Magazing article "Why Static Site Generators Are The Next Big Thing".
One of our teams was already using a static approach to their product's site, although they started static right away. Basically all they had to do was consume a huge JSON file and spit parts of this JSON file in a human-digestible way in different pages. You could think of that JSON file as their database. They had tried Jekyll but it turned out too complex for what they wanted to do. I don't think they tried other static-site generators out there, probably they didn't know of them.
For example, Middleman is a static-site generator powered by Ruby, so you can still write your own business-logic in Ruby-plain-objects. But the only static-site generator that Github supports automatically in their GitHub pages service is Jekyll, and they were hosting the site in Github (Gitlab supports a bunch of static-site generators, though, including Middleman).
So they created a new approach: they built a Sinatra app with proper service objects etc., then run the app locally and
wget all its pages. Finally, they push this plain HTML pages to a
gh-pages branch on a repo. I believe this happens automatically (through Travis maybe?) everytime a PR is merged to master.
Sinatra gives you great flexibility with less framework code, so that could have been a reason for choosing it over a static-site generator. I worked on that project too.
We decided to use the same approach to port the Django app into a static site.
We had to handle both new content and old content.
First we needed to find a way for users to enter content, and that content to be persisted. We stored the content in a repository in GitHub. Then used the integration that prose.io provides with GitHub.
This is an open-source project that provides a GUI to files stored in GitHub, so that non-technical users can add, update and delete content easily. Each change is recorded in the main branch as a GitHub commit, which means that the content is version-controlled. Also, since this tool is open-source, we could submit PR for any functionality that we required.
The way we used it is the following:
- We keep "the prose repo", where users enter content. Things that live in this repo are blog posts, static pages, anything that is just content.
- In the repo with the Sinatra code that powers the site, we create a
prosefolder. Then we tell Sinatra that the content files are there.
- We clone the prose repo in that folder.
- We add this folder to the Sinatra repo's
Having the prose repo living there allows us to easily make changes in context anytime we need to update any file in the prose repo, like the configuration file or the Travis file (we use a Travis file to deploy to staging).
To port the old content (like old pages, images and old blog posts) from the Django website, we wrote some scripts that would read what was stored in the database and spit it in the form of markdown files with frontmatter. You may be familiar with this format if you ever used Jekyll.
The reason to do this is because prose was made originally for Jekyll blogs, so it produces markdown files with frontmatter in them. You can define the fields that you want your frontmatter to have in prose's config file. The values of these fields can be edited from prose's GUI by the non-technical users. If they click on the "metadata" icon, they will be presented with a form listing all editable fields. Some of them, like the dropdowns or the published checkbox, are very powerful.
We used these fields in the code behind in the Sinatra app to manipulate the data and decide how and what we would present in the views.
The deployment is done automatically through a script. This is what the script does:
- The Sinatra repo is cloned
prosefolder is created under the root of the project, and the prose repo is cloned inside it.
- The Sinatra app is fired.
- We crawl it with
- We push these html pages to the
gh-pagesbranch of our staging or production repo.
Prose makes automatic URL generation very easy, because when you create a new file, it is automatically saved with a filename built from the date and a dashified version of the title.
This is very convenient for new sites, because you can just have a dynamic route in Sinatra, and tell it to go search a file with that slug as filename from a specific folder. Then it's a matter of reading the contents and the frontmatter fields and doing whatever you want with it.
We had a complication though: since we were porting from an existing site, we had to keep the same slugs for every page as the slugs that already existed. Luckily, those were stored in the database, so it was just a matter of adding extra code in the script to deal with that. Basically, we used the creation date of the record in the table, followed by the slug saved in the database for that page, as the name of the file.
We wanted to serve images from a proxy. Prose lets you define a folder to save your images in the same repository where all the content lives. Then, in the GUI, whenever you want to add an image, you can either upload a new image from your computer or select an existing image from that folder.
If you don't provide a specific fullpath to an image or resource, but a relative path instead, Sinatra will assume that the resource has to be read from the current app, so it will search for the resource in the
public directory by default, or in any custom directory you specify in Sinatra settings.