Let’s see how to setup a Gitlab CI/CD process for a Flask app!
This article is also available on Medium.
Recently, I had to prepare a little presentation on CI/CD, automation, and all those DevOps core concepts that help you set up a sain and robust workflow for a project. So I decided I’d create some sample project to showcase a few features of the CI/CD, like auto-deploy or unit tests.
I threw together a basic Flask app that just computes takes in two integers and computes their sum, subtraction, multiplication or division, and then shows the result in another HTML “result page”.
Of course, this is very simple and it would be completely overkill to actually do computations with an online service like this for a real production, but it’s a good way of playing around with the basics of automation 🙂
It’s now live here, if you want to test it out!
Except that when the time came to actually add the CI/CD polish over it, I realised that I’d never really had the chance to deploy a Flask app before… and that simply use the static Gitlab pages would obviously not be enough!
So today, I want to show how I finally managed to publish this sample demo project on Heroku and link it to the Gitlab CI/CD tools 🙂
A quick overview of the toolchain
What is CI/CD? And why use it?
Over the years, it’s become common practice to version our code and store it on online Git providers, like Gitlab, Github or Bitbucket for example. This is a good way of having a remote copy of your files, sharing them with others for collaborative dev and keeping a full history of the project’s evolution.
Versioning, however, is just one of features of these online Git providers. They usually also offer a lot more tools, such as team management, metrics, files and branches easy visualisation… or CI/CD!
Basically, CI/CD is about automating the build, testing and deployment of your code so that you don’t ever have inconsistencies between your current “up-to-date” code and the version that is in production for the users. Depending on your project, some steps might be removed or added to this, but this is the core idea: by having some checks and updates run regularly, you insure you have continuous feedback and you can iterate quickly to improve the project.
Since you’re also constantly feeding your production with the latest updates and/or sharing some beta features on other (restricted) staging servers, you can keep your QA team, your beta testers or even your investors informed on what’s going on.
This automated delivery also leads to adding features or fixing bugs in smaller batches which means that each task is better scoped and be reviewed quicker by the right people.
And because everything is run by a computer, you can’t have any weird script to run manually in the middle of the process, or a strange variable to remember to define half-way through: the whole thing has to run smoothly, and on its own. Which is great! This avoids little mistakes and it makes it possible to scale up: you remove the manual steps, so you don’t have a hard limit on how much you can produce with the people in your team anymore 😉
Finally, a cool thing with automation is that generally prevents you from publishing buggy or unfinished code. If your CI/CD process is setup properly, then there will be some validation that checks everything runs as expected before actually sending the code to production. This way, unless your validation fails to catch some mistake, you don’t ever have unstable or invalid code in prod, and your clients don’t experience any issues!
There are of course other advantages (and gotchas) about automation, but this is already a soft overview of why it’s interesting…
A quick overview of the toolchain
Still, CI/CD isn’t just set up in the twinkling of an eye! When you start to dive into this domain, you’ll quickly discover that DevOps is pretty complex – and it’s no wonder that there are real experts that spend years learning this stuff! 🙂
In my case, overall, the project works in the following way:
- there is Python code that actually implements the app logic: here, a Flask server (with its associated HTML template pages), the “backstage logic” and the tools to check that the app behaves as expected
- the code is versioned thanks to Git and published on Gitlab: this way, I can share the files and get back to any previous state if need be, and I can isolate my changes in small batches as commits
- I use Gitlab’s built-in CI/CD tool to automatically run some scripts: these validate the code (thanks to the unit tests we coded previously)…
- … and send this code to an online service called Heroku which allows devs to quickly deploy, manage and share their apps
To validate that the app works correctly, the technique is usually to add what we call unitary tests (or “unit tests”) to the project, and have them run during the CI/CD process so that, if anything goes wrong and one of the test fails, the entire process stops and aborts the deploy of this buggy version.
In short, unit testing is a software testing method where you split your codebase into tiny logical units and verify each individually to be sure that at least that chunk of your code is sound. You simply run each function in a code chunk against a series of examples and validate that, for every example, you get the expected result.
This very granular approach makes it easier to identify where the errors come from, and it makes for a more solid project. It’s also key in avoiding code regressions and maintaining your code, because you can quickly check that everything still runs properly (and that, for example, the new feature you added didn’t break down something else without you noticing).
However, unit tests can sometimes be hard to write – either because you don’t know how granular to be, or because the code in itself is difficult to test just based on example cases. In particular, it’s often pointed out that object-oriented programming (OOP) is harder to unit test than functional programming: since OOP relies a lot more on context, it can be difficult to cut down the code and isolate self-contained functions.
Note: for more details on unit testing, and on how you can setup unit tests with Unity and C#, check out this other article I wrote! 🙂
What is my project architecture?
First things first, if you want to see the project and/or get the files, everything is available publicly for free on Gitlab, over here 🚀
Now, as we’ve seen in the introduction, the project is very basic: I just have two HTML pages (the index and the “result page”) in a
templates/ folder and a Flask server to render them (in the
server.py file). The pages are actually Jinja templates – this way, I could “inject” some data from my server and have dynamic content.
The four operations are extracted to their own file,
functions.py so that I could easily re-import the same server functions into my unit test file:
All the other files are just for the project’s config and the CI/CD setup. More precisely:
.gitignoredefines which files (or file types) should be ignored by the versioning system
requirements.txtlists all the Python libs my project uses – this is used by the CI to re-create the right environment when it runs on a machine in the cloud
Procfiletells Heroku which command to execute with the files it receives from the CI/CD process
- finally, the
.gitlab-ci.ymlis the configuration file for the Gitlab CI: it’s what lists and sets up all the CI/CD steps to run during a pipeline to go from the code in the repo to a running app on Heroku
Setting up the CI/CD
Since this article focuses on the automation part, I won’t detail how the app in itself works. It is basically adapted from all the Flask demos and example projects you can find in their docs or on the net.
Similarly, I don’t want to dive into how you can write unit tests in Python – for the ones interested, for this really simple project I simply used the built-in
unittest Python package and prepared one test case per operation to test (+, -, x and ÷).
So let’s assume we have some running Python Flask app and a
test_functions.py file we can run to check that everything works properly, and skip to how we define the CI/CD steps.
Preparing the skeleton of your automation process
We know that the CI/CD process is defined in the
.gitlab-ci.yml file. To begin with, we can start with a file that looks like this:
Here, the “stage” blocks are all the steps that you could run in a pipeline instance on your code, and the
stages key at the top just lists the ones that are active and will actually be used. Python code can’t really be built, so we’ll just go straight to the testing and deploy parts.
Another important thing is to know which environment we’ll be working in. Because our pipeline will be run online on a machine that Gitlab lends us for this process, we can’t assume anything about it; so the environment is a combination of various options: the OS, the installed packages, the environment variables, the Python specific packages…
To quickly get a base computer config, the most common trick today is to use a Docker image. Docker containers and images are another deep topic on their own, so I won’t dive into this this time. Just think of them as light virtual machines that you can pull from the global online Docker hub and “instantiate” to spawn a (virtual) computer with a given OS, sometimes some pre-installed tools or libraries, etc. Then, just like on your local computer, you can run some additional commands to install even more software and “finish up” your environment. And finally, you’ll use this well-prepared environment to run your actual build/testing/deploy scripts.
The Docker hub is a big place with thousands of images – so if you’re new to this world, you might need to spend a bit of time to learn your way around and discover some famous images. In my case, I don’t need a lot; thus I’ll go with a very light image, the Linux-based
To tell the Gitlab CI/CD that I want to use this image as my “base computer setup”, all I have to do is add an
image key at the top:
Now, whenever the pipeline starts, Gitlab will apply this specific config to the machine that we borrowed for the process. But because the
alpine image is so light, it doesn’t have much tools. For example, it doesn’t have Python.
So let’s also make sure that before any of our steps are run, we add the right libraries and tools; this can be done in the
before_script key that contains instructions to execute before each block:
From that point, since there is a
.gitlab-ci.yml in the repo, Gitlab will automatically run a pipeline with this structure whenever we push a new commit to the online repository… but this workflow doesn’t actually do anything! 😉
Running the unit tests
To validate that our app works correctly, we need to do two things: install all its Python dependencies and run the unit tests. The libs install is easy to do, using the
requirements.txt file, and the unit tests can be executed with the
test_functions.py script we prepared.
run-tests block just needs a little
script section with two commands:
Deploying to Heroku
The final step is the
deploy stage were we want to send our code to Heroku so that this service can deploy our app online and share it with the world!
Before you can actually publish anything on Heroku, you need to create an account and prepare a new Heroku application for your project. Just follow the instructions on the platform; when you’re done, if you take a look at your dashboard, you should see your app:
Now – as explained in the Heroku docs, the Heroku CLI is based on Git and we can therefore use commands such as
git remote or
git push to link the Heroku app to our repo and send it our files. If you go to your app’s settings, you’ll see that it indeed has its own Git URL that we’ll be able to connect to in our CI/CD pipeline.
The last import piece of info we need from the Heroku website is our API key: this is the unique secret code that will allows us to push files to the app while being properly authenticated with our Heroku account.
You can get it in your own account settings, by scrolling down the first page and clicking on the “Reveal” button:
But of course, this is a secret key that should absolutely not be committed to the Git repo – or else everyone will be able to read it and impersonate us! So how can we add some data to the Gitlab project without actually versioning it with our code?
The solution here is to rely on environment variables. Whenever we run our pipeline, Gitlab automatically feeds it a bunch of variables that are either pre-defined by Gitlab itself, or defined by us in the project’s CI/CD settings.
So let’s go ahead and store our API key in a variable named
HEROKU_API_KEY, that we protect and mask to be sure it never appears in the pipeline logs. I’ll also define another variable, the
HEROKU_APP, that contains the name of my Heroku app – this is not a secret, but this way if I ever rename my app I won’t have to do another commit to update my pipeline, I’ll just have to change this environment variable 😉
All that’s left to do is use those environment variables in our
.gitlab-ci.yml file to link the files to the Git URL of the app and authenticate, and then push the updated app code:
The final touch: restricting the CI/CD to the
To wrap this up, we can improve our pipeline by making sure that it only runs when commits are pushed to the
master branch (the one that we want to deploy in production). This is a useful tweak, because it reduces the number of pipelines you run (it just ignores any commit pushed to other branches) and therefore the resources you consume.
It’s very easy to do, thanks to the keyword
only, to which we give the name(s) of the branch(es) to monitor for CI/CD:
Setting up a Gitlab CI/CD and linking it with Heroku isn’t as straight-forward as it sounds, because the two tools need to interface properly for the whole chain to work. But by taking a good look at the docs and starting with a very simple sample like this one, we can learn to use all those amazing services and prepare really great pipelines for our next projects!
I hope you liked this article about DevOps and automation – if you have other ideas of posts I could write on this topic, feel free to leave a comment below. As always, thanks a lot for reading and see you soon for more programming articles 🙂