The BioExcel Binder deployment

The Binder infrastructure under the hood leverages the potential of container and cloud technologies by being built on top of Kubernetes, the most popular container orchestration system, and using container and Kubernetes-based tools and infrastructure such as DockerHub, Helm, Helmsman, Grafana and Fluentd for k8s.

Prerequisites/recommended skill level

In this part, we detail the underlying technology of a BioExcel Binder deployment. The intended audience is contributors, collaborators, or IT services interested in forking the code and trying to run their own instances.

We heavily use Kubernetes for the BioExcel Binder deployment. It's recommended a deep knowledge of how to install (at least using a public provider) and operate it (eg: Ingress networking, Kustomize, Helm charts ... ).

Moreover, since we have implemented customizations, an understanding of installing and operating a JupytherHub instance is recommended.

All the deployment scripts are based on standard Linux tools like Bash and Make to make it easier to integrate the automation in CI/CD agents.

The infrastructure

The BioExcel Binder is currently hosted in the EMBL-EBI private OpenStack cloud, an isolated Infrastructure as a Service (IaaS) capability within EMBL-EBI’s Data Centre.

The use of Kubernetes and deployment automation guarantees portability between cloud infrastructures and will enable the installation of satellite deployments at other partners’ sites or in the public cloud (Cloud agnostic approach).

Indeed, to increase the availability and meet further needs we are recently extending the infrastructure to use the Google Cloud Platform (GCP).

Kubernetes clusters setup

The cluster setup is configured differently according to the infrastructures providers:

Example

Here you can find an example of a command to create a cluster using Magnum.

Example

You can download an example of Terraform configuration here.

Cluster initialization

Once the Kubernetes cluster is installed, we run a script to set up all the configurations required to initialize a brand new instance. Eg:

  • Set up the access for the deploy agents.
  • Set up user accounts.

After this step - ideally - all the other operations should be done by a deployment agent authorized to operate on the cluster (eg: Helmsman using a dedicated service account).

Example

Here you can see the script we are using to accomplish that.

Deployments over Kubernetes

The BinderHub deployment is packaged using a Helm chart. We also use charts to install other components. Eg:

Helmsman

To further improve the automation and the configurability we use Helmsman : "a Helm Charts as Code tool which allows you to automate the deployment/management of your Helm charts from version-controlled code".

This way we have a single configuration to manage all the components (Charts) in various environments (Cloud infrastructures and setup flavours).

Binderhub

Example

Here there is an example of Helmsman configuration and how to run it (see the Makefile).

Secrets management

There are secrets at various levels in the configurations. Eg:

  • Kubernetes cluster tokens
  • Docker registry accounts
  • Jupyterhub Tokens
  • TLS certificates

To integrate those sensitive data into the code we use Mozilla SOPS. This is also well integrated with Helm (by a specific Helm plugin) and Helmsman (native support by the helm-secrets plugin).

Example

In the Helmsman example configuration you can see how encrypted configurations can be passed transparently using the "secretsFiles" key:

...
valuesFiles:
     - helm_configs/base/config_binder.yml
     - helm_configs/base/config_jupytherhub.yml
     - helm_configs/base/config_ingress.yaml
secretsFiles:
     # Managed by helm-secret plugin (SOPS encrypted files).
     - helm_configs/base/secret.yaml
...