For small operations, say 20 developers or so, deploying web services using docker and docker-compose seems to be plenty sufficient. When a single server can handle all the load requirements for a website, the simplicity and transparency of docker-compose is great. I used to take a VM snap-shot, run docker-compose pull followed by docker-compose up -d and just like that, my GitLab instance was up and running the latest version. All of this convenience was threatened when I attempted to keep up with current developments in container technology and switch to Podman. However, with Podman came several benefits plus a kick in the direction of being able to scale my applications with Kubernetes.

Access to the Docker socket is equivalent to sudo

I there I was happily running Docker containers when some security minded folks from RedHat started saying things like:

if a user can talk to the docker socket, they can execute the following command:
docker run -ti --privileged -v /:/host fedora chroot /host
Giving them full root access to the host system. This is similar to giving them the following in sudo.1

At the time I read this (maybe 2017 or so), I was rather surprised at how simple is was to elevate a user's privileges through docker in this way. I took it too heart and did the most pragmatic thing I could think of: I ran all my containers directly as root. In fact, on machines that were dedicated to running docker, I didn't bother creating any user accounts!

I knew there had to be a better way, but I was happy to chug along with Docker for a while and I was not about to go back to provisioning my own servers by hand. The major event that caused me to look for a real solution was when I upgraded my workstation to Fedora 31. This was the release that enabled cgroups v2 by default which was not supported by docker at the time. As a result or perhaps in addition, the docker packages were stricken from the fedora repositories I was using. A quick google search found that I could go back to cgroups v1 and pull docker from a third party repository or plow ahead with Podman and get user-space containers without the security issues of Docker. I chose the latter.

From Docker to Podman

I have been using containers for only a few years and found early on that Docker Compose was an extremely convenient way to setup and configure a series of containers on a single host. I was particularly happy with the ability of volumes to navigate selinux contexts properly as I ran all my containers on CentOS or Fedora machines. So after reading promises like:

alias docker=podman

I thought I'd give a go at podman-compose only to find that it did not support a lot of the features I have come to rely on with Docker. I won't go into details about podman-compose's issues because what struck me most was Podman's developers' assertion:

Out of scope
Supporting docker-compose. We believe that Kubernetes is the defacto standard for composing Pods and for orchestrating containers, making Kubernetes YAML a defacto standard file format. Hence, Podman allows the creation and execution of Pods from a Kubernetes YAML file (see podman-play-kube). Podman can also generate Kubernetes YAML based on a container or Pod (see podman-generate-kube), which allows for an easy transition from a local development environment to a production Kubernetes cluster.2

I had looked into Kubernetes many times in the past and I knew it was perhaps something worth learning, but I never had the time or need to do so. Also, since Docker Compose was sufficient, Kubernetes seemed way over-complicated and unneccessarily verbose. Thus began my search for a Kubernetes-from-scratch tutorial and while there were many, none seemed close to the locally hosted single-node configurations I had in my Docker Compose files. They all seemed to talk about scaling and being cloud-native from the get-go - even the ones around minikube.

The problem as I see it was that I was not trying to learn Kubernetes, but rather Podman. And since Podman's documentation is rather sparse, I thought I had to look at Kubernetes documentation to figure out how to create a kube configuration file that Podman would be able to read. Turns out I was only partially right. Here, I will now detail the path I took to understanding how to migrate from Docker Compose to Podman using a Podman-compatible Kube configuration file.

Podman from Docker Compose

After getting nowhere with Kubernetes documentation - I wanted to learn Podman after all - I switch gears and started to translate a rather simple Docker Compose file to bare Podman commands. It had been a while since I had run these kinds of docker commands so it took some effort. The compose file looked like the following. It's just a single postgresql container with two host volume mounts: one managed by docker and other by the user.

version: '3.1'

volumes:
    testdata:

services:
  testdb:
    container_name: db
    image: postgres:alpine
    restart: always
    volumes:
      - testdata:/var/lib/postgresql/data
      - ./scripts:/scripts:ro
    ports:
      - "5432:5432"
    environment:
      POSTGRES_USER: user
      POSTGRES_PASSWORD: guest

The final set of Podman commands I came up with were as follows:

mkdir -p ./scripts

podman pod create --name testdb --publish 5432
podman volume create testdata
podman container create \
    --env POSTGRES_USER=user \
    --env POSTGRES_PASSWORD=guest \
    --name db \
    --pod testdb \
    --restart always \
    --volume testdata:/var/lib/postgresql/data \
    --volume ./scripts:/scripts:ro \
    postgres:alpine
podman pod restart testdb

Notice I created a "pod" in which the container lives. Since all the containers in a pod share the same network address, the exposing of ports are handled at the pod layer. A podman-managed volume is mounted to provide persistent storage for this database and a local directory is mounted to provide some scripts.

Kube Configuration from Podman

Once created and running, I used the podman generate kube command to create a Kube configuration file. After rearranging and pairing it down to the bare essentials, it looked like this:

apiVersion: v1

kind: Pod

metadata:
  name: testdb

spec:
  volumes:
    - name: testdb-scripts
      hostPath:
        path: ./scripts
        type: Directory

  containers:
    - name: db
      image: postgres:alpine
      volumeMounts:
        - name: testdb-scripts
          mountPath: /scripts
          readOnly: true
      ports:
        - containerPort: 5432
          hostPort: 5432
          protocol: TCP
      env:
        - name: POSTGRES_USER
          value: user
        - name: POSTGRES_PASSWORD
          value: guest

Notice that the testdata volume is nowhere to be seen as well as the restart policy for the container.

Now is a good time to point to the Kube configuration reference. You have to follow the links to the API reference since I couldn't determine if there is a permanent link to the latest version. This was buried under several layers of documentation and it took me ages to find it. Furthermore, it's quite hard to navigate, but to me it's the most useful document when working with Kubernetes.

Anyways, my best effort attempt to rectify the omissions by hand was to merge the following configuration into the Kube file above. Notice I didn't bother with the restart policy as "always" is already the default for a pod.

spec:
  volumes:
    - name: testdb-data
      emptyDir: {}
  containers:
    - name: db
      volumeMounts:
        - name: testdb-data
          mountPath: /var/lib/postgresql/data

But this errors out when trying to "play" this from Podman with the podman play kube command:

Error: HostPath is currently the only supported VolumeSource

I could find no way to tell Podman to generate a managed volume from the Kube configuration file. It's hard to tell if this is a deliberate design descision or not, but there is a viable work-around that's not too cumbersome and might be better in the long run.

SeLinux and Kube Config with Podman

It Looks like a local directory mount is required if I want to use a Kube configuration file with Podman. So the bare Podman commands became this:

mkdir -p ./scripts
mkdir -p ./testdata

podman pod create --name testdb --publish 5432
podman volume create testdata
podman container create \
    --env POSTGRES_USER=user \
    --env POSTGRES_PASSWORD=guest \
    --name db \
    --pod testdb \
    --restart always \
    --volume ./testdata:/var/lib/postgresql/data:z \
    --volume ./scripts:/scripts:ro \
    postgres:alpine
podman pod restart testdb

That "z" at the end of the volume mount for the testdata directory is important. Without it, the initialization of the database can't happen because selinux prevents the postgres user inside the container from writing to this directory. Notice this wasn't a problem with the scripts directory because it was read-only and selinux is fine with that.

So generating the Kube configuration file resulted in this:

apiVersion: v1

kind: Pod

metadata:
  name: testdb

spec:
  volumes:
    - name: testdb-scripts
      hostPath:
        path: ./scripts
        type: Directory
    - name: testdb-testdata
      hostPath:
        path: ./testdata
        type: Directory
  containers:
    - name: db
      image: postgres:alpine
      securityContext:
        allowPrivilegeEscalation: true
        capabilities: {}
        privileged: false
        readOnlyRootFilesystem: false
      volumeMounts:
        - name: testdb-scripts
          mountPath: /scripts
          readOnly: true
        - name: testdb-testdata
          mountPath: /var/lib/postgresql/data
      ports:
        - containerPort: 5432
          hostPort: 5432
          protocol: TCP
      env:
        - name: POSTGRES_USER
          value: user
        - name: POSTGRES_PASSWORD
          value: guest

This time I did not remove the security context block (it wasn't needed before). But it doesn't look like it will help much with permissions on the volume mounts and of course attempting to "play" this results in an error. Here's the output from podman logs db:

chmod: /var/lib/postgresql/data: Permission denied

Getting this directory to be writable by the postgres user within the container was far from obvious to me. And after trying in vain to use the configuration's SecurityContext options, I found that I only needed to set the selinux context of the testdata directory on the host to the right thing. The correct setting, I determined experimentally by creating a temporary podman-managed volume and locating the actual location on the host.

podman volume create testvol
DIR=`podman volume inspect testvol | jq -r .[0].Mountpoint`
ls -dZ $DIR
podman volume rm testvol

The ls command above printed out selinux context I was after:

system_u:object_r:container_file_t:s0

And so to wrap it all up, All I had to do was to run chcon on the testdata and it all seemed to work. Here is the final Kube configuration file:

apiVersion: v1

kind: Pod

metadata:
  name: testdb

spec:
  volumes:
    - name: testdb-scripts
      hostPath:
        path: ./scripts
        type: Directory
    - name: testdb-testdata
      hostPath:
        path: ./testdata
        type: Directory
  containers:
    - name: db
      image: postgres:alpine
      volumeMounts:
        - name: testdb-scripts
          mountPath: /scripts
          readOnly: true
        - name: testdb-testdata
          mountPath: /var/lib/postgresql/data
      ports:
        - containerPort: 5432
          hostPort: 5432
          protocol: TCP
      env:
        - name: POSTGRES_USER
          value: user
        - name: POSTGRES_PASSWORD
          value: guest

and here is the setup and running of the pod:

mkdir -p ./scripts
mkdir -p ./testdata
chcon system_u:object_r:container_file_t:s0 ./testdata
podman play kube postgres.kube.yaml

Conclusion

So there are a number of things to be aware of when switching from Docker and Docker Compose to Podman and a Kube configuration file:

  1. The most useful reference is the Kubernetes API reference but keep in mind that not everything is availble in Podman.
  2. Mounted volumes are limited to user-created host directories and files.
  3. Write access to mounted volumes requires the right selinux context to be set on the host directory.

Looking ahead with the new configuration format, I can already see that this will ease a transition to running on a Kubernetes cluster in the cases where duplication of the containers is desired. Still, I'm far from there yet and I haven't managed to switch everything over to Podman. I still need to figure out the process of updating container images within a pod using the Kube configuration file as well as determining the stability of the system as a whole. Stay tuned!