Self-hosted Git with Gitea LFS and S3

Using version control in game development

Git isn’t an ideal tool for game development. In fact, version control in general really isn’t designed to handle large amounts of binary data. This is a frustrating issue for game development projects, as they tend to require a huge amount of binary assets along with a lot of code. Version control is a necessary tool for most developers these days, and thinking of starting any project without using it somehow feels dirty. This blog post covers my experience with using VCS on a Unity project that’s rapidly gaining gigabytes, and my attempt to find a suitable solution at a low cost. While writing this post I came across a recent blog post by Steve Streeting (creator of SourceTree, Ogre3D, and also core contributor to Git LFS). His post covers the same problem, but he gives more info on Git LFS locking and using UE4 - I recommend giving it a read!

GitLab is pretty great

Initially I chose to host my team’s project on GitLab. I’m familiar with the interface, it offers features like issue tracking and project boards, as well as Git LFS support which is a necessity when dealing with large files. GitLab offers a 10GB repo size even for free accounts - higher than GitHub’s official max size of 5GB (but you can seemingly get to 100GB before it will stop you pushing).

I did not expect to run into the 10GB limit as quickly as we did, but uncompressed audio and texture files alone took up over 8GB of that space and adding some more audio asset packs quickly increased that. Unfortunately GitLab doesn’t yet offer any way to go over that limit, even to paying customers, so we were forced to look for an alternative.

Alternative paid services

Perforce

Git isn’t the only version control around, in fact it isn’t even the most popular in the game dev community. Most large studios use Perforce Helix Core. Perforce uses a centralised architecture, unlike Git’s distributed approach where every developer downloads the whole repository and its history (outside of shallow clones). The centralised approach is more appealing to large companies who have concerns over exactly how much access people have to the source code, and want to maintain one true master branch. When your repositories range from hundreds to thousands of gigabytes, it’s also probably quite nice to not download the entire history.

Perforce offers a free tier of Helix Core for teams of up to five members, but limits the repo size to 1GB. The paid tiers start from $19/user/year, and you only get 1GB per additional user, which doesn’t fit with the goal of spending as little as possible.

GitHub or Bitbucket

GitHub and Bitbucket both offer comparable plans for teams where you can buy extra storage for $5/50GB/month. This price is on top of the monthly cost of using teams in the first place, so this doesn’t work out very cheap.

Self-Hosting

Self hosting was the next logical step. You can get a VPS with 1VCPU and 2GB RAM and extensible storage for less than $5/month so it was just a matter of weighing up which Git service had the best features.

GitLab self-hosted

Self-hosting GitLab would be the most seamless transition from the regular GitLab service. Most of the same features are available, and there is detailed documentation available to help set it up. Unfortunately, the recommended system specs lists 4GB RAM and a 4 core CPU. It also occurred to me that GitLab comes with a lot of features. Far more features than we would actually need, so I decided to keep looking for something with lower requirements.

Gogs

Gogs is a GitHub clone written in Go that you can deploy with Docker. They boast very low system requirements of just 1 CPU core and 512MB RAM. Gogs also fully supports LFS and would probably be a great solution.

One thing I was hoping to find in a self-hosted solution was the ability to use a different storage backend for LFS. Being able to use something like AWS S3 would allow very cheap and extensive storage for LFS objects, and could potentially speed up clone times. Unfortunately Gogs does not yet support using S3 for LFS.

Gitea

Like Gogs, Gitea is also a GitHub clone written in Go. Gitea recommends using 1GB RAM and 2 CPU cores for small teams, but I’ve had no issues running with 2GB RAM and 1 CPU core.

Unlike Gogs, Gitea does support using S3 for LFS storage through a Minio interface. This means you can use any object storage service that is S3 compatible too.

Deploying Gitea on Docker with Traefik reverse proxy

Deploying with Docker is very easy. I’ll show the whole docker-compose.yml file and then explain some of the important lines below. Traefik can be configured through a Docker compose file, and provides automatic TLS certificates - so this file is everything you need to get running with an HTTPS enabled Git service.

docker-compose.yml
version: "3.7"

networks:
  internal-network:
    name: "internal-network"
    internal: true
  external-network:
    name: "external-network"

volumes:
  gitea:
    driver: local
  postgres:
    driver: local

services:
  gitea:
    image: gitea/gitea:latest
    container_name: gitea
    environment:
      - USER_UID=1000
      - USER_GID=1000
      - DB_TYPE=postgres
      - DB_HOST=db:5432
      - DB_NAME=gitea
      - DB_USER=gitea
      - DB_PASSWD=password
    restart: always
    networks:
      - internal-network
      - external-network
    volumes:
      - gitea:/data
      - ./gitea-app.ini:/data/gitea/conf/app.ini
      - /etc/timezone:/etc/timezone:ro
      - /etc/localtime:/etc/localtime:ro
    ports:
      - "222:22"
    depends_on:
      - db
    labels:
      - "traefik.enable=true"
      - "traefik.http.services.gitea.loadbalancer.server.port=3000"
      - "traefik.http.services.gitea.loadbalancer.passhostheader=true"

      - "traefik.docker.network=internal-network"
      - "traefik.http.routers.gitea.rule=Host(`git.yourdomain.com`)"
      - "traefik.http.routers.gitea.entrypoints=websecure"
      - "traefik.http.routers.gitea.tls=true"
      - "traefik.http.routers.gitea.tls.certresolver=leresolver"

  db:
    image: postgres:13
    restart: always
    environment:
      - POSTGRES_USER=gitea
      - POSTGRES_PASSWORD=password
      - POSTGRES_DB=gitea
    networks:
      - internal-network
    volumes:
      - postgres:/var/lib/postgresql/data

  traefik:
    image: traefik:v2.2
    restart: always
    security_opt:
      - no-new-privileges:true
    command:
      - "--api=true"
      - "--api.insecure=false"
      - "--api.debug=true"

      - "--log.level=INFO"
      - "--log.filePath=/logs/traefik.log"
      - "--accesslog=true"
      - "--accesslog.filepath=/logs/traefik_access.log"
      - "--providers.docker=true"
      - "--providers.docker.exposedbydefault=false"
      - '--providers.docker.defaultRule=Host(`{{ index .Labels "com.docker.compose.service" }}.localhost`)'
      - "--providers.docker.network=internal-network"  # Default network to speak to containers on
      - "--entrypoints.web.address=:80"
      - "--entrypoints.websecure.address=:443"
      - "[email protected]"
      - "--certificatesresolvers.leresolver.acme.storage=/letsencrypt/acme.json"
      - "--certificatesresolvers.leresolver.acme.tlschallenge=true"
      # Use this for testing and debugging. caserver will issue a `Fake LE Intermediate X1` cert that shows up as invalid
    #      - "--certificatesresolvers.leresolver.acme.caserver=https://acme-staging-v02.api.letsencrypt.org/directory"
    ports:
      - "80:80"
      - "443:443"
    networks:
      internal-network:
      external-network:
    volumes:
      - "./letsencrypt:/letsencrypt"
      - "/var/run/docker.sock:/var/run/docker.sock:ro"
      - "./logs:/logs"
    labels:
      # Dashboard
      - "traefik.enable=true"
      - "traefik.http.routers.traefik.rule=Host(`traefik.yourdomain.com`)"
      - "[email protected]internal"
      - "traefik.http.routers.traefik.tls.certresolver=leresolver"
      - "traefik.http.routers.traefik.tls=true"
      - "traefik.http.routers.traefik.entrypoints=websecure"
      - "traefik.http.routers.traefik.middlewares=authtraefik"
      - "traefik.http.middlewares.authtraefik.basicauth.users=username:$$apr1$$ftx468m9$$GHtH3awnQiO.7OL29Ydwa."  # The $ signs need to be doubled up to escape

      - "traefik.http.routers.http-catchall.rule=hostregexp(`{host:.+}`)"
      - "traefik.http.routers.http-catchall.entrypoints=web"
      - "traefik.http.routers.http-catchall.middlewares=redirect-to-https"

      - "traefik.http.middlewares.redirect-to-https.redirectscheme.scheme=https"
      - "traefik.http.middlewares.redirect-to-https.redirectscheme.permanent=true"

      - "traefik.http.routers.traefik.tls.domains[0].main=git.yourdomain.com"
      - "traefik.http.routers.traefik.tls.domains[1].main=traefik.yourdomain.com"

At line 31 we are mounting our gitea-app.ini file which we will create in the same directory as the docker-compose.yml file. Mounting the file allows up to edit the Gitea config from outside the container, which helps reduce setup steps a lot.

- ./gitea-app.ini:/data/gitea/conf/app.ini

The Traefik labels on the gitea service tell the reverse proxy to route git.yourdomain.com to port 3000 in the Gitea container and make it available through HTTPS only. We are also pointing to the TLS certificate provider, which is defined below in the traefik service.

The db service is made available on the internal-network only, which means it will not be exposed through the reverse proxy.

The traefik service defines all of the settings for the reverse proxy. Under the command header we are defining our TLS certificate provider settings. Traefik will automatically handle acquiring and updating the certificates. Under the labels header we set the URL used to access the Traefik dashboard, and we set the username and password for accessing it. You can use a tool like htpasswd to generate this, or a website like this one.

- "traefik.http.middlewares.authtraefik.basicauth.users=username:$$apr1$$ftx468m9$$GHtH3awnQiO.7OL29Ydwa."
gitea-app.ini

This file is used to define the config settings for Gitea from outside the Docker container. This file is mounted inside the docker-compose file, see above for more details. There is a lot more required in this file, but these are the relevant settings to enable storing LFS files on S3.

[lfs]
STORAGE_TYPE = minio
MINIO_BASE_PATH = /lfs

[storage.minio]
STORAGE_TYPE = minio
SERVE_DIRECT = true
MINIO_ENDPOINT = <your s3 endpoint>
MINIO_ACCESS_KEY_ID = <your s3 access key>
MINIO_SECRET_ACCESS_KEY = <your s3 secret key>
MINIO_BUCKET = <your s3 bucket name>
MINIO_LOCATION = <your s3 region>
MINIO_BASE_PATH = /attachment
MINIO_USE_SSL = true

You should hopefully have enough info here to get up and running quickly. This setup has been working well for my small team for the past few weeks and hopefully it can help you too.