Introduction to geoCML

geoCML is a Containerized, Multi-paradigm, Lightweight deployment pattern for geographic information systems running locally or in the cloud. geoCML deployments package best-in-class open-source GIS software including QGIS and PostGIS allowing you to quickly go to production. In this topic, you will learn how geoCML deployments work and be able to make an informed decision about whether geoCML is a good choice your needs.

How does geoCML work?

geoCML deployments include four containerized microservices: geocml-desktop (the primary user entrypoint), geocml-server (the secondary user entrypoint), geocml-postgres (the primary data store), and geocml-task-scheduler (often abbreviated as gTS). Each service can communicate with each other via an internal network called geocml-network. Once deployed as a geoCML instance, you can begin using geoCML to complete GIS workflows. By design, each geoCML instance should host only one project/dataset. Multiple datasets should be hosted across multiple geoCML instances. Specific documentation on each service is available for your reference.

What can geoCML do?

geoCML deployments come out of the box with a variety of features:

  • Desktop GIS workflows

  • Cloud native server GIS workflows

  • Routine database backups

  • PostgreSQL/PostGIS database hosting

  • Python support for writing custom gTS automation tasks

  • Persist information between containerized services via the persistence-layer directory

  • Use less space/resources than other enterprise GIS deployments

What can geoCML not do out of the box?

geoCML is currently limited in the following areas:

  • Integration with ArcGIS proprietary deployments and extensions

  • Database version management

  • Backup domain values in a database

  • Support for storing raster data in geocml-postgres

  • MySQL support, SQL Server support, etc (base deployments only support PostgreSQL/PostGIS)

geoCML base deployments are customizable to fit your specific needs. You can customize your services however you’d like! If you would like to contribute to the development of geoCML, you can find the project on GitHub at github.com/geocml.

Quick deployment guide

Get up and running with geoCML in under 15 minutes!

geoCML deployments are multi-paradigm, offering a desktop GIS experience and a server GIS experience with a single deployment. You may host your geoCML instance locally or in the cloud, depending on your needs.

Before instantiating a geoCML deployment, you must have Docker and Docker Compose installed on the machine you want to host geoCML on. You do not need any additional GIS software installed on the machine. Once you have satisfied these conditions, please follow the following steps to deploy your geoCML instance.

  1. Clone the geoCML source code from github.com/geocml/geocml-base-deployment

  2. Open a terminal and cd into the source code directory

  3. Run docker compose pull to pull geoCML service images from the project’s hosted container registry (optionally, run docker compose build to build the images on your machine; this is a requirement if you are modifying your images and have not pushed them to your own registry!)

  4. Run docker network create geocml-network

  5. Run docker compose up -d to bring up the instance

That’s it! You can access geoCML Desktop via {deployment host URL}:10000 or geoCML Server portal via {deployment host URL}:80 using a web browser. Further configuration steps for each of these services are discussed in later topics.

geoCML Desktop

geoCML Desktop is the primary access point for your geoCML deployment. geoCML Desktop provides you with a desktop environment allowing you to prepare, visualize, and analyze your GIS data. The geocml-desktop container is based on Ubuntu Linux and comes installed with the XPRA, allowing you to view running applications in a web browser. geoCML Desktop comes preinstalled with QGIS, a best-in-class open-source desktop GIS application.

Connecting to geoCML Desktop

After instantiating your geoCML deployment, you can connect to geoCML Desktop via a web browser at {deployment host URL}:10000.

Using geoCML Desktop

geoCML Desktop is designed to be a replacement for your typical desktop GIS experience. With the default geoCML Desktop, you can:

  • Use QGIS to prepare, aggregate, and visualize your GIS data

  • Configure geoCML Server Portal

  • Connect to the geocml-postgres service

Configuring geoCML Desktop

You may want to extend the default geocml-desktop service with additional applications or configurations that meet your needs. geoCML Destkop has a two step configuration process: Dockerfile configuration, and Ansible configuration.

Docker Configuration

You can use Docker to install packages to geoCML Desktop. Open geocml-base-deployment\Dockerfiles\Dockerfile.geocml-desktop in your favorite text editor, and between the Customize Container Here and End Customizations comments, add your use-case specific geoCML Desktop configuration steps.

Ansible Configuration

You can use Ansible to automate advanced, tedious configuration workflows. By default, geoCML Desktop uses Ansible to automate configuring your database. You can further customize your advanced configurations. Open geocml-base-deployment/ansible-playbooks/geocml-desktop-playbook.yaml

Understanding the persistence layer

geoCML deployments do not persist data; the file system within the service should not ever be changed directly unless during the build process. You may wonder how, then, are you to upload and change datasets contained within the geoCML Desktop service?

The persistence layer is a mutible directory shared between the geoCML Desktop service and the deployment’s host machine. geoCML Base Deployments use Docker to bind the host machine’s local persistence layer to a persistence layer within the geoCML Desktop service. This binding allows you to add files to the host machine’s persistence layer and have them available within the geoCML Desktop service. The persistence layer is also be available to other services within the deployment.

About geocml-project.qgz

geoCML Desktop will automatically open geocml-project.qgz in QGIS when you connect to the XPRA service. geocml-project.qgz is the central project file for your entire geoCML deployment. By design, all of your GIS work must be done in this file.

geoCML Postgres

geoCML deploys a micro-service container with a PostGIS enabled Postgres database by default for your project. This is the primary data store for a geoCML deployment.

Configuring geoCML Postgres

geoCML Postgres can be fully configured to suit the needs of your use case. geoCML Postgres has a two step configuration process: Dockerfile configuration, and Ansible configuration.

Docker Configuration

You can use Docker to install packages to your geoCML Postgres service deployment. Open geocml-base-deployment\Dockerfiles\Dockerfile.geocml-postgres in your favorite text editor, and between the Customize Container Here and End Customizations comments, add your use-case specific geocml-postgres configuration steps.

Ansible Configuration

You can use Ansible to automate advanced, tedious configuration workflows. By default, geoCML Postgres uses Ansible to automate configuring your database. You can further customize your advanced configurations. Open geocml-base-deployment/ansible-playbooks/geocml-postgres-playbook.yaml

Accessing your database

geoCML Postgres creates a database named geocml_db, which is the primary datastore for your geoCML project. Do not change the name of this database! You can access your database in several ways:

  • via geoCML Desktop and QGIS (over the internal geocml-network)

  • via a PostgreSQL data explorer (over port {deployment host URL}:5432)

The default credentials for accessing your database are:

  • Username: geocml

  • Password: geocml

geoCML Postgres also configures an admin user for your data store. The default credentials for accessing your data store as an admin are:

  • Username: postgres

  • Password: admin

Current limitations of the geoCML Postgres service

In geoCML v0.2.0, there are known limitations with the geoCML Postgres service. Currently, geoCML Postgres does not support the following data:

  • Raster datasets,

  • Domain fields

Support for these data will be added to geoCML Postgres in a future release of geoCML.

geoCML Server

geoCML deploys a micro-service container with a QGIS Server instance, an Apache web server, and a React application called geoCML Server. This micro-service is responsible for implementing web and server GIS paradigms for your project. geoCML Server collects information from geocml-project.qgz to serve your data via several web services.

Configuring geoCML Server

geoCML Server can be fully configured to suit the needs of your use case. geoCML Server has a two step configuration process: Dockerfile configuration, and Ansible configuration.

Docker Configuration

You can use Docker to install packages to your geoCML Server service deployment. Open geocml-base-deployment\Dockerfiles\Dockerfile.geocml-server in your favorite text editor, and between the Customize Container Here and End Customizations comments, add your use-case specific geocml-server configuration steps.

Ansible Configuration

You can use Ansible to automate advanced, tedious configuration workflows. By default, geoCML Server uses Ansible to automate configuring your Apache and QGIS Server. You can further customize your advanced configurations. Open geocml-base-deployment/ansible-playbooks/geocml-server-playbook.yaml

Accessing geoCML Server via the web

geoCML Server Portal is accessible via {deployment host URL}:80. geoCML Server Portal is a web application frontend for geoCML Server written in React. geoCML Server Portal acts as the secondary user entrypoint for your deployment, allowing you to share data with others over the internet. geoCML Server Portal relies on Web Map Service (WMS) information from geocml-project.qgz in order to properly display your instance information.

geoCML Server Portal features:

  • a description of your project,

  • copyright claims for your project’s data,

  • a hosted Leaflet web map,

  • WMS connection information,

  • WFS connection information,

  • WCS connection information,

  • a preview of your hosted data,

  • a list of all hosted data tables

Accessing geoCML Server via an API

geoCML Server exposes all functionality for QGIS Server via cfcgi and Apache. Learn more about QGIS Server here: https://docs.qgis.org/3.34/en/docs/server_manual/index.html

geoCML Task Scheduler (gTS)

geoCML deploys a micro-service container called geoCML Task Scheduler (gTS). This micro-service is responsible for automating routine tasks within your deployment such as backing up data from geoCML Postgres, restoring geocml_db from a backup, and healthchecking services within your deployment. geoCML Task Scheduler exposes a simple Python API for developing additional tasks.

Configuring geoCML Task Scheduler

geoCML Task Scheduler can be fully configured to suit the needs of your use case. geoCML Task Scheduler has a two step configuration process: Dockerfile configuration, and Ansible configuration.

Docker Configuration

You can use Docker to install packages to your geoCML Task Scheduler service deployment. Open geocml-base-deployment\Dockerfiles\Dockerfile.geocml-task-scheduler in your favorite text editor, and between the Customize Container Here and End Customizations comments, add your use-case specific geocml-task-scheduler configuration steps.

Ansible Configuration

You can use Ansible to automate advanced, tedious configuration workflows. Open geocml-base-deployment/ansible-playbooks/geocml-task-scheduler-playbook.yaml

Understanding DBBackups

geoCML Postgres is not a persistent data store. Because of this, when your geoCML instance goes down, you will risk losing information in your data store. geoCML Task Scheduler handles this by automatically backing up geocml_db every hour. When your instance is brought back up, geoCML Task Scheduler will automatically restore your data store from the most recent backup. Backups are stored in the DBBackups directory in the persistence layer. Each DBBackup contains a .tabor file defining the schema of geocml_db and a series of CSV files representing the actual data in your tables.

Writing Tasks

You can create custom tasks in geoCML Task Scheduler.

  1. Open build-resources/geocml-task-scheduler/geocml-task-scheduler/ in your favorite text editor.

  2. Create a new Python file.

  3. Define a function in the new file with your task logic.

  4. Return 0 if you want your task to run only once. Otherwise, your task will run according to its position in the schedule.

  5. Save your Python file.

  6. Open schedule.py

  7. Create a new Task object, instantiated with your new function and its execution frequency (in seconds).

  8. Start the task with <your_task>.start()

  9. Rebuild the geocml-task-scheduler container and deploy

Your new task is created, and it is scheduled for execution.

Tabor

Tabor is a database modeling language for GIS based on YAML, but with additional syntax restrictions. The goal of Tabor is to allow GIS users to create and maintain complex database rules using plain-text configuration files. The following is an example of a Tabor configuration file for a PostGIS database:

tabor: 0.2.0
layers:

- name: grass
  schema: public
  owner: geocml
  geometry: polygon
  fields:
    - name: fid
      type: int
      pk: true

- name: trees
  schema: public
  owner: geocml
  geometry: point
  fields:
  - name: fid
    type: int
    pk: true
  - name: genus
    type: text
  - name: species
    type: text
  - name: height_meters
    type: numeric
  - name: circumference_cm
    type: numeric
  constraints:
    - name: on
      layer: grass

- name: streams
  schema: public
  owner: geocml
  geometry: polyline
  fields:
    - name: fid
      type: int
      pk: true

Running this file through the Tabor command line utility generates a valid PostgreSQL schema query that can be used to create or update tables in a PostGIS database.

Downloading and Installing Tabor

Tabor can be downloaded directly from https://github.com/geoCML/tabor. After downloading, simply extract the downloaded .zip file to a directory accessible on your terminal path.

Command Line Usage

tabor read --file <path/to/file> → Converts a .tabor file into a PostGIS schema query.

tabor write --file <path/to/file> --db <name_of_psql_db> --username <name of db user> --password <password of db user?> --host <host of psql db?> --port <port of psql db?> --ignore <tables to ignore?> → Converts a PostGIS database to a .tabor file

tabor load --file <path/to/file> --db <name_of_psql_db> --username <name of db user> --password <password of db user?> --host <host of psql db?> --port <port of psql db?> → Loads a PostGIS database from a .tabor file.

Supported Geometry Types

Tabor supports the following geometry types:

  • polygon → for 2D shapes such as political boundaries

  • polyline → for 1D shapes such as roadway centerlines

  • point → for 1D shapes such as trees

If a layer has no geometry type, it may still be defined as a non-geometric table in the database.

Supported Field Types

Tabor supports the following field types:

  • int → for whole numbers

  • numeric → for decimal numbers

  • text → for unlimited length character strings

It is strongly recommended that you have a primary key (pk) on each of your layers. You can define a field as a primary key using pk: true.