Introduction to geoCML

geoCML Logo

geoCML is a Containerized, Multi-paradigm, Lightweight deployment pattern for geographic information systems running locally or in the cloud. geoCML deployments package best-in-class open-source GIS software, including QGIS and PostGIS, allowing you to quickly go to production. In this topic, you will learn how geoCML deployments work and be able to make an informed decision about whether geoCML is a good choice your needs.

How does geoCML work?

geoCML deployments include four containerized microservices: geocml-desktop (the primary user entrypoint), geocml-server (the secondary user entrypoint), geocml-postgres (the primary data store), and geocml-task-scheduler (an automation service, often abbreviated as gTS). Each service can communicate via an internal network called geocml-network. Once deployed as a geoCML instance, you can begin using geoCML to complete GIS workflows. By design, each geoCML instance should host only one project/dataset. Multiple datasets should be hosted across multiple geoCML instances. Specific documentation on each service is available for your reference.

What can geoCML do?

geoCML deployments come out of the box with a variety of features:

  • Desktop GIS workflows

  • Cloud native server GIS workflows

  • Routine database backups

  • PostgreSQL/PostGIS database hosting

  • Python support for writing custom gTS automation tasks

  • Persist information between containerized services via the persistence-layer directory

  • Realtime GIS

  • Use less space/resources than other enterprise GIS deployments

What can geoCML not do out of the box?

geoCML is currently limited in the following areas:

  • Integration with ArcGIS proprietary deployments and extensions

  • Database version management

  • Backup domain values in a database

  • Support for storing raster data in geocml-postgres

  • MySQL support, SQL Server support, etc (base deployments only support PostgreSQL/PostGIS)

geoCML base deployments are customizable to fit your specific needs. You can customize your services however you’d like! If you would like to contribute to the development of geoCML, you can find the project on GitHub at github.com/geocml.

Quick deployment guide

Get up and running with geoCML in under 15 minutes!

geoCML deployments are multi-paradigm, offering a desktop, server, and web GIS experience with a single deployment. You may host your geoCML instance locally or in the cloud, depending on your needs.

Before instantiating a geoCML deployment, you must have Docker and Docker Compose installed on the machine you want to host geoCML on. You do not need any additional GIS software installed on the host machine. Once you have satisfied these conditions, please follow the following steps to deploy your geoCML instance.

  1. Clone the geoCML source code from github.com/geocml/geocml-base-deployment

  2. Open a terminal and cd into the source code directory

  3. Copy .env.example into a new file called .env

  4. Update your .env to include your deployment specific configuration variables

  5. Run sh build.sh to build geoCML service images on your machine.

  6. Run docker network create geocml-network

  7. Run sh start.sh to bring up the instance.

That’s it! You can access geoCML Desktop via {deployment host URL}:10000 or geoCML Server Portal via {deployment host URL}:80 using a web browser. Further configuration steps for each of these services are discussed in later topics.

Using hosted geoCML images from GHCR

The geoCML development team hosts pre-built containers at our container registry on Github. These containers are a great way to demo geoCML, but please note that these services are not production ready, because they lack the required build arguments. If you want to use geoCML in production, please build your containers.

Please keep in mind that the GEOCML_DESKTOP_PASSWORD variable in the .env file must be set to access your deployment via geoCML Desktop.

geoCML Desktop

geoCML Desktop Logo

geoCML Desktop is the primary access point for your geoCML deployment. geoCML Desktop provides you with a desktop environment allowing you to prepare, visualize, and analyze your GIS data. The geocml-desktop container is based on Ubuntu Linux and comes installed with the XPRA, allowing you to view running applications in a web browser. geoCML Desktop comes preinstalled with QGIS, a best-in-class open-source desktop GIS application. geoCML Desktop is password protected. You must set a password within your deployment’s .env file.

Connecting to geoCML Desktop

After instantiating your geoCML deployment, you can connect to geoCML Desktop via a web browser at {deployment host URL}:10000.

Using geoCML Desktop

geoCML Desktop is designed to be a replacement for your typical desktop GIS experience. With the default geoCML Desktop, you can:

  • Use QGIS to prepare, aggregate, and visualize your GIS data

  • Configure geoCML Server Portal

  • Connect to the geocml-postgres service

Configuring geoCML Desktop

You may want to extend the default geocml-desktop service with additional applications or configurations that meet your needs. geoCML Destkop has a two step configuration process: Dockerfile configuration, and Ansible configuration.

Docker Configuration

You can use Docker to install packages to geoCML Desktop. Open geocml-base-deployment\Dockerfiles\Dockerfile.geocml-desktop in your favorite text editor, and between the Customize Container Here and End Customizations comments, add your use-case specific geoCML Desktop configuration steps.

Ansible Configuration

You can use Ansible to automate advanced, tedious configuration workflows. By default, geoCML Desktop uses Ansible to automate configuring your database. You can further customize your advanced configurations. Open geocml-base-deployment/ansible-playbooks/geocml-desktop-playbook.yaml

Understanding the persistence layer

geoCML deployments do not persist data; the file system within the service should not ever be changed directly unless during the build process. You may wonder how, then, are you to upload and change datasets contained within the geoCML Desktop service?

The persistence layer is a mutible and persistent directory shared between the geoCML Desktop service and the deployment’s host machine. geoCML Base Deployments use Docker to bind the host machine’s local persistence layer to a persistence layer within the geoCML Desktop service. This binding allows you to add files to the host machine’s persistence layer and have them available within the geoCML Desktop service. The persistence layer is also be available to other services within the deployment.

About geocml-project.qgz

geoCML Desktop will automatically open geocml-project.qgz in QGIS when you connect to the XPRA service. geocml-project.qgz is the central project file for your entire geoCML deployment. By design, all of your GIS work must be done in this file.

geoCML Postgres

geoCML Postgres Logo

geoCML deploys a micro-service container with a PostGIS enabled Postgres database by default for your project. This is the primary data store for a geoCML deployment.

Configuring geoCML Postgres

geoCML Postgres can be fully configured to suit the needs of your use case. geoCML Postgres has a two step configuration process: Dockerfile configuration, and Ansible configuration.

Docker Configuration

You can use Docker to install packages to your geoCML Postgres service deployment. Open geocml-base-deployment\Dockerfiles\Dockerfile.geocml-postgres in your favorite text editor, and between the Customize Container Here and End Customizations comments, add your use-case specific geocml-postgres configuration steps.

Ansible Configuration

You can use Ansible to automate advanced, tedious configuration workflows. By default, geoCML Postgres uses Ansible to automate configuring your database. You can further customize your advanced configurations. Open geocml-base-deployment/ansible-playbooks/geocml-postgres-playbook.yaml

Accessing your database

geoCML Postgres creates a database named geocml_db, which is the primary datastore for your geoCML project. Do not change the name of this database! You can access your database in several ways:

  • via geoCML Desktop and QGIS (over the internal geocml-network)

  • via a PostgreSQL data explorer (over port {deployment host URL}:5432)

The default credentials for accessing your database are:

  • Username: geocml

  • Password: geocml

geoCML Postgres also configures an admin user for your data store. The default credentials for accessing your data store as an admin are:

  • Username: postgres

  • Password: admin

You can change the default password for both the geocml and postgres users in your deployment’s .env file. Note that these are build-time arguments; you must rebuild your containers to commit these changes.

Current limitations of the geoCML Postgres service

In geoCML v0.3.0, there are known limitations with the geoCML Postgres service. Currently, geoCML Postgres does not support the following data:

  • Raster datasets,

  • Domain fields

If you are working with raster data, please store them in the persistence layer rather than in geoCML Postgres.

geoCML Server

geoCML Server Logo

geoCML deploys a micro-service container with a QGIS Server instance, an Apache web server, and a React application called geoCML Server Portal (a frontend for your server GIS). This micro-service is responsible for implementing web and server GIS paradigms for your project. geoCML Server collects information from geocml-project.qgz to serve your data via several web services.

Configuring geoCML Server

geoCML Server can be fully configured to suit the needs of your use case. geoCML Server has a two step configuration process: Dockerfile configuration, and Ansible configuration.

Docker Configuration

You can use Docker to install packages to your geoCML Server service deployment. Open geocml-base-deployment\Dockerfiles\Dockerfile.geocml-server in your favorite text editor, and between the Customize Container Here and End Customizations comments, add your use-case specific geocml-server configuration steps.

Ansible Configuration

You can use Ansible to automate advanced, tedious configuration workflows. By default, geoCML Server uses Ansible to automate configuring your Apache and QGIS Server. You can further customize your advanced configurations. Open geocml-base-deployment/ansible-playbooks/geocml-server-playbook.yaml

Accessing geoCML Server via the web

geoCML Server Portal is accessible via {deployment host URL}:80. geoCML Server Portal is a web application frontend for geoCML Server written in React. geoCML Server Portal acts as the secondary user entrypoint for your deployment, allowing you to share data with others over the internet. geoCML Server Portal relies on Web Map Service (WMS) information from geocml-project.qgz in order to properly display your instance information.

geoCML Server Portal features:

  • a description of your project,

  • your contact information,

  • copyright claims for your project’s data,

  • a hosted Leaflet web map,

  • WMS connection information,

  • WFS connection information,

  • WCS connection information,

  • a preview of your hosted data,

  • a list of all hosted data tables,

  • similar recommended datasets from DRGON

Accessing geoCML Server via an API

geoCML Server exposes all API functionality for QGIS Server via cfcgi and Apache. Learn more about QGIS Server here: https://docs.qgis.org/3.34/en/docs/server_manual/index.html

geoCML Task Scheduler (gTS)

geoCML Task Scheduler Logo

geoCML deploys a micro-service container called geoCML Task Scheduler (gTS). This micro-service is responsible for automating routine tasks within your deployment such as backing up data from geoCML Postgres, restoring geocml_db from a backup, and healthchecking services within your deployment. geoCML Task Scheduler exposes a simple Python API for developing additional tasks.

Configuring geoCML Task Scheduler

geoCML Task Scheduler can be fully configured to suit the needs of your use case. geoCML Task Scheduler has a two step configuration process: Dockerfile configuration, and Ansible configuration.

Docker Configuration

You can use Docker to install packages to your geoCML Task Scheduler service deployment. Open geocml-base-deployment\Dockerfiles\Dockerfile.geocml-task-scheduler in your favorite text editor, and between the Customize Container Here and End Customizations comments, add your use-case specific geocml-task-scheduler configuration steps.

Ansible Configuration

You can use Ansible to automate advanced, tedious configuration workflows. Open geocml-base-deployment/ansible-playbooks/geocml-task-scheduler-playbook.yaml

Understanding DBBackups

geoCML Postgres is not a persistent data store. Because of this, when your geoCML instance goes down, you will risk losing information in your data store. geoCML Task Scheduler handles this by automatically backing up geocml_db every hour. When your instance is brought back up, geoCML Task Scheduler will automatically restore your data store from the most recent backup. Backups are stored in the DBBackups directory in the persistence layer. Each DBBackup contains a .tabor file defining the schema of geocml_db and a series of CSV files representing the actual data in your tables.

Writing Tasks

You can create custom tasks in geoCML Task Scheduler.

  1. Open build-resources/geocml-task-scheduler/geocml-task-scheduler/ in your favorite text editor.

  2. Create a new Python file.

  3. Define a function in the new file with your task logic.

  4. Return 0 if you want your task to run only once. Otherwise, your task will run according to its position in the schedule.

  5. Save your Python file.

  6. Open schedule.py

  7. Create a new Task object, instantiated with your new function.

  8. Schedule your task for execution with its execution frequency (in seconds)

  9. Rebuild the geocml-task-scheduler container and deploy

Your new task is created, and it is scheduled for execution.

Tabor

Tabor Logo

Tabor is a database modeling language for GIS based on YAML, but with additional syntax restrictions. The goal of Tabor is to allow GIS users to create and maintain complex database rules using plain-text configuration files. The following is an example of a Tabor configuration file for a PostGIS database:

tabor: 0.3.0
layers:

- name: grass
  schema: public
  owner: geocml
  geometry: polygon
  srid: 4326
  fields:
    - name: fid
      type: int
      pk: true

- name: trees
  schema: public
  owner: geocml
  geometry: point
  srid: 4326
  fields:
  - name: fid
    type: int
    pk: true
  - name: genus
    type: text
  - name: species
    type: text
  - name: height_meters
    type: numeric
  - name: circumference_cm
    type: numeric
  constraints:
    - name: on
      layer: grass

- name: streams
  schema: public
  owner: geocml
  geometry: polyline
  srid: 4326
  fields:
    - name: fid
      type: int
      pk: true

Running this file through the Tabor command line utility generates a valid PostgreSQL schema query that can be used to create or update tables in a PostGIS database.

Downloading and Installing Tabor

Tabor can be downloaded directly from https://github.com/geoCML/tabor. After downloading, simply extract the downloaded .zip file to a directory accessible on your terminal path.

Command Line Usage

tabor read --file <path/to/file> → Converts a .tabor file into a PostGIS schema query.

tabor write --file <path/to/file> --db <name_of_psql_db> --username <name of db user> --password <password of db user?> --host <host of psql db?> --port <port of psql db?> --ignore <tables to ignore?> → Converts a PostGIS database to a .tabor file

tabor load --file <path/to/file> --db <name_of_psql_db> --username <name of db user> --password <password of db user?> --host <host of psql db?> --port <port of psql db?> → Loads a PostGIS database from a .tabor file.

Supported Geometry Types

Tabor supports the following geometry types:

  • polygon → for 2D shapes such as political boundaries

  • polyline → for 1D shapes such as roadway centerlines

  • point → for 1D shapes such as trees

If a layer has no geometry type, it may still be defined as a non-geometric table in the database.

The spatial reference of each layer in your Tabor file can be defined with the srid field, followed by the SRID of your reference system.

Supported Field Types

Tabor supports the following primitive field types:

  • int → for whole numbers

  • numeric → for decimal numbers

  • text → for unlimited length character strings

  • boolean → for true or false values

Tabor also supports arrays of primitive data. You can define an array as type: <primitive> array

It is strongly recommended that you have a primary key (pk) on each of your layers. You can define a field as a primary key using pk: true.

Data Constraints

You can define complex business logic in your Tabor file with data constraints. Data constraints ensure that changes to your database table meets specific rules before they are committed to the database. You can choose to include as many data constraints on your layers as you need.

Tabor supports the following data constraints:

  • on → Checks that new features are at least partially within the boundaries of another layer (works with all geometry types)

ex.

constraints:
  - name: on
    layer: other_layer
  • length → Checks that new polyline features have either a minimum or maximum length

ex.

constraints:
  - name: length
    minimum: 0.5
    maximum: 99.5 # You must include either a minimum or maximum value, but not both!
  • near → Checks that new features are placed within a given distance of another layer (works with all geometry type)

ex.

constraints:
  - name: near
    distance: 15.6
    layer: other_layer

DRGON

DRGON Logo

DRGON (pronounced as 'Dragon') is a Distributed Registry of GISystems Over a Network. DRGON collects a registry geoCML deployments over the internet (or an intranet, if you prefer). Using a simple REST API, you can easily query DRGON to find the perfect dataset. A public registry is hosted at https://drgon.geocml.com, but you may also self host a DRGON instance, depending on your needs.

Quickstart Guide

Before interacting with DRGON, you must first have a hosted geoCML deployment with a properly configured geoCML Server Portal. Next, register for an API key via a POST request to <DRGON_HOST>/apikey; You must provide an email address in the request body. Copy your API key to a safe place, you will only be able to view it once!

On your deployment’s server machine, add the following values to your .env file:

  1. DRGON_HOST: the host URL of the DRGON instance you want to use (Do not include trailing slash)

  2. DRGON_API_KEY: your DRGON API key

  3. GEOCML_DEPLOYMENT_HOST: the domain name of your hosted geoCML instance

Rebuild your geoCML deployment, and restart your instance. After about a minute of up-time, geoCML Task Scheduler will ping DRGON and automatically register your deployment.

Self Hosting a DRGON Instance

You can host your own dedicated DRGON instance with a few simple steps:

  1. Clone the geoCML source code from github.com/geocml/drgon

  2. Open a terminal and cd into the source code directory

  3. Copy .env.example into a new file called .env

  4. Update your .env to include your deployment specific configuration variables

  5. Run sh build.sh to build DRGON service images on your machine.

  6. Run docker network create drgon-network

  7. Run sh start.sh to bring up the instance.

API Reference

Endpoint

Method

Request Body

Description

apikey/

POST

{ email: "me@example.com" }

Requests an API key necessary for registering your geoCML deployment with DRGON. This key is 100% free, forever.

registry/

GET

None

Requests a registry containing all geoCML deployments known to DRGON.

registry/

POST

{ key: "YOUR_API_KEY", title: "DEPLOYMENT TITLE", description: "DEPLOYMENT DESCRIPTION", owner: "OWNER NAME", tags: "TAG,TAG,TAG" }

Requests that a hosted geoCML Deployment be registered with DRGON.

recommendations/

GET

?tags=TAG,TAG,TAG&limit=999

Requests DRGON for recommended datasets based on the provided tags.

Moderating DRGON

Whoever hosts a DRGON instance is free to moderate their registry however they wish. DRGON automatically moderates deployments containing banned words (e.g. offensive words), however manual review and moderation of deployments in your registry is highly recommended.

The DRGON development team recommends that you prevent users from abusing your registry, either by uploading duplicate datasets, or by registering non-geoCML deployments. If a user is abusing your registry, you can revoke their API key within the drgon-postgres service database.