MQTT Compact Message DBM

At STI we are developing a custom data management backend consisting of a database and a database manager (DBM). The DBM interprets the relevant MQTT messages and performs a few other useful tasks.

The DBM will accommodate any type of instrument so long as it is able to interpret its payload. For this reason, the Weather Station MQTT messages follow certain payload formatting and topic pattern conventions.

The end user needs to know them for the interpretation of data.

The station administrator should design new applications by following these conventions in order for the payload to be recognized and logged in the database correctly.

MQTT Compact Message Topic Specification

All topics shall be of the form:

ws/<action>/<remainder>

The topic header,

ws (mandatory)
is a reserved keyword. All topics in the ws/ tree will be parsed by the database logger. Care should be taken to only publish the relevant data to this tree.
<action> (mandatory)
is one of reserved keywords. Currently recognized values are:
  • d (uplink) data
  • j join
  • e error
  • r response
  • c command
<remainder>
is action-specific and is described for each action separately in the respective section.

Meteorological instruments publish to d, j, e.

An end user can subscribe to d, j, e, r (some of these may need permission) and to publish to c (with the same caveat).

The database logger shall publish to r.

Please note that the topic structure is defined in the application.

MQTT Compact Message Payload Specification

This section is a guide to understanding of the Weather Station MQTT message format used in this project.

The meterological information that leaves a gateway as MQTT messages, may contain, besides the actual meteorological data, metadata and some networking information.

The MQTT messages are encapsulated in a JSON object. It’s structure in the same across all meteo-instruments. The station administrator should design the data decoder considering the constraints that are listed below.

To save bandwidth and prevent the message from being overly bloated, a custom JSON object is implemented. Here is a sample JSON object (222 bytes), from a MeteoHelix device, pretty-printed for clarity:

{
        "d":    [18, 18, 18, 41.4, 100780, 0, 0],
        "h":    {
                "bp":   58,
                "pv":   3.9
        },
        "i":    "0004a30b0021fbae",
        "p":    "652700019f3d60000008ff",
        "t":    "2023-04-17T23:21:08",
        "x":    {
                "b":    125,
                "c":    "4/5",
                "f":    868100000,
                "i":    "24e124fffef580c6",
                "rssi": -63,
                "s":    11,
                "snr":  9.8
        }
}

This scheme has been designed to keep the messages relatively short. As a matter of fact, most payloads with all metadata included should fit in between 200 and 300 bytes. Without the metadata - probably in less than 50 bytes.

Assuming message transmission rate at 1 every 10 minutes, we can estimate the data consumption rate at less than 1.3 MB per month per device with all metadata included.

In most cases the direct consumer of these messages will be a computer program and not a human being, so “human readability” is not really so important. However, one can still read off the data quite easily with a little practice. Read on to understand how.

All MQTT messages shall be of the form:

i (mandatory)
Device EUI. Format: hexadecimal string (16). This is the main device ‘identifier’ and must be present in all messages.
t (mandatory)
UTC Time stamp. Time stamp is a critical datum and therefore must be present.
d
Optional sensor data: meteorological data. Format: JSON array (see below). Although sensor data is not mandatory, it makes sense to provide it.
x
Optional link data: networking meta-data. Format: JSON object (see below).
h
Optional device data: battery levels and various alerts that some devices provide. This is data that is not meteorological payload. Optional, too.
p
Optional payload data: the original LORA payload. Format: hexadecimal string of variable length.

Messages containing at least the fields “i” and “t” shall be parsed (although to have any effect whatsoever, at least one of “d”, “x”, “h”, “p” should also be defined). For debugging, you can send “almost empty” messages with only “i” and “t” fields.

Meteo-variables

The meteorological data is a JSON array which can be accessed by the payload key d. The interpretation of the array elements is instrument-specific (and could be model-specific). Please find detailed description in MQTT Compact Message Decoders’ page.

Networking metadata

The networking information is a JSON struct which can be accessed by the payload key x. All meteorological devices provide the networking data with the same set of elements:

i (mandatory)
Gateway’s MAC address. Format: hexadecimal string (16)
rssi
Connection’s RSSI value
snr
Connection’s signal to noise ratio
f
Frequency
b
Bandwidth
s
Spreading factor
c
code rate.

Device-specific information

The device-info data is a JSON struct which can be accessed by the payload key h. Meteorological instruments are expected to provide device info data but the set of elements is instrument-specific. Usually, they should provide at least the battery level:

bp
Battery level as percentage.
bv
Battery level as potential in volts

Each instrument may or may not provide additional fields which are documented in the legend and in the following sections.

Payload data

The payload data is the original LoRa payload as sent by the instrument, accessible by the payload key p. It’s format is a JSON string.

Topics in the ws/d subtree is where the meteo-instruments publish their data. This subtree is of interest to the end user who wants to subscribe to meteorological data.

The meteo-instruments shall send their payload to a topic consisting of four parts with this structure:

ws/d/<project>/<subtopic>

Example

ws/d/GHA/accra/legon/rain

Description

<project> (mandatory)
specify an autonomous project, for example, “GHA”, “TimorLeste”, “test”, etc.
<subtopic>=<region>/<station>/<instrument>
provides arbitrary “local” content without any specific restrictions (maximum string length: 128 characters). In particular <subtopic> may be further split using forward slashes (/). From the DBM perspective, the splitting scheme can be freely chosen by the designer. Our recommendation is to split the subtopic into the region, station, and instrument parts, where
<region>
designates a region where the gateway is deployed (a district or a city name are good candidates).
<station>
designates the locality of the weather station
<instrument>
designates the type of the instrument in human readable form, and should serve to easily recognize the type of meteorological information and generally help identify an instrument.

This scheme allows to easily filter messages. Here are examples of subscription by certain criteria:

ws/d/GHA/#
subscribe to all data from Ghana.
ws/d/+/+/legon/#
subscribe to the Legon weather station
ws/d/+/+/+/rain
subscribe to all instruments of type rain.

ws/r DB Response channel

Reponses are simple text messages aiming to provide some information about the outcome of a command. They are a good way to debug/verify the results of some commands.

Topic structure

ws/r/<hardware-id>/cmd

Examples

ws/r/#
all responses
ws/r/<hid>
all responses for device with identifier <hid>.

All commands of the form ws/c/1234567890abcdef/<cmd> will publish their responses to the topic ws/r/1234567890abcdef/<cmd>.

ws/c DB Command channel

It is possible to query the database using MQTT messages, by publishing and subscribing to certain topics.

Note

This section as well as the logger command functionality is WORK IN PROGRESS. For security reasons some of these features may require an authorized user.

Database commands shall be published to a topic of the form

ws/c/<hid>/<cmd>

<hid>
is a valid device’s hardware identifier, which must be a valid 16 character string representing 8 byte data in hexadecimal format (example: “0123456789abcdef”). Depending on the device type it is either the device EUI (end nodes), or the MAC address (gateways).
<cmd>
Is one of the allowed commands (see below).

Depending on the command, a message body may be required, and the effect of the command also may depend on the contents of a message’s body.

Available and foreseen commands

cmdworks?descriptionpayload?
crdyesset or change coordinates and geohashJSON object
getnoget informationnone or a JSON array with database field names
setnoset field valueJSON object, restrictions on allowed fields apply
newnocreate new devicenone or a JSON struct with database field names as keys
rmnoremove devicenone.
nofetch payload’s legendnone.

crd command

Set/update geographical coordinates of a device. Requires a valid hardware ID.

Message format

ws/c/<valid-harware-id>/crd

Payload

Mandatory payload is a JSON object with fields
lat (mandatory)
latitude (float)
lon (mandatory)
longitude (float)
alt (optional)
altitude. Set to zero if not provided.

Example

Topic: ws/c/1234567890abcdef/crd

Payload: {'lat':1.234, 'lon':5.678, 'alt':90}

Actions

If payload is empty and a valid <hid> is provided, a ws/r/<hid>/cmd topic will be published containing the coordinates of the device <hid>.

If payload is not empty and a valid <hid> is provided, DBM will perform the following actions:

  1. Compute the geohash corresponding to the payload;
  2. Based on the value of geohash either the current active device is updated with new coordinates, or if the new geohash differs from the old one beyond a certain resolution, then a new device is created with the new values. The old device is “retired” by setting is_active flag to False.
  3. Response message will be published to the topic ws/r/<valid-hardware-id>/crd with acknowledgment. Sample payload: “0004a30b00f1d19a/crd completed with code 0”