Total Open Station: a specialised format converter

It’s 2017 and nine years ago I started writing a set of Python scripts that would become Total Open Station, a humble GPL-licensed tool to download and process data from total station devices. I started from scratch, using the Python standard library and pySerial as best as I could, to create a small but complete program. Under the hood, I’ve been “religiously” following the UNIX philosophy of one tool that does one thing well and that is embodied by the two command line programs that perform the separate steps of:

  1. downloading data via a serial connection
  2. converting the raw data to formats that can be used in GIS or CAD environments

And despite starting as an itch to scratch, I also wanted TOPS to be used by others, to provide something that was absent from the free software world at the time, and that is still unchallenged in that respect. So a basic and ugly graphical interface was created, too. That gives a more streamlined view of the work, and largely increases the number of potential users. Furthermore, TOPS can run not just on Debian, Ubuntu or Fedora, but also on macOS and Windows and it is well known that users of the latter operating systems don’t like too much working from a terminal.

Development has always been slow. After 2011 I had only occasional use for the software myself, no access to a real total station, so my interest shifted towards giving a good architecture to the program and extending the number of formats that can be imported and exported. In the process, this entailed rewriting the internal data structures to allow for more flexibility, such as differentiating between point, line and polygon geometries.

Today, I still find GUI programming out of my league and interests. If I’m going to continue developing TOPS it’s for the satisfaction of crafting a good piece of software, learning new techniques in Python or maybe rewriting entirely in a different programming language. It’s clear that the core feature of TOPS is not being a workstation for survey professionals (since it cannot compete with the existing market of proprietary solutions that come attached to most devices), but rather becoming a polyglot converter, capable of handling dozens of raw data formats and flexibly exporting to good standard formats. Flexibly exporting means that TOPS should have features to filter data, to reproject data based on fixed base points with known coordinates, to create separate output files or layers and so on. Basically, to adapt to many more needs than it does now. From a software perspective, there are a few notable examples that I’ve been looking at for a long time: Sphinx, GPSBabel and Pandoc.

Sphinx is a documentation generator written in Python, the same language I used for TOPS. You write a light markup source, and Sphinx can convert it to several formats like HTML, ePub, LaTeX (and PDF), groff. You can write short manuals, like the one I wrote for TOPS, or entire books. Sphinx accepts many options, mostly from a configuration file, and I took a few lines of code that I liked for handling the internal dictionary (key-value hash) of all input and output formats with conditional import of the selected module (rather than importing all modules that won’t be used). Sphinx is clearly excellent at what it does, even though the similarities with TOPS are not many. After all, TOPS has to deal with many undocumented raw formats while Sphinx has the advantage of only one standard format. Sphinx was originally written by Georg Brandl, one of the best Python developers and a contributor to the standard library, in a highly elegant object-oriented architecture that I’m not able to replicate.

GPSBabel is a venerable and excellent program for GPS data conversion and transfer. It handles dozens of formats in read/write mode and each format has “suboptions” that are specific to it. GPSBabel has also advanced filtering capabilities, it can merge multiple input files and since a few years there is a minimal graphical interface. Furthermore, GPSBabel is integrated in GIS programs like QGIS and can work in a variety of ways thanks to its programmable command line interface. A strong difference with TOPS is that many of the GPS data formats are binary, and that the basic data structures of waypoints, tracks and routes is essentially the same (contrast that with the monster LandXML specification, or the dozens of possible combinations in a Leica GSI file). GPSBabel is written in portable C++, that I can barely read, so anything other than inspiration for the user interface is out of question.

Pandoc is a universal document converter that reads many markup document formats and can convert to a greater number of formats including PDF (via LaTeX), docx, OpenDocument. The baseline format for Pandoc is an enriched Markdown. There are two very interesting features of Pandoc as a source of inspiration for a converter: the internal data representation and the Haskell programming language. The internal representation of the document in Pandoc is an abstract syntax tree that is not necessarily as expressive as the source format (think of all the typography and formatting in a printed document) but it can be serialised to/from JSON and allows filters to work regardless of the input or output format. Haskell is a functional language that I have never programmed, although it lends to creating complex and efficient programs that are easily extended. Pandoc works from the command line and has a myriad of options – it’s also rather common to invoke it from Makefiles or short scripts since one tends to work iteratively on a document. I could see a future version of TOPS being rewritten in Haskell.

Scriptability and mode of use seem both important concepts to keep in mind for a data converter. For total stations, a common workflow is to download raw data, archive the original files and then convert to another format (or even insert directly into a spatial database). With the two programs totalopenstation-cli-connector and totalopenstation-cli-parser such tasks are easily automated in a single master script (or batch procedure) using a timestamp as identifier for the job and the archived files. This means that once the right parameters for your needs are found, downloading, archiving and loading survey data in your working environment is a matter of seconds, with no point-and-click, no icons, no mistakes. Looking at GPSBabel, I wonder whether keeping the two programs separate really makes sense from a UX perspective, as it would be more intuitive to have a single totalopenstation executable. In fact, this dual approach is a direct consequence of the small footprint of totalopenstation-cli-connector, that merely acts as a convenience layer on top of pySerial.

It’s also important to think about maintainability of code: I have little interest in developing the perfect UI for TOPS, all the time spent for development is removed from my spare time (since no one is paying for TOPS) and it would be way more useful if dedicated plugins existed for popular platforms (think QGIS, gvSIG, even ArcGIS supports Python, not to mention CAD software). At this time TOPS supports ten (yes, 10) input formats out of … hundreds, I think (some of which are proprietary, binary formats). Expanding the list of supported formats is the single aim that I see as reasonable and worth of being pursued.

What is coming in Total Open Station 0.4

More than one year has passed since the first release of Total Open Station (TOPS). Version 0.3 already brought support for multiple data formats and devices, the ability to export your data to common standard formats and a programming library to create scripts around the core functionality of TOPS. We were very proud when TOPS was added to OpenSUSE and it is now being added to Debian and Fedora, three among the most popular GNU/Linux distributions.

Feedback from users of TOPS 0.3 has not been as significant as we would have expected, even though we have been providing a stack component for survey professionals that was totally missing on GNU/Linux and on Mac OS too in many cases. Nevertheless, we have continued developing TOPS, admittedly at a slower pace.

TOPS 0.4 is going to feature support for new raw data formats (including initial support for the popular Leica GSI) and the core data types are being completely rewritten in order to allow handling of polylines and polygons. The lines of code are fewer, making it easier to find new bugs and to start hacking on your own if you want. Thanks to our contributors, we will make more languages available for the program interface.

Being a volunteer-driven project, developer time is a critical resource, but we found out that user feedback and involvement is actually the most valuable resource. With this in mind, we are going to change the project governance to make the role of contributing users more prominent.

If you use a total station as part of your daily work and you care about software freedom, please consider donating to support the development of TOPS, and submitting a bug report about the models and formats you need.

Total Open Station is a free and open source program to download, manage and export survey data from total stations. It runs on all major operating systems and supports a growing number of raw data formats.

Total Open Station packaged for OpenSUSE

Thanks to Angelos Tzotzos (Remote Sensing Laboratory, National Technical University of Athens) Total Open Station has an installable package for OpenSUSE, since a few weeks. Installing is as easy as:

$ sudo zypper ar http://download.opensuse.org/repositories/Application:/Geo/openSUSE_12.1/ GEO 
$ sudo zypper refresh 
$ sudo zypper install TotalOpenStation

Any report about this package will be very welcome. We have already added these instructions in the website.

A few days ago I met Angelos in Athens. He is a very active member in several OSGeo projects and even more importantly he does an incredible job at animating the small Greek OSGeo community. I hope they will grow as our Italian community did, and I think we have an obligation to help them.

Total Station and GNU/Linux: Zeiss Elta R55 done!

The pySerial library is really good. Today I installed it and in half an hour I got acquainted with its class methods, even though I have little knowledge about serial ports and the like.

With some trial and error about the connection parameters, I was even able to solve the problem with non-printable characters, tweaking the bytesize of the connection.

Briefly, these are the steps I did in the interactive ipython console:

  1. >>> import serial
    >>> ser = serial.Serial('/dev/ttyUSB0', \
    baudrate=9600, bytesize=serial.SEVENBITS, timeout=0, \
    parity=serial.PARITY_NONE, rtscts=1)
    >>> ser.open()
  2. at this point, start the transfer from the device
  3. check that you have received some data:
    >>> ser.inWaiting()
    648L
    

    A non-zero result means that you have received something.

  4. I saved this value to use it with the read() method of the Serial class:
    >>> n = ser.inWaiting()
    >>> result = ser.read(n)
    
  5. The result object is a string, seeing its contents is as simple as:
    >>> print(result)
    0001 OR.COOR
    0002                   0S        X        0.000 Y         0.000 Z     0.000
    0003                                            Om     397.0370
    0004 POLAR
    0005 INPUT                       th       1.500 ih        0.000
    0006 INPUT                       th       0.000 ih        0.000 Z     0.000
    0007                   1         X       -0.472 Y         1.576 Z     0.004
    END
    

    As you can see, there are no errors after the END sentence, because the serial connection is handled gracefully now. The previous attempt with cat /dev/ttyUSB0 was a bit brutal…

For now, that’s all. I go back studying and maybe also writing some Python code for this Total Station. If you have got a total station and want to contribute to this project, let me know by leaving a comment here.

Total stations and GNU/Linux, part 2

In this off-line weekend, I went on investigating the output from the Zeiss Elta R55 total station.

First of all, it turns out that the file was not binary. A simple

$ file downloaded_data
downloaded_data: Non-ISO extended-ASCII text

could have revealed this simple truth. My error was due mainly to the fact that most of the content in the file was made by non-printable characters. But my guess about the lines that contained the point coordinates was right. Probably due to a wrong download procedure, there were some problems with that file. All characters with code > 128 (hex 80) had to be translated shifting their code by 128. I used this simple Python script for this task:

>>> read_file = open("downloaded_data", 'r')
>>> des = read_file.read()
>>> for i in des:
...     if ord(i) > 127:
...         print chr(ord(i)-128)
...     else:
...         print chr(ord(i))

Probably this could be done in a better way, but I’m no hexpert at all. And I think this can be completely avoided by downloading from the serial port with the right connection parameters. I think I’m going to use the pySerial library for this task.

Obviously, I’m solving the problem for one model of one manufacturer, but there are many models and many brands. With my short experience, the best solution I can think of is a modular approach, with an abstract connection class that can be subclassed, with the connection parameters for each model.

The second part of the story comes when it’s time to process the downloaded data. First of all, take a look at the clean file contents:

   0001 OR.COOR
   0002                   0S        X        0.000 Y         0.000 Z     0.000
   0003                                            Om     397.0370
   0004 POLAR
   0005 INPUT                       th       1.500 ih        0.000
   0006 INPUT                       th       0.000 ih        0.000 Z     0.000
   0007                   1         X       -0.472 Y         1.576 Z     0.004
END
E
E

Let’s comment it, line by line:

  1. the first line contains the OR.COOR string, but I’m not sure about the other possible values it can take; the line starts with 0001 like all other lines, except the last one
  2. the second line contains the X Y Z coordinates of the origin point (maybe represented by the string 0S?); please note that it uses the same format as for normal points, except that instead of the id number there is this special string
  3. the third line contains information about the orientation angle, but I can’t tell anything more specific about this
  4. the fourth line contains the POLAR string, that is probably referred to the orientation method; I’m not sure about the other values this field can take
  5. the fifth and sixth lines both start with an INPUT string, that should refer to the height of the reflector prism: 1.500 m is in fact the usual height of the reflector
  6. the seventh line contains our only recorded point, with its id (an integer number) and the X Y X coordinates with precision 3
  7. the eighth line indicates that there are no more points to download, and starts with the END string: when downloading, the program should stop here, otherwise the device emits an error (and also a noisy beep), that is represented by the E string on the following lines
  8. attempts to let the download go on even if the device emits the error simply result in more E lines

Part 3 will follow soon.

cat /dev/total_station > file

This post is one of the “dear lazyweb” ones.

Here at the department we have a Zeiss Elta R55 total station. This device has its own software for downloading recorded data, but, as usual, it’s a Windows-only, non-free application.

Is it possible to download data from such a device using a GNU/Linux machine? Nobody knows. I have asked a number of people and no one has ever tried to do this. 🙁 With some good advice from Frankie, today I made my first test.

With substantial help from Elisa, I recorded 1 point. This point has coordinates:

X    -0.472
Y     1.576
Z     0.004

I downloaded from the device using this simple command (it’s ttyUSB0 because my laptop has no serial port)

cat /dev/ttyUSB0 > data

The total number of points is 7. Points 1-6 contain information about the origin point and other parameters. For now, I’m ignoring them. The resulting data file is binary. You can see it here. I am no expert of binary files, so I used GHex to see its contents. Its dumped form looks like this:

...000.....................
...........................
...........................
...000....................0
S.................0.000.Y..
.......0.000.Z.....0.000...
...0003....................
...........................
....39..03.0...............
...000..P..A...............
...........................
...........................
...0005..NPU...............
.........t..........500.i..
.......0.000...............
...0006..NPU...............
.........t........0.000.i..
.......0.000.Z.....0.000...
...000.....................
.................-0.....Y..
.........5.6.Z.....0.00....
.ND........................
...........................
...........................
...............

Some comments about this first test:

  • anything after the ND means there are no more data.
  • the recorded point seems to be in the part immediately before ND

If anyone has any other suggestions about this test, please tell me.