Archaeology and Django: mind your jargon

I have been writing small Django apps for archaeology since 2009 ‒ Django 1.0 had been released a few months earlier. I love Django as a programming framework: my initial choice was based on the ORM, at that time the only geo-enabled ORM that could be used out of the box, and years later GeoDjango still rocks. I almost immediately found out that the admin interface was a huge game-changer: instead of wasting weeks writing boilerplate CRUD, I could just focus on adding actual content and developing the frontend. Having your data model as source code (under version control) is the right thing for me and I cannot go back to using “database” programs like Access, FileMaker or LibreOffice.

Previous work with Django in archaeology

There is some prior art on using Django in the magic field of archaeology, this is what I got from published literature in the field of computer applications in archaeology:

I have been discussing this interaction with Diego Gnesi Bartolani for some time now and he is developing another Django app. Python programming skills are becoming more common among archaeologists and it is not surprising that databases big and small are moving away from desktop-based solutions to web-based

The ceramicist’s jargon

There is one big problem with Django as a tool for archaeological data management: language. Here are some words that are either Python reserved keywords or very important in Django:

  • class (Python keyword)
  • type (Python keyword)
  • object (Python keyword)
  • form (HTML element, Django module)
  • site (Django contrib app)

Unfortunately, these words are not only generic enough to be used in everyday speak, but they are very common in the archaeologist’s jargon, especially for ceramicists.

Class is often used to describe a generic and wide group of objects, e.g. “amphorae”, “fine ware”, “lamps”, ”cooking ware” are classes of ceramic products ‒ i.e. categories. Sometimes class is also used for narrower categories such as “terra sigillata italica”, but the most accepted term in that case is ware. The definition of ware is ambiguous, and it can be based on several different criteria: chemical/geological analysis of source material; visible characteristics such as paint, decoration, manufacturing; typology. The upside is that ware has no meaning in either Python or Django.

Form and type are both used within typologies. There are contrasting uses of these two terms:

  •  a form defines a broad category, tightly linked to function (e.g. dish, drinking cup, hydria, cythera) and a type defines a very specific instance of that form (e.g. Dragendorff 29); sub-types are allowed and this is in my experience the most widespread terminology;
  • a form is a specific instance of a broader function-based category ‒ this terminology is used by John W. Hayes in his Late Roman Pottery.

These terminology problems, regardless of their cause, are complicated by translation from one language to another, and regional/local traditions. Wikipedia has a short but useful description of the general issues of ceramic typology at the Type (archaeology) page.

Site is perhaps the best understood source of confusion, and the less problematic. First of all everyone knows that the word site can have a lots of meanings and lots of archaeologists survive using both the website and the archaeological site meaning everyday. Secondly, even though the sites app is included by default in Django, it is not so ubiquitous ‒ I always used it only when deploying, una tantum.

Object is a generic word. Shame on every programming language designer who ever thought it was a good idea to use such a generic word in a programming language, eventually polluting natural language in this digital age. No matter how strongly you think object is a good term to designate archaeological finds, items, artifacts, features, layers, deposits and so on, thou shalt not use object when creating database fields, programming functions, visualisation interfaces or anything else, really.

The horror is when you end up writing code like this:

class Class(models.Model)
    '''A class. Both a Python class and a classification category.'''


class Type(models.Model)
    '''A type. Actually, a Python class.

    >>> t = Type()
    >>> type(t)
    <class '__main__.Type'>

Not nice.

Is there a solution to this mess? Yes. As any serious Pythonista knows…

Explicit is better than implicit.
Namespaces are one honking great idea — let’s do more of those!

The Zen of Python

Since changing the Python syntax is not a great idea, the best solution is to prefix anything potentially ambiguous to make it explicit (as suggested by the honking idea of namespaces ‒ a prefix is a poor man’s namespace). If you follow this, or a likewise approach, you won’t be left wondering if that form is an HTML form or a category of ceramic items.


class CeramicClass(models.Model):
    '''A wide category of ceramic items, comprising many forms.'''

    name = models.CharField()

class CeramicForm(models.Model):
    '''A ceramic form. Totally different from CeramicType.'''

    name = models.CharField()

class CeramicType(models.Model):
    '''A ceramic type. Whatever that means.'''

    name = models.CharField()
    ceramic_class = models.ForeignKey(CeramicClass)
    ceramic_form = models.ForeignKey(CeramicForm)
    source_ref = models.URLField()

class ArcheoSite(models.Model):
    '''A friendly, muddy, rotting archaeological site.'''

    name = models.CharField()

class CeramicFind(models.Model):
    '''The real thing you can touch and look at.'''

    ceramic_type = models.ForeignKey(CeramicType)
    archeo_site = models.ForeignKey(ArcheoSite)
    ... # billions of other fields