Tag: python

  • Install iosacal with conda

    Starting today, you can install iosacal with conda. This adds to the existing installation procedure with pip. Conda is a good fit for complex projects and has better tooling for reproducibility.

    Installing iosacal can be achieved by adding conda-forge to your channels with:

    conda config --add channels conda-forge conda config --set channel_priority strict

    Once the conda-forge channel has been enabled, iosacal can be installed with conda:

    conda install iosacal

    or with mamba:

    mamba install iosacal
  • Research papers and case studies using iosacal

    Research papers and case studies using iosacal

    I have updated the documentation of iosacal with a new page that lists all research papers and case studies where the software gets a mention for being used.

    A collage of figures from the papers using iosacal

    The list is at https://iosacal.readthedocs.io/en/latest/literature.html and it’s longer than I thought, with 6 papers ranging from Norway to Antarctica, from the Last Glacial Maximum to the European Middle Ages.

    It’s humbling to see this small piece of software find its way in so many research projects and I’m learning a lot by studying these publications.

    Some authors contributed to iosacal with new features and bug fixes, and that is the most accurate metric of a healthy project that I can think of.

    I’m going to add more useful content to the documentation as the main focus of the 0.7 release. In the meantime, you can continue using iosacal 0.6 in your research projects.

  • Harris Matrix Data Package: version 2022 of the hmdp tool with new features for the creation of stratigraphy data packages

    A few weeks ago I presented a new version of the hmdp tool at the ARCHEOFOSS conference in Rome. You can find the archived presentation on Zenodo.

    Harris Matrix Data Package is a proposal for a standardised digital format of archaeological stratigraphy datasets in CSV format, following the table schema developed by Thomas S. Dye for the hm Lisp package, augmented with a metadata descriptor (datapackage.json) that enables consistency checks and streamlined data access with the Frictionless Data tools and programming libraries. In the standard, each dataset consists of various CSV tables and a metadata descriptor, forming a data package. I proposed this standard in 2019 at a previous ARCHEOFOSS conference based on a 2015 work by Dye and Buck [zotpressInText item=”{12096:TZPPQB42},{12096:G2QPMZQT}”].

    Based on this proposal, hmdp is a command line program for working with archaeological stratigraphy data in the Harris Matrix Data Package format.

    This new version adds an “init” command, that will create an empty data package with the correct metadata. You can find the archived source code of hmdp version 2022.10.16 on Zenodo, too.

    The hmdp init command works both interactively and with explicit command line parameters, and it is centered around the idea that in the Harris Matrix Data Package:

    • each Harris Matrix is a data package
    • there is 1 data descriptor
    • there are from 2 to 7 CSV tables
    • each CSV table is a resource

    The two resources that MUST be present are:

    • contexts
    • observations

    Most often, excavation data will make use of three other resources:

    • inferences
    • periods
    • phases

    Only in case there are radiocarbon dates or other absolute chronology available the two resources should be used:

    • events
    • event-order

    With the above outline, default presets are defined, and choosing a preset will create the corresponding CSV files (resources). The CSV files are created only with the standard column headers, data must be filled by the user.

    The current released version of hmdp init can create a Harris Matrix Data Package from scratch, e.g. in a new empty directory. Support for recognizing existing CSV files and adding the metadata descriptor is in progress.

    The project home page is at https://www.iosa.it/software/harris-matrix/ and the development repository is on Codeberg at https://codeberg.org/steko/harris-matrix-data-package

    References

    [zotpressInTextBib]

  • IOSACal 0.5, featuring IntCal20 and more

    After three years of slow paced development, IOSACal 0.5 is here. The DOI of the latest release is https://doi.org/10.5281/zenodo.630455

    As before, the preferred installation method is with pip in a virtual environment. The documentation is at https://iosacal.readthedocs.io/

    This release brings the new IntCal20 calibration data and several improvements for different use cases, plus one important bug fix. Apart from myself, there were two contributors to this release, I’m grateful to Karl Håkansson and Wesley Weatherbee for their work.

    These are the highlights from the release notes:

    • the project has moved to Codeberg for source code hosting and issue tracking. The new Git repository is at https://codeberg.org/steko/iosacal with a default branch name of main
    • there is an official Code of Conduct that all contributors (including the maintainter) will need to follow, based on Contributor Covenant
    • the documentation has seen some improvements, in particular in the Contributing section. Overall, making contributions easier from both expert and novice users is a major theme in this release.
    • interactive use in Jupyter notebooks is made easier with CalibrationCurve that can be created in many ways (such as loading from an arbitrary file, or from a standard calibration curve called by shorthand)
    • fixed a bug that made plots with AD/CE setting incorrect (contributed by Karl Håkansson)
    • fixed a bug that caused a wrong plot density function for dates 80 BP to 0 BP (contributed by Karl Håkansson)
    • add IntCal20 calibration data (contributed by Wesley Weatherbee)

    On the technical side:

    • the command line interface is now based on the Click library
    • most code is now covered by tests, based on pytest
    • Python 3.6 or above required
    • requires Numpy 1.18 and Matplotlib 3.0

    I don’t have big plans for the next release. I would like to add more tests, modernize the code and make it easier to adapt / tinker with. The only major achievement I’m looking forward to is to submit an article about IOSACal to the Journal of Open Source Software.

  • Total Open Station 0.4 release

    Total Open Station 0.4 release

    This article was originally published on the Total Open Station website at https://tops.iosa.it/

    After two years of slow development, I took the opportunity of some days off to finally release version 0.4, that was already available in beta since 2017.

    No open bugs were left and this release is mature enough to hit the repositories.

    Find it on PyPI at https://pypi.python.org/pypi/totalopenstation as usual.

    Windows users, please note that the TOPS-on-a-USB-stick version will have to wait a few days more, but the beta version is equally functional.

    What’s new in Total Open Station 0.4

    The new version brings read support for 4 new formats:

    • Carlson RW5
    • Leica GSI
    • Sokkia SDR33
    • Zeiss R5

    Other input formats were improved, most notably Nikon RAW.

    DXF output was improved, even though the default template is not very useful since it is based on an old need from the time when TOPS was developed day to day on archaeological excavations.

    The work behind these new formats is in part by the new contributor to the project, Damien Gaignon (find him as @psolyca on GitHub), who submitted a lot of other code and started helping with project maintenance as well. I am very happy to have Damien onboard and since my usage of TOPS is almost at zero, it’s very likely that I will hand over the development in the near future.

    The internal data structures for handling the conversion between input and output formats are completely new, and based on the Python GeoInterface abstraction offered by the pygeoif library. This allows going beyond single points to managing lines and polygons, even though no such feature is available at the moment. If you often record linear or polygonal features that you’re manually joining in the post-processing stage, think about helping TOPS development and you could get DXF or Shapefiles with the geometries ready to use (yes, Shapefile output is on our plans, too).

    There were many bugfixes, more than 100 commits, 64 by Damien Gaignon and 52 by myself (to be honest, many of my own commits are just merges!).

    This version is the last built on Python 2, and work is already ongoing towards a new version that will be based on Python 3: a more mature codebase will mean a better program, without any visible drastic change.

    Photo by Scott Blake on Unsplash

  • IOSACal 0.4

    IOSACal 0.4

    IOSACal is an open source program for calibration of radiocarbon dates.

    A few days ago I released version 0.4, that can be installed from PyPI or from source. The documentation and website is at http://c14.iosa.it/ as usual. You will need to have Python 3 already installed.

    The main highlight of this release are the new classes for summed probability distributions (SPD) and paleodemography, contributed by Mario Gutiérrez-Roig as part of his work for the PALEODEM project at IPHES.

    A bug affecting calibrated date ranges extending to the present was corrected.

    On the technical side the most notable changes are the following:

    • requires NumPy 1.14, SciPy 1.1 and Matplotlib 2.2
    • removed dependencies on obsolete functions
    • improved the command line interface

    You can cite IOSACal in your work with the DOI https://doi.org/10.5281/zenodo.630455. This helps the author and contributors to get some recognition for creating and maintaining this software free for everyone.

  • What’s the correlation between the exposure time of your photographs and the time of the day?

    What’s the correlation between the exposure time of your photographs and the time of the day?

    My digital photo archive spans 15 years and holds about 12,600 pictures (not so many, after all). I’m curious to see if there is a correlation between the exposure time of my photographs and the time of the day they were taken. A rather simplistic observation, perhaps.

    In short: there’s nothing spectacular about this correlation, but it’s nice. The morning hours are the ones with the lowest average exposure time at around 1/320 s between 9 and 10 AM . There’s a sharp increase between 12 and 1 PM, then it increases again after 4 PM towards dusk. I don’t take many pictures at night!

    See for yourself.

    expotime2

    The most frequent values for exposure time are in the table below. 1/30 s is the typical exposure time when using the flash, and it’s recognisable in the plot above.

    1/n soccurrences
    8001986
    10001178
    30943
    400547
    250488
    640458
    200450
    500388
    320342
    160337

    The Python and R scripts are at Codeberg https://codeberg.org/steko/expotime but were originally at Gitlab.com (giving GitLab a spin since GitHub is a monopoly and I don’t like that). I’m still doing some experiments with the source data, then I’ll upload those as well.

  • Archaeology and Django: mind your jargon

    I have been writing small Django apps for archaeology since 2009 ‒ Django 1.0 had been released a few months earlier. I love Django as a programming framework: my initial choice was based on the ORM, at that time the only geo-enabled ORM that could be used out of the box, and years later GeoDjango still rocks. I almost immediately found out that the admin interface was a huge game-changer: instead of wasting weeks writing boilerplate CRUD, I could just focus on adding actual content and developing the frontend. Having your data model as source code (under version control) is the right thing for me and I cannot go back to using “database” programs like Access, FileMaker or LibreOffice.

    Previous work with Django in archaeology

    There is some prior art on using Django in the magic field of archaeology, this is what I got from published literature in the field of computer applications in archaeology:

    I have been discussing this interaction with Diego Gnesi Bartolani for some time now and he is developing another Django app. Python programming skills are becoming more common among archaeologists and it is not surprising that databases big and small are moving away from desktop-based solutions to web-based

    The ceramicist’s jargon

    There is one big problem with Django as a tool for archaeological data management: language. Here are some words that are either Python reserved keywords or very important in Django:

    • class (Python keyword)
    • type (Python keyword)
    • object (Python keyword)
    • form (HTML element, Django module)
    • site (Django contrib app)

    Unfortunately, these words are not only generic enough to be used in everyday speak, but they are very common in the archaeologist’s jargon, especially for ceramicists.

    Class is often used to describe a generic and wide group of objects, e.g. “amphorae”, “fine ware”, “lamps”, ”cooking ware” are classes of ceramic products ‒ i.e. categories. Sometimes class is also used for narrower categories such as “terra sigillata italica”, but the most accepted term in that case is ware. The definition of ware is ambiguous, and it can be based on several different criteria: chemical/geological analysis of source material; visible characteristics such as paint, decoration, manufacturing; typology. The upside is that ware has no meaning in either Python or Django.

    Form and type are both used within typologies. There are contrasting uses of these two terms:

    •  a form defines a broad category, tightly linked to function (e.g. dish, drinking cup, hydria, cythera) and a type defines a very specific instance of that form (e.g. Dragendorff 29); sub-types are allowed and this is in my experience the most widespread terminology;
    • a form is a specific instance of a broader function-based category ‒ this terminology is used by John W. Hayes in his Late Roman Pottery.

    These terminology problems, regardless of their cause, are complicated by translation from one language to another, and regional/local traditions. Wikipedia has a short but useful description of the general issues of ceramic typology at the Type (archaeology) page.

    Site is perhaps the best understood source of confusion, and the less problematic. First of all everyone knows that the word site can have a lots of meanings and lots of archaeologists survive using both the website and the archaeological site meaning everyday. Secondly, even though the sites app is included by default in Django, it is not so ubiquitous ‒ I always used it only when deploying, una tantum.

    Object is a generic word. Shame on every programming language designer who ever thought it was a good idea to use such a generic word in a programming language, eventually polluting natural language in this digital age. No matter how strongly you think object is a good term to designate archaeological finds, items, artifacts, features, layers, deposits and so on, thou shalt not use object when creating database fields, programming functions, visualisation interfaces or anything else, really.

    The horror is when you end up writing code like this:

    class Class(models.Model)
        '''A class. Both a Python class and a classification category.'''
    
        pass
    
    class Type(models.Model)
        '''A type. Actually, a Python class.
    
        >>> t = Type()
        >>> type(t)
        <class '__main__.Type'>
        '''

    Not nice.

    Is there a solution to this mess? Yes. As any serious Pythonista knows…

    Explicit is better than implicit.
    […]
    Namespaces are one honking great idea — let’s do more of those!

    The Zen of Python

    Since changing the Python syntax is not a great idea, the best solution is to prefix anything potentially ambiguous to make it explicit (as suggested by the honking idea of namespaces ‒ a prefix is a poor man’s namespace). If you follow this, or a likewise approach, you won’t be left wondering if that form is an HTML form or a category of ceramic items.

    # pseudo-models.py
    
    class CeramicClass(models.Model):
        '''A wide category of ceramic items, comprising many forms.'''
    
        name = models.CharField()
    
    class CeramicForm(models.Model):
        '''A ceramic form. Totally different from CeramicType.'''
    
        name = models.CharField()
    
    class CeramicType(models.Model):
        '''A ceramic type. Whatever that means.'''
    
        name = models.CharField()
        ceramic_class = models.ForeignKey(CeramicClass)
        ceramic_form = models.ForeignKey(CeramicForm)
        source_ref = models.URLField()
    
    class ArcheoSite(models.Model):
        '''A friendly, muddy, rotting archaeological site.'''
    
        name = models.CharField()
    
    class CeramicFind(models.Model):
        '''The real thing you can touch and look at.'''
    
        ceramic_type = models.ForeignKey(CeramicType)
        archeo_site = models.ForeignKey(ArcheoSite)
        ... # billions of other fields