A look at pollen data in the Old World

Since the 19th century, the study of archaeobotanical remains has been very important for combining “strictly archaeological” knowledge with environmental data. Pollen data enable assessing the introduction of certain domesticated species of plants, or the presence of other species that grow typically where humans dwell. Not all pollen data come from archaeological fieldwork, but the relationship among the two sets is strong enough to take an interested look at pollen data worldwide, their availability and most importantly their openness, for which we follow the Open Knowledge Definition.

The starting point for finding pollen data is the NOAA website.

The Global Pollen Database hosted by the NOAA is a good starting point, but apparently its coverage is quite limited outside the US. Furthermore, data from 2005 onwards aren’t available via FTP in simple documented formats, but are instead downloadable as Access databases from another external website. Defining MS Access databases as a Bad Choice™ for data exchange is perhaps an euphemism.

Unfortunately, a large number of databases covering single continents or smaller regions is growing, and the approaches to data dissemination show marked differences.


For both North and South America, you can get data from more than one thousand sites directly via FTP. There are no explicit terms of use. Usually, data retrieved from federal agencies are public domain data.

The README document only states NOTE: PLEASE CITE ORIGINAL REFERENCES WHEN USING THIS DATA!!!!!. Fair enough, the requirement for attribution is certainly compatible with the Open Knowledge Definition.


From the GPD website we can easily reach the European Pollen Database, that is found at another website tough (and things can be even more confusing, provided that the NOAA website has some dead links).

You can download EPD data in PostgreSQL dump format (one file for each table, with a separate SQL script create_epd_db.sql). Data in the EPD can be restricted or unrestricted. That’s fine, let’s see how many unrestricted datasets there are. Following the database documentation, the P_ENTITY table contains the use status of each dataset:

steko@gibreel:~/epd-postgres-distribution-20100531$ cat p_entity.dump \
 | awk -F "t" {' print $5 '} | sort | uniq -c 
 154 R 
 1092 U

which is pretty good because almost 88% of them are unrestricted (NB I write most of my programs in Python but I love one liners that involve awksort and uniq). We could easily create an “unrestricted” subset and make it available for easy download to all those who don’t want to mess up with restricted data.

But what do “unrestricted” mean for EPD data? Let’s take a more careful look (emphasis mine):

  1. Data will be classified as restricted or unrestricted. All data will be available in the EPD, although restricted data can be used only as provided below.
  2. Unrestricted data are available for all uses, and are included in the EPD on various electronic sites.
  3. Restricted data may be used only by permission of the data originator. Appropriate and ethical use of restricted data is the responsibility of the data user.
  4. Restrictions on data will expire three years after they are submitted to the EPD. Just prior to the time of expiration, the data originator will be contacted by the EPD database manager with a reminder of the pending change. The originator may extend restricted status for further periods of three years by so informing the EPD each time a three-year period expires.

Sounds quite good, doesn’t it? “for all uses” is reassuring and the short time limit is a good trade off. The horror comes a few paragraphs below with the following scary details:

  1. The data are available only to non-profit-making organizations and for research.

Profit-making organizations may use the data, even for legitimate uses, only with the written consent of the EPD Board, who will determine or negotiate the payment of any fee required.

Here the false assumption that only academia is entitled to perform research is taken for granted. And there are even more rules about the “normal ethics”: basically if you use EPD data in a publication the original data author should be listed among the authors of the work. I always thought citation and attribution were invented just for that exact purpose, but it looks like they have distinctly different approach to attribution. The EPD is even deciding what are “legitimate” uses of pollen data (I can hardly think of any possible unlegitimate use).


You write “Africa” but you read “Europe” again, because most research projects are from French and English universities. For this reason, the situation is almost the same. What is even worst is that in developing countries there are far less people or organizations that can afford buying those data, notwithstanding the fact that in regions under rapid development the study and preservation of environmental resources are of major importance.

Data are downloadable for individual sites using a search engine, in Tilia format (not ASCII unfortunately). The problems come out with the license:

The wording is almost exactly the same as for the EPD seen above:

Normal ethics pertaining to co-authorship of publications applies. The contributor should be invited to be a co-author if a user makes significant use of a single contributor’s site, or if a single contributor’s data comprise a substantial portion of a larger data set analysed, or if a contributor makes a significant contribution to the analysis of the data or to the interpretation of the results. The data will be available only to non-profit-making organisations and for research. Profit-making organisations may use the data for legitimate purposes, only with the written consent of the majority of the members of the Advisory board, who will determine or negotiate the payment of any fee required. Such payment will be credited to the APD.


As for dendrochronological data, there is a serious misunderstanding by universities and research centers of their role in society as places of research, innovation that is available for everyone. In other words, academia is a closed system producing data (at very high costs for society) that are only available inside its walls, but it’s all done with public money.

The only positive bit of the story, if any, is that these datasets are nevertheless available on the web, and their terms of use are clearly stated, no matter how restrictive. It would be just impossible to write a similar article about archaeological pottery, or zooarchaeological finds.

Appendix: Using pollen data

Pollen data are usually presented in forms of synthetic charts where both stratigraphic data and quantitative pollen data are easily readable. Each “column” of the chart stands for a species or genus. You can create this kind of visualization with free software tools.

The stratigraph package for R can be used for

plotting and analyzing paleontological and geological data distributed through through time in stratigraphic cores or sections. Includes some miscellaneous functions for handling other kinds of palaeontological and paleoecological data.

See the chart for an example of how they look like.

An example plot using the R stratigraph package
An example plot using the R stratigraph package

ArcheoFOSS 2010: back from Foggia

ArcheoFOSS 2010, the 5th Italian workshop on “Free software, open source e open format nei processi di ricerca archeologica” took place in Foggia, on the 6 and 7 May. First of all, it was very good. I’m satisfied with this meeting. Why? Here are some thoughts I sketched while traveling back to Siena.

Lots of talks were about the results and methods of research done by MA and PhD students (myself included) – and this means one of the most important pieces of research, perhaps the most important at all, and the most underrated at the same time. Our community shows a strong connection between education and research. Making this connection stronger is part of our habits, I believe

There was a lot of discussion about methodology, and thanks to the firm experience of our friends in Foggia we have gone beyond some stereotypes of the past years. Take for example the recognition that methodology means much more than recording, documentation or technical tools. Add the acceptance of plurality as a (positive) fact rather than a problem. End up with the epiphany that using similar tools (e.g. databases, GIS) doesn’t mean working with the same underlying methodological mindset. In Italy we have a very bad habit of not having a debate about method and theory, but with this workshop we’re clearly building a place open for discussion.

We are well distributed geographically (from many regions of Italy) and chronological/disciplinary (from prehistoric to medieval archaeology, both excavation and landscape archaeologists). Despite this variability, there are some strong groups that are references for the whole community. I firmly believe that the University of Foggia should be listed among these groups since now. Even more interestingly, there are new groups of people that look very promising for their novel approach (I am glad to see that even my department could now be listed here). The ArcheoFOSS workshop is already acting as an incubator for innovation, and in the future we will see more of that, because of the large number of young researchers involved, the friendly and encouraging environment that is perhaps even more interesting than “open archaeology” for Italian academia. Or maybe it’s just part of the “open archaeology” agenda.

Free software works. It works from a technical perspective, obviously, but also from a social one. We have been learning its limits, its potential and the ways to improve it and share it. There’s a political vein in free software, and it’s so well combined with the need for a new way of doing research in archaeology. On the technical side, I am more and more excited about how creativity is encouraged, instead of being pre-ordered. We are doing humanities – it would be so silly to lose our creativity (also when it goes towards chaos and anarchy), in the name of a pseudo-scientific strictness born out of a great misunderstanding. We already won one bet since the early 2000, but now we can play with something even more important: not just sharing software and methods, but sharing knowledge. This is our target for 2020, and what we are going to do for the next decade.

Lastly, we’re learning how to act in the real world, and not just discuss among ourselves. Take for example the creation of common tools for creating catalogues. we can do that from the bottom up, with a wide perspective that is going to comprise technical standard, conservation and research needs – all as free software and open formats. grupporicerche already proposed some work in this direction last year, and we invite again all those who have developed databases for archaeological purposes to share them.

What’s missing? Of course, we have lots of areas for improvement. This is also because of the “multidimensional” approach of this initiative. Here I list some topics that I’m particularly interested in:

  • quantitative and statistical methods: let’s take back maths into archaeology through computing! This is not to say that archaeology can be reduced in numerical terms, but on the contrary to better define the complexity we are dealing with, giving the right weight to “data” (whatever that means) and developing proper archaeological ideas
  • an inter-regional and international approach, to deal with big not-so-big research themes in a collaborative way
  • encouraging the upgrade of old databases from obsolete, proprietary formats to open and free formats, ready for dissemination on the web
  • build a technological infrastructure for sharing our work, in the many forms it can take – or at least develop best practices for doing that on our own, taking accessibility and sustainability into account since day #0

More comments, insights and excerpts from the round table to follow in the next few days.

This post was originally published at iosa.it.

Longobardi? No, grazie

A Palazzo Bricherasio, Torino, fino a gennaio c’è una mostra dal titolo Longobardi.

Se avete un fine settimana libero, e magari trovate bel tempo come abbiamo trovato noi pochi giorni fa, andate a Torino. La città è sempre più bella ed è veramente un piacere girare per le vie del centro. Alla mostra, invece, non andateci. E vi spiego perché.

Continua a leggere Longobardi? No, grazie

Castori e Cultura Materiale

E per finire, che dire dei castori che costruiscono dighe modificando l’ambiente per un beneficio non immediato (cibo e tane sicure)? Hanno un progetto? Le dighe sono manufatti? Come avviene la trasmissione del sapere tecnico fra i castori?


Forse queste sono solo provocazioni prive di risposta e le dighe non sono certo fatte con le mani, ma possono divenire spunti utili per ragionare su cosa significhi l’adozione delle tecniche…

Enrico Giannichedda, Uomini e cose, 2006.

Al castello di Bogli

Oggi ho fatto una escursione al castello di Bogli. La camminata non è brevissima e la strada è interrotta in alcuni punti, rendendo il tragitto poco agevole.

Del castello non c’è praticamente niente, se non qualche cresta di muro visibile in superficie. “Castello” è forse eccessivo come denominazione, viste le piccole dimensioni penso si tratti di una torre a controllo del fondovalle. La postazione è costruita su un tratto del pendio naturalmente rialzato, e in più c’è un ampio fossato scavato artificialmente nella roccia sul versante a monte.

Qualche “genio” ha pensato bene di scavare qui un paio di buche alla ricerca di tesori, senza rendersi conto che in queste torri di avvistamento stavano disgraziati di infima classe sociale, costretti a passare interi inverni isolati, senza alcun tipo di comfort né tanto meno di oggetti preziosi.

Ho scattato un autoritratto commemorativo:


ex novo

Bella, bella bella idea davvero! Quando ho mostrato a Laura il sito ex-novo.org, lei mi ha risposto “C’è vita su Marte!”. Sì, evidentemente, c’è vita. Incredibile a credersi, ma non siamo soli in questa valle di lacrime. Sei pezzi (giornalistici, per fortuna non sono articoli di rivista scientifica), alcuni dal vago – e dimenticato – sapore di pamphlet. E una domanda, fatta ad alta voce, che dovrebbe farsi un po’ più spesso e con un po’ più di umiltà: Perché l’archeologia?.

Non è solo voglia di farsi del male, e non serve nemmeno ripetersi stancamente che l’università fa schifo, che le soprintendenze fanno schifo, che il ministero fa schifo, che le leggi fanno schifo. Si può fare di meglio, si può cercare di uscire dalla crisi (Brogiolo), a testa bassa. Ci vuole sano e critico pessimismo sul presente, e un altrettanto sano e critico ottimismo sul futuro. Ci vuole che i professori insegnino: un mestiere, un modo di pensare e di guardare al mondo, il presente e vivo – non quel che fu.

L’articolo di Azzena strappa qualche sorriso e squarcia qualche illusione (ma chi ne ha ancora?), anche se si esaurisce in se stesso senza dare adito a critiche, ed è un difetto di tanta archeologia italiana, purtroppo.

A seguire un po’ di critica su Mario Torelli, l’archeologia e il PCI.


La giornata di oggi, intendo. ll momento terapeutico, dicevamo. Penso che al momento nessuno conosca l’esistenza di questo blog. Prima o poi forse ne invierò apposita comunicazione. Ierisera dopo la pizza sono andato dritto a letto, non prima di aver appuntato un paio di pensieri profondi. Oggi fino alle 16.30 sono rimasto in compagnia del prof. Parenti e dei miei compagni di corso. Dalle 16.30 in poi invece con il prof. Zanini, si è iniziata la registrazione del nuovo sito web per Gortyna Quartiere Bizantino, su piattaforma Mediawiki. Un bell’esperimento, devo dire.

Prima o poi dovrò anche trovare il tempo/volontà di andare dal prof. Fronza per fargli vedere la mia tesi, e parlare di eventuali progetti per la tesi o chi sa cosa altro. Nel frattempo il mio futuro è piuttosto nebuloso, anche se non mi pare affatto privo di determinazione.