Stefano Costa

There's more than potsherds out here

Faccio l’archeologo e vivo a Genova

MAO Torino – Klavika

View post on imgur.com

The room signs and panel headings at the MAO Torino are typeset in Klavika.

07/01/2015
Flickr selling prints of Creative Commons pictures: a challenge, not a problem
A few weeks ago Flickr, the most popular photo-sharing website, started offering prints of Creative Commons-licensed works in their online shop, among other photographs that were uploaded under traditional licensing terms by their authors.

In short, authors get no compensation when one of their photographs is printed and sold, but they do get a small attribution notice. It has been pointed out that this is totally allowed by the license terms, and some big names seem totally fine with the idea of getting zero pennies when their work circulates in print, with Flickr keeping any profit for themselves.

Some people seemed actually pissed off and saw this as an opportunity to jump off the Flickr wagon (perhaps towards free media sharing services like Mediagoblin, or Wikimedia Commons for truly interesting photographs). Some of us, those who have been involved in the Creative Commons movement for years now, had a sense of unease: after all, the “some rights reserved” were meant to foster creativity, reuse and remixes, not as a revenue stream for Yahoo!, a huge corporation with no known mission of promoting free culture. I’m in the latter group.

But it’s OK, and it’s not really a big deal, for at least two reasons. There are just 385 pictures on display in the Creative Commons category on the Flickr Marketplace, but you’ve got one hundred million images that are actually available for commercial use. Many are beautiful, artistic works. Some are just digital images, that happen to have been favorited (or viewed) many times. But there’s one thing in common to all items under the Creative Commons label: they were uploaded to Flickr. Flickr is not going out there on the Web, picking out the best photographs that are under a Creative Commons license, or even in the public domain, I guess they are not legally comfortable with doing that, even if the license totally allows it. In fact, the terms and conditions all Flickr users agreed to state that:

[…] you give to Yahoo the following licence(s):
- For photos, graphics, audio or video you submit or make available on publicly accessible areas of the Yahoo Services, you give to Yahoo the worldwide, royalty-free and non-exclusive licence to use, distribute, reproduce, adapt, publish, translate, create derivative works from, publicly perform and publicly display the User Content on the Yahoo Services
That’s not much different from a Creative Commons Attribution license, albeit much shorter and EULA-like.

In my opinion, until the day we see Flickr selling prints of works that were not uploaded to their service, this is not bad news for creators. Some users feel screwed, but I wouldn’t be outraged, not before seeing how many homes and offices get their walls covered in CC art.

The second reason why I’m a bit worried about the reaction to what is happening is that, uhm, anyone could have been doing this for years, taking CC-licensed stuff from Flickr, and arguably at lower prices (17.40 $ for a 8″ x 10″ canvas print?). Again, nobody did, at least not on a large scale. Probably this is because few people feel comfortable commercially appropriating legally available content ‒ those who don’t care do this stuff illegally anyway, Creative Commons or not. In the end, I think we’re looking at a big challenge: can we make the Commons work well for both creators and users, without creators feeling betrayed?

Featured image is Combustion [Explored!] by Emilio Kuffer.
18/12/2014
Sono tornati i daini

Sono tornati i daini qui a Torriglia.

Solo pochi giorni fa dicevamo con mio padre che da molti mesi non si vedono più daini, mentre fino alla primavera erano un incontro quotidiano. Lui diceva anche che ultimamente ha sentito bramire i caprioli.

Poi l’altroieri ho visto tre daini nel prato di Parodi a Casabianca. E ieri sera li ho rivisti alla Crocetta, e dopo cena erano sotto casa nostra, come se ci avessero sentito e volessero farci sapere che non è vero niente, loro ci sono.

Chissà perché sono comparsi così all’improvviso. Qui nei giorni scorsi era ancora caldo, ma può darsi che in quota avesse già iniziato a fare più freddo, o che l’erba sia finita o comunque sia la stagione di scendere a valle, qualunque cosa voglia dire la stagione, ormai. Oggi tira vento e potrebbe fare freddo davvero tra qualche giorno. Speriamo.

Non è che prima non mi interessassero i daini. Ma dopo essere stato in Patagonia gli animali selvatici non sono più gli stessi, nemmeno qui. Lì ho visto come possono vivere, anche in un ambiente comunque antropizzato, con allevamento di animali al pascolo. E ho l’impressione che qui ci sia qualcosa che non va.

Sono contento che siano tornati i daini. Oggi ho perso ore a cercare un paio di foto che ero sicuro di avere scattato qualche anno fa, volevo usarle per illustrare questa pagina di blog ‒ alla fine le ho trovate.

03/12/2014
Yet another failure for cultural heritage data in Italy

This short informative piece is written in English because I think it will be useful for anyone working on cultural heritage data, not just in Italy.

A few days ago the Istituto Centrale per il Catalogo e la Documentazione published an internal document for all offices in the Ministry of Culture (actual name is longer, but you got it), announcing imminent changes and the beginning of a process for publishing all records about cultural heritage items (I have no idea on the exact size but we’re in the millions of records). In short, all records will be publicly available, and there will be at least one image for each record ‒ you’ll get anything from small pieces of prehistoric flint to renaissance masterpieces, and more. That’s a huge step and we can only be happy to see this, the result of decades of cataloguing, years of digital archiving and … some lobbying and campaigning too. Do you remember Beni Culturali Aperti? The response from the ICCD had been lukewarm at best, basically arguing that the new strong requirements for open government data from article 68 of the Codice dell’Amministrazione Digitale did not apply at all to cultural heritage data. So nobody was optimistic about the developments to follow.

·

And unfortunately pessimism was justified. Here’s an excerpt from the document published last week:

Nota prot. n. 2975 del 17/11/2014 dell’Istituto Centrale per il Catalogo e la Documentazione

relevant sentence:

Le schede di catalogo verranno rese disponibili con la licenza Creative Commons CC BY-NC-SA

that would be

Catalog records will be made available under the Creative Commons CC BY-NC-SA license

And that was the (small) failure. CC BY-NC-SA is not an open license. The license makes commercial (= paid!) work with such data impossible or very difficult, at a time when the cultural heritage private sector could just benefit from full access to this massive dataset, with zero losses for the gatekeepers. At the same time when we have certified that open licenses are becoming more and more widespread and non-open licenses like BY-NC-SA are used less and less because they’re incompatible with anything else and inhibit reuse, someone decided that it was the right choice, against all internationa, European and national recommendations and regulations. We can only hope that a better choice will be made in the near future, but the record isn’t very encouraging, to be honest.

23/11/2014
Archaeology in the Mediterranean: I don’t wanna drown in cold water

This post is the second half of the one I had prepared for this year’s Day of Archaeology (Archaeology in the Mediterranean: do not drown if you can). For an appropriately timed mistake, I only managed to post the first, more relaxed half of the text. Enjoy this rant.

Written and unwritten rules dictate what is correct, acceptable and ultimately recognised by your peers: it is never entirely clear who sets research agendas for entire disciplines, but ‒ just to be more specific ‒ I feel increasingly stifled by the “trade networks” research framework that has dominated Late Roman pottery studies for the past 40 years now. Invariably, at any dig site, there will be from 1 to 100,000 potsherds from which we should infer that said site was part of the Mediterranean trade network. We are all experts about our “own” material, that is, the finds that we study, and apart from a few genuine gurus most of us have a hard time recognising where one pot was made, what is the exact chronology of one amphora, and so on. But those gurus, as leaders, contribute to setting in stone what should be a temporary convention as to what terminology, chronology and to a larger extent what approach is appropriate. I can hear the drums of impostor syndrome rolling in the back.

I don’t want to drown in this sea of small ceramic sherds and imaginary trade networks, rather I really need to spend time understanding why those broken cooking pots ended up exactly where we found them, in a certain room used by a limited number of people, in that stratigraphical position.

At the same time, I’m depressingly frustrated by how mechanical and repetitive the identification of ceramic finds can be: look at shape, compare with known corpora, look at fabric, compare with more or less known corpora. If any, look at decoration, lather, rinse, repeat. My other self, the one writing open source computer programs, wonders if all of this could not be done by a mildly intelligent algorithm, liberating thousands of human neurons for more creative research. But this is heresy. We collectively do our research and dissemination as we are told, with sometimes ridiculously detailed guidelines for the preparation of digital illustrations that end up printed either on paper or on PDF (which is the same thing). Our collective knowledge is the result of a lot of work that we need to respect, acknowledge, study and pass on to the next generation.

At the end of the obligations telling you how to study your material, how to publish it, and ultimately how to think about it, you could just be happy and let yourself comfortably drown into the next grant application. Don’t do that. Do more. Follow your crazy idea and sail the winds of Mediterranean archaeology.

19/11/2014
Targhe delle strade di Genova. Tipografia della lettera A
Da qualche settimana ho iniziato a collezionare lettere A. Le prendo dalle targhe delle strade di Genova e sto cercando di farmi guidare da queste “prime della classe” per prendere confidenza con la storia tipografica delle targhe, soprattutto di quelle più antiche ‒ approssimativamente datate prima del 1945. C’è qualcosa di affascinante nell’idea che queste targhe siano un unico smisurato testo steso per tutta la città, un palinsesto scritto in momenti diversi ma fatto per essere letto oggi.

Da questa serie si notano alcuni elementi interessanti, soprattutto il passaggio dalla A con testa piatta a quella acuta. Le datazioni che ho abbozzato per ora sono poco più che ipotetiche, così come le riproduzioni dei caratteri che ho raccolto (da vero neofita della tipografia). È certamente possibile che ci siano ampie sovrapposizioni di tipi nel tempo, anche se chiaramente ci sono stati dei momenti di impulso ordinatore e omologatore. Il mio tipo preferito è di gran lunga il secondo nell’immagine sotto, il più diffuso nel centro storico.

La forma della lettera A

Non so se esistano dei lavori dedicati a questo argomento, finora non ne ho trovati. Sto procedendo con metodo stratigrafico (poteva essere altrimenti?) e questo è naturalmente frustrante perché non permette datazioni precise se non avendo a disposizione una discreta quantità di dati, che non ho ancora. Mi sono sembrate molto interessanti quelle strade in cui in punti diversi si trovano targhe con tipi diversi (es. via Corsica e via San Vincenzo, entrambe interessate dalla costruzione di via XX Settembre).
1. Se un tipo è usato su una targa dedicata a una persona morta
  nell’anno X, il tipo va considerato in uso dopo quella data e non a
  quella data esatta.
2. Se un tipo è usato su una targa di una strada costruita nell’anno X,
  il tipo va considerato in uso dopo quella data.
3. Se un tipo non compare su targhe databili dopo l’anno X,
  probabilmente è andato fuori uso intorno all’anno X.
4. Se un tipo compare su un edificio costruito nell’anno X, non possiamo trarne alcuna informazione, in mancanza di indicazioni più precise.
Tra gli eventi più significativi per l’urbanistica e la toponomastica di Genova sono certamente le due espansioni del 1873 e del 1926 ‒ sulla base di quelle è possibile ad esempio osservare i quartieri di Marassi e Staglieno (annessi nel 1873), Molassana (annesso nel 1926). Girare per le strade, fotografare, prendere appunti… tutte cose non veloci. Ad un certo punto farò anche due passi a Staglieno, ovviamente.
08/10/2014
Migrating a database from FileMaker Pro to SQLite
FileMaker Pro is almost certainly one of the least interoperable cases of proprietary database management software. As such, it is the worst choice you could make to manage your data as far as digital preservation goes. Period.

There is data to be salvaged out there. If you find data that you care about, you’ll want to migrate content from FileMaker Pro to another database.

Things are not necessarily easy. If you work primarily on GNU/Linux (Debian in my case) it may be even more difficult. It can be done, but not without proprietary software.

Installing FileMaker Pro

You can get a 30-days free trial of FileMaker Pro (FMP): you may need to register on their website with your e-mail address. It is a regular .exe installer.

Please note that, while other shareware programs exist to extract data from a FileMaker Pro database, there is absolutely no way to do it using free and open source software, and as far as I know nobody has ever done any reverse engineering on the format. Do not waste your time trying to open the file with another program: the FileMaker Pro trial is your best choice. Also, do not waste your time and money buying another proprietary software to “convert” or “export” your data: FileMaker Pro can be used to extract all of your data, including any images that were stored in the database file, and the trial comes at no cost if you already have one of the supported operating systems. After all, it is proprietary so it is appropriate to use the native proprietary program to open it.

Extracting data

Alphanumeric data are rather simple to extract, and result in CSV files that can easily be manipulated and imported by any program. Be aware however that you have no way to export your database schema, that is the relationships between the various tables. If you only have one table, you should not have used a database in the first place, but that’s another story.

Make sure FMP is installed correctly. Open the database file.
1. Go to the menu and choose File → Export records …
2. Give a name for the exported file and make sure you have selected Comma-separated values (CSV) as export format
3. A dialog will appear asking you to select which fields you want to export.
  - Make sure that “UTF-8” is selected at the bottom
  - On the left you see the available fields, on the right the ones you have chosen
  - Click Move all ‒ you should now see the fields listed at the bottom right. If you get an error complaining about Container fields, do not worry, we are going to rescue your images later (see below)
  - Export and your file is saved.
4. Take a look at the CSV file you just saved. It should open in Notepad or any other basic text editor. A spreadsheet will work as well and may help checking for errors in the file, especially encoding errors (accented letters, special characters, newlines inside text fields, etc.)
5. Repeat the above steps for each table. You can choose the table to export from using the drop-down list in the upper left of the export dialog.
Extracting images from a Container Field in FileMaker Pro

If, for some unfortunate reason, image files have been stored in the same database file using a Container Field, the normal export procedure will not work and you will need to follow a slightly more complex procedure:
- Exporting the Field Contents of a Container Field – How can I export multiple images from my container field?
```
 Go to Record/Request/Page [First]
 Loop
      * Set Variable [$filePath; Value: Get ( DesktopPath ) MyPics::Description & ".jpg"]
    ** Export Field Contents [MyPics::Picture; “$filePath”]  
        Go to Record/Request/Page [Next; Exit after last]
End Loop
```
```
*Set Variable Options: Name: $filePath
 Value: Use one of the following formulas
```
```
Mac: Get ( DesktopPath ) & MyPics::Description & ".jpg"

 Windows: "filewin:"& Get ( DesktopPath ) & MyPics::Description & ".jpg"

 Mac and Windows: Choose ( Abs ( Get ( SystemPlatform ) ) -1 ;
```
```
     /*MAC OS X*/
      Get ( DesktopPath ) & MyPics::Description & ".jpg"
 ;
      /*WINDOWS*/
      "filewin:"& Get ( DesktopPath ) & MyPics::Description & ".jpg"
 )

 Repetition: 1

 **Export Field Contents Options:
 Specify target field: Picture
 Specify output File: $filePath
```
Migrating to SQLite

SQLite is a lightweight, file-based real database (i.e. based on actual SQL). You can import CSV data in SQLite very easily, if your starting data are “clean”. If not, you may want to look for alternatives.

Appendix: if you are on GNU/Linux

If you are on GNU/Linux, there is no way to perform the above procedure, and you will need a working copy of Microsoft Windows. The best solution is to use VirtualBox. In my case, I obtained a copy of Microsoft Windows XP and a legal serial number from my university IT department. The advantage of using VirtualBox is that you can erase FileMaker Pro and Windows once you’re done with the migration, and stay clear of proprietary software.

Let’s see how to obtain a working virtual environment:
1. Install VirtualBox. On Debian it’s a matter of sudo apt-get install virtualbox virtualbox-guest-additions-iso
2. Start VirtualBox and create a new machine. You will probably need to do sudo modprobe vboxdrv (in a terminal) if you get an error message at this stage
3. Install Windows in your VirtualBox. This is the standard Windows XP install and it will take some time. Go grab some tea.
4. … some (virtual!) reboots later …
5. Once Windows is installed, make sure you install the VirtualBox Guest Additions from the Devices menu of the main window. Guest Additions are needed to transfer data between your regular directories and the virtual install.
6. Install the FMP trial and reboot again as needed. Then you can open the database file you need to convert and follow the steps described above
04/09/2014
Low back pain

I have been going through an acute event of low back pain a few months ago (the so-called colpo della strega), and I’m slowly recovering to normality ‒ still no lifting of heavy weights for me. It hurt me a lot, suddenly, but in retrospect it was not a surprise, because I had been having mild pain for months now and I know since 2010 that there’s a beginning of slipped disc at L5-S1 in my spine.

I know this is very common, but I cannot help thinking about the consequences of this health issue as an archaeologist. I don’t call myself a field archaeologist now, but I have been spending 2-3 months a year in the field for several years (2003-2010) and in 2009 I did that as a profession for a while (most of the other fieldwork was done with universities). Luckily enough, but without any actual plan, in 2009 I started accumulating some experience with ceramics and I took part in several campaigns doing that instead of digging. I like digging ‒ I know very well that I am far from being good at it, because I think too much and I’m not quite a fast “identify-clean-record-dig” type ‒ but I still like it a lot. And, the less I practice archaeological digging (10 sparse days last year), the more I idealise it as the real archaeology.

Obviously, the idea that archaeology is restricted to fieldwork is wrong, but I’m only fortunate that I have a job and I am not forced to prove this truth.

It hurts.

03/09/2014
Foto libere nei musei, ennesima occasione persa
Nella bozza di decreto con DISPOSIZIONI URGENTI PER LA TUTELA DEL PATRIMONIO CULTURALE, LO SVILUPPO DELLA CULTURA E IL RILANCIO DEL TURISMO compare come è noto anche la voce Misure urgenti per la semplificazione in materia di beni culturali e paesaggistici, che contiene tre norme diverse, del tutto slegate tra loro e a mio parere sintomo di una incapacità politica di leggere il mondo dei beni culturali.

Le tre voci sono le “foto libere”, un indebolimento (ennesimo) delle procedure di autorizzazione paesaggistica e la diminuzione da 40 a 30 anni del termine di consultazione per i documenti giudiziari in archivio. Sui punti 2 e 3 non ho niente da dire. Sul punto 1 ho qualcosa da dire, aggiungendo alle ampie riflessioni di Luca Corsato in cui è spiegato chiaramente che tutti gli usi più credibili e buoni delle immagini dei beni culturali (Wiki Loves Monuments, Invasioni Digitali, OpenGLAM, …) sono completamente esclusi da questo regime di piccole concessioni.

Riporto per intero il passaggio in questione dalla bozza di decreto pubblicata da Tafter:

Sono libere, al fine dell’esecuzione dei dovuti controlli, le seguenti attività, purché attuate senza scopo di lucro, neanche indiretto, per finalità di studio, ricerca, libera manifestazione del pensiero o espressione creativa, promozione della conoscenza del patrimonio culturale:

1) la riproduzione di beni culturali attuata con modalità che non comportino alcun contatto fisico con il bene, né l’esposizione dello stesso a sorgenti luminose,né l’uso di stativi o treppiedi;

2) la divulgazione con qualsiasi mezzo delle immagini di beni culturali, legittimamente acquisite, in modo da non poter essere ulteriormente riprodotte dall’utente se non, eventualmente, a bassa risoluzione digitale.”

Quindi il vertice politico del Ministero dei Beni e delle Attività Culturali e del Turismo ritiene che lo studio, la ricerca, la libera manifestazione del pensiero, l’espressione creativa e (tremo al pensiero) la promozione della conoscenza del patrimonio culturale siano attività che devono essere primariamente svolte senza scopo di lucro. Ovviamente questo non esclude di esercitarle a scopo di lucro, ma solo con esplicita autorizzazione e, immaginiamo, pagamento. Si dichiara abbastanza chiaramente che:
- l’espressione creativa basata sul più grande e magnifico patrimonio culturale del mondo (come dice chi non ha niente da dire) possa avere scopo di lucro solo se preventivamente autorizzata;
- la promozione della conoscenza del patrimonio culturale, ovvero molto semplicemente far conoscere agli altri, al mondo, agli amici le cose belle che vediamo in giro, sia ugualmente sottoposta allo stesso regime di preventiva autorizzazione allo scopo di lucro;
- che la ricerca sia alla pari con le altre attività, con buona pace delle decine e decine di imprese che fanno ricerca d’avanguardia sul patrimonio culturale, partecipando a bandi e progetti europei ‒ quasi unico paese in Europa in cui siano vigenti restrizioni di questo genere.
Ma non è tutto, perché passando al punto 2), le “riproduzioni” diventano “immagini” (e quindi immaginiamo che le disposizioni del punto 2 non si applichino alle riproduzioni che non sono immagini, es. le stampe 3D, il testo di un documento copiato in digitale…). Queste immagini possono essere divulgate con qualsiasi mezzo, dalla stampa alla proiezione su maxischermo inclusa la pubblicazione su Internet, solo se nessuno può ulteriormente riprodurle. Evidentemente mancano le basi tecniche fondamentali per capire che quando Tizio pubblica sul suo sito web personale una foto, nel momento in cui Caio la visualizza sul suo computer o tablet ha già creato una copia identica del file sul proprio dispositivo. Inoltre, poiché di fatto la stragrande maggioranza di queste condivisioni avviene tramite social network, le immagini pubblicate sono automaticamente copiate sui server e da lì diffuse attraverso ulteriori copie sui dispositivi di tutti gli utenti che visualizzano le immagini. La stessa indicazione di “utente” è al tempo stesso fuorviante e priva di senso, poiché la maggior parte delle operazioni di elaborazione e trasferimento dei contenuti sono completamente automatiche.

Passando dagli aspetti squisitamente tecnici a quelli di natura contrattuale, facciamo un esempio con Twitter per capire meglio l’assurdità del decreto.

I termini di servizio di Twitter, al punto 5 recitano:

Con l’invio, la pubblicazione o visualizzazione di Contenuti sui Servizi, o mediante gli stessi, l’utente concede a Twitter una licenza mondiale, non esclusiva e gratuita (con diritto di sublicenza) per l’utilizzo, copia, riproduzione, elaborazione, adattamento, modifica, pubblicazione, trasmissione, visualizzazione e distribuzione di tali Contenuti con qualsiasi supporto o metodo di distribuzione (attualmente disponibile o sviluppato in seguito).

…

L’utente accetta che i propri Contenuti potranno essere condivisi, trasmessi, distribuiti o pubblicati dai partner di Twitter e si assume le eventuali responsabilità che possono derivargli qualora non abbia il diritto di fornire Contenuti per detto utilizzo.

Di fatto, la bozza di decreto ci proibisce di pubblicare una immagine di bene culturale su Twitter. Altrettanto di fatto, questa norma viene descritta da tanti (esempio 1, esempio 2) come “libera selfie” o comunque favorevole proprio alla condivisione sui social network. Non capisco bene come sia possibile un fraintendimento così esagerato, sia da parte di chi ha prodotto questo testo sia da parte di alcuni commentatori entusiasti. Siamo, a mio personale parere, di fronte all’ennesima occasione persa di una vera riforma della normativa sulla riproduzione dei beni culturali. Questo fatto potrebbe sembrare strano alla luce della continua e ripetuta importanza strategica dei beni culturali come struttura portante per il rilancio dell’Italia, anche dal punto di vista economico e turistico. Di fatto, ai proclami politici non sembra seguire una adeguata analisi delle richieste del settore. Permane una pretesa di controllo totale, anche nei casi in cui palesemente non sussiste alcuna possibilità di immediata monetizzazione della riproduzione, come nel classico caso del museo in cui non si possono scattare foto ma non è nemmeno disponibile un catalogo stampato da acquistare. In più, viene dato un segnale debole alle istituzioni decentrate, che temo ci porterà dagli attuali bruttissimi segnali “Vietato fotografare” a segnali e/o volantini molto più prolissi in cui saranno spiegate le modalità in cui è possibile chiedere l’autorizzazione a fare una cosa normalissima: fotografare quello che ci piace. E rimarrà quindi nella coscienza collettiva anche una distorsione del tutto inspiegabile, cioè il divieto/obbligo di permesso per scattare foto dentro il museo tout court, laddove la norma è specifica sui beni culturali e non riguarda minimamente il museo come struttura, spesso semplice contenitore di oggetti nonostante idealizzazioni teoriche lontane dalla realtà.

Ciliegina sulla torta: viene concessa la possibilità di riproduzione a bassa risoluzione digitale, con sentite scuse ai possessori di un display retina. Non ci resta che far stampare i nostri selfie a 1200 dpi e distribuirli agli amici e “amici”. Che non potranno nemmeno rivenderli come carta straccia per liberarsene, ma dovranno eliminarli senza scopo di lucro.
27/05/2014
I tracciati ICCD con la shell Unix e Python
Il trasferimento delle schede di catalogo ICCD avviene usando file di testo chiamati tracciati. Può essere utile sapere come trattare questi tracciati con gli strumenti più semplici a disposizione in un ambiente Unix.

L’argomento non è particolarmente eccitante e fortunatamente sembra destinato a diventare presto obsoleto con l’introduzione di un formato XML (peraltro, già obsoleto), ma qualche appunto tecnico può essere utile. I comandi della shell Unix permettono di ricavare informazioni di base e poi possiamo procedere con passaggi più elaborati in Python.

Nel caso di un singolo file di tracciato, per contare il numero di schede contenute possiamo fare riferimento al paragrafo CD, obbligatorio, che introduce ogni scheda:
```
$ cat tracciato.trc | grep -c -e "^CD:"
1527
```
La parte principale del comando è l’espressione regolare ^CD: che unita all’opzione -c di grep permette di contare il numero di schede (nel nostro caso, 1527). L’espressione regolare è necessaria per evitare di includere nei risultati anche altre parti del tracciato, inclusi altri campi (ad esempio DSCD), come mostrato nell’esempio successivo:
```
$ cat tracciato.trc | grep -c "CD:"
2680
```
Ricaviamo un elenco dei numeri NCTN delle schede, usando un’espressione regolare analoga e il comando cut, indicando lo spazio come carattere delimitatore e selezionando solo il secondo campo:
```
cat tracciato.trc | grep -e "^NCTN:" | 
cut -d " " -f 2
```
La lista prodotta dal comando sarà piuttosto lunga ed è meglio salvarla in un file a parte, usando una semplice pipe:
```
cat tracciato.trc | grep -e "^NCTN:" | cut -d " " -f 2 > nctn.txt
```
È importante controllare la correttezza dei dati. Secondo la normativa ICCD (sia la vecchia versione 2.00 del 1992, sia la nuova versione 3.00) il campo NCTN deve essere composto da 8 cifre, quindi al numero vero e proprio devono essere aggiunti zeri. In questo modo il numero 13887 diventa nel campo NCTN 00013887 e così via.

Di fatto, dal punto di vista informatico, 00013887 non è un numero intero (che sarebbe invariabilmente memorizzato come 13887) ma una stringa di 8 caratteri in cui i singoli caratteri sono numerici (una considerazione analoga vale, ad esempio, per i CAP). Purtroppo spesso accade che questa prescrizione rimanga disattesa, perché il campo NCTN viene erroneamente interpretato come un campo numerico, ad esempio nella creazione di un database.

Nel caso a cui stavo lavorando nei giorni scorsi effettivamente abbiamo un po’ di tutto: 5 cifre, 7 cifre, 8 cifre.

Nella seconda parte di questo post ci trasferiamo in una sessione IPyhon (in inglese) dove vediamo come elaborare ulteriormente i dati che abbiamo salvato nel file nctn.txt per ottenere un elenco leggibile dei numeri NCTN: prosegui la lettura su nbviewer.

Ricordo infine che una introduzione molto leggibile alla catalogazione ICCD è quella di Diego Gnesi Bartolani, che si può scaricare in PDF dal suo sito web.
22/05/2014