Localising Online Data – baby steps

While I’m intending to build a simple seismometer in the near future, there’s a lot of data already available online. Generally this seems to be available in two forms : near real-time output of seismometers and feeds of discrete events. I’ve not yet started investigating the former as the latter seems an easier entry point to start coding around.

I’m based in northern Italy, and as it happens there are some very good resources online for this region (hardly surprising, as earthquakes are a significant threat to life & property in Italy). A hub for these is the Istituto Nazionale di Geofisica e Vulcanologia (INGV).

Web Services

The service with the snappy name Event Federation of Digital Seismograph Networks Web Services  allows queries with a variety of parameters, returning data in a choice of three formats : QuakeML (a rich, dedicated XML), KML (another XML, Google’s version of GML) and text. Hat tip to the folks behind this, it’s Open Data (CC Attribution).

The INGV also offers a preset translation of the previous week’s events in Atom format (augmented with Dublin Core, W3C Geo & GeoRSS terms for event datetime and location). This is a minimal version of data from the Web service, but contains all I’m interested in for now: event magnitude, location & time. (I also have a personal bias towards Atom – it’s the first and I think only RFC in which my name appears 🙂

Relocating the Event

The potential utility of this project is about the risk of seismic events felt at a specific location. The feed gives the magnitude of the event at epicentre. It may be reasonable to process the raw data to reflect this.

Whether or not it will be better to use such data raw (i.e. separate  inputs for magnitude, latitude & longitude) or pre-processed as input to neural nets remains to be seen. But to flag up high risk at the target location, some combined measure is desirable.

The true values depend on a huge number of factors (for more on this see e.g. The attenuation of seismic intensity by Chiara Pasolini). As a first attempt at a useful approximation I’ll do the following:

  1. restrict data to events within 200km of the ‘home’ location
  2. apply an inverse-square function to the magnitude over the distance

Both are guesstimates of the relative event significance. Although major events beyond 200km may well have significant impact locally, for the purposes of prediction it seems reasonable to exclude these. For these purposes the data is going to be fed to a machine learning system which will be looking for patterns in the data. Intuitively at least it seems to make sense to reduce the target model to that of local geology rather than that of the whole Earth.

The recorded magnitude of events (as in the Richter magnitude scale) is more or less a log10 of the event power. Compensations to this made by measuring stations to allow for their distance aren’t exactly straightforward. But seismic shaking is very roughly similar to sound, and the intensity of sound at a distance d from a point source is proportional to 1/d^2. This may be wildly different from the true function even under ideal conditions. But it does introduce the distance factor.

(If you’re interested in looking into the propagation side of this further, the ElastoDynamics Toolbox for Matlab looks promising).

Anyhow, I’ve roughed this out for node.js, code’s up on github. A bit of tweaking of the fields was needed (e.g. the magnitude appears in the text of the <title> element). It looks like it works ok, but I’ve only done limiting testing, shoving the output into a spreadsheet and making a chart. The machine I’m working on right now is old and has limited memory, so is a bit slow to do much of that.


The red dot is the home location.