I’ve now located the minimum data sources needed to start putting together the neural network for this system. I now need to consider how to sample & shape this data. To this end I’ve roughed out a graph – it’s short on details and will undoubtedly change, but should be enough to decide on how to handle the inputs.
As an overall development strategy, I’m starting with a target of the simplest thing that could possibly work, and iteratively moving towards something with a better chance of working.
I’ve not yet had a proper look at what’s available as archived data, but I’m pretty sure what’s needed will be available. The kind of anomalies that precede earthquakes will be relatively rare, so special case signals will be important in training the network. However, the bulk of training data and runtime data will come come from live online sources.
Prior work (eg OPERA) suggests that clear radio precursors are usually only associated with fairly extreme events, and even those are only detectable using traditional means for geographically close earthquakes. The main hypothesis of this project is that Deep Learning techniques may pick up more subtle indicators, but all the same it makes sense to focus initially on more local, more significant events.
The Istituto Nazionale di Geofisica e Vulcanologia (INGV) provides heaps of data, local to Italy and worldwide. A recent event list can be found here. Of what they offer I found it easiest to code against their Atom feed which gives weekly event summaries. (No surprise I found it easiest, I had a hand in the development of RFC4287 🙂
I’ve put together some basic code for GETting and parsing this feed.
The go-to site for natural ELF/VLF radio information is vlf.it and it’s maintainer Renato Romero has a station located in northern Italy. The audio from this is streamed online (along with other channels) by Paul Nicholson. Reception, logging and some processing of this data is possible using Paul’s VLF Receiver Software Toolkit. I found it straightforward to get a simple spectrogram from Renato’s transmissions using these tools. I’ve not set up a script for logging yet, but I’ll probably get that done later today.
It will be desirable to visualise the VLF signal to look for interesting patterns and the best way of doing this is through spectrograms. Conveniently, this makes the problem of recognising anomalies essentially a visual recognition task – the kind of thing the Deep Learning literature is full of.
The Provisional Graph
Here we go –
This is what I’m picturing for the full training/runtime system. But I’m planning to set up pre-training sessions. Imagine RNN 3 and its connections removed. On the left will be a VLF subsystem and on the right a seismic subsystem.
In this phase, data from VLF logs will be presented as a set of labeled spectrograms to a multi-layer convolutional network CNN. VLF signals contain a variety of known patterns, which include:
- Man-made noise – the big one is 50Hz mains hum (and its harmonics), but other sources include things like industrial machinery, submarine radio transmissions.
- Sferics – atmospherics, the radio waves caused by lightning strikes in a direct path to the receiver. These appear as a random crackle of impulses.
- Tweeks – these again are caused by lightning strikes but the impulses are stretched out through bouncing between the earth and the ionosphere. They sound like brief high-pitched pings.
- Whistlers – the impulse of a lightning strike can find its way into the magnetosphere and follow a path to opposite side of the planet, possibly bouncing back repeatedly. These sound like descending slide whistles.
- Choruses – these are caused by the solar wind hitting the magnetosphere and sound like a chorus of birds or frogs.
- Other anomalous patterns – planet Earth and it’s environs are a very complex system and there are many other sources of signals. Amongst these (it is assumed here) will be earthquake precursors caused by geoelectric activity.
Sample audio recordings of the various signals can be found at vlf.it and Natural Radio Lab. They can be quite bizarre. The key reference on these is Renato Romero’s book Radio Nature – strongly recommended to anyone with any interest in this field. It’s available in English and Italian (I got my copy from Amazon).
So…with the RNN 3 path out of the picture, it should be feasible to set up the VLF subsystem as a straightforward image classifier.
On the right hand side, the seismic section, I imagine the pre-training phase being a series of stages, at least with: seismic data->RNN 1; seismic data->RNN 1->RNN 2. If you’ve read The Unreasonable Effectiveness of Recurrent Neural Networks (better still, played with the code – I got it to write a Semantic Web “specification”) you will be aware of how good LSTMs can be at picking up patterns in series. But it’s pretty clear that the underlying system behind geological events will be a lot more complex than the rules of English grammar & syntax. But I’m (reasonably) assuming that sequences of events, ie predictable patterns do occur in geological systems. While I’m pretty certain that this alone won’t allow useful prediction with today’s technology, it should add information to the system as a whole in the form of probabilistic ‘shapes’. Work already done elsewhere would seem to bear this out (eg see A Deep Neural Network to identify foreshocks in real time).
Training & Prediction
Once the two subsystems have been pre-trained for what seems a reasonable length of time, I’ll glue them together, retaining the learnt weights. The VLF spectrograms will now be presented as a temporal sequence, and I strongly suspect the time dimension will have significance in this data, hence the insertion of extra memory in the form of RNN 3.
At this point I currently envisage training the system in real time using live data feeds. (So the seismic sequence on the right will be time now, and the inputs on the left will be now-n). I’m not entirely sure yet how best to flip between training and predicting, worst case periodically cloning the whole system and copying weights across.
A more difficult unknown for me right now is how best to handle the latency between (assumed) precursors and events. The precursors may appear hours, days, weeks or more before the earthquakes. While I’m working on the input sections I think I need to read up a lot more on Deep Learning & cross-correlation.