Kaggle Earthquake Prediction Challenge

I nearly had kittens when danbri pointed this challenge out to me. I’ve thought for a long time earthquake prediction was well in scope for machine learning and have been dismayed at how little uptake there has been. (Surprised too, given the co-location of many tech people and the San Andreas fault…). Hopefully this competition will change things. Deadline is in 3 months, pretty significant $ prizes.

My second reaction was: “Great! I can work on ELFQuake and maybe win a prize!”. But that isn’t the whole picture. The competition is based around data generated in the lab, essentially by squashing a rock and recording its fractures. Apparently a reasonable approximation of geological effects.

I forget offhand where, but I’ve also seen a project aiming to use machine learning on real data. But that project, as with this competition, I feel is missing a trick. My gut says that although, sure, algorithms can almost certainly be useful in predicting seismic events, using the seismic data alone for training is a blinkered approach. These events, in the real world, occur as the result of the behaviour of a massively complex system. There are practical limitations on what can be modeled, but I’d suggest that it’s possible to creep a little further into the real world by bringing in data from other natural sources. For example, does the position of the moon influence the timing of events? Does seem at least credible, given that its gravity is enough to pull the oceans around.

The data source I reckon looks most promising is natural radio, signals that have been shown to sometimes contain artifacts associated with subsequent seismic events. This is the hypothesis of this project ELFQuake (ELF for Extremely Low Frequency, it’s in this frequency range and that of VLF, Very Low Frequency that earthquake precursors seem to occur).

For me, the Kaggle challenge has acted as a nudge, to get me moving on ELFQuake software again. What’s more, material has already appeared for the basic setup needed to process this data – data that has a lot in common with the ELFQuake targets. This is very convenient for me, as although I’m now getting somewhat familiar with the principles and algorithms of Deep Learning, my practical experience is virtually non-existent. So I’ve been given a great foot-up. Here’s some material using sklearn : video, github. The toolkit that seems to me the most appealing option  is Keras on TensorFlow (on Anaconda), but a lot of the pre/post Python wrangling will be the same.

I’ve put my name down for the competition, it’s a bit of extra motivation – potential $$$s! – to work on this stuff, and also what I get together for it can be used as a placeholder in the end-to-end system I’m aiming for (seismic & radio sensors -> data acquisition -> [magic] -> Twitter notifications).

Happy New Year!

A good time to take stock, huddled in front of the fire.

Boo!

As is often the case, I’ve been moving more slowly on this project than I’d have liked. Lack of resources is a continuing problem, but my own tendency to procrastinate has been by far the biggest obstacle to progress. On top of this, my main dev computer packed up recently, so until I can get that fixed or replaced I’m getting things set up again on an old laptop. Frustrating.

Three Steps Forward…

My strategy of taking a multi-pronged approach has had its pros and cons. I’ve got a prototype VLF receiver mostly built and have spent quite a lot of time playing around with Arduinos and related devices. On the software side – which is really the novel aspect of this project – I did make reasonable progress, getting together a provisional system design and some of the implementation. But then stalled. My desire to build hardware to allow local data collection has been something of a distraction, when there’s nothing stopping me from working with data from INGV and VLF.it.

Plans.

Looking ahead, I really need to reboot myself on the software dev. The ultimate target for running code will be nothing more sophisticated than this laptop. But for exploring algorithms and probably NN training, pre-optimisation, I reckon using cloud services will be my best bet. Concurrently I can look at some of the side prongs that I want to include in the system as a whole – notably web publication of data and automatic generation of Twitter notifications.

As everyone that’s worked on a solo project knows, I’ve also got a lot more material in my head or at best sketched in notebooks that needs writing up. How often has the New Year Resolution been : “Write more docs.”.

Mini-Seismograph

On the hardware side, until I’ve got my income a bit better sorted out I am pretty much limited to rather a scattergun approach using what ever components I have at hand. As well as finishing off the mostly-built VLF receiver, I’ve also got the bits for a basic seismograph. It’ll essentially be :

  •  ESP32microcontroller + comms :  core of the subsystem, handling the acquisition and preprocessing of data, which it will expose using a basic web server, accessible from the local network (the ESP32 includes WiFi connectivity).
  • MPU6050sensors : accelerometer +gyroscope : a tiny MEMS device, connected over I2C.
  • MicroSD carddata logging : experience shows that 100% connectivity is implausible, so some local history is very desirable
  • Tiny RTC cardrealtime clock :  the comms will be async, so accurate local timestamps are a must.

The ESP32 is a remarkably capable little device and I’m reasonably confident of the viability of interfacing the peripherals. Hopefully just a matter of plodding through example code for each, tweaking as needed.

The MPU6050 sensors are much less sensitive than those of typical seismometers. Only events with significant magnitude are likely to be detectable. It remains to be seen, but I have my suspicions that having 2 different types of sensor in there mean it will, with a bit of wrangling, be possible to get more effective sensitivity than the individual sensor data would yield. Whatever, once the wiring and code is in place for this setup, it should be trivial to extend it to use a more sensitive sensor. (Note the Raspberry Shake 4D configuration.)

Also…

I’ve got a little tangential project on the go. ELFQuake is in essence about trying to model aspects of a physical system : Earth geology and its electronically-detectable artifacts. Creating an analogue in software that captures enough to be able to make useful predictions. Also I’m increasingly convinced that the design of analog circuits between the sensors and standard data acquisition elements (ADCs etc) will have a major impact on the potential success of the system. Putting these points together, it shouldn’t seem that off the wall that I’ve been working on the design of an analog computer. (I must admit I also want to play with chaotic systems, this is something I’ve been messing about with for years).

 

 

Arduino front end ideas

So, as mentioned in previous posts, I reckon it’s worth trying to use Arduinos as front-end microcontrollers for this project, as shown in the block diagram here. An Arduino Uno has 6 analog inputs, and the ESP8266 WiFi card which I plan to use has one. These are quite limited – 10 bit ADCs with bandwidth that at best may go up into a few kHz. As such, while they should be ok for picking up seismic data, they fall far short for the ELF/VLF radio capture which should really go up to the region of 20kHz.

On the seismic side, I think a first pass worth trying is a home-hacked sensitive, one axis sensor, plus a 3 axis gyro and a 3 axis accelerometer. I’ll come back to this is a while – I need to research & buy the gyro & accelerometers. But I have all the components for an attempt at a useful radio subsystem, provisional design as follows…

vlf-filter-based

Starting at top left, the blue circle represents the actual ELF/VLF radio receiver. This will be some kind of antenna, picking up the electric field with a frequency range from somewhere probably in the 100s of mHz up to around say 200kHz. A good starting point for this seems to be the BBB-4 VLF Receiver. It’s a relatively simple 2-transistor design, with a high impedance FET input followed by a bit of further amplification provided by a regular BJT.

A major problem, as mentioned here before is mains hum interference. It seems that as well as the 50Hz fundamental, there’s also a significant amount of the 3rd harmonic at 150Hz. So I propose using notch filters at these frequencies (also in that earlier post). Given what will follow in the circuit, I don’t think these need to be very high Q/narrow, just enough to prevent these parts of the input swamping everything else, saturating what comes next. These filters are shown as the yellow block in the diagram.

Next comes a bank of bandpass filters. The Arduino+ESP8266 offer 7 channels, so I propose having the first being relatively broadband, pretty much just a buffer for everything coming from the receiver (post notches). After each of these will be a simple peak level detector, shown above as a diode & capacitor. The level on these will be passed onto the Arduino/ESP8266 analogue inputs.

(The diagram is simplified a bit. The gain of the different stages will need to be figured out, additional gain/buffering/level-shifting/limiting stages will be needed).

The key references on ELF/VLF radio precursors to earthquakes are vlf.it (note especially the OPERA project) and a chapter in Roberto Romero’s Radio Nature book. Alas, it seems that research is fairly inconclusive (and in places contradictory). Radio frequencies from the milliHertz right up to microwave are mentioned, may contain useful information. But keeping things simple is a major consideration here, so I’ll stick to somewhere a bit below audio up to a bit above. Yes, this project is experimental…

I intend to do a bit more examination of the signals that appear in VLF before going further, though whatever, the choice of frequency bands at this point has to be fairly arbitrary. Pretty much decades in the audio range seem a reasonable starting point. So on top of 0. broadband, here goes:

  1. 0.01 … 10Hz
  2. 20Hz
  3. 200Hz
  4. 2kHz
  5. 20kHz
  6. 40kHz … 200kHz

The question of how narrow/broad to make the filters for best results is another question that I reckon can only be answered with the help of experimentation. But it is possible to make pragmatic educated guesses. I intend using general-purpose op amps for implementation.

At the bottom end of (1.), I suspect it’ll be more effort that it’s worth to worry too much about LF roll off, a simple buffered CR filter, should be adequate. Effectively just DC blocking. For the top end of (1.), a straightforward two op amp LP filter should be fine. For 2. – 5. bandpass filters made from 2 op amps should make a fair starting point. Regarding the steepness of their curves, Butterworth configurations (maximally flat in passband) keep design straightforward.

You may notice that 3. + are at multiples of 50Hz. But I’m hoping that using standard value/tolerance components will make enough offset to alleviate the hum harmonics. E.g. using the Sallen-Key circuit (this is a low pass, but shows what I’m talking about):

Sallen-Key_Lowpass_Example.svg

This gives fc = 15.9 kHz and Q = 0.5, subject to component tolerances (typical inexpensive capacitors are +/-10%). The kind of values that are probably close enough to the decades above to usefully split ranges, but (hopefully) offcentre for the 50Hz harmonics.

I don’t know if I’ve mentioned it before, but as the radio receiver needs to be as far away as possible from power lines (which will likely be determined by my WiFi range), I’m intending using little solar panels feeding rechargeable batteries for power.

While on the subject, I reckon it’ll also be worthwhile adding data from other environmental sensors, notably for temperature and acoustic noise (a mic). Pretty straightforward for Arduinos. Variations in this data may be unlikely to be useful as earthquake precursors, but they will almost certainly play a part in environmental noise picked up by the radio & seismic sensors. My hope is to get a Deep Learning configuration together that will in effect subtract this from the signals of interest.

Arduino – initial experiences

skip to Arduino/WiFi bit, also Issues Raised and a Cunning Plan

Requirements & Constraints

On the hardware side of this project, I want to capture local seismic and ELF/VLF radio data. I’ve given myself two major constraints: it should be simple; it should be low cost. These constraints are somewhat conflicting. For example, on the seismic side, a simple approach would be to purchase a Raspberry Shake, an off-the-shelf device based on a Raspberry Pi and an (off-the-shelf) geophone. Unfortunately, these gadgets start at $375 USD, and that’s only for one dimension (and there may be software licensing issues). I want to capture 3D data, and want to keep the price comfortably under $100. Note that project non-constraints are absolute measurement, calibration etc. So the plan is to hack something. I’m taking rather a scattergun approach to the hardware – find as many approaches as are feasible and try them out.

Both the seismic and radio sensor subsystems have particular requirements when it comes to physical location. The seismic part should ideally be firmly attached to local bedrock; the radio part should be as far away as possible from interference – mains hum being the elephantine wasp in the room. For my own installation this will probably mean bolting the seismic part to my basement floor (which is largely on bedrock) and having the radio part as far up the fields as I can get it.

What seems the most straightforward starting point is to feed data from the sensors into a local ADC, pass this through a microcontroller into a WiFi transceiver, then pick this up on the home network. (WiFi range may well be an issue – but I’ll cross that bridge when I come to it).

The two microcontroller systems that seem most in the frame due to their relatively low cost are the aforementioned Raspberry Pi and the Arduino family. For a first pass, something Arduino-based seems the best bet – they are a lot cheaper than the Pis, and have the advantage of having multiple ADCs built in (compared to the Pi’s none – though there are straightforward add-ons).

Arduino Fun

Quite a while ago I ordered a couple of Arduino Unos and WiFi shields from Banggood, a China-based retailer of low cost stuff. My only prior experience with Arduinos was when my brother was building something MIDI-related and hit a code problem. He mailed me on the offchance and amusingly I was able to solve the problem in my reply – it was a fairly easy bit of C (I hadn’t done any other C for years, but coding is coding).

I instantly fell in love with the Arduino boards (actually a clone by GeekCreit). After very little time at all I was able to use the Arduino IDE to get some of the example code running on one of the devices. Light goes on, light goes off, light goes on… Very user friendly.

ESP8266 Nightmares

In my naivety, I assumed the WiFi shields would be as straightforward. Most probably are, but the ones I ordered have been distinctly painful so far. But I can at least put slow progress so far down as a learning experience. Essentially the ones I got have several issues. The story so far:

The boards I got are labeled “Arduino ESP8266 WiFi Shiald Version 1.0 by WangTongze”. Yup, that’s ‘Shiald’, not auspicious. The first major issue was that the only official documentation was in Chinese (mandarin?). I wasted a lot of time trying to treat them as more standard boards. But then found two extremely helpful blog posts by Claus : Using ESP8266 Shield ESP-12E (elecshop.ml) by WangTongze with an Arduino Uno and Arduino ESP8266 WiFi Shield (elecshop.ml) by WangTongze Comparison.

The first of these posts describes a nifty little setup, using an Arduino board as a converter from USB to TTL level RS232 that the Shiald can understand (I didn’t think to order such an adapter). It looks like this:

arduino1

By default the Shiald plugs its serial TX/RX pins to the Arduino’s, which does seem a design flaw. But this can (apparently) be flipped to using software serial via regular digital I/O pins on the Uno. A key thing needed is to tell the Shiald to use 9600 baud rather than its default 115200. The setup above allows this. This part worked for me.

However, at this point, after bending the TX/RX pins out of the way on the Shiald and plugging it in on top of the Uno (with jumpers to GPIO for TX/RX), I couldn’t talk to it. So going back to Claus’s post, he suggests updating the Shiald’s firmware. Following his links, I tried a couple, ended up with the setup spewing gibberish (at any baud rate).

At this point – after a good few hours yesterday, I was ready to cut my losses with the WiFi Shialds. I’d mentioned to danbri that I was struggling with these cards and he mentioned that he’d had the recommendation (from Libby) of Wemos cards. So I started having a look around at what they were. As it happens, they have a page on their wiki Tutorial – Returning a Wemos D1 Mini to Factory Firmware (AT). The D1 uses the same ESP8266 chips as my Shiald, so this morning, nothing to lose, adjusted the script and gave it a shot. Going back to the setup in the pic (with DIP switches tweaked as Claus suggests) it worked! (Tip – along the way of flashing, I had to press the Shiald’s reset button a couple of times).

arduino-at

So far so pleasing – I thought I might have bricked the board.

(See also ESP8266 Wifi With Arduino Uno and Nano)

After this I’d tried with the Shiald mounted on top of the Arduino in a good few configurations with various different software utilities, haven’t yet got everything talking to everything else, but this does feel like progress.

Issues Raised and a Cunning Plan

Sooo… these Shialds have been rather thieves of time, but it’s all learning, innit.

These bits of play have forced me into reading up on the Arduinos a bit more. For this project, a key factor is the ADC sample rate. It seems that the maximum achievable for a single ADC is around 9kHz (with 10 bit precision). That should be plenty for the seismic sensor. The radio sensor is another matter. I’d like to be able to cover up to say 20kHz, which means a sampling rate of at least 40kHz. I’m still thinking about this one, but one option would be to use an ADC shield – these ones from Mayhew Labs look plenty – though getting the fast data along to WiFi could well be an issue (intermediate baud rates). If necessary, some local processing could be a workaround. I have been intending to present the radio data to the neural network(s) as spectrograms so maybe eg. running an FFT on the Arduino may be feasible.

Along similar lines, I may have a Cunning Plan, that is to shift some of the processing from digital to analog. This is likely to need a fair amount of research & experimentation, but the practical circuitry could be very straightforward. It seems at least plausible that the earthquake precursors are going to occur largely in particular frequency regions. The Arduino has 6 analog inputs. So imagine the radio receiver being followed by 6 bandpass filters, each tuned around where precursors may be expected. A simple diode & (leaky) capacitor peak level detector for each of these could provide a very crude spectrogram, at a rate the Arduino could easily handle. Op amp BP filters are straightforward and cheap, so an extra $5 on the analog side might save $40 and a oad more work afterward.

Regarding the research – a key source is (of course) Renato Romero’s vlf.it, notably the OPERA project – although that does seem to focus at the low end of potential frequencies.

Hall Effect-based Seismometer, Sanity Check Experiments

PS. Oops! I made a silly mistake in the breadboarding, if you look closely at the photo you can see that the 10k ground resistor at the input of the op amp is going to + input, not – as intended. Which kind of messes up all my measurements. Hey ho. I have since made a ball-bearing in a jar (1 axis) sensor and roughed out a signal conditioning circuit (which will now need tweaking…), so will repeat the experiment here and do another post asap.

A fun part of this project is the investigation of hardware possibilities for detecting seismic events and ELF/VLF signals. Even though I’m aiming towards minimum budget hardware, my funds for this have been virtually non-existent so I’ve not got much done (grumble, grumble).

For a seismometer, the requirements as it seems to me, are: simplicity, reasonable sensitivity and low cost. Ideally I want to monitor all 3 dimensions with relatively wide bandwidth. A non-requirement is any kind of absolute accuracy or calibrated measurement.

There are a variety of options for seismic sensors, most that I’ve seen fall down for these requirements in one way or another. I won’t go into them here – try searching for accelerometers (low sensitivity), geophones (expensive), pendulum-based systems (complicated build, 3 dimensions would be very tricky…). To give a ballpark, prices for a ready-made seismometer system based on the Raspberry Pi, the Raspberry Shake, start at $375 USD. That’s for one dimension, using a geophone sensor.

Almost a year ago I sketched out an idea for something that might work.

DSCN7976

At the time I picked up a linear Hall Effect device from Jaycar, a UGN3503UA, costing just $7.75 AUS. It’s in case very like a transistor, just 3 pins : +ve, -ve power and output. An example use in the datasheet uses the same principle as I want to exploit:

gear-sense

A magnet is glued to the back of the sensor. As a (ferrous material) cog approaches the sensor, the magnetic field increases, correspondingly increasing the devices output voltage.

The other day a bag of ball bearings arrived. I just got around to having a play. This is what the setup looked like:

DSCN7974

I’ve got the Hall Effect device soldered to a connector to make breadboarding easier. On the left of it is a blob of Blue Tack attaching a 1cm diameter/3mm deep neodymium magnet. On the right, a 5/8″ steel ball precision mounted between my finger & thumb.

Right now I’ve only got a crude +/- homebrew power supply, so I’m using an op amp to buffer a potential divider to provide a lower voltage to suite the device. Another op amp is used to provide a 10x amplifier from the output of the device.

When I put the magnet in direct contact with the sensor it saturated it at one extreme or the other. I seemed to get best results with around 1cm space in between. With a 5.2v supply to the sensor, this led to a no-magnet output of 2.52v (after the 10x amplification). With the magnet, this changed to 3.07v or 1.76v depending on polarity. With the ball bearing at 1cm away this changed by approx 0.01/0.02v, steadily increasing from there to 3.50/1.22v when the ball bearing touched the sensor.

This sensitivity was less than I’d hoped, but will hopefully be enough to be usable if I tweak a few of the components. I reckon it’s definitely worth going for a prototype, see how it behaves in practice.

I’ll need to find a very small jar 🙂

Here are my full notes:

seismo-experiment

 

 

 

Preconditioning Seismic Data

The filtered data I have is CSV with lots of lines with the fields :

datetime, latitude, longitude, depth, magnitude

The latter 4 fields will slot in as they are, but a characteristic of seismic events is that they can occur at any time. Say today 4 events were detected at the following times:

E1 01:15:07 lat1 long1 d1 2.2
E2 01:18:06 lat2 long2 d2 3.1
E3 01:20:05 lat3 long3 d3 2.1
E4 08:15:04 lat4 long4 d4 3.5

To get the data in a shape that can act as input to a neural network (my first candidate is PredNet), it seems like there are two main options:

Time Windows

Say we decide on a 6 hour window starting at 00:00. Then E1, E2, E3 will fall in one window, E4 the next.  Which leads to the question of how to aggregate the first 3 events. Often events are geographically clustered, a large event will be associated with nearby foreshocks and aftershocks. For a first stab at this, it doesn’t seem unreasonable to assume such clustering will be the typical case. With this assumption, the data collapses down to :

[00:00-06:00] E2 lat2 long2 d2 3.1
[06:00-12:00] E4 lat4 long4 d4 3.5

This is lossy, so if say E1 and E2 were in totally different locations, the potentially useful information of E1 would be lost. A more sophisticated strategy would be to look for local clustering – not difficult in itself (check Euclidian distances), but then the question would be how to squeeze several event clusters into one time slot. As it stands it’s a simple strategy, and worth a try I reckon.

Time Differences

This strategy would involve a little transformation, like so:

E1[datetime]-E0[datetime] = ? lat1 long1 d1 2.2
E2[datetime]-E1[datetime] = 00:03:01 lat2 long2 d2 3.1
E3[datetime]-E2[datetime] = 00:02:01 lat3 long3 d3 2.1
E4[datetime]-E3[datetime] = 07:05:01 lat4 long4 d4 3.5

Now I must confess I really don’t know how much sense this makes, but it is capturing all the information, so it might just work. Again, it’s pretty simple and also worth a try.

I’d very much welcome comments and suggestions on this – do these strategies make sense? Are there any others that might be worth a try?

 

Seismic Data – fixed?

As described in my last post, I was seeing significant gaps in the seismic event data I was retrieving from the INGV service. So I re-read their docs. Silly me, I’d missed the option to include query arguments restricting the geo area of the events (I had code in a subsequent script doing this).

While tweaking the code to cover these parameters I also spotted a really clumsy mistake. I had a function doing more or less this –

for each event element in XML DOM:
        extract event data
        add event data to list
        return list

D’oh! Should have been –

for each event element in XML DOM:
        extract event data
        add event data to list
return list

I’ve also improved error handling considerably, discriminating between genuine HTTP errors and HTTP 204 No Content. Now I’ve narrowed the geo area and reduced the time window for each GET down to 1 hour, there are quite a lot of 204s.

I’m now running it over the time period around the l’Aquila quakes as a sanity check. Jeez, 20+ events in some hours, 10+ in most.

Assuming this works ok, I’ll run it over the whole 1997-2017 preiod, hopefully in ~12 hours time I’ll have some usable data.

PS. Looking good, for the 30 days following that of the l’Aquila big one, it produced:

in_zone_count = 8877
max_depth = 62800.0
max_magnitude = 6.1

 

 

 

Seismic Data Wrangling

Following my interim plan of proceeding software-only (until I’ve the funds to get back to playing with hardware), I’ve been looking at getting seismic event data from the INGV Web Service into a Keras/Tensorflow implementation of PredNet.

My code is on GitHub, and rather than linking to individual files which I may rename, I’ll put a README over there with pointers.

As a first step, I put together code to pull of the data and dump it down to simple CSV files. This appeared to be working. The demo implementation of PredNet takes HDF5 data from the KITTI  vision dataset (videos from a car on road around Karlsruhe), extracting it into numpy arrays, with the PredNet engine using Keras. To keep things simple I wanted to follow the same approach. I’m totally new to HDF5 so pinged Bill Lotter of the PredNet project for clues. He kindly gave me some helpful tips, and concurred with what I’d been thinking – keep the CSV data, process that into something PredNet can consume.

The data offered by the Web Service is good XML delivered over HTTP (props to INGV). But it does include a lot of material (provenance, measurement accuracy etc) that isn’t needed here. So my service-to-CSV code parses out just the relevant parts, producing a line for each event:

datetime, latitude, longitude, depth, magnitude

e.g.

2007-01-02T05:28:38.870000, 43.612, 12.493, 7700, 1.7

I couldn’t find the info anywhere, but it appears that the INGV service records go back at least to somewhere in the early 1990’s, so I chose 1997-01-01T00:00:00 as a convenient start datetime, giving me 20 years of events.

For this to be a similar shape to what PredNet expects, I will aggregate events within a particular time period (actually I think taking the most significant event in that period). I reckon 6 hour periods should be about right. This also seemed a reasonable window for calling the service (not). I’ll filter down the events to just those within the region of interest (northern Italy, see earlier post)  and then scale the latitude & longitude to an easy integer range (probably [128, 128]). For a first pass I’ll ignore the depth field.

As it happens, I’m well on the way to having implemented this. But along the way I did a few sanity checks, eg. checking for maximum event magnitude in the region of interest, (I got 4.1), and it turned out I was missing some rather significant data points.  Here’s one I checked for:

The 2009 L’Aquila earthquake occurred in the region of Abruzzo, in central Italy. The main shock occurred at 03:32 CEST (01:32 UTC) on 6 April 2009, and was rated 5.8 or 5.9 on the Richter magnitude scale and 6.3 on the moment magnitude scale; its epicentre was near L’Aquila, the capital of Abruzzo, which together with surrounding villages suffered most damage.

Nope, it wasn’t in the CSV, but the Web Service knows all about it:

http://webservices.ingv.it/fdsnws/event/1/query?eventId=1895389

Doing a service call for that whole day:

http://webservices.ingv.it/fdsnws/event/1/query?starttime=2009-04-06T00:00:00&endtime=2009-04-06T23:59:59

–  yields 877 events – nightmare day!

I’d set the timeout on the HTTP calls to 2 seconds, but there is so much data associated with each event that this was woefully inadequate. Since upped to 5 mins.

Manually checking calls, I was also sometimes getting a HTTP Status Code of 413 Request Entity Too Large. This puzzled me mightily – still does actually. It says request entity, not requested (or response) entity, but the way it’s behaving is that the response requested is too large. Either way I reckon the spec (latest is RFC7231) is a little open to misinterpretation here. (What the heck – I’ve mailed the IEFT HTTP list about it – heh, well well, I’ve co-chaired something with the chair…).

Anyhow, I’ve also tweaked the code to make calls over just 1 hour windows, hopefully it’ll now get the stuff it was missing.

Hmm…I’ve got it running now and it’s giving errors throughout the year 2000, which should be trouble-free. I think I’ll have to have it make several passes/retries to ensure I get the maximum data available.

Drat! It’s giving me Entity Too Large with just 1 hour windows, e.g.

http://webservices.ingv.it/fdsnws/event/1/query?starttime=2000-12-13T01:00:00&endtime=2000-12-13T02:00:00

I need to fix this…

 

 

 

 

Candidate Neural Network Architecture : PredNet

While I sketched out a provisional idea of how I reckoned the network could look, I’m doing what I can to avoid reinventing the wheel. As it happens there’s a Deep Learning problem with implemented solutions that I believe is close enough to the earthquake prediction problem to make a good starting point : predicting the next frame(s) in a video. You train the network on a load of sample video data, then at runtime give it a short sequence and let it figure out what happens next.

This may seem a bit random, but I think I have good justification. The kind of videos people have been working with are things like human movement or motion of a car. (Well, I’ve seen one notable, fun, exception : Adversarial Video Generation is applied to the activities of Mrs. Pac-Man). In other words, a projection of objects obeying what is essentially Newtonian physics. Presumably seismic events follow the same kind of model. As mention in my last post, I’m currently planning on using online data that places seismic events on a map – providing the following: event time, latitude, longitude, depth and magnitude. The video prediction nets generally operate over time on x, y with R, G, B for colour. Quite a similar shape of data.

So I had a little trawl of what was out there.  There are a surprisingly wide variety of strategies, but one in particular caught my eye : PredNet. This is described in the paper Deep Predictive Coding Networks for Video Prediction and Unsupervised Learning (William Lotter, Gabriel Kreiman & David Cox from Harvard) and has supporting code etc. on GitHub. Several things about it appealed to me. It’s quite an elegant conceptual structure, which translates in practice into a mix of convnets/RNNs, not too far from what I anticipated needing for this application. This (from the paper) might give you an idea:

prednet-block

Another plus from my point of view was that the demo code is written using Keras on Tensorflow, exactly what I was intending to use.

Yesterday I had a go at getting it running.  Right away I hit a snag: I’ve got this laptop set up for Tensorflow etc. on Python 3, but the code uses hickle.py, which uses Python 2. I didn’t want to risk messing up my current setup (took ages to get working) so had a go at setting up a Docker container – Tensorflow has an image. Day-long story short, something wasn’t quite right. I suspect the issues I had related to nvidia-docker, needed to run on GPUs.

Earlier today I decided to have a look at what would be needed to get the PredNet code Python3-friendly. Running kitti-train.py (Kitti is the demo data set) led straight to an error in hickle.py. Nothing to lose, had a look. “Hickle is a HDF5 based clone of Pickle, with a twist. Instead of serializing to a pickle file, Hickle dumps to a HDF5 file.“. There is a note saying there’s Python3 support in progress, but the cause of the error turned out to be –

if isinstance(f, file):

file isn’t a thing in Python3. But kitti-train.py was only passing a filename to this, via data-utils.py, so I just commented out the lines associated with the isinstance. (I guess I should fix it properly, feed back to Hickle’s developer.)

It worked! Well, at least for kitti-train.py. I’ve got it running in the background as I type. This laptop only has a very wimpy GPU (GeForce 920M) and it took a couple of tweaks to prevent near-immediate out of memory errors:

export TF_CUDNN_WORKSPACE_LIMIT_IN_MB=100

kitty_train.py, line 35
batch_size = 2 #was 4

It’s taken about an hour to get to epoch 2/150, but I did renice Python way down so I could get on with other things.

Seismic Data

I’ve also spent a couple of hours on the (seismic) data-collecting code. I’d foolishly started coding around this using Javascript/node, simply because it was the last language I’d done anything similar with. I’ve got very close to having it gather & filter blocks of from the INGV service and dump to (csv) file. But I reckon I’ll just ditch that and recode it in Python, so I can dump to HDF5 directly – it does seem a popular format around the Deep Learning community.

Radio Data

Yes, that to think about too.

My gut feeling is that applying Deep Learning to the seismic data alone is likely to be somewhat useful for predictions. From what I’ve read, the current approaches being taken (in Italy at least) are effectively along these lines, leaning towards traditional statistical techniques. No doubt some folks are applying Deep Learning to the problem. But I’m hoping that bringing in radio precursors will make a major difference in prediction accuracy.

So far I have in mind generating spectrograms from the VLF/ELF signals. Which gives a series of images…sound familiar? However, I suspect that there won’t be quantitatively all that much information coming from this source (though qualitatively, I’m assuming vital).  As a provisional plan I’m thinking of pushing it through a few convnet/pooling layers to get the dimensionality way down, then adding that data as another input to the  PredNet.

Epoch 3/150 – woo-hoo!

PS.

It was taking way too long for my patience, so I changed the parameters a bit more:

nb_epoch = 50 # was 150
batch_size = 2 # was 4
samples_per_epoch = 250 # was 500
N_seq_val = 100 # number of sequences to use for validation

It took ~20 hours to train. For kitti_evaluate.py, it has produced some results, but also exited with an error code. Am a bit too tired to look into it now, but am very pleased to get a bunch of these: