## Preconditioning Seismic Data

The filtered data I have is CSV with lots of lines with the fields :

`datetime, latitude, longitude, depth, magnitude`

The latter 4 fields will slot in as they are, but a characteristic of seismic events is that they can occur at any time. Say today 4 events were detected at the following times:

E1 01:15:07 lat1 long1 d1 2.2
E2 01:18:06 lat2 long2 d2 3.1
E3 01:20:05 lat3 long3 d3 2.1
E4 08:15:04 lat4 long4 d4 3.5

To get the data in a shape that can act as input to a neural network (my first candidate is PredNet), it seems like there are two main options:

#### Time Windows

Say we decide on a 6 hour window starting at 00:00. Then E1, E2, E3 will fall in one window, E4 the next.  Which leads to the question of how to aggregate the first 3 events. Often events are geographically clustered, a large event will be associated with nearby foreshocks and aftershocks. For a first stab at this, it doesn’t seem unreasonable to assume such clustering will be the typical case. With this assumption, the data collapses down to :

[00:00-06:00] E2 lat2 long2 d2 3.1
[06:00-12:00] E4 lat4 long4 d4 3.5

This is lossy, so if say E1 and E2 were in totally different locations, the potentially useful information of E1 would be lost. A more sophisticated strategy would be to look for local clustering – not difficult in itself (check Euclidian distances), but then the question would be how to squeeze several event clusters into one time slot. As it stands it’s a simple strategy, and worth a try I reckon.

#### Time Differences

This strategy would involve a little transformation, like so:

E1[datetime]-E0[datetime] = ? lat1 long1 d1 2.2
E2[datetime]-E1[datetime] = 00:03:01 lat2 long2 d2 3.1
E3[datetime]-E2[datetime] = 00:02:01 lat3 long3 d3 2.1
E4[datetime]-E3[datetime] = 07:05:01 lat4 long4 d4 3.5

Now I must confess I really don’t know how much sense this makes, but it is capturing all the information, so it might just work. Again, it’s pretty simple and also worth a try.

I’d very much welcome comments and suggestions on this – do these strategies make sense? Are there any others that might be worth a try?