Categories
correlation pandas python statistics

How to get the correlation between two timeseries using Pandas

I have two sets of temperature date, which have readings at regular (but different) time intervals. I’m trying to get the correlation between these two sets of data.

I’ve been playing with Pandas to try to do this. I’ve created two timeseries, and am using TimeSeriesA.corr(TimeSeriesB). However, if the times in the 2 timeSeries do not match up exactly (they’re generally off by seconds), I get Null as an answer. I could get a decent answer if I could:

a) Interpolate/fill missing times in each TimeSeries (I know this is possible in Pandas, I just don’t know how to do it)

b) strip the seconds out of python datetime objects (Set seconds to 00, without changing minutes). I’d lose a degree of accuracy, but not a huge amount

c) Use something else in Pandas to get the correlation between two timeSeries

d) Use something in python to get the correlation between two lists of floats, each float having a corresponding datetime object, taking into account the time.

Anyone have any suggestions?