Skip to content Skip to sidebar Skip to footer

Pandas Computer Hourly Average And Set At Middle Of Interval

I want to compute the hourly mean for a time series of wind speed and direction, but I want to set the time at the half hour. So, the average for values from 14:00 to 15:00 will be

Solution 1:

So the easiest way is to resample and then use linear interpolation:

In [21]:rng=pd.date_range('1/1/2011',periods=72,freq='H')In [22]:ts=pd.Series(np.random.randn(len(rng)),index=rng)...:In [23]:ts.head()Out[23]:2011-01-01 00:00:00    0.7967042011-01-01 01:00:00   -1.1531792011-01-01 02:00:00   -1.9194752011-01-01 03:00:00    0.0824132011-01-01 04:00:00   -0.397434Freq:H,dtype:float64In [24]:ts2=ts.resample('30T').interpolate()In [25]:ts2.head()Out[25]:2011-01-01 00:00:00    0.7967042011-01-01 00:30:00   -0.1782372011-01-01 01:00:00   -1.1531792011-01-01 01:30:00   -1.5363272011-01-01 02:00:00   -1.919475Freq:30T,dtype:float64In [26]:

I believe this is what you need.

Edit to add clarifying example

Perhaps it's easier to see what's going on without random Data:

In [29]:ts.head()Out[29]:2011-01-01 00:00:00    02011-01-01 01:00:00    12011-01-01 02:00:00    22011-01-01 03:00:00    32011-01-01 04:00:00    4Freq:H,dtype:int64In [30]:ts2=ts.resample('30T').interpolate()In [31]:ts2.head()Out[31]:2011-01-01 00:00:00    0.02011-01-01 00:30:00    0.52011-01-01 01:00:00    1.02011-01-01 01:30:00    1.52011-01-01 02:00:00    2.0Freq:30T,dtype:float64

Solution 2:

This post is already several years old and uses the API that has long been deprecated. Modern Pandas already provides the resample method that is easier to use than pandas.TimeGrouper. Yet it allows only left and right labelled intervals but getting the intervals centered at the middle of the interval is not readily available.

Yet this is not hard to do.

First we fill in the data that we want to resample:

ts_g=[datetime.datetime.fromisoformat('2019-11-20') + 
      datetime.timedelta(minutes=10*x) for x in range(0,100)]
dg = {'ws': range(0,100), 'wdir': range(0,100)}

df_g = pd.DataFrame(data=dg, index=ts_g, columns=['ws','wdir'])
df_g.head()

The output would be:

wswdir2019-11-20 00:00:00     002019-11-20 00:10:00     112019-11-20 00:20:00     222019-11-20 00:30:00     332019-11-20 00:40:00     44

Now we first resample to 30 minute intervals

grouped_g = df_g.resample('30min')
halfhourly_ws_g = grouped_g['ws'].mean()
halfhourly_ws_g.head()

The output would be:

2019-11-20 00:00:00     12019-11-20 00:30:00     42019-11-20 01:00:00     72019-11-20 01:30:00    102019-11-20 02:00:00    13Freq:30T,Name:ws,dtype:int64

Finally the trick to get the centered intervals:

hourly_ws_g = halfhourly_ws_g.add(halfhourly_ws_g.shift(1)).div(2)\
                             .loc[halfhourly_ws_g.index.minute % 60 == 30]
hourly_ws_g.head()

This would produce the expected output:

2019-11-20 00:30:00     2.52019-11-20 01:30:00     8.52019-11-20 02:30:00    14.52019-11-20 03:30:00    20.52019-11-20 04:30:00    26.5Freq:60T,Name:ws,dtype:float64

Post a Comment for "Pandas Computer Hourly Average And Set At Middle Of Interval"