How to Calculate a 3-Day Centered Rolling Average for Measurements with Varying Numbers of Measurements Per Date in R?
I have a dataset in r that contains methane measurements taken on different dates by of different cows. The dataset is structured as follows:
ID: An identifier for each cow (integer).
measure_date: The date when the measurements were taken (Date).
ch4: The methane measurement values (numeric).
Each date can have multiple measurements, and some dates might not have any measurements at all, so they do not appear in the dataset. I need to calculate a 3-day centered rolling average of the methane measurements (ch4) for each animal (id), considering all available measurements within the date range.
Key Requirements: Multiple Measurements Per Date: The dataset includes multiple measurements for some dates, but can have a completely different number for another date depending on how many times the animal went to the feed bin, and some days might have no measurements. If the date has no measurement it doesn’t actually appear as a row in the data, but I can easily add in those rows if necessary.
3-Day Centered Rolling Average: The rolling average should be calculated over a 3-day window centered on each date. For example, the rolling average for January 2nd should consider all measurements from January 1st, 2nd, and 3rd.
Group by animal: The rolling average calculation should be done separately for each animal (ID), ensuring that the calculations for one animal does not affect another.
Example Dataset: Here’s a small sample of what the dataset might look like for one techid:
Meausre_date = c(2023-01-01, 2023-01-01, 2023-01-01, 2023-01-01, 2023-01-02, 2023-01-03 2023-01-03, 2023-01-05)
Ch4 = c(200, 250, 233, 256, 270, 256, 290, 299)
Desired Output: For each techid, I want to generate a new column that includes: The original dates (including those with no measurements). The average 3 day centered rolling average methane measurement for each date.
Summary: To summarize, I need an efficient way to compute a 3-day centered rolling average of methane measurements for each animal, considering varying numbers of measurements per date and including dates with no measurements in the sequence.
All of the methods I have seen and attempted so far haven’t been able to deal with the different number of measures on each measure_date. (it is a very large dataset)
example of dataset