The Railroad Commission of Texas (RRC) has recently reported new data for oil and natural gas output through August 2016. Dean Fantazzini has kindly shared his corrected data using the most recent data from the RRC. He uses a statistical procedure which adds up the changes in the RRC data from April 2014 to July 2016 to see how incomplete the data has been in the past and uses this to estimate the “missing barrels of oil and cubic feet of natural gas” that will be added to the current “incomplete data” over the following 24 months. In the past the RRC data has been about 99% complete when you look back 24 months from the most recently reported month. Dean estimates the “correction factors” which need to be added to the reported data to get a more reliable estimate of recent output levels.
The correction factors for the month of August looked very low compared to the historical correction factors so I asked Dean to check for a statistical break in the correction factors. Essentially in the past there has been no statistical trend in the correction factors based on Dean’s analysis, but I wondered if perhaps there was now a downward trend in the correction factors due to the digitization of reporting by the RRC.
I will quote Dean’s findings below (from an e-mail):
I checked the time series for each correcting factor -for crude oil only- using unit root tests with a breakpoint, and I found that the correcting factors for the latest 6 months are non-stationary (even at the 1% level), with a break in the constant which took place in February 2016. The previous months (older than 6 months) are instead stationary.
The effect of the ongoing digitalization process seems to (finally) appear in the data. However, many more data will be needed to confirm the break in the data structure: for example, the break in the constant is significant only at the 5% probability level, but not at the 1% level.
Given this evidence, reporting both the corrected data using all the vintage data and the corrected data using the last 3 months (to take the structural break into account) may be a wise thing.
I decided to show the correction based on the last 6 months rather than 3 months because that is where the break occurs, though the difference between 3 months and 6 months is not significant (a difference of 12 kb/d less on average each month.) I also show the previous method of using all the data (Jan 2014 to Aug 2016 for oil and April 2014 to Aug 2016 condensate), this is called all vintage in the chart that follows.
For July 2016 the 6 month estimate is 161 kb/d higher than the EIA estimate and the all vintage estimate is 235 kb/d higher than the EIA estimate.
Data for TX C+C below is from July 2015 to July 2016, first column is all vintage, then 6 month, then EIA all in kb/d
3452 3451 3452
3429 3427 3413
3436 3429 3415
3431 3421 3404
3436 3424 3409
3398 3383 3348
3443 3424 3361
3424 3401 3315
3408 3382 3295
3396 3363 3245
3375 3333 3193
3380 3321 3172
3396 3322 3161
Dean also provides data on how his estimates have changed over time. In the chart below I show Dean’s Texas C+C corrected estimates (using all vintage data) from June 2015 to August 2016, where the month is the final data point of the estimate. The recent estimate is lower than the previous 3 months, in the past the correction factors have bounced up and down by quite a bit, so potentially this could change, particularly if we focus on the 6 month corrected estimate, the estimate will be more volatile.
The chart below shows how the correction factors have changed over time. Statistically we see no trend in the correction factors from April 2014 to Feb 2016 (the correction factors are “stationary”), from Feb 2016 to August 2016 we see a downward trend significant at the 5% level.
The natural gas corrected estimate is compared with the EIA estimate below.