Abstract:
In the oil and gas industry, accurate prediction of hydrocarbon production is still a major problem. While certain empirical and analytical decline analysis approaches were developed to address this problem, they are limited by their assumptions making them less reliable for every reservoir and flow conditions. There is no single decline analysis method that can handle all kinds of data and reservoir types. Reservoir simulation can also be used but it demands extensive data, accurate geo-modeling of the reservoir, and a long production-pressure history. Moreover, it is computationally expensive and time-consuming. In this context, data-driven approaches offer a promising avenue for a more robust solution.
This study explores hydrocarbon production forecasting using empirical, analytical, and machine learning methods. The analysis used three distinct datasets: one from the Volve oil field, a conventional oil field with production, time, and downhole pressure data collected over 95 months, another from the Marcellus shale gas field, an unconventional gas reservoir with production and time data over 130 months, and the other from a conventional gas field of Bangladesh consisting production rate and time over 100 months. The study involves three methodologies: empirical decline curve analysis using Arps’ Decline Curve Analysis for the conventional oil field and gas field and the Duong model for the Marcellus shale gas field, analytical modeling using the Topaze module in KAPPA workstation specifically including Blasingame type curve analysis, and machine learning and deep learning models including Autoregressive Linear Regression (LR), Support Vector Regression (SVR), and Long Short-Term Memory (LSTM) networks.
The empirical analysis resulted in an estimated ultimate recovery (EUR) of 3.99 million Sm3 for the Volve oil field at the end of 120 months, 60.176 Bscf for the conventional gas field after 124 months of production, and 3.03 Bscf for the Marcellus shale gas field after 240 months of production. Analytical modeling for the Volve oil field provided additional insights with a forecasted EUR of 3.88 million Sm3.
Machine learning models exhibited significant outcomes in forecasting accuracy. For the Volve oil field, the LR, SVR, and LSTM models achieved Root Mean Squared Error (RMSE) values of 481.765 Sm3/month, 419.049 Sm3/month, and 361.072 Sm3/month respectively on the test dataset with corresponding EUR values of 4.009 million Sm3, 4.0219 million Sm3, and 3.956 million Sm3. In case of the gas field, the LR, SVR, and LSTM models attained RMSE values of 8.256 MMscf/month, 8.291 MMscf/month, and 17.034 MMscf/month respectively on the unseen data with respective EUR values of 61.751 Bscf, 61.754 Bscf, and 61.127 Bscf. For the Marcellus shale gas field, the LR, SVR, and LSTM models achieved RMSE values of 126.7783 Mscf/month, 127.3119 Mscf/month, and 237.0362 Mscf/month respectively on the test dataset with respective EUR values of 3.1341 Bscf, 3.1357 Bscf, and 2.957 Bscf. The results are very close to the outcomes obtained from empirical and analytical methods. The errors in the oil and shale gas datasets are much less considering the mean and standard deviation of the production data. However, on the conventional gas data both the empirical and the machine learning models produced more errors because of the nature of the dataset. To evaluate the models, the study also considered the coefficient of determination (R2 score) and Relative RMSE (RRMSE) metrics. Regarding the R2 score, the SVR model performed better than the two others on the unseen data, greater than 0.98 for the Volve oil field dataset. On the other hand, for the Marcellus shale dataset, LR and SVR performed almost the same achieving R2 scores of 0.9154 and 0.9149 respectively. On the gas field dataset, the LR model explained the variance in the unseen data better than the other models. LSTM did not perform very well both on the gas field and Marcellus shale dataset both in terms of RMSE and R2 values.
The significance of this work lies in its direct comparison of empirical, analytical, and machine learning techniques using diverse datasets, shedding light upon their respective strengths and limitations. Through this research, a more generalized and robust tool for hydrocarbon production forecasting has been developed.