Avocado (Persea americana) farming in East Africa has expanded since recent, contributing significantly toward economic growth and livelihood for small-scale farmers. However, insects attacking avocado fruits reduce fruit quality and size, causing massive losses. Previous studies have identified key avocado insect pests, their temporal population patterns and how landscape vegetation productivity influences their population dynamics. This research analyzed insect count data collected on Bactrocera dorsalis and Ceratitis spp. in an avocado plantation in Thika, Kenya over a successive period of time, as part of pest management. These data are characterized by overdispersion due to aggregation behaviour of the insects in their habitat and serial correlations since the count data were collected over a successive period of time. Analyzing these data becomes complicated because of overdispersion and the serial correlation in the data. In this study, we explored variants of generalized linear models (GLMs) with a sinusoidal component over time; and with and without timescale decomposition of covariates (weather variables). All GLM variants were fitted assuming the negative binomial distribution to account for overdispersion. Based on the Akaike information criterion (AIC), GLMs with decomposed covariates had lower AIC values than GLMs without decomposed covariates for both B. dorsalis and Ceratitis spp., and therefore GLMs with a sinusoidal component and decomposed covariates under negative binomial distribution were the best choice for these data. The contribution of the preceding weekly insect pest counts in all models was statistically significant. The study established that both abiotic and biotic factors drive insect pest infestation.
Published in | International Journal of Data Science and Analysis (Volume 8, Issue 1) |
DOI | 10.11648/j.ijdsa.20220801.11 |
Page(s) | 1-10 |
Creative Commons |
This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited. |
Copyright |
Copyright © The Author(s), 2022. Published by Science Publishing Group |
Overdispersion, Negative Binomial Distribution, Sinusoidal Component, Time Series Count Data
[1] | T. Liboschik, “Modeling Count Time Series following Generalized Linear Models,” 2016, [Online]. Available: https://eldorado.tu-dortmund.de/bitstream/2003/35144/1/Dissertation.pdf. |
[2] | K. Fokianos, “Some recent progress in count time series, statistics,” A J. Theor. Appl. Stat., vol. 45, no. 1, pp. 49–58, 2011. |
[3] | Y. Shapovalova, N. Baştürk, and M. Eichler, “Multivariate count data models for time series forecasting,” Entropy, vol. 23, no. 6, pp. 1–23, 2021. |
[4] | R. C. Jung, M. Kukuk, and R. Liesenfeld, “Time series of count data: modeling, estimation and diagnostics,” Comput. Stat. Data Anal., vol. 51, no. 4, pp. 2350–2364, 2006, doi: 10.1016/j.csda.2006.08.001. |
[5] | M. A. Quddus, “Time series count data models: An empirical application to traffic accidents,” Accid. Anal. Prev., vol. 40, no. 5, pp. 1732–1741, 2008. |
[6] | N. Bosowski, V. Ingle, and D. Manolakis, “Generalized Linear Models for count time series,” in ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing-Proceedings, 2017, pp. 4272–4276. |
[7] | V. Serhiyenko, “Dynamic modeling of multivariate counts-Fitting, diagnostics, and applications,” Dr. Diss., vol. 858, 2015. |
[8] | C. W. S. Chen and S. Lee, “Generalized Poisson autoregressive models for time series of counts,” Comput. Stat. Data Anal., vol. 99, no. xxxx, pp. 51–67, 2016, doi: 10.1016/j.csda.2016.01.009. |
[9] | N. Alzahrani, P. Neal, S. E. F. Spencer, T. J. McKinley, and P. Touloupou, “Model selection for time series of count data,” Comput. Stat. Data Anal., vol. 122, pp. 33–44, 2018, doi: 10.1016/j.csda.2018.01.002. |
[10] | F. Lu and D. Wang, “A new estimation for INAR (1) process with Poisson distribution,” Comput. Stat., vol. 2021, 2021, doi: 10.1007/s00180-021-01157-5. |
[11] | L. Y. Chiu, D. J. Arcega Rustia, C. Y. Lu, and T. Te Lin, “Modelling and forecasting of greenhouse whitefly incidence ssing time-series and ARIMAX analysis,” IFAC-PapersOnLine, vol. 52, no. 30, pp. 196–201, 2019. |
[12] | J. Hinde and C. G. B. Demétrio, “Overdispersion: Models and estimation,” Comput. Stat. Data Anal., vol. 27, no. 2, pp. 151–170, 1998, doi: 10.1016/S0167-9473(98)00007-3. |
[13] | P. Arya, R. K. Paul, A. Kumar, K. N. Singh, N. Sivaramne, and P. Chaudhary, “Predicting pest population using weather variables : An arimax time series framework,” Int. J. Agric. Stat. Sci., vol. 11, no. 2, pp. 381–386, 2015. |
[14] | R. Nyoka, J. Omony, S. M. Mwalili, T. N. O. Achia, A. Gichangi, and H. Mwambi, “Effect of climate on incidence of respiratory syncytial virus infections in a refugee camp in Kenya: A non-Gaussian time-series analysis,” PLoS One, vol. 12, no. 6, pp. 1–14, 2017. |
[15] | H. E. Z. Tonnang, L. V. Nedorezov, J. O. Owino, H. Ochanda, and B. Löhr, “Host-parasitoid population density prediction using artificial neural networks: Diamondback moth and its natural enemies,” Agric. For. Entomol., vol. 12, no. 3, pp. 233–242, 2010, doi: https://doi.org/10.1111/j.1461-9563.2009.00466.x. |
[16] | H. et al. Vennila, S; Singh, G; Jha, G K; Rao, M S; Panwar, “Artificial neural network techniques for predicting severity of Spodoptera litura (Fabricius) on groundnut,” J. Environ. Biol., vol. 38, pp. 1–6, 2017, doi: 10.22438/jeb/38/3/MS-163. |
[17] | T. Wahyono, Y. Heryadi, H. Soeparno, and B. S. Abbas, “Enhanced lstm multivariate time series forecasting for crop pest attack prediction,” ICIC Express Lett., vol. 14, no. 10, pp. 943–949, 2020. |
[18] | J. J. Odanga et al., “Spatial distribution of bactrocera dorsalis and thaumatotibia leucotreta in smallholder avocado orchards along altitudinal gradient of taita hills and mount kilimanjaro,” Insects, vol. 9, no. 2, pp. 1–11, 2018, doi: 10.3390/insects9020071. |
[19] | N. K. Toukem, A. A. Yusuf, T. Dubois, E. M. Abdel-Rahman, M. S. Adan, and S. A. Mohamed, “Landscape vegetation productivity influences population dynamics of key pests in small avocado farms in Kenya,” Insects, vol. 11, no. 7, pp. 1–14, 2020, doi: 10.3390/insects11070424. |
[20] | J. J. Odanga, S. Mohamed, R. Nyankanga, F. Olubayo, T. Johansson, and S. Ekesi, “Temporal population patterns of oriental fruit flies and false codling moths within small-holder avocado orchards in Southeastern Kenya and Northeastern Tanzania,” Int. J. Fruit Sci., vol. 20, no. 2, pp. 542–556, 2020, doi: 10.1080/15538362.2020.1746728. |
[21] | K. S. Choi, A. C. Samayoa, S. Y. Hwang, Y. B. Huang, and J. J. Ahn, “Thermal effect on the fecundity and longevity of Bactrocera dorsalis adults and their improved oviposition model,” PLoS One, vol. 15, no. 7, pp. 3–6, 2020. |
[22] | R. Ma, A. Verghese, R. R. Pv, and S. Kandakoor, “Effect of climate change on biology of oriental fruit fly, Bactrocera dorsalis hendel (Diptera: Tephritidae),” vol. 8, no. May, pp. 935–940, 2020. |
[23] | P. Montoya, S. Flores, and J. Toledo, “Effect of rainfall and soil moisture on survival of adults and immature stages of Anastrepha ludens and A. obliqua (Diptera: Tephritidae) under semi-field conditions,” Florida Entomol., vol. 91, no. 4, pp. 643–650, 2008. |
[24] | J. A. Nelder and R. W. M. Wedderburn, “Generalized Linear Models,” J. R. Stat. Soc. Ser. A, vol. 135, no. 3, pp. 370–384, 1972, doi: 10.2307/2344614. |
[25] | F. J. Anscombe, “The statistical analysis of insect counts based on the Negative Binomial distribution,” Int. Biometric Soc., vol. 5, no. 2, pp. 165–173, 1949, doi: 10.2307/3001918. |
[26] | H. Akaike, “Information theory and an extension of the maximum likelihood principle,” in Proceedings of the 2nd International Symposium on Information Theory, 1973, pp. 267–281. |
[27] | A. Guisan and N. E. Zimmermann, “Predictive habitat distribution models in ecology,” Ecol. Modell., vol. 135, no. 2–3, pp. 147–186, 2000, doi: 10.1016/S0304-3800(00)00354-9. |
[28] | R. CoreTeam, “R: A Language and Environment for Statistical Computing,” 2020. https://www.r-project.org/ (accessed Aug. 16, 2021). |
[29] | B. D. Venables, W. N. & Ripley, “Modern applied Statistics with S-PLUS,” J. R. Stat. Soc. Ser. D (The Stat., vol. 52, no. 4, pp. 704–705, 2002. |
[30] | A. Gasparrini, “Distributed lag linear and non-linear models in R: The package dlnm,” J. Stat. Softw., vol. 43, no. 8, pp. 2–20, 2011. |
[31] | J. Law and D. Mitarotonda, “Package ‘ lubridate ’ R topics documented:,” 2021. https://cran.r-project.org/web/packages/lubridate/lubridate.pdf. |
[32] | R. Hyndman and G. Athanasopoulos, Forecasting: Principles and Practice, 2nd ed. Melbourne, Australia: Melbourne, Australia: OTexts, 2018. |
[33] | K. Ramanathan et al., “Assessing seasonality variation with harmonic regression: Accommodations for sharp peaks,” Int. J. Environ. Res. Public Health, vol. 17, no. 4, pp. 1–14, 2020. |
[34] | D. A. Ewing, C. A. Cobbold, B. V. Purse, M. A. Nunn, and S. M. White, “Modelling the effect of temperature on the seasonal population dynamics of temperate mosquitoes,” J. Theor. Biol., vol. 400, pp. 65–79, 2016. |
[35] | C. A. Johnson et al., “Effects of temperature and resource variation on insect population dynamics: the bordered plant bug as a case study,” Funct. Ecol., vol. 30, no. 7, pp. 1122–1131, 2016. |
[36] | C. Imai, B. Armstrong, Z. Chalabi, P. Mangtani, and M. Hashizume, “Time series regression model for infectious disease and weather,” Environ. Res., vol. 142, pp. 319–327, 2015. |
[37] | M. H. Stephens, A Primer of Ecology with R, vol. 32, no. Book Review 3. New York, United States: Springer, 2010. |
[38] | A. Tobías and M. Saez, “Time-series regression models to study the short-term effects of environmental factors on health,” 11, 2004. |
[39] | H. Mze et al., “Invasion by Bactrocera dorsalis and niche partitioning among tephritid species in Comoros,” Bull. Entomol. Res., vol. 106, no. 6, pp. 749–758, 2016. |
APA Style
Eric Ali Ibrahim, Daisy Salifu, Samuel Musili Mwalili, Thomas Dubois, Henri Edouard Zefack Tonnang. (2022). Analysis of Overdispersed Insect Count Data from an Avocado Plantation in Thika, Kenya. International Journal of Data Science and Analysis, 8(1), 1-10. https://doi.org/10.11648/j.ijdsa.20220801.11
ACS Style
Eric Ali Ibrahim; Daisy Salifu; Samuel Musili Mwalili; Thomas Dubois; Henri Edouard Zefack Tonnang. Analysis of Overdispersed Insect Count Data from an Avocado Plantation in Thika, Kenya. Int. J. Data Sci. Anal. 2022, 8(1), 1-10. doi: 10.11648/j.ijdsa.20220801.11
AMA Style
Eric Ali Ibrahim, Daisy Salifu, Samuel Musili Mwalili, Thomas Dubois, Henri Edouard Zefack Tonnang. Analysis of Overdispersed Insect Count Data from an Avocado Plantation in Thika, Kenya. Int J Data Sci Anal. 2022;8(1):1-10. doi: 10.11648/j.ijdsa.20220801.11
@article{10.11648/j.ijdsa.20220801.11, author = {Eric Ali Ibrahim and Daisy Salifu and Samuel Musili Mwalili and Thomas Dubois and Henri Edouard Zefack Tonnang}, title = {Analysis of Overdispersed Insect Count Data from an Avocado Plantation in Thika, Kenya}, journal = {International Journal of Data Science and Analysis}, volume = {8}, number = {1}, pages = {1-10}, doi = {10.11648/j.ijdsa.20220801.11}, url = {https://doi.org/10.11648/j.ijdsa.20220801.11}, eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ijdsa.20220801.11}, abstract = {Avocado (Persea americana) farming in East Africa has expanded since recent, contributing significantly toward economic growth and livelihood for small-scale farmers. However, insects attacking avocado fruits reduce fruit quality and size, causing massive losses. Previous studies have identified key avocado insect pests, their temporal population patterns and how landscape vegetation productivity influences their population dynamics. This research analyzed insect count data collected on Bactrocera dorsalis and Ceratitis spp. in an avocado plantation in Thika, Kenya over a successive period of time, as part of pest management. These data are characterized by overdispersion due to aggregation behaviour of the insects in their habitat and serial correlations since the count data were collected over a successive period of time. Analyzing these data becomes complicated because of overdispersion and the serial correlation in the data. In this study, we explored variants of generalized linear models (GLMs) with a sinusoidal component over time; and with and without timescale decomposition of covariates (weather variables). All GLM variants were fitted assuming the negative binomial distribution to account for overdispersion. Based on the Akaike information criterion (AIC), GLMs with decomposed covariates had lower AIC values than GLMs without decomposed covariates for both B. dorsalis and Ceratitis spp., and therefore GLMs with a sinusoidal component and decomposed covariates under negative binomial distribution were the best choice for these data. The contribution of the preceding weekly insect pest counts in all models was statistically significant. The study established that both abiotic and biotic factors drive insect pest infestation.}, year = {2022} }
TY - JOUR T1 - Analysis of Overdispersed Insect Count Data from an Avocado Plantation in Thika, Kenya AU - Eric Ali Ibrahim AU - Daisy Salifu AU - Samuel Musili Mwalili AU - Thomas Dubois AU - Henri Edouard Zefack Tonnang Y1 - 2022/02/16 PY - 2022 N1 - https://doi.org/10.11648/j.ijdsa.20220801.11 DO - 10.11648/j.ijdsa.20220801.11 T2 - International Journal of Data Science and Analysis JF - International Journal of Data Science and Analysis JO - International Journal of Data Science and Analysis SP - 1 EP - 10 PB - Science Publishing Group SN - 2575-1891 UR - https://doi.org/10.11648/j.ijdsa.20220801.11 AB - Avocado (Persea americana) farming in East Africa has expanded since recent, contributing significantly toward economic growth and livelihood for small-scale farmers. However, insects attacking avocado fruits reduce fruit quality and size, causing massive losses. Previous studies have identified key avocado insect pests, their temporal population patterns and how landscape vegetation productivity influences their population dynamics. This research analyzed insect count data collected on Bactrocera dorsalis and Ceratitis spp. in an avocado plantation in Thika, Kenya over a successive period of time, as part of pest management. These data are characterized by overdispersion due to aggregation behaviour of the insects in their habitat and serial correlations since the count data were collected over a successive period of time. Analyzing these data becomes complicated because of overdispersion and the serial correlation in the data. In this study, we explored variants of generalized linear models (GLMs) with a sinusoidal component over time; and with and without timescale decomposition of covariates (weather variables). All GLM variants were fitted assuming the negative binomial distribution to account for overdispersion. Based on the Akaike information criterion (AIC), GLMs with decomposed covariates had lower AIC values than GLMs without decomposed covariates for both B. dorsalis and Ceratitis spp., and therefore GLMs with a sinusoidal component and decomposed covariates under negative binomial distribution were the best choice for these data. The contribution of the preceding weekly insect pest counts in all models was statistically significant. The study established that both abiotic and biotic factors drive insect pest infestation. VL - 8 IS - 1 ER -