Rady, E. - H. A., M. R. Abonazel, and M. H. Metawe'e, "A Comparison Study of Goodness of Fit Tests of Logistic Regression in R: Simulation and Application to Breast Cancer Data", Academic Journal of Applied Mathematical Sciences, vol. 7, issue 1, pp. 50-59, 2021. Abstracta_comparison_study_of_goodness_of_fit_tests_of_logistic_regression_in_r.pdf

Goodness of fit (GOF) tests of logistic regression attempt to find out the suitability of the model to the data. The null hypothesis of all GOF tests is the model fit. R as a free software package has many GOF tests in different packages. A Monte Carlo simulation has been conducted to study two situations; the first, studying the ability of each test, under its default settings, to accept the null hypothesis when the model truly fitted. The second, studying the power of these tests
when assumptions of sufficient linear combination of the explanatory variables are violated (by omitting linear covariate
term, quadratic term, or interaction term). Moreover, checking whether the same test in different R packages had the
same results or not. As the sample size supposed to affect simulation results, so the pattern of change of GOF tests results under different sample sizes as well as different model settings was estimated. All tests accept the null hypothesis (more than 95% of simulation trials) when the model truly fitted except modified Hosmer-Lemeshow test in "LogisticDx" package under all different model settings and Osius and Rojek’s (OsRo) test when the true model had an interaction
term between binary and categorical covariates. In addition, le Cessie-van Houwelingen-Copas-Hosmer unweighted sum of squares (CHCH) test gave unexpected different results under different packages. Concerning the power study, all tests had a very low power when a departure of missing covariate existed. Generally, stukel's test (package 'LogisticDX) and CHCH test (package "RMS") reached a power in detecting a missing quadratic term greater than 80% under lower sample size while OsRo test (package 'LogisticDX') was better in detecting missing interaction term. Beside the simulation study, we evaluated the performance of GOF tests using the breast cancer dataset.

Youssef, A. H., M. R. Abonazel, and O. A. Shalaby, "Determinants of Per Capita Personal Income in U.S. States: Spatial Fixed Effects Panel Data Modeling", Journal of Advanced Research in Applied Mathematics and Statistics, vol. 5, issue 1, pp. 1-13, 2020. Abstractdeterminants__of__per__capita__personal__income.pdf

Over the last decades, the Per Capita Personal Income (PCPI) variable was a common measure of the effectiveness of economic development policy. Therefore, this paper is an attempt to investigate the determinants of personal income by using spatial panel data models for 48 U.S. states during the period from 2009 to 2017. We utilize the three following models: spatial autoregressive (SAR) model, Spatial Error (SEM) Model, and Spatial Autoregressive Combined (SAC) model, with individual (or spatial) fixe deffects according to three different known methods for constructing spatial weights matrices: binary contiguity, inverse distance, and Gaussian transformation spatial weights matrix. Additionally, we pay attention for direct and indirect effects estimates of the explanatory variables for SAR, SEM, and SAC models. The second objective of this paper is to show how to select the appropriate model to fit our data.
The results indicate that the three used spatial weights matrices provide the same result based on goodness of fit criteria, and the SAC model is the most appropriate model among the models presented. However, the SAC model with spatial weights matrix based on inverse distance is better compared to other used models. Also, the results indicate that percentage of individuals with graduate or professional degree, real Gross Domestic Product (GDP) per capita,and number of nonfarm jobs have a positive impact on the PCPI, while the percentage of individuals without degree or bachelor’s degree have a negative impact on the PCPI.

Abonazel, M. R., "Handling Outliers and Missing Data in Regression Models Using R: Simulation Examples", Academic Journal of Applied Mathematical Sciences, vol. 6, issue 8, pp. 187-203, 2020. AbstractHandling outliers and missing data using R simulation examples.pdfWebsite

This paper has reviewed two important problems in regression analysis (outliers and missing data), as well as some handling methods for these problems. Moreover, two applications have been introduced to understand and study these
methods by R-codes. Practical evidence was provided to researchers to deal with those problems in regression modeling
with R. Finally, we created a Monte Carlo simulation study to compare different handling methods of missing data in the
regression model. Simulation results indicate that, under our simulation factors, the k-nearest neighbors method is the
best method to estimate the missing values in regression models.

Abonazel, M., and N. Elnabawy, "Using the ARDL bound testing approach to study the inflation rate in Egypt", Economic consultant, vol. 31, issue 3, pp. 24-41, 2020. AbstractUsing the ARDL bound testing approach to study the inflation rate in Egypt

According to economic theory, the change in any economic variables may affect another economic variable through the time and these changes are not instantaneously, but also over future periods. The autoregressive distributed lag (ARDL) model has been used for decades to study the relationship between variables using a single equation time series. The ARDL model is one of the most general dynamic unrestricted models in econometric literature. In this model, the dependent variable is expressed by the lag and current values of independent variables and its own lag value.
This paper studies the dynamic causal relationships between inflation rate, foreign exchange rate, money supply, and gross domestic product (GDP) in Egypt during the period 2005: Q1 to 2018: Q2. Using the bounds testing approach to cointegration and error correction model, developed within an ARDL model, we investigate whether a long-run equilibrium relationship exists between the inflation rate and three determinants (foreign exchange rate, money supply, and GDP). The results indicate that the exchange rate and the growth in money supply have significant effects on the inflation rate in Egypt, while the real GDP has no significance effect on the inflation rate.

Abonazel, M. R., and O. M. Saber, "A Comparative Study of Robust Estimators for Poisson Regression Model with Outliers", Journal of Statistics Applications and Probability, vol. 9, issue 2, pp. 279-286, 2020. Abstracta_comparative_study_of_robust_estimators_for_poisson.pdfWebsite

The present paper considers Poisson regression model in case of the dataset that contains outliers. The Monte Carlo simulation study was conducted to compare the robust (Mallows quasi-likelihood, weighted maximum likelihood) estimators with the nonrobust (maximum likelihood) estimator of this model with outliers. The simulation results showed that the robust estimators give better performance than maximum likelihood estimator, and the weighted maximum likelihood estimator is more efficient than Mallows quasi-likelihood estimator.

Youssef, A. H., M. R. Abonazel, and E. G. Ahmed, "Estimating the Number of Patents in the World Using Count Panel Data Models", Asian Journal of Probability and Statistics, vol. 6, issue 4, pp. 24-33, 2020. Abstractestimating_the_number_of_patents_in_the_world_using_count_panel_data.pdfWebsite

In this paper, we review some estimators of count regression (Poisson and negative binomial) models in panel data modeling. These estimators based on the type of the panel data model (the model with fixed or random effects). Moreover, we study and compare the performance of these estimators based on a real dataset application. In our application, we study the effect of some economic variables on the number of patents for seventeen high-income countries in the world over the period from 2005 to 2016. The results indicate that the negative binomial model with fixed effects is the better and suitable for data, and the important (statistically significant) variables that effect on the number of patents in high-income countries are research and development (R&D) expenditures and gross domestic product (GDP) per capita.

Abonazel, M. R., and A. A. - E. Gad, "Robust partial residuals estimation in semiparametric partially linear model", Communications in Statistics - Simulation and Computation, vol. 49, issue 5: Taylor & Francis, pp. 1223-1236, 2020. AbstractWebsite

This paper presents a robust version of partial residuals technique to estimate parametric and nonparametric components in semiparametric partially linear model. The robust estimation of the parametric component is constructed by using an M-estimation after eliminating the effect of the nonparametric component on both the response and covariates based on the pseudo data. Finally, the nonparametric component is estimated robustly by using the residuals from the obtained M-estimation of the parametric component. Simulation studies and a real data analysis illustrate that the proposed estimator performs better than the existing estimations when outliers in the dataset or errors with heavy tails.

Abonazel, M. R., N. Helmy, and A. Azazy, "The Performance of Speckman Estimation for Partially Linear Model using Kernel and Spline Smoothing Approaches", International Journal of Mathematical Archive, vol. 10, issue 6, pp. 10-18, 2019. Abstractthe_performance_of_speckman_estimation.pdf

The Speckman method is a commonly used for estimating the partially linear model (PLM). This method used the
kernel approach to estimate nonparametric part in PLM. In this paper, we suggest using the spline approach instead of the kernel approach. Then we present a comparative study of the two estimations based on two smoothing (kernel and spline) approaches. A simulation study has been conducted to evaluate the performance of these estimations. The results of our study confirmed that the spline smoothing approach was the best.

Abonazel, M. R., and R. A. Farghali, "Liu-Type Multinomial Logistic Estimator", Sankhya B, vol. 81, issue 2, pp. 203-225, Sep, 2019. AbstractWebsite

Multicollinearity in multinomial logistic regression affects negatively on the variance of the maximum likelihood estimator. That leads to inflated confidence intervals and theoretically important variables become insignificant in testing hypotheses. In this paper, Liu-type estimator is proposed that has smaller total mean squared error than the maximum likelihood estimator. The proposed estimator is a general estimator which includes other biased estimators such as Liu estimator and ridge estimator as special cases. Simulation studies and an application are given to evaluate the performance of our estimator. The results indicate that the proposed estimator is more efficient and reliable than the conventional estimators.

Abonazel, M. R., "Advanced Statistical Techniques Using R: Outliers and Missing Data", Annual Conference on Statistics, Computer Sciences and Operations Research, Faculty of Graduate Studies for Statistical Research, Cairo University, 2019. AbstractAdvanced_statistical_techniques_using_r_outliers_and_missing_data.pdf

This paper has reviewed two important problems in regression analysis (outliers and missing data), as well as some handling methods for these problems using R. Moreover, two R-applications have been introduced to understand these methods by R-codes. Finally, we created a simple simulation study to compare different handling methods of missing data; this is an example of how to create R-codes to perform Monte Carlo simulation studies.