This study explores the circumstances under which traditional statistical methods and machine learning methods perform best. The literature has provided no conclusive answer on when each approach...Show moreThis study explores the circumstances under which traditional statistical methods and machine learning methods perform best. The literature has provided no conclusive answer on when each approach performs best. Instead, many contradicting findings have been reported demonstrating situations from both perspectives. Often sample size plays a role in which approach is recommended as machine learning methods are said to perform better when the data sample is big. We performed a simulation study, in which we varied several complexity parameters: number of covariates, interactions, interaction depth, regression coefficients, variance of p(x), and formula complexity. Additionally, we reviewed whether sample size and continuous covariates had any bearing on results by reviewing results across different sample sizes and including continuous covariates in combination with binary covariates. To analyze the results, we made use of accuracy, sensitivity, and specificity. From 138 models, we identified seven general patterns analyzed across different sample sizes: (a) a machine learning method performed best, (b) a traditional statistical method performed best, and (c) mixed performance. We extended the analysis to include more methods from both approaches. For each pattern and performance measure we selected models. This resulted in 20 median models in which not all patterns returned. In a similar analysis on three empirical data sets, similar behavior emerged, although the identification of patterns became more challenging. Our findings indicate that the variety within each pattern is too great to conclusively identify which complexity parameters produce a particular pattern, although nuances do exist. Moreover, many similar models are spread out across multiple patterns. The identification of patterns has shown that the opposing views in the literature might be explained by the existence of these patterns. We find that traditional statistical methods outperformed complex machine learning methods in several patterns. Furthermore, we determine that sample size is not the sole determinant to select the best approach, as results demonstrate several instances in which traditional statistical methods perform better on larger sample size(s). This adds new insights into how sample size and methods are related.Show less