Fraud Prediction in Financial Statements through Comparative Analysis of Data Mining Methods

Document Type : Original Article

Authors

1 Department of Accounting, Zanjan Branch, Islamic Azad University, Zanjan, Iran.

2 Department of Accounting, Zanjan Branch, Islamic Azad University, Zanjan, Iran

3 Department of Computer Engineering, Ardabil Branch, Islamic Azad University, Ardabil, Iran.

10.30495/ijfma.2023.71866.1981

Abstract

Fraud increases business risks and costs, creates investor distrust, and questions the professional competence and credibility of accounting. Hence, this study aims to employ data mining methods for fraud risk prediction at the companies listed in the Tehran Stock Exchange within the 2014–21 period. For this purpose, 96 financial ratios were collected by reviewing theoretical foundations and research literature. The proposed classifiers such as the k-nearest neighbors algorithm, Bayesian network, support vector machine, and bagging classifier were adopted for fraud prediction. The performance of all classifiers were evaluated relatively poor . Therefore, financial ratios were reduced to enhance the proposed classifiers through the particle swarm optimization algorithm. In fact, 11 effective financial ratios were extracted with a precision of 72.92% and a prediction accuracy validity of 84,82 %. The extracted ratios were then reevaluated by the proposed classifiers for fraud prediction. According to the reevaluation results, all of the proposed methods improved with the extracted financial ratios. The research results indicated that the bagging classifier yielded the highest precision and accuracy, i.e., 84.28% and 76.85%, respectively, and the lowest prediction error, i.e., 23.15%. It was also 87% efficient in fraud prediction.

Keywords


Aftabi, S. Z., Ahmadi, A., & Farzi, S. (2023). Fraud detection in financial statements using data mining and GAN models. Expert Systems with Applications, 227, 120144.
Ali, A. A., Khedr, A. M., El-Bannany, M., & Kanakkayil, S. (2023). A Powerful Predicting Model for Financial Statement Fraud Based on Optimized XGBoost Ensemble Learning Technique. Applied Sciences, 13(4), 2272.
Auditing Standards Committee (2015). Principles and Regulations of Accounting and Auditing: Auditing Standards, Audit Organization Publications, Tehran, Iran
Berry, M. J., & Linoff, G. S. (2004). Data mining techniques: for marketing, sales, and customer relationship management. John Wiley & Sons.
Chen, Y. (2023). Financial Statement Fraud Detection based on Integrated Feature Selection and Imbalance Learning. Frontiers in Business, Economics and Management, 8(3), 46-48.
Cheng, C. H., Kao,Y.F., &lin, H. P. (2021). A financial statement fraud model based on synthesized attribute selection and a dataset with missing values and imbalanced classes. Applied Soft Computing 108: 107487.
Chimonaki, C., Papadakis, S., Vergos, K., & Shahgholian, A. (2018, June). Identification of financial statement fraud in Greece by using computational intelligence techniques. In International Workshop on Enterprise Applications, Markets and Services in the Finance Industry .pp. 39-51. Springer, Cham.
Cormen, T. H., Leiserson, C., Rivest, R., & Stein, C. (2001). Advanced Algorithms-CS 6/76101
Corruption Perceptions Index . (2021).https://www.transparency.org/en/cpi
Ebrahimi, M.,& Khajavi, SH. (2017). Modeling Effective Variables in Fraud Detection in Financial Statements through Data Mining Techniques, Financial Accounting Journal, (33): 41–62
El-Shorbagy, M. A., & Hassanien, A. E. (2018). Particle swarm optimization from theory to applications. International Journal of Rough Sets and Data Analysis (IJRSDA), 5(2), 1-24.
Guo,G., Wang, H., Bell, D., Bi, Y., & Greer,K. (2003).KNN Model-Based Approach inClassification, Lecture Notes in Computer Science,Volume 2888.
Gupta, S., & Mehta, S. K. (2021). Data mining-based financial statement fraud detection: Systematic literature review and meta-analysis to estimate data sample mapping of fraudulent companies against non-fraudulent companies. Global Business Review, 0972150920984857.
Han, J., Kamber, M., & Mining, D. (2006). Concepts and techniques. Morgan Kaufmann, 340, 94104-3205.
Hidayattullah, S., Surjandari, I., & Laoh, E. (2020). Financial Statement Fraud Detection in Indonesia Listed Companies using Machine Learning based on Meta-Heuristic Optimization. International Workshop on Big Data and Information Security (IWBIS). IEEE. 79-84
Hosseini, S.M., Mahfoozi, Gh., & Kheradyar, S. (2021). Relationship Between Tax Reporting Aggressiveness and Financial Statement Fraud; Accounting and Auditing Research, 13 (50): 163–176.
Ibadin, P. O., & Kemebradikemor, E. (2020). Tax Fraud in Nigeria: A Review of Causal Factors. Journal of Taxation and Economic Development, 19(1), 64-80.
Kamrani,H.,& Abedini, B. (2022). Developing Fraud Detection Model in Financial Ratios through Neural Network Methods and Support Vector Machine at TSE-Listed Companies, Accounting Knowledge and Management Auditing, (41): 285–314
Kazemi, T.(2016). Identifying Cases of Fraud Risk in Financial Statements of Iran and Evaluating Fraud Detection Methods, Doctoral Dissertation, Faculty of Economics and Social Sciences, Shahid Chamran University of Ahvaz.
Khalid, S., T. Khalil, and S. Nasreen.(2014). A survey of feature selection and feature extraction techniques in machine learning. in 2014 science and information conference IEEE.
Kuncheva, L. I. (2014). Combining pattern classifiers: methods and algorithms. John Wiley & Sons.
Larose, D. T. (2005). An introduction to data mining. Traduction et adaptation de Thierry Vallaud.
Lei, Y., Qiaoming, H., & Tong, Z. (2023). Research on Supply Chain Financial Risk Prevention Based on Machine Learning. Computational Intelligence and Neuroscience.
Leung, K. M. (2007). Naive bayesian classifier. Polytechnic University Department of Computer Science/Finance and Risk Engineering, 2007, 123-156.
Nath, S.S., G. Mishra., J. Kar., S. Chakraborty & N. Dey. (2014). A survey of image classification methods and techniques. In 2014 International Conference on Control, Instrumentation, Communication and Computational Technologies (ICCICCT)IEEE: 554-557.
Occupational Fraud. (2022,2020.2018.2016).A Report To Nations ,https://acfepublic.s3.us-west-mazonaws.com.
Pradhan, A. (2012). Support vector machine-a survey. International Journal of Emerging Technology and Advanced Engineering, 2(8), 82-85.
Ranganathan, P., Pramesh,C.S., & Aggarwal,R. (2017). Common pitfalls in statistical analysis: Logistic regression. Perspectives in clinical research 8(3): 148
Rahnama Roudposhti, F. (2012). Data Mining and Financial Fraud Detection, Accounting Knowledge and Management Auditing, 1 (3): 17–33
Rastatter, S., Moe, T., Gangopadhyay, A., & Weaver, A. (2019). Abnormal Traffic Pattern Detection in Real-Time Financial Transactions (No. 827). EasyChair.
Rezaie, M., Nazemi Ardakani, M., & Naser Sadrabadi, A. (2021). Predicting Financial Statement Fraud with CRISP Approach; Management Accounting and Auditing Knowledge, 10 (40): 135–150.
Rezaei, M., Nazemi Ardakani, M.,& Naser Sadr Abadi, A. (2020). Fraud Detection in Financial Statements through Audit Reports of Financial Statements, Management Accounting Journal, (45): 141–153
Sadgali, I., N. Sael & F. Benabbou. (2019). Performance of machine learning techniques in the detection of financial frauds. Procedia computer science 148: 45-54.
Shinde, A., Sahu, A., Apley, D., & Runger, G. (2014). Preimages for variation patterns from kernel PCA and bagging. Iie Transactions, 46(5), 429-456.
Tashdidi, E., Sepasi, S., Etemadi, H., & Azar, A. (2019). Proposing a Novel Approach to Fraud Prediction and Detection in Financial Statements through Bees Algorithm, Accounting Knowledge Journal, 12 (3): 139–167.
Umar, H., Purba, R. (2020), "HU Model: Incorporation of Fraud Star in Detection of Corruption", International Journal of Economics and Management Studies, 13(6), PP. 234-265.
Vieira, S. M., Sousa, J. M., & Runkler, T. A. (2010). Two cooperative ant colonies for feature selection using fuzzy models. Expert Systems with Applications, 37(4), 2714-2723.
Wang, J., Cao, Y., Li, B., Kim, H. J., & Lee, S. (2017). Particle swarm optimization based clustering algorithm with mobile sink for WSNs. Future Generation Computer Systems, 76, 452-457.
Wang, G., Sun, J., Ma, J., Xu, K., & Gu, J. (2014). Sentiment classification: The contribution of ensemble learning. Decision support systems, 57, 77-93.
Xiuguo, W., & Shengyong, D. (2022). An Analysis on Financial Statement Fraud Detection for Chinese Listed Companies Using Deep Learning. IEEE Access, 10, 22516-22532.
Yao, J., Pan, Y., Yang, S., Chen, Y., & Li, Y. (2019). Detecting fraudulent financial statements for the sustainable development of the socio-economy in China: a multi-analytic approach. Sustainability, 11(6), 1579.
Yao, J., Zhang, J., & Wang, L. (2018, May). A financial statement fraud detection model based on hybrid data mining methods. International Conference on Artificial Intelligence and Big Data (ICAIBD). pp. 57-61. IEEE.
Yingquan W., Ianakiev,K., & Govindaraju,V. (2002). Improved k-nearest neighbor classification ",Pattern Recognition 35.
Youkhneh Alghiani, M., Bahri Sales,J., Jabarzadeh Kangarlouei,S.,& Zavari Rezaei, A. (2021). Explaining Financial Tax Cross Reporting of Companies: Hybrid Method of Classic Data Mining, ANFIS, and Metaheuristic Algorithms, Empirical Studies of Financial Accounting, 18 (71), 89–111.