Prediksi Cacat Software Dengan Teknik Sampel Dan Seleksi Fitur Pada Bayesian Network

Sukmawati Anggraeni Putri

Sari


Proses prediksi cacat software telah menjadi bagian penting pada proses pengujian kualitas software. Penelitian ini berfungsi sebagai alternatif bagi praktisi software untuk menentukan prioritas modul software yang akan diuji. Sehingga dapat mengurangi biaya maupun waktu dalam pengujian kualitas software, Sebagai percobaannya, sejak awal para peneliti pada bidang prediksi cacat perangkat lunak ini menggunakan dataset NASA MDP yang bersifat publik. Tetapi, dataset ini memiliki dua kekurangan seperti noise atribut dan ketidak seimbangan kelas. Permasalahan noise atribute dapat diatasi menggunakan algoritma seleksi fitur, seperti Chi Square dan Information Gain. Sementara, permasalahan ketidak seimbangan kelas dapat diatasi menggunakan teknik sampel, seperti RUS (Random Undersampling) dan SMOTE (Synthetic  Minority  Over-sampling Technique). Sehingga pada penelitian ini dilakukan integrasi antara teknik sampel (RUS dan SMOTE) pada algoritma pemilihan atribut (algoritma Information Gain) yang diterapkan pada machine learning Bayesian Network. Machine learning Bayesian Network menurut Lessman merupakan  pengklasifikasi statistik yang  memiliki performa  yang  baik  pada  proses  klasifikasi. Dari hasil percobaan yang dilakukan di empat dataset NASA MDP diperoleh hasil bahwa model SMOTE + IG dapat meningkatkan akurasi pengklasifikasi Bayesian Network hingga rata-rata 0.912 dari 4 dataset NASA MDP yang digunakan.

Kata Kunci


Cacat Perangkat Lunak, Teknik Sampel, Algoritma Seleksi Fitur, Algoritma Bayesian Network

Referensi


Catal, C. (2011). Software fault prediction: A literature review and current trends. Expert Systems with Applications, 38(4), 4626–4636.

Chawla, N. V, Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE : Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence, 16, 321–357.

de Carvalho, A. B., Pozo, A., & Vergilio, S. R. (2010). A symbolic fault-prediction model based on multiobjective particle swarm optimization. Journal of Systems and Software, 83(5), 868–882.

Gao, K., & Khoshgoftaar, T. M. (2011). Software Defect Prediction for High-Dimensional and Class-Imbalanced Data. Conference: Proceedings of the 23rd International Conference on Software Engineering & Knowledge Engineering, (2).

Hall, T., Beecham, S., Bowes, D., Gray, D., & Counsell, S. (2010). A Systematic Literature Review on Fault Prediction Performance in Software Engineering. IEEE Transactions on Knowledge and Data Engineering, 38(6), 1276–1304.

Kabir, M., & Murase, K. (2012). Expert Systems with Applications A new hybrid ant colony optimization algorithm for feature selection. Expert Systems With Applications, 39(3), 3747–3763. http://doi.org/10.1016/j.eswa.2011.09.073

Khoshgoftaar, T. M., & Gao, K. (2009). Feature Selection with Imbalanced Data for Software Defect Prediction. 2009 International Conference on Machine Learning and Applications, 235–240. http://doi.org/10.1109/ICMLA.2009.18

Kohavi, R., & Edu, S. (1995). A Study of Cross-Validation and Bootstrap for Accuracy Estimation and M o d e l Selection. Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI), 1137–1143.

Lessmann, S., Member, S., Baesens, B., Mues, C., & Pietsch, S. (2008). Benchmarking Classification Models for Software Defect Prediction : A Proposed Framework and Novel Findings. IEEE Transactions on Software Engineering, 34(4), 485–496.

Ling, C. X. (2003). Using AUC and Accuracy in Evaluating Learning Algorithms, 1–31.

Liu, Y., Yu, X., Huang, J. X., & An, A. (2011). Combining integrated sampling with SVM ensembles for learning from imbalanced datasets. Information Processing & Management, 47(4), 617–631. http://doi.org/10.1016/j.ipm.2010.11.007

Menzies, T., Milton, Z., Turhan, B., Cukic, B., Jiang, Y., & Bener, A. (2010). Defect prediction from static code features: current results, limitations, new approaches. Automated Software Engineering, 17(4), 375–407. http://doi.org/10.1007/s10515-010-0069-5

Putri, S. A. (2017). Combining Integreted Sampling Technique with Feature Selection for Software Defect Prediction. In IEEE International Conference on Cyber and IT Service Management (pp. 1–6).

Song, Q., Jia, Z., Shepperd, M., Ying, S., & Liu, J. (2011). A General Software Defect-Proneness Prediction Framework. IEEE Transactions on Software Engineering, 37(3), 356–370.

Sun, L., & Erath, A. (2015). A Bayesian network approach for population synthesis. TRANSPORTATION RESEARCH, 61, 49–62. http://doi.org/10.1016/j.trc.2015.10.010

Wahono, R. S., & Suryana, N. (2013). Combining Particle Swarm Optimization based Feature Selection and Bagging Technique for Software Defect Prediction. International Journal of Software Engineering and Its Applications, 7(5), 153–166.

Wang, S., Gao, R., & Wang, L. (2016). Bayesian network classifiers based on Gaussian kernel density. Expert Systems with Applications: An International Journal, 51(C), 207–217. http://doi.org/10.1016/j.eswa.2015.12.031

Wang, T., Li, W., Shi, H., & Liu, Z. (2011). Software Defect Prediction Based on Classifiers Ensemble. Journal of Information & Computational Science 8, 16(December), 4241–4254.

Yap, B. W., Rani, K. A., Aryani, H., Rahman, A., Fong, S., Khairudin, Z., & Abdullah, N. N. (2014). An Application of Oversampling, Undersampling, Bagging and Boosting in Handling Imbalanced Datasets. Proceedings of the First International Conference on Advanced Data and Information Engineering (DaEng-2013), 285, 13–23.




DOI: http://dx.doi.org/10.31599/jki.v19i1.314

Refbacks

  • Saat ini tidak ada refbacks.



Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License
 

Lembaga Penelitian, Pengabdian kepada Masyarakat dan Publikasi Universitas Bhayangkara Jakarta Raya (LPPMP UBJ)