Comparative Analysis of Text Mining Results With Tf ldf Features and SQL Like Operator in Indonesian News Search
Analisa Perbandingan Hasil Text Mining Dengan Fitur Tf ldf dan SQL Like Operator Pada Pencarian Berita berbahasa Indonesia
Abstract
Research on the implementation of text mining uses the TF IDF method to be used in the Information retrieval / Indonesian news search feature. The dataset used was sourced from NewsAPI and built a Codeigniter based website named "News Plus Six Dua". This study also uses the Vertor Space Model (VSM) method to overcome the weaknesses of the TF IDF method at the time of the sorting process. The results of this study explain that the search by the TF IDF method has higher accuracy when compared to SQL like operators. TF IDF produces a percentage of precision 100% and recall (sensitivity) 66.7% on searches with the keyword "Indonesian soccer schedule" while SQL like operators do not display search results or equal to 0%. But the TF IDF method has the disadvantage of running slower than SQL like operators. This has been tested using either the number of words or terms entered, the number of datasets, and the location of access. At the location of access, access via hosting is monitored faster when compared via localhost.
References
K. Frinta and P. P. Adikara, “Pencarian Berita Berbahasa Indonesia Menggunakan Metode BM25,” vol. 3, no. 3, pp. 2589–2595, 2019.
F. Amin, “Implementasi Search Engine (Mesin Pencari) Menggunakan Metode Vector Space Model,” Din. Tek., vol. 1, no. 1, pp. 45–58, 2011.
K. D. Putung, A. S. M. Lumenta, and A. Jacobus, “Penerapan Sistem Temu Kembali Informasi Pada Kumpulan Dokumen Skripsi,” J. Tek. Inform., vol. 8, no. 1, 2016.
M. A. Rosid, G. Gunawan, and E. Pramana, “Centroid Based Classifier With TF – IDF – ICF for Classfication of Student’s Complaint at Appliation E-Complaint in Muhammadiyah University of Sidoarjo,” J. Electr. Electron. Eng., vol. 1, no. 1, p. 17, 2016. DOI: https://doi.org/10.21070/jeee-u.v1i1.23
R. J. Mooney, “Machine Learning Text Categorization,” Mach. Learn., pp. 1–6, 2006.
A. Harahap and M. Agung, Jurnalistik Televisi: Teknik Memburu dan Menulis Berita. 2012.
H. Bunyamin, “Algoritma Umum Pencarian Informasi Dalam Sistem Temu Kembali Informasi Berbasis Metode Vektorisasi Kata dan Dokumen,” J. Inform. UKM, vol. 2, no. Mesin Pencari, pp. 85–91, 2005.
G. Block, P. Cibraro, P. Felix, H. Dierking, and D. Miller, Designing Evolvable Web APIs with ASP.NET. 2014.
K. Arianto, M. A., Munir, S., dan Khotimah, “Analisis Dan Perancangan Representational State Transfer (Rest) Web Service Sistem Informasi Akademik Stt Terpadu Nurul Fikri Menggunakan Yii Framework,” J. Teknol. Terpadu, vol. 2, no. 2, 2016.
M. A. Hearst, “Untangling text data mining,” pp. 3–10, 1999. DOI: https://doi.org/10.3115/1034678.1034679
H. Manning, C. D., Raghavan, P., dan Schütze, An Introduction to Information Retrieval, no. c. 2009. DOI: https://doi.org/10.1017/CBO9780511809071
Y. Wibisono and M. L. Khodra, “Clustering Berita Berbahasa Indonesia,” Univ. Pendidik. Indones., pp. 1–4, 2005.
G. Karyono, F. S. Utomo, A. Sistem, and T. Balik, “Temu Balik Informasi Pada Dokumen Teks Berbahasa Indonesia Dengan Metode Vector Space Retrieval Model,” Semin. Nas. Teknol. Inf. dan Terap. 2012, vol. 2012, no. Semantik, pp. 282–289, 2012.
F. Sanjaya, “Pemanfaatan Sistem Temu Kembali Informasi dalam Pencarian Dokumen Menggunakan Metode Vector Space Model,” J. Inf. Technol., vol. 53, no. 9, pp. 1689–1699, 2018.
A. Indriani, Gunawan, and E. Novianto, “Weight Adjusted K-Nearest Neighbor dan Minimum Spanning Tree untuk Information Retrieval System di Perpustakaan STMIK PPKIA Tarakanita Rahmawati Tarakan,” Semin. Nas. Apl. Teknol. Inf. 2013, pp. 18–22, 2013.
Copyright (c) 2020 Riwa Rambu Hada Enda, Fajar Hariadi
This work is licensed under a Creative Commons Attribution 4.0 International License.