Analisis Performa Model Embedding BGE Small Dan Minilm-L6 Terhadap Kualitas Retrieval Menggunakan Metrik Ragas
Main Article Content
Abstract
The application of Large Language Models in the medical domain is often hampered by issues of hallucination and limited up-to-date knowledge. Retrieval-Augmented Generation offers a solution for connecting LLM with factual data, but the quality of RAG output is highly dependent on the accuracy of the information retrieval process. This study aims to analyze the effect of chunk size and embedding model variations on retrieval quality in a medical chatbot system at the Nusa Putra Farmedika General Clinic. The method used is a comparative experiment by testing three chunk size variations (256, 512, and 1024 tokens) and comparing the performance of two embedding models, BGE Small and MiniLM-L6. The evaluation was conducted automatically using the RAGAS framework, focusing on the Context Recall and Context Precision metrics. These findings were implemented into a medical chatbot prototype as a form of functional validation. The results showed an inverse relationship between chunk size and retrieval quality, with a chunk size of 512 tokens producing the best level of information granularity. The BGE Small model proved to be slightly superior to MiniLM-L6 in capturing the semantics of clinical text. The most optimal configuration was achieved by combining the BGE Small model with a chunk size of 512, which produced the highest average score of 0.59, Context Recall of 0.45, and Context Precision of 0.74. This study recommends this configuration as a technical standard for the development of medical chatbot as a foundational step to improve context relevance and mitigate the potential for hallucinations.
Article Details

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
References
Komdigi, “Transformasi Digital Bersama Kementrian,” Kementrian Komunikasi dan Digital. Accessed: Oct. 22, 2025. [Online]. Available: https://www.komdigi.go.id/transformasi-digital
K. U. P. N. P. Farmedika, “Klinik Umum Pratama Nusa Putra Farmedika.” [Online]. Available: https://clinic.nusaputra.ac.id/
M. Farwati, I. T. Salsabila, K. R. Navira, and T. Sutabri, “ANALISA PENGARUH TEKNOLOGI ARTIFICIAL INTELLIGENCE ( AI ) DALAM KEHIDUPAN SEHARI-HARI,” Jurnal Sistem Informasi dan Manajemen, vol. 11, no. 1, pp. 39–45, 2023, doi: https://doi.org/10.47024/js.v11i1.563.
K. D. Ningtyas, R. Kurniawan, and A. Armansyah, “Penerapan Natural Language Processing Pada Aplikasi Chatbot Info Layanan Kantor Menggunakan Naive Bayes Algorithm,” Jurnal Teknologi Sistem Informasi dan Sistem Komputer TGD, vol. 6, no. 2, pp. 266–271, 2023, doi: https://doi.org/10.53513/jsk.v6i1.7413.
P. Lewis et al., “Retrieval-augmented generation for knowledge-intensive NLP tasks,” Adv. Neural Inf. Process. Syst., pp. 2–9, 2021.
C. Karaniya Wigayha, “Analisis Sistematis Penggunaan Large Language Models (Llms) Dan Artificial Intelligence (Ai) Untuk Peningkatan Literasi Digital Pada Jenjang Pendidikan Tinggi,” PT Dinamika Publishing International, vol. 1, no. 1, pp. 1–7, 2025, [Online]. Available: https://jurnal.dinamikapublika.id/index.php/EXGEN/article/view/10
R. Ahadi, N. Safaat Harahap, M. Fikry, and F. Kurnia, “Retrieval-Augmented Generation in a Web-Based Question Answering System for Fiqh Books,” Journal of Artificial Intelligence and Software Engineering, vol. 5, no. 2, pp. 626–635, 2025, doi: 10.30811/jaise.v5i2.7005.
I. Fanani, “IMPLEMENTASI RETRIEVAL AUGMENTED GENERATION UNTUK EVALUASI PROPOSAL TUGAS AKHIR MAHASISWA,” Jurnal Teknologi Komputer dan Informatika, vol. 3, no. 2, 2025, doi: https://doi.org/10.59820/tekomin.v3i2.336.
R. MARLINA, “SISTEM TANYA JAWAB PERNIKAHAN DALAM ISLAM BERBASIS WEB,” Jurnal Informatika, Manajemen dan Komputer, pp. 1–11, 2024.
E. A. Prasetyo, Chatbot untuk Informasi Pembangunan Wilayah Kota Semarang menggunakan Metode Retrieval Augmented Generation (RAG). 2024. [Online]. Available: http://ecampus.poltekkes-medan.ac.id/jspui/handle/123456789/1726
J. RISAKOTTA, “Penerapan Chunking Strategy Untuk Meningkatkan Kemampuan Memahami Teks Dalam Bahasa Inggris Pada Smk Kesehatan Nusaniwe Ambon,” VOCATIONAL: Jurnal Inovasi Pendidikan Kejuruan, vol. 2, no. 4, pp. 327–334, 2023, doi: 10.51878/vocational.v2i4.1751.
M. Susanty and S. Sukardi, “Perbandingan Pre-trained Word Embedding dan Embedding Layer untuk Named-Entity Recognition Bahasa Indonesia,” Petir, vol. 14, no. 2, pp. 247–257, 2021, doi: 10.33322/petir.v14i2.1164.
E. T. Eman, T. N. Fatyanosa, and A. F. Aji, “Analisis Perbandingan Metode Chunking dalam Chatbot Berbasis Retrieval-Augmented Generation Rekomendasi Terapi Nutrisi Medis Pasien,” Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer, vol. 10, no. 1, pp. 1–6, 2026.
C. and C. with Navnit, “Day 13 : Master RAG: How to Find the Ideal Chunk Size for Better AI Retrieval,” 2026. [Online]. Available: https://youtu.be/lTdGt3vIhsY?si=F_cR9uLKgMBdorYN
M. A. Mersha, M. Gemeda Yigezu, and J. Kalita, “Semantic-Driven Topic Modeling Using Transformer-Based Embeddings and Clustering Algorithms,” Procedia Comput. Sci., vol. 244, pp. 121–132, 2024, doi: 10.1016/j.procs.2024.10.185.
I. L. Kharisma, M. S. Hidayat, Somantri, and Kamdan, “Implementasi Retrieval Augmented Generation dalam Sistem Chatbot Dermatologi Berbasis Website,” Jurnal Teknik Informatika dan Sistem Informasi, vol. 11, no. 3, pp. 448–462, 2025, doi: https://doi.org/10.28932/jutisi.v11i3.12258.
S. Aliphadji Talaohu, R. Soekarta, and M. Surahmanto, “Implementasi LLM Pada Chatbot PMB Universitas Muhammadiyah Sorong Menggunakan Metode RAG Berbasis Website,” Jurnal Ilmu Komputer dan Informatika, vol. 03, no. 02, pp. 1–11, 2025.
O. Topsakal and T. C. Akinci, “Creating Large Language Model Applications Utilizing LangChain: A Primer on Developing LLM Apps Fast,” International Conference on Applied Engineering and Natural Sciences, vol. 1, no. 1, pp. 1050–1056, 2023, doi: 10.59287/icaens.1127.
NCBI, “National Center for Biotechnology Information,” National Center for Biotechnology Information. Accessed: Nov. 10, 2025. [Online]. Available: https://www.ncbi.nlm.nih.gov/
WHO, “World Health Organization,” World Health Organization. Accessed: Nov. 10, 2025. [Online]. Available: https://www.who.int/
K. RI, “Kementrian Kesehatan Republik Indonesia,” Kementrian Kesehatan Republik Indonesia. Accessed: Nov. 11, 2025. [Online]. Available: https://www.kemkes.go.id/id/home
Alodokter, “Alodokter,” Alodokter. Accessed: Nov. 13, 2025. [Online]. Available: https://www.alodokter.com/
Halodoq, “Halodoq.” Accessed: Nov. 15, 2025. [Online]. Available: https://www.halodoc.com/?srsltid=AfmBOoq3etemB8IKchp1ukv0m9OZ40TqD28FdX-znWNZLNt_oZuRhAcY
Biofarma, “PT Bio Farma,” Biofarma Group. Accessed: Nov. 15, 2025. [Online]. Available: https://www.biofarma.co.id/id/pt-bio-farma-persero
S. Latif, H. Ameer, M. H. Akram, and M. Fatima, “The Chunking Paradigm: Recursive Semantic for RAG Optimization,” Association for Computational Linguistics, pp. 137–145, 2025, [Online]. Available: https://aclanthology.org/2025.icnlsp-1.15/
S. Lai et al., “Enhancing Technical Documents Retrieval for RAG,” in 2025 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2025, pp. 1176–1181. doi: 10.1109/APSIPAASC65261.2025.11249099.
R. Sajja, Y. Sermet, and I. Demir, “Domain-specific embedding models for hydrology and environmental sciences: enhancing semantic retrieval and question answering,” Water Science and Technology, vol. 92, no. 9, pp. 1328–1342, 2025, doi: 10.2166/wst.2025.156.
A. Z. Abidin and M. M. Engel, “Comparative Analysis of Performance Aspects Between Chroma and Pgvector as a Vector Database,” bit-Tech, vol. 8, no. 2, pp. 2079–2090, 2025, doi: 10.32877/bt.v8i2.3198.
M. Antal and K. Buza, “Evaluating Open-Source LLMs in RAG Systems: A Benchmark on Diploma Theses Abstracts Using Ragas,” Acta Universitatis Sapientiae, Informatica, vol. 17, no. 1, pp. 1–15, 2025, doi: 10.1007/s44427-025-00006-3.
S. Es, J. James, L. Espinosa-Anke, and S. Schockaert, “RAGAS: Automated Evaluation of Retrieval Augmented Generation,” EACL 2024 - 18th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of System Demonstrations, pp. 150–158, 2024, doi: 10.18653/v1/2024.eacl-demo.16.