Article
Optimizing QA Systems: Evaluating Row-Based and Traditional Chunking in Structured-Data-Aware Retrieval-Augmented Generation for University Virtual Assistants
DOI:
https://doi.org/10.47344/7we5dg32Keywords:
Q&A system, Virtual Assistant, ChatBot, RAG, NLP, Chunking strategy, LLM powered ChatbotsAbstract
This paper presents the development of a question-answering system that can assist university students with academic and administrative questions. We present a new approach that examines various chunking approaches to the Retrieval-Augmented Generation process. Although RAG is typically used with standard chunking methods, this paper presents row-based chunking, tailored to structured question-answer datasets, in order to enhance context retrieval for large language models. To establish its effectiveness, we conducted a human evaluation to compare the outputs it generated with those generated using standard and row-based chunking. The individuals who tested our system were both students and educators at the university. We concluded that row-based chunking gives more coherent and relevant contextual data than standard ways of chunking when applied to structured data sets. This work highlights the potential of using chunking methods to improve RAG-based systems for domain-specific applications, paving the way towards more accurate and context-sensitive AI-based aid in educational settings.