Master's Thesis Gentrit Fazlija
Toward Optimizing a Retrieval Augmented Generation Pipeline using Large Language Model
Introduction & Motivation
Hello! Welcome to my project page :)
Currently, I'm working on my master's thesis. As a student of Mathematics in Data Science, I'm deeply interested in data and how to maximize the inherent value within it. Ever since I was introduced to NLP, I was immediately hooked. Currently, I'm focusing on an information retrieval model, which aims to assist both current students and students-to-be in understanding the different study programs that TUM offers.
Through this, I am leveraging the reasoning capabilities of Large Language Models to extract current data about the study programs at TUM. The goal is to build a model pipeline that answers a variety of questions one might have about this subject field.
Join me on this journey either by checking back on this page around mid-February or connecting with me on LinkedIn.
Research Questions
Q1: Would a multi-query formulation system improve the performance?
Q2: Would an optimization approaches, such as ensamble retriever in combination with a child-parent chunking imporove the performance of the passage retriever?
Q3: How much will few-shot promping help us with respect to zero-shot prompting?
Q4: How does the performance change when using a free open-source model compared to a paid closed source model? How can open-sourced models be optimized?
References
tba
TRANSLATE with x
English
TRANSLATE with
COPY THE URL BELOW
EMBED THE SNIPPET BELOW IN YOUR SITE
Enable collaborative features and customize widget: Bing Webmaster Portal
"; langMenu.appendChild(origLangDiv); LanguageMenu.Init('LanguageMenu', LanguageMenu_keys, LanguageMenu_values, LanguageMenu_callback, LanguageMenu_popupid); window["LanguageMenu"] = LanguageMenu; clearInterval(intervalId); } }, 1); // ]]>
| Attribute | Value |
|---|---|
| Title (de) | Optimierung einer Retrieval Augmented Generation Pipeline unter Verwendung eines großen Sprachmodells |
| Title (en) | Toward Optimising a Retrieval Augmented Generation Pipeline using Large Language Model |
| Project | |
| Type | Master's Thesis |
| Status | completed |
| Student | Gentrit Fazlija |
| Advisor | Anum Afzal |
| Supervisor | Prof. Dr. Florian Matthes |
| Start Date | 15.08.2023 |
| Sebis Contributor Agreement signed on | 08.08.2023 |
| Checklist filled | Yes |
| Submission date | 15.03.2024 |