Masterarbeit

Multi-Agent Template Filling using Large Language Models: Design and Comparative Evaluation with Speech Input Case Study

Completion

2025/12

Research Area

Web Engineering

Students

Huzefa Ismail Jadliwala

student

Advisers

Verena Traubinger M.Sc.

researcher

Room: 1/B204

Phone: +49 371 531-32100

Email: verena.traubinger@informatik.tu-chemnitz.de

Dr.-Ing. Sebastian Heil

senior researcher

Room: 1/B204

Phone: +49 371 531 32861

Email: sebastian.heil@informatik.tu-chemnitz.de

Description

In domains such as healthcare, manufacturing, and administration, converting unstructured spoken language into structured forms is a critical yet challenging task. Traditional approaches using large language models (LLMs) typically employ monolithic systems that attempt to complete the entire template filling process in a single step. While computationally efficient, these systems are often brittle: they struggle with incomplete or ambiguous inputs, lack modularity for domain adaptation, and provide limited transparency when dealing with domain-specific terminology, noisy transcriptions, or real-world operational conditions.

This thesis proposes a modular, multi-agent approach to template filling, implemented through the Invox system. Instead of treating the problem as a single LLM task, Invox decomposes it into subtasks—transcription interpretation, field mapping, value inference, and result verification—each handled by dedicated LLM agents. The system leverages state-of-the-art components such as Whisper for transcription, GPT-4 and Claude for reasoning, and DeepSeek for semantic validation. Five architectural strategies are explored: (1) Single-Pass Full Input, (2) Iterative Single-Field Processing, (3) Multi-LLM Consensus (Full), (4) Multi-LLM Consensus (Iterative), and (5) Hybrid Refinement. These differ in terms of prompt structure, processing granularity, and verification mechanisms. All approaches are evaluated using the same criteria: accuracy, consistency, latency, cost-efficiency, and modularity. Datasets include the benchmark MUC-4 corpus and a real-world industrial dataset from steel manufacturing shift reports. The goal is not only to assess individual method performance, but to better understand the trade-offs introduced by modular, agent-based LLM systems in real-world deployment contexts.

The objective of this thesis is the creation of a solution or the combination of existing approaches to solve the problem described above of filling out templates with the help of Large Language Models. This comprises the following parts. An analysis of the state of the art on LLMs and prompt engineering, multi-agent systems, tools for filling out templates, and other relevant work. The thesis includes the implementation and comparison of five different approaches for a solution. A suitable evaluation should be conducted, where the approaches are tested on datasets regarding a set of benchmarks and the elicited requirements based on the literature research.

Masterarbeit

Multi-Agent Template Filling using Large Language Models: Design and Comparative Evaluation with Speech Input Case Study

Completion

Research Area

Students

Huzefa Ismail Jadliwala

Advisers

Verena Traubinger M.Sc.

Dr.-Ing. Sebastian Heil

Description

Press Articles