POSW-Vote: A precision-oriented weighted voting framework for robust information extraction from domain-specific reports
DOI:
https://doi.org/10.54939/1859-1043.j.mst.CSCE9.2025.123-134Keywords:
Information extraction; Large language models; Ensemble voting; Semantic similarity; Schema-based extraction.Abstract
Information extraction (IE) from unstructured or semi-structured reports remains a challenging task in specialized domains such as military situation reporting, where textual content is narrative, irregular, and context-dependent. Traditional rule-based or named-entity-recognition (NER) methods often fail to achieve sufficient coverage or adaptability in such settings. In comparison, large language models (LLMs) have shown strong potential for schema-based extraction, their outputs exhibit variability across runs and models, limiting consistency and precision. This paper proposes POSW-Vote (Precision-Oriented Similarity-Weighted Voting) — a semantic voting algorithm designed to consolidate multiple LLM outputs into a single, stable, structured representation. The method jointly employs similarity-based clustering, reliability weighting, and superstring-aware selection to identify the most complete and contextually correct information for each schema-defined field. Extensive experiments on real-world, expert-annotated Vietnamese military reports demonstrate that POSW-Vote consistently improves Precision and F1-score compared to single-run and intra-model baselines, while maintaining robustness across heterogeneous models. The results highlight that the proposed framework enhances the stability and reliability of LLM-based extraction without retraining, offering a scalable, model-agnostic solution for high-stakes domains such as defense intelligence and situational monitoring.
References
[1]. Yang, Y.; Wu, Z.; Yang, Y.; Lian, S.; Guo, F.; Wang, Z., “A Survey of Information Extraction Based on Deep Learning”, Applied Sciences, Vol. 12, No. 19, Article 9691, (2022).
[2]. Chiticariu, L.; Li, Y.; Reiss, F. R., “Rule-Based Information Extraction Is Dead! Long Live Rule-Based Information Extraction Systems!”, In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 827–832, (2013).
[3]. Zhang, Z.; You, W.; Wu, T.; Wang, X.; Li, J.; Zhang, M., “A Survey of Generative Information Extraction”, In Proceedings of the International Conference on Computational Linguistics (COLING), pp. 4840–4870, (2025).
[4]. OpenAI et al., “gpt-oss-120b & gpt-oss-20b Model Card”, arXiv, Article arXiv:2508.10925, (2025).
[5]. Yang, A. et al., “Qwen3 Technical Report”, arXiv, Article arXiv:2505.09388, (2025).
[6]. Lupart, S.; van Dijk, D.; Langezaal, E.; van Dort, I.; Aliannejadi, M., “Investigating LLM Variability in Personalized Conversational Information Retrieval”, arXiv, Article arXiv:2510.03795, (2025).
[7]. Wang, X.; Wei, J.; Schuurmans, D.; Le, Q.; Chi, E. H.; Zhou, D., “Self-Consistency Improves Chain of Thought Reasoning in Language Models”, arXiv, (2022).
[8]. Li, J.; Zhang, Q.; Yu, Y.; Fu, Q.; Ye, D., “More Agents Is All You Need”, arXiv, Article arXiv:2402.05120, (2024).
[9]. Chen, Z. et al., “Harnessing Multiple Large Language Models: A Survey on LLM Ensemble”, arXiv, Article arXiv:2502.18036, (2025).
[10]. Qwen Team, “Qwen3-4B-Thinking-2507”, Hugging Face Model Repository, (2025).
[11]. Qwen Team, “Qwen3-14B”, Hugging Face Model Repository, (2025).
[12]. OpenAI, “gpt-oss-20b”, Hugging Face Model Repository, (2025).
[13]. Ye, A.; Wang, L.; Zhao, L.; Ke, J.; Wang, W.; Liu, Q., “RapidFuzz: Accelerating Fuzzing via Generative Adversarial Networks”, Neurocomputing, Vol. 460, pp. 195–204, (2021).