Task 1: Reading comprehension
Question Answering is an important and broad field of research within Natural Language Processing. Previous editions of PolEval included passage retrieval tasks, which are crucial in narrowing down the range of documents relevant to a human question, but only after including a reading comprehension system, can the whole process of answering a question be fully automated.
Classical systems for text comprehension relied on span extraction, but this is a limiting technique: it does not work well for morphologically rich languages (such as Polish) and does not fit well with answering yes-no questions. Recent advances in large, generative language models show that high zero-shot performance is easily achievable. However, aligning these models with more precise task definitions is still challenging, and traditional supervised learning paradigms still outperform GPT. Moreover, the problem of hallucinations is still pervasive, and so generative models will often prefer to fabricate answers instead of stating that the passage is irrelevant.
Task 2: Emotion and sentiment recognition
Understanding human emotions is one of the more challenging tasks in natural language processing. Not only are they a very subjective topic, but humans also often lack the capability to fully express themselves in written language. Understanding the expressed emotions can require some additional context, sometimes given by external knowledge.
Nowadays, the problem of understanding the structure and subtleties of a language as well as having knowledge that is not available in the immediate context of a text is addressed by using large pre-trained models. This solution is by no means perfect and often requires additional training to fit the task at hand. Nonetheless, having associative knowledge from a lot of unlabeled texts gives noticeable gains in tasks such as emotion recognition.
Task 3: Polish Automatic Speech Recognition Challenge
Automatic speech recognition (ASR) has made significant progress over the last decade. Improvements in deep learning and increased data availability have resulted in accuracy levels for artificial speech transcription that are on par with human transcription, at least in specific domains, tasks, and speech characteristics. ASR technology has expanded to cover many new languages, use cases, user demographics, and devices. However, achieving robust speech recognition remains a challenge for many low-resource languages, specific speaker groups, application domains, and acoustic conditions.
To gauge the technological advancements in Polish ASR technology, we are introducing the Open Challenge for Polish ASR. This initiative draws inspiration from the Multi-Domain End-to-End Speech Recognition Benchmark for the English language [1].