CL Fall School 2024

Program

We offer four (on-site) courses, each of which consists of ten 90-minute lectures. All courses are taught in English.

Week I (16-20 September 2024)	9-10.30am 11am-12.30pm	Python for Language Processing (Jakob Prange, University of Augsburg)
	1.30-3pm 3.30-5pm	Multimodal CL and NLP: Combining Language and Vision for (Computational) Semantics (Carina Silberer, IMS Stuttgart)
Week II (23-27 September 2024)	9-10.30am 11am-12.30pm	Argument Mining: From argument diagramming to automatic reconstruction of reasoning (John Lawrence, University of Dundee)
	1.30-3pm 3.30-5pm	Visual Analytics for Linguistics (Raphael Buchmüller, University of Konstanz)

The total workload of the modules corresponds to 3 ECTS (90 hours).

Course descriptions

Python for Language Processing (Jakob Prange, University of Augsburg)

This compact course delivers a practical introduction to the Python programming language, tailored to linguists and language professionals. We will start from basic concepts like loops and reading files, and by the end of the week you will be able to process and automatically annotate a whole text corpus! Python is highly relevant in today's technology landscape due to its versatility and ease of use. It is used frequently for web development and data analysis, for artificial intelligence and automation, in industries and research, by beginners and experienced developers alike.

The course will comprise mostly practical exercises, following a brief introductory lecture each morning. In other words, we will spend as much time coding, understanding code, and critically discussing computational linguistic problems as possible!

After taking the course, participants will know how to analyze and process text files using Python programs and scripts. They will apply good code design techniques and find and fix bugs in their own and others' Python code. Participants will be familiar with common questions and issues in Computational Linguistics and prepared to critically reflect on them, in order to properly approach corresponding programming problems.

Multimodal CL and NLP: Combining Language and Vision for (Computational) Semantics (Carina Silberer, IMS Stuttgart)

How do humans use language to communicate with each other in and about the real world? How can we equip systems with a (better) ability to comprehend and use human language in the physical world? These theoretical and practical questions are at the core of the field of Multimodal NLP.

This course places a particular focus on the connection between language and the visual context. The course provides the fundamental concepts and knowledge of multimodal NLP and its underlying fields, discusses its tasks and applications, and teaches the tools and methods that are commonly used to explore language comprehension through joint reasoning and understanding of images/videos.

Each lecture consists of a theoretical and a practical part (hands-on).

The first part of the course will provide the fundamentals, in particular computer vision and image recognition models used to extract visual representations from visual data (images or videos), as well as multimodal features/representation models, together with practical exercises.

Classical tasks and applications in multimodal NLP will then be discussed and current multimodal models for some of these tasks will be explored practically.

Following this, the course will discuss the limitations of current multimodal models to account for language understanding (in the real world). In practical exercises, we will analyse current models to account for specific linguistic phenomena. The final part focuses on current challenges, tasks and applications in multimodal NLP and also robotics.

For the practical exercises, students are expected to have (some) programming skills, ideally in Python. We recommend that course participants have Python installed, and ideally Anaconda. No prior knowledge of computer vision is required.

Argument Mining: From argument diagramming to automatic reconstruction of reasoning (John Lawrence, University of Dundee)

Argument mining builds on opinion mining, sentiment analysis and related to tasks to automatically extract not just what people think, but why they hold the opinions they do. From being largely unheard of less than a decade ago, there are now several thousand papers on the topic, tens of millions of dollars of commercial and research investment, and a growing community dedicated to the field.

The course will start with a short manual annotation exercise, in which participants will be encouraged to think about their justifications for the choices they make: Why that particular segment? Why that particular structure? Why that argumentation scheme? etc., developing intuition for how these tasks could be automatically performed. The remainder of the first session will look at mapping the manual annotation process to argument mining tasks.

The second session will move on to look at state-of-the-art techniques for each argument mining task and the advances that are enabling these, including: transformer models; word & sentence embeddings; generative models; and, hybrid models. The session will conclude with a short practical implementation exercise, and a look ahead at future directions for the field.

Visual Analytics for Linguistics (Raphael Buchmüller, University of Konstanz)

This course aims to provide an introduction to the field of the visualization of linguistic information (LingVis). LingVis integrates techniques developed in Information Visualization (InfoVis) and Visual Analytics with methodologies and analyses from theoretical and computational linguistics. In addition to standard visualization techniques such as bar charts, scatterplots, or line charts, a wide array of advanced and novel methods have been developed. Prominent examples include treemaps, pixel displays, sunburst visualizations, and glyphs of varying complexity. We will explore how computational linguistics can benefit from learning LingVis techniques, as these visualizations can help uncover patterns, trends, and relationships in linguistic data that are otherwise difficult to detect using traditional methods. We present a concrete use case for LingVis in the context of Large Language Model (LLM) interactions. Each day, the course will be divided into two parts: The first session will introduce theoretical concepts and foundational techniques from the field of LingVis, while the second session will focus on deepening these skills through hands-on design and application exercises. Participants will apply what they have learned investigating how linguistic questions can benefit from visual analysis.

Computational Linguistics Fall School 2024

September 16 - 27, 2024

Program

Course descriptions