Priority Area 'LaSTing'

General information

The Priority Area (German: Schwerpunktprogramm) “Robust Assessment & Safe Applicability of Language Modeling: Foundations for a New Field of Language Science & Technology” (acronym: LaSTing; SPP 2556) aims to advance our understanding of language technology, in particular language modeling, for safer use, especially in applications in the (computational / cognitive) language sciences.

Currently, a call for project proposals is open. We encourage interested scholars to consult the project description below and to consider applying.

General information about the funding scheme is here.

The official call for proposals from the DFG for LaSTing is here.

Slides with additional information on content and practicalities: here.

A detailed project description is here.

Timeline for first funding phase:

Practical and organizational information

Project proposals are submitted as “individual projects” to the DFG. Evaluation and selection is organized and executed by the DFG, not the SPP’s coordinator and board.

Projects apply for their own funding for running costs (experiments, conference traveling etc.) and may include additional funding for PostDocs and PhDs. Normally, a project would apply for funding of one PhD or PostDoc position. PhD students are usually funded as E13 65% (Humanities) or E13 100% (Sciences) depending on local custom. Funding of compute architecture or other large investments in equipment cannot be funded through this initiative, which is chiefly a community-building / networking initiative aiming to connect researchers at the interface between language science and language technology, and to educate a new cohort of outstanding young researchers in this emerging interdisciplinary field.

Early-career researchers are particularly encouraged to apply for their own funding (PostDoc position).

Additional measures

On top of each project’s individual budget, the priority area contributes additional measures for networking and community-building. This includes extra funding for workshops, short-term collaboration between projects, exchange visits, public outreach measures, as well as equality and family support. The program is moreover supported by four Mercator Fellows. Annual meetings and annual autumn schools will help inter-project exchange, interdisciplinary education and public dissemination of research results.

Aims and scope of the priority area

While modern language technology increasingly permeates many areas of applications, much of its input-output behaviour and its inner mechanics remains unknown. As a result, recent years have seen a newly emerging field of interdisciplinary and methodologically diverse work at the interface between the cognitive language sciences (broadly construed) and language technology (focused on neural language models, but not exclusively). However, many foundational and methodological issues remain unclear. The overarching goal of this Priority Programme is therefore to channel cross-disciplinary efforts dedicated to the understanding, testing and safe application of modern language technology (with a focus on language modelling).

The Priority Programme LaSTing addresses researchers in the interdisciplinary field of the cognitive and computational language sciences (including classical disciplines such as linguistics, psychology, neuroscience, computational linguistics, artificial intelligence, philosophy, computer science and others) who seek to advance our understanding of language modelling from a theoretical or empirical point of view, or use modern language technology as a tool for innovative theoretical and empirical research in the cognitive language sciences. Individual projects are expected to relate to at least one of the Priority Area’s core issues, which are robust assessment, safe applicability and foundational questions (as detailed in the following). The Priority Programme especially encourages contributions that seek to address these core issues by bringing to bear concepts and methods from the theoretical/empirical language sciences.

Robust assessment Given the very rapid pace of recent developments, careful reflection on standards for the methodology of testing and assessment is lagging behind. What is required is a joint effort to converge on proper standards for robust assessment of language models. Methodology is robust, in the sense intended here, if its results are generalisable (carrying over with sufficient certainty to other models and data sets), transferable (insightful beyond the purposes of understanding a single type of computational model), and reproducible (with the same or different models and data sets). Robust methodology also aspires to be as future-proof as possible, i.e. likely relevant to the next generation of models or the next set of antagonistic examples.

Safe applicability As language technology gets applied more and more widely, concerns of safe applicability become ever more important. Safe applicability subsumes critical aspects such as being conceptually sound (e.g. anchored in “first principles” or established empirical knowledge), validated (e.g. by mathematical proof or other rigorous derivation) or at least stress-tested across a near-exhaustive traversal of possible conditions of use, ethical (e.g. bias- and harm-free, or privacy-respecting), and also economical (i.e. minimising data requirements and energy consumption). Issues of safe applicability loom particularly large in the context of high-stake implications, of which application in the scientific process is a special case. The Priority Area LaSTing therefore also particularly invites contributions on the reflection of safe applicability of language technology for knowledge gain in the cognitive language sciences.

Foundational questions Progress on understanding the behaviour of language models and their safe applicability is inexorably tied to a better understanding of their core mechanisms and the impact of their training data or their training objectives. But just as relevant are deep foundational questions concerning the nature of language models (e.g. what are LMs models of?) and their proper role in the scientific research into human language (e.g. how could LMs be used as explanatory tools for understanding human language?). In response to these issues, the Priority Programme especially welcomes foundational work addressing general properties or potential limits of particular classes of language models, e.g. by using mathematical arguments, simulations studies, tight conceptual argumentation or a mixture of such methods.

Examples of more concrete research questions that fit into these three core issues are:

Examples of work that is outside the scope of this Priority Programme are efforts geared mainly at improving system performance (e.g. based on some benchmark score). Also, projects that merely seek new areas of application with established tools, as long as there is little or no reflection on methods or concepts, or any other bearing on the knowledge-oriented cognitive language sciences.

In order to achieve its goals, LaSTing requires broad and deep interdisciplinary collaboration. The Priority Programme therefore implements an extensive suite of individual measures to support diversity, networking and dissemination, and to ensure the success of early career researchers and scholars with backgrounds underrepresented in academic research. Early career researchers are explicitly encouraged to submit their own proposals.

Internal Collaboration

If you want to mention collaboration between project proposals with your application, and you would like to know what others are preparing to submit, you can share and see project ideas in this document: LaSTing ‘meet & collab’.

Additional information on applications (FAQs)