Resources

Code & Models

GitHub Repositories

NoteπŸ€– C-BERT Model

padjohn/cbert
Library to create datasets, train and run the Multi-task transformer for causal relation extraction.

Model Weights

NoteπŸ€— Hugging Face

pdjohn/c-bert
Pre-trained C-BERT model weights for German causal extraction.

Publications

Thesis

TipπŸ“– PhD Dissertation

Title: Kausalsemantik. Eine Operationalisierung der -sterben Komposita im Umweltdiskurs
Author: Patrick
Institution: Technical University Darmstadt
Year: 2026
Link: [coming soon]

The thesis introduces the Causal Semantics framework and applies it to analyze responsibility attributions in German environmental discourse (1990-2020).

Papers

TipπŸ“„ C-BERT Paper

Title: C-BERT: Factorized Causal Relation Extraction
Authors: Patrick Johnson
Year: 2026
Link: DOI | PDF

Introduces the C-BERT multi-task transformer for extracting (C,E,I) tuples from German text.

Data

Indicator Library

The framework uses a library of 644 German causal indicators across multiple families:

  • πŸ”’ Size: 644 indicator forms
  • 🏷️ Families: 162 semantic families (CAUSE, STOP, THROUGH, etc.)
  • πŸ“Š Distribution: Annotated with frequency and priority
  • 🧲 Polarity: Each indicator marked as promoting (+) or inhibiting (βˆ’)
  • 🌟 Salience: Each indicator marked as mono, distributive (0.5) or priority (βˆ’)

Download: indicators.csv

Annotation Corpus

Manual annotations of causal relations in German environmental texts:

  • πŸ“„ Size: 2,391 annotated relations
  • πŸ“… Period: 1990-2020
  • 🌍 Domain: Environmental discourse (forest dieback, species extinction, insect mortality, bee mortality)
  • 🏷️ Annotations: Indicators, entities, polarity, salience, context markers

Availability: A subset containing the sentences from the German Bundestag is available on Huggingface.

Tools & Libraries

The framework builds on several open-source tools:

Contact & Collaboration

Interested in using or extending this framework? We welcome collaborations!
πŸ“§ Email, 🟒 ORCiD, πŸ’Ό LinkedIn

Citation

If you use this framework in your research, please cite:

@phdthesis{johnson2026causalsemantics,
  title={Kausalsemantik. Eine Operationalisierung der -sterben Komposita im Umweltdiskurs},
  author={Patrick Johnson},
  school={Technical University of Darmstadt},
  year={forthcoming}
}
@misc{cbert,
  title={C-BERT: Factorized Causal Relation Extraction},
  author={Patrick Johnson},
  doi={10.26083/tuda-7797},
  year={2026}
}

License

The code is released under the MIT License.
The documentation is licensed under CC BY 4.0.

Acknowledgments

This research was funded by the Deutsche Forschungsgemeinschaft (DFG) as part of the project FOR 5182 / β€œKontroverse Diskurse. Sprachgeschichte als Zeitgeschichte seit 1990”. The author is deeply grateful for the financial support that made this work possible. I also thank the members of the research group for their invaluable feedback and stimulating discussions throughout the development of the Causal Semantics framework.

References

[1]
Dunietz J, Levin L, Carbonell J. The BECauSE corpus 2.0: Annotating causality and overlapping relations. In: Palmer M, Schlangen D, Zettlemoyer L, editors. Proceedings of the 14th workshop on building and using semantic corpora: Semantic annotation for NLP, linguistic modeling and theoretical issues (ACL 2017 workshop), Vancouver, Canada: Association for Computational Linguistics; 2017, p. 96–105. https://doi.org/10.18653/v1/W17-0812.
[2]
Cruse DA. Lexical semantics. Cambridge: Cambridge University Press; 1986.
[3]
Wolff P, Klettke B, Ventura T, Song G. Expressing causation in English and other languages. In: Ahn W, Goldstone R, Love BC, Markman AB, Wolff P, editors. Categorization inside and outside the laboratory: Essays in honor of Douglas Medin, Washington, DC, US: American Psychological Association; 2005, p. 29–48. https://doi.org/10.1037/11109-003.
[4]
Talmy L. Force dynamics in language and cognition. Cognitive Science 1988;12:49–100.
[5]
Li P, Zhou K-J, Lin K-Y, Fan R-E, Lin H-T. Multi-label Classification for Causal Relation Extraction. Proceedings of the 2021 conference on empirical methods in natural language processing (EMNLP), 2021, p. 1245–55.
[6]
Wu Z, Yang L, Liu C, Han Y, Liu J. CKG-EIE: A Scalable Causal Knowledge Graph for Event Influence Extraction using Property Graphs. Data Science and Engineering 2025;10:45–62. https://doi.org/10.1007/s41019-024-00256-y.