Blombach, A., & Lindner-Bornemann, B. (2024). Der possessive Dativ in Raum und Zeit. In Dagobert Höllein, Günter Koch, Alexander Werth (Hrg.), Regionale Sprachgeschichte(n). (S. 29-46). Berlin/Boston: De Gruyter.
Dykes, N., Evert, S., Heinrich, P., Humml, M., & Schröder, L. (2024). Finding Argument Fragments on Social Media with Corpus Queries and LLMs. In Philipp Cimiano, Anette Frank, Michael Kohlhase, Benno Stein (Eds.), Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (pp. 163-181). Bielefeld, DEU: Springer Science and Business Media Deutschland GmbH.
Dykes, N., Evert, S., Heinrich, P., Humml, M., & Schröder, L. (2024). Leveraging High-Precision Corpus Queries for Text Classification via Large Language Models. In Hautli-Janisz A, Lapesa G, Anastasiou L, Gold V, Liddo AD, Reed C (Eds.), Proceedings of the First Workshop on Language-driven Deliberation Technology (DELITE) @ LREC-COLING 2024 (pp. 52--57). Torino, Italy: Torino, Italy: ELRA and ICCL.
Heinrich, P., Blombach, A., Doan Dang, B., Zilio, L., Havenstein, L., Dykes, N.,... Schäfer, F. (2024). Automatic Identification of COVID-19-Related Conspiracy Narratives in German Telegram Channels and Chats. In Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue (Eds.), Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024) (pp. 1932-1943). Turin, IT.
Heinrich, P., Blombach, A., Doan Dang, B., Zilio, L., Havenstein, L., Dykes, N.,... Schäfer, F. (2024). Automatic Identification of COVID-19-related Narratives in German Telegram Channels and Chats. In Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue (Eds.), LREC-COLING 2024 - Main Conference Proceedings (pp. 1932-1943). Torino, IT: European Language Resources Association (ELRA).
Khan, A.F., Ionov, M., Chiarcos, C., Romary, L., Sérasset, G., & Kabashi, B. (2024). On Modelling Corpus Citations in Computational Lexical Resources. In Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue (Eds.), 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings (pp. 12385-12394). Hybrid, Torino, ITA: European Language Resources Association (ELRA).
Zilio, L., Qian, S., Kanojia, D., & Orăsan, C. (2024). Using character-level models for efficient abbreviation and long-form detection. In Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue (Eds.), 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings (pp. 3028-3037). Torino, Hybrid, IT: European Language Resources Association (ELRA).
Adrian, A., Dykes, N., Evert, S., Heinrich, P., & Keuchen, M. (2023). Automatische Anonymisierung von Gerichtsurteilen – Eine Vision scheint realisierbar. In Erich Schweighofer / Jakob Zanol / Stefan Eder (Hrg.), Rechtsinformatik als Methodenwissenschaft des Rechts – Tagungsband des 26. Internationalen Rechtsinformatik Symposions IRIS 2023. (S. 211 - 220). Editions Weblaw.
Dykes, N., Wilson, A., & Uhrig, P. (2023). A Pipeline for the Creation of Multimodal Corpora from YouTube Videos. In Piush Aggarwal, Özge Alaçam, Carina Silberer, Sina Zarrieß, Torsten Zesch (Eds.), Proceedings of the 1st Workshop on Linguistic Insights from and for Multimodal Language Processing (LIMO 2023) (pp. 1-5). Ingolstadt, DE: Ingolstadt: Association for Computational Linguistics.
Adrian, A., Dykes, N., Evert, S., Heinrich, P., Keuchen, M., & Proisl, T. (2022). Manuelle und automatische Anonymisierung von Urteilen. In Adrian, Axel/Kohlhase, Michael/Evert, Stephanie/Zwickel, Martin (Hrg.), Digitalisierung von Zivilprozess und Rechtsdurchsetzung. (S. 173-197).
Blombach, A., Evert, S., Jannidis, F., Pielström, S., Konle, L., & Proisl, T. (2022). Exploring Lexical Diversities. In Digital Humanities 2022. Conference Abstracts (pp. 130-134). Tokyo, JP.
Chiarcos, C., Gkirtzou, K., Ionov, M., Kabashi, B., Khan, A.F., & Truica, C.-O. (2022). Modelling Collocations in OntoLex-FrAC. In Proceedings of Globalex Workshop on Linked Lexicography within the 13th Language Resources and Evaluation Conference (pp. 10--18). Marseille, France: European Language Resources Association.
Dykes, N., Heinrich, P., & Evert, S. (2022). Retrieving Twitter argumentation with corpus queries and discourse analysis. In Susanne Flach, Martin Hilpert (Eds.), Broadening the Spectrum of Corpus Linguistics: New approaches to variability and change. (pp. 229-256). John Benjamins Publishing Company.
Gracia, J., Kabashi, B., & Kernerman, I. (2022). TIAD 2022: The Fifth Translation Inference Across Dictionaries Shared Task. In Proceedings of Globalex Workshop on Linked Lexicography within the 13th Language Resources and Evaluation Conference (pp. 19--25). Marseille, France: European Language Resources Association.
Keuchen, M., Adrian, A., Evert, S., Heinrich, P., & Dykes, N. (2021). Anonymisierung von Gerichtsurteilen – Eine wesentliche Voraussetzung für E-Justice –. In Schweighofer E, Eder S, Hanke P, Kummer F, Saarenpää A (Hrg.), Cybergovernance - Tagungsband des 24. Internationalen Rechtsinformatik Symposions IRIS 2021. (S. 137 - 149). Editions Weblaw.
Blombach, A., Dykes, N., Evert, S., Heinrich, P., Kabashi, B., & Proisl, T. (2020). A new German Reddit corpus. In Proceedings of the 15th Conference on Natural Language Processing, KONVENS 2019 (pp. 278-279). Erlangen-Nurnberg, DE: German Society for Computational Linguistics and Language Technology.
Blombach, A., Dykes, N., Heinrich, P., Kabashi, B., & Proisl, T. (2020). A Corpus of German Reddit Exchanges (GeRedE). In Nicoletta Calzolari, Frederic Bechet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis (Eds.), LREC 2020 - 12th International Conference on Language Resources and Evaluation, Conference Proceedings (pp. 6310-6316). Marseille, FR: European Language Resources Association (ELRA).
Evert, S., Harlamov, O., Heinrich, P., & Baski, P. (2020). Corpus query lingua franca part II: Ontology. In Nicoletta Calzolari, Frederic Bechet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis (Eds.), LREC 2020 - 12th International Conference on Language Resources and Evaluation, Conference Proceedings (pp. 3346-3352). Marseille, FR: European Language Resources Association (ELRA).
Griebel, T., & Heinrich, P. (2020). The Cultural Political Economy of Brexit in the Age of Austerity. In Griebel T, Evert S, Heinrich P (Eds.), Multimodal Approaches to Media Discourses: Reconstructing the Age of Austerity in the United Kingdom. (pp. 163 - 188). London: Routledge.
Proisl, T., Dykes, N., Heinrich, P., Kabashi, B., Blombach, A., & Evert, S. (2020). EmpiriST Corpus 2.0: Adding Manual Normalization, Lemmatization and Semantic Tagging to a German Web and CMC Corpus. In Nicoletta Calzolari, Frederic Bechet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis (Eds.), LREC 2020 - 12th International Conference on Language Resources and Evaluation, Conference Proceedings (pp. 6142-6148). Marseille, FR: European Language Resources Association (ELRA).
Proisl, T., & Lapesa, G. (2020). KLUMSy@KIPoS: Experiments on Part-of-Speech Tagging of Spoken Italian. In Basile V, Croce D, Di Maro M, Passaro L (Eds.), Proceedings of the 7th Evaluation Campaign of Natural Language Processing and Speech Tools for Italian (EVALITA 2020). Online: CEUR-WS.org.
Kabashi, B. (2019). Collecting collocations for the Albanian language. In Iztok Kosem, Tanara Zingano Kuhn, Margarita Correia, Jose Pedro Ferreira, Maarten Jansen, Isabel Pereira, Jelena Kallas, Milos Jakubicek, Simon Krek, Carole Tiberius (Eds.), Proceedings of Electronic Lexicography in the 21st Century Conference (pp. 478-489). Sintra, PT: Lexical Computing CZ s.r.o..
Kabashi, B., & Proisl, T. (2018). Albanian Part-of-Speech Tagging: Gold Standard and Evaluation. In Calzolari N, Choukri K, Cieri C, Declerck T, Goggi S, Hasida K, Isahara H, Maegaard B, Mariani J, Mazo H, Moreno A, Odijk J, Piperidis S, Tokunaga T (Eds.), Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018) (pp. 2593–2599). Miyazaki, JP: Miyazaki: European Language Resources Association.
Proisl, T. (2018). SoMeWeTa: A Part-of-Speech Tagger for German Social Media and Web Texts. In Calzolari N, Choukri K, Cieri C, Declerck T, Goggi S, Hasida K, Isahara H, Maegaard B, Mariani J, Mazo H, Moreno A, Odijk J, Piperidis S, Tokunaga T (Eds.), Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018) (pp. 665–670). Miyazaki, JP: Miyazaki: European Language Resources Association.
Proisl, T., Evert, S., Jannidis, F., Schöch, C., Konle, L., & Pielström, S. (2018). Delta vs. N-Gram Tracing: Evaluating the Robustness of Authorship Attribution Methods. In Calzolari N, Choukri K, Cieri C, Declerck T, Goggi S, Hasida K, Isahara H, Maegaard B, Mariani J, Mazo H, Moreno A, Odijk J, Piperidis S, Tokunaga T (Eds.), Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018) (pp. 3309–3314). Miyazaki, JP: Miyazaki: European Language Resources Association.
Proisl, T., Heinrich, P., Kabashi, B., & Evert, S. (2018). EmotiKLUE at IEST 2018: Topic-Informed Classification of Implicit Emotions. In Balahur A, Mohammad SM, Hoste V, Klinger R (Eds.), Proceedings of the 9th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis (pp. 235–242). Brüssel, BE: Brussels: Association for Computational Linguistics.
Evert, S., Heinrich, P., Henselmann, K., Rabenstein, U., Scherr, E., & Schröder, L. (2017). Combining Machine Learning and Semantic Features in the Classification of Corporate Disclosures. In Loukanova R, Liefke K (Eds.), Proceedings of the Workshop on Logic and Algorithms in Computational Linguistics 2017 (LACompLing2017) (pp. 47 - 62). Stockholm, SE: Stockholm: Stockholm University.
Proisl, T., Heinrich, P., Evert, S., & Kabashi, B. (2017). Translation Inference across Dictionaries via a Combination of Graph-based Methods and Co-occurrence Statistics. In McCrae J, Bond F, Buitelaar P, Cimiano P, Declerck T, Gracia J, Kernerman I, Ponsoda E, Ordan N, Piasecki M (Eds.), Proceedings of the LDK 2017 Workshops: 1st Workshop on the OntoLex Model (OntoLex-2017), Shared Task on Translation Inference Across Dictionaries & Challenges for Wordnets (pp. 94–102). Galway, IE: CEUR.
Kabashi, B., & Proisl, T. (2016). A Proposal for a Part-of-Speech Tagset for the Albanian Language. In Calzolari Nicoletta, Choukri Khalid, Declerck Thierry, Grobelnik Marko, Maegaard Bente, Mariani Joseph, Moreno Asuncion, Odijk Jan, Piperidis Stelios (Eds.), Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016) (pp. 4305–4310). Portorož, SI: Paris: European Language Resources Association (ELRA).
Proisl, T., & Uhrig, P. (2016). SoMaJo: State-of-the-art tokenization for German web and social media texts. In Cook P, Evert S, Schäfer R, Stemle E (Eds.), Proceedings of the 10th Web as Corpus Workshop (WAC-X) and the EmpiriST Shared Task (pp. 57-62). Berlin, DE: Berlin: Association for Computational Linguistics (ACL).
Plotnikova, N., Kohl, M., Volkert, K., Lerner, A., Dykes, N., Ermer, H., & Evert, S. (2015). KLUEless: Polarity Classification and Association. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015) (pp. 619--625). Denver, Colorado.
Bartsch, S., & Evert, S. (2014). Towards a Firthian Notion of Collocation. In Abel A, Lemnitzer L (Eds.), Vernetzungsstrategien, Zugriffsstrukturen und automatisch ermittelte Angaben in Internetwörterbüchern. (pp. 48–61). Mannheim: Institut für Deutsche Sprache.
Evert, S. (2014). Distributional Semantics in R with the wordspace Package. In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: System Demonstrations (pp. 110–114). Dublin, Ireland.
Evert, S., Proisl, T., Greiner, P., & Kabashi, B. (2014). SentiKLUE: Updating a polarity classifier in 48 hours. In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval-2014) (pp. 551–555). Dublin, Ireland.
Lapesa, G., & Evert, S. (2014). NaDiR: Naive Distributional Response Generation. In Proceedings of the 4th Workshop on Cognitive Aspects of the Lexicon (CogALex) (pp. 50–59). Dublin, Ireland.
Schulze Wettendorf, C., Jegan, R., Körner, A., Zerche, J., Plotnikova, N., Moreth, J.,... Evert, S. (2014). SNAP: A Multi-Stage XML-Pipeline for Aspect Based Sentiment Analysis. In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014) (pp. 578-584). Dublin, Ireland.
Biemann, C., Bildhauer, F., Evert, S., Goldhahn, D., Quasthoff, U., Schäfer, R.,... Zesch, T. (2013). Scalable Construction of High-Quality Web Corpora. Journal for language technology and computational linguistics, 28(2), 23–59.
Evert, S. (2013). Tools for the acquisition of lexical combinatorics. In Gouws RH, Heid U, Schweickard W, Wiegand HE (Eds.), Dictionaries. An International Encyclopedia of Lexicography. Supplementary volume: Recent Developments with Focus on Electronic and Computational Lexicography (HSK 5.4). (pp. 1415–1432). Berlin, New York: Mouton de Gruyter.
Greiner, P., Proisl, T., Evert, S., & Kabashi, B. (2013). KLUE-CORE: A regression model of semantic textual similarity. In Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 1: Proceedings of the Main Conference and the Shared Task: Semantic Textual Similarity (pp. 181–186). Atlanta, Georgia, USA: Association for Computational Linguistics.
Proisl, T., Greiner, P., Evert, S., & Kabashi, B. (2013). KLUE: Simple and robust methods for polarity classification. In Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013) (pp. 395–401). Atlanta, GA: Association for Computational Linguistics.