Marcos Zampieri

Marcos Zampieri

Assistant Professor
School of Computing
George Mason University
Fairfax, VA, USA

email

About

I am an Assistant Professor at the School of Computing at George Mason University and currently a Visiting Assistant Professor at the Department of Computer Science at Duke University.

My research interests are in Computational Linguistics and Natural Language Processing.

My research aims to enhance our understanding of human language and communication while improving the robustness, safety, and accessibility of NLP systems in the era of LLMs. Here are some questions driving my research:

Language variation: How does systemic variation (e.g., diatopic and diachronic) and individual variation (e.g., speaker’s L1 and proficiency level) affect the performance of NLP systems? How do we build corpora, benchmarks, and models that capture language variation, and what linguistic insights do these models reveal?
Multilingual NLP: How robust are NLP systems across languages and language varieties? What happens when models trained on standard, well-resourced languages encounter non-standard low-resource languages, such as code-mixed and transliterated text?
Education: How are LLMs changing the way we teach and learn? How do we deploy safe and pedagogically aligned NLP systems to help students learn language and computing?

I have co-founded and served as co-organizer of the VarDial workshop since 2014. I have served as Program Chair for SemEval (2025–2026), Tutorial Chair for ACL 2022 and Faculty Advisor for NAACL SRW 2024, alongside regular area chair and senior area chair roles at all major NLP venues.

Recent Selected Publications

For a full list of publications please check Google Scholar.

Narrate2Nav: Real-Time Visual Navigation with Implicit Language Reasoning in Human-Centric Environments
Amirreza Payandeh, Anuj Pokhrel, Daeun Song, Marcos Zampieri, Xuesu Xiao
ICRA (2026) pdf

TigerLLM - A Family of Bangla Large Language Models
Nishat Raihan, Marcos Zampieri
ACL (2025) pdf

Tracing L1 Interference in English Learner Writing: A Longitudinal Corpus with Error Annotations
Poorvi Acharya, J. Elizabeth Liebl, Dhiman Goswami, Kai North, Marcos Zampieri, Antonios Anastasopoulos
EMNLP (2025) pdf

mHumanEval - A Multilingual Benchmark to Evaluate Large Language Models for Code Generation
Nishat Raihan, Antonios Anastasopoulos, Marcos Zampieri
NAACL (2025) pdf

Bayelemabaga: Creating Resources for Bambara NLP
Allahsera Auguste Tapo, Kevin Assogba, Christopher M Homan, M. Mustafa Rafique, Marcos Zampieri
NAACL (2025) pdf

Large Language Models in Computer Science Education: A Systematic Literature Review
Nishat Raihan, Mohammed Latif Siddiq, Joanna CS Santos, Marcos Zampieri
SIGCSE (2025) pdf

Annotator Reliability Through In-Context Learning
Sujan Dutta, Deepak Pandita, Tharindu Weerasooriya, Marcos Zampieri, Christopher Homan, Ashiqur KhudaBukhsh
AAAI (2025) pdf

A Survey of Multimodal Sarcasm Detection
Shafkat Farabi, Tharindu Ranasinghe, Diptesh Kanojia, Yu Kong, Marcos Zampieri
IJCAI (2024) pdf

Language Variety Identification with True Labels
Marcos Zampieri, Kai North, Tommi Jauhiainen, Mariano Felice, Neha Kumari, Nishant Nair, Yash Bangera
LREC-COLING (2024) pdf

Native Language Identification in Texts: A Survey
Dhiman Goswami, Sharanya Thilagan, Kai North, Shervin Malmasi, Marcos Zampieri
NAACL (2024) pdf

Features of Lexical Complexity: Insights from L1 and L2 Speakers
Kai North, Marcos Zampieri
Frontiers in Artificial Intelligence (2023) url

Lexical Complexity Prediction: An Overview
Kai North, Matthew Shardlow, Marcos Zampieri
ACM Computing Surveys (2023) url

ALEXSIS-PT: A New Resource for Portuguese Lexical Simplification
Kai North, Marcos Zampieri, Tharindu Ranasinghe
COLING (2022) pdf

Handling Extreme Class Imbalance in Technical Logbook Datasets
Farhad Akhbardeh, Cecilia O. Alm, Marcos Zampieri, Travis Desell
ACL (2021) pdf

Books

Automatic Language Identification in Texts

Tommi Jauhiainen, Marcos Zampieri, Timothy Baldwin, Krister Lindén
Synthetisis Lectures on Human Language Technologies
Springer (2024)

Similar Languages, Varieties, and Dialects: A Computational Perspective

Marcos Zampieri, Preslav Nakov (Editors)
Studies in Natural Language Processing
Cambridge University Press (2021)

Last Updated: July 2026 | Template: Plain Academic