Research

Describes Chris Tanner’s research interests in Machine Learning, Deep Learning, and Natural Language Processing (NLP). Harvard University. Brown University, Spotify, IBM Research, IBM Watson, Johns Hopkins HLT COE, MIT Lincoln Laboratory, MITLL, Department of Defense, Google, Florida Tech, UCLA.

Chris Tanner headshot

cwt [AT] mit.edu

Hi, I’m Chris Tanner.

I research and teach Natural Language Processing (NLP) and Machine Learning.

I am the Head of R&D at Kensho, an amazing 100-person ML/NLP firm headquartered in Harvard Square. I am building a research lab and am actively hiring Research Scientists! Additionally, I hold a joint faculty appointment at MIT, where I teach NLP in the Fall and Machine Learning in the Spring.

Currently, my research mostly concerns LLMs and includes tokenization, numeric reasoning, long-form question answering, building evaluation tasks, and alignment.

Before joining Kensho and MIT, I spent three wonderful years teaching full-time at Harvard’s Institute for Applied Computational Science (IACS), which centers around two Master’s programs: Data Science; and Computational Science and Engineering. Brief application advice.

I received my PhD in Computer Science (NLP) at Brown University and was fortunate to be Eugene Charniak’s final student. My "grand-adviser" is Marvin Minksy (me --> Eugene --> Marvin). Before then, I worked at MIT Lincoln Lab as an Associate Staff NLP researcher from 2009-2012.

Current hobbies include woodworking, designing and sewing hiking gear, and going on challenging hikes.

FALL 2023

  1. Advanced NLP” aka 6.861 at MIT. (500 students). Topics include text classification, language modelling, seq2seq models, transformers, and structured models. Students will also work on a significant research project.


PAST STUDENTS

  • Haoran Zhang (currently Harvard Master’s)

  • Xiaohan Yang (Harvard Master’s Thesis 2022 -> Apple)

  • Anita Mahinpei (Harvard Master’s Thesis 2022)

  • Xin Zeng (Harvard Master’s Thesis 2022)

  • Jack Scudder (Harvard Master’s Thesis 2022 -> West Point Instructor)

  • Xavier Evans (Harvard Undergrad Independent Study)

  • Ning Hua (Smith x Harvard Independent Study)

  • Jie Sun (Harvard Independent Study -> Co-founded basys.ai)

  • Yoel Zweig (ETH Zurich Master’s Thesis ‘21)

  • Ali Hindy (High School -> Stanford CS)

  • Thomas Fouts (High School -> University of Michigan ME)

  • Mingyue Wei (Harvard Master's 2021 -> Amazon)

  • Alessandro Stolfo (ETH-Zurich Master’s ‘21 -> PhD program)

  • Brendan Falk (Harvard ‘20 -> CEO @ Fig)


CURRENT RESEARCH PROJECTS

These projects are active and ongoing; I meet weekly to advise and collaborate on the following projects:

HUMBLE NLP: An Annotation Suite (Shivas Jayaram, Eduardo Peynetti, Joe Brucker, Vasco Meerman, and Chris Tanner). In Progress.

We are building an easy-to-use annotation platform, with the initial goal of building the biggest, best event coreference dataset to date.

A Commonsense Approach to Event Coreference Resolution (Sahithya Ravi, Chris Tanner, and Vered Shwartz). In Progress.

We are aiming to improve coreference resolution by injecting commonsense knowledge.

A Simple Unsupervised Approach for Coreference Resolution using Rule-based Weak Supervision (Alessandro Stolfo, Mrinmaya Sachan, Vikram Gupta, and Chris Tanner). In Submission.

We measure the robustness of rich language models (e.g., BERT), on particular tasks, as they train over time.

An Analysis of Model Robustness and Catastrophic Forgetting (Qiang Fei, Yingsi Jian, Mingyue Wei, Shuyuan Xiao, Shahab Asoodeh, Chris Tanner, and Ekin Dogus Cubuk). In Preparation.

For my Capstone course, students partnered with Google Brain to understand catastrophic forgetting, a phenomenon in neural models. We collectively extended this work for a publication. [Blog Overview] [Slides] [Poster] [Poster Video]

Bringing BERT to the field: Transformer models for gene expression prediction in maize (Benjamin Levy, Zihao Xu, Liyang Zhao, Shuying Ni, Phoebe Wong, Ross Karl Kremling, Ross Altman, and Chris Tanner). Preparing for Nature’s Scientific Review.

For my Capstone course, students partnered with Inari to predict gene expression. We collectively extended this work for a publication. [Blog Overview] [Slides] [Poster] [Poster Video]

MASTER’S THESES

End-to-end Entity Linking (Mingyue Wei and Chris Tanner). In Progress.

Mingyue is finishing her thesis research on end-to-end entity linking.

Active Learning for Coreference Resolution (Xiaohan Yang and Chris Tanner). In Progress.

Xiaohan is developing active learning approaches to mitigate the time-intensive nature of annotating coreference resolution data.

Automated Figure Captioning to assist the visually-challenged (Anita Mahinpei, Zona Kostic, and Chris Tanner). In Progress.

Anita is developing models to automatically caption data visualization and figures, which can be highly nuanced and technical.

Symbiotic Coreference Resolution for Entities and Events (Xin Zeng and Chris Tanner). In Progress.

Xin is developing a new approach for jointly performing entity and event coreference.

Commonsense-based Adversarial Attacks (Jack Scudder and Chris Tanner). In Progress.

Jack is researching adversarial attacks in NLP, with a particular focus on commonsense reasoning


EXPERIENCE

During my career within academia, industry, and the government, my work has concerned:

  • coreference resolution

  • sign language classification

  • natural language understanding (NLU)

  • entity linking

  • citation prediction

  • face recognition

  • topic modelling

  • machine translation

  • streaming algorithms for NLP

  • anomaly detection

  • adaptive web personalization

  • speech recognition via active learning

  • error-correcting codes

  • social network analysis

  • 2D pattern recognition

  • animats-based learning (swarm intelligence)

INVITED TALKS

2021

2020

  • November 20 — Research Talk @ Florida Institute of Tech.

  • October 15 — Career Advice @ Florida Institute of Tech.

  • May 19 — Open Data Science Conference (ODSC)

  • January 23 — Sequential Data @ Harvard ComputeFest

2019

  • September 27 — PhD Alumni Panel @ Brown

  • October 27 — RDMeetsIT Panel @ MIT Media Lab + Mercedes Benz

  • March 11 — Coreference Resolution @ Invitae

  • April 1 — MIT

  • March 15 — University of Washington

  • March 6 — CMU

  • February 21 — Brown

  • February 15 — Harvard