Rami Al-Rfou

Rami Al-Rfou

Staff Research Scientist

Google Research

Biography

Rami Al-Rfou is a Staff Research Scientist at Google Research. In his current role, he is a technical lead for assisted writing applications such as SmartReply. His research focuses on improving pretraining huge language modeling through token-free architectures, synthetic datasets constructed with knowledge-base based generative models, and improved sampling strategies for multilingual datasets. These pretrained language models, trained on +100 languages, are being utilized in query understanding, web page understanding, semantic search, and response ranking in conversations.

Al-Rfou’s research goes beyond language into designing better architecture to under large-scale data such as graphs. Al-Rfou repurposes language modeling tools to produce novel graph learning algorithms that measure node and graph similarities. These modeling ideas have been deployed for spam detection and personalization application on large scale.

Al-Rfou received his PhD in Computer Science at Stony Brook University under the supervision of Prof. Steven Skiena in 2015. He investigated how to utilize deep learning representations to build truly massive multilingual NLP pipeline that supports +100 languages. Massively multilingual modeling significantly gained momentum in the recent years since then. Al-Rfou’s experience in sequential modeling and crosslingual applications span 10 years of academic and industrial research with applications that touched the lives of millions of users and open sourced code that helped thousands of students.

Experience

 
 
 
 
 

Staff Research Scientist

Google Research

Sep 2020 – Present Mountain View, CA

Responsibilities include:

  • SmartReply Technical Lead
  • Deep Retrieval Research Lead
 
 
 
 
 

Senior Research Scientist

Google Research

Apr 2017 – Sep 2020 Mountain View, CA
 
 
 
 
 

Software Engineer

Google Research

Jun 2015 – Apr 2017 Mountain View, CA
 
 
 
 
 

Research Intern

Microsoft Research

Jun 2013 – Aug 2013 New York City, NY
Host: Leon Bottou
“Investigated new ways to improve semi-supervised learning with word embeddings.”
 
 
 
 
 

Research Intern

Google Research

Jun 2012 – Aug 2012 Mountain View, CA
Host: Jay Ponte
“Developed a language-independent, semi-supervised method for multilingual coreference resolution utilizing word emebddings and finetuned dual-encoder ranking model.”
 
 
 
 
 

Software Engineer Intern

Google

Jun 2011 – Aug 2011 Mountain View, CA
Host: Mario Guajardo
“Developed a visualization system for Google’s data centers' internal networks.”

Education

 
 
 
 
 

PhD in Natural Language Processing

Stony Brook University

Sep 2010 – Jun 2015 Stony Brook, NY
Dissertation: Polyglot: A Massive Multilingual Natural Language Processing Pipeline. Adviser: Steven Skiena.
Committee: Yejin Choi, Leman Akoglu, Leon Bottou
 
 
 
 
 

BSc. in Computer Engineering

University of Jordan

Sep 2004 – Feb 2009 Amman, Jordan
Dissertation: TCP Performance over Wireless Networks: Analysis & Simulation.
GPA: 3.79/4.0

Patents

  • Systems and Methods for Determining Graph Similarity
    US Patent Application US16/850,570

  • Selective text prediction for electronic messaging
    US Patent Application US15/852,916

  • Cooperatively training and/or using separate input and subsequent content neural networks for information retrieval
    US Patent Application US15/476,280

  • Cooperatively training and/or using separate input and response neural network models for determining response(s) for electronic communications
    US Patent Application US15/476,292

  • Iteratively learning coreference embeddings of noun phrases using feature representations that include distributed word representations of the noun phrases
    Issued Oct 02, 2017 US 9514098 B1

Contact