Matei Zaharia

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search
Matei Zaharia
Alma materUC Berkeley (Ph.D.)
University of Waterloo (B.Math.)
Known forApache Spark
Apache Mesos
Scientific career
FieldsComputer Science
InstitutionsStanford University
ThesisAn Architecture for Fast and General Data Processing on Large Clusters (2013)
Doctoral advisorIon Stoica
Scott Shenker

Matei Zaharia is a Romanian-Canadian computer scientist specializing in big data, distributed systems, and cloud computing. He is a co-founder and chief technologist of Databricks, and an assistant professor of computer science at Stanford University.[1]


Matei Zaharia was born in Romania. His family moved to Canada later and he attended Jarvis Collegiate Institute in Toronto for high school and the University of Waterloo for computer science. While at university, he helped program the 0 A.D. video game. He received the Governor General's academic silver medal for highest academic standing upon graduation from the University of Waterloo. He went on to study at the University of California, Berkeley gaining a Ph.D. in Computer Science in 2013 under the supervision of Ion Stoica and Scott Shenker.[2]

He participated in programming contests, winning two IOI silver medals in high school. He was on the University of Waterloo team that competed in ACM ICPC programming competition in 2004 and 2005. He won a gold medal in ICPC 2005 (3rd place worldwide), and placed 15th in 2004.[3] Both times his team got a title of North America champions.[citation needed]

In the course of his PhD studies, he created the Apache Spark project[4] and co-created the Apache Mesos project. He also designed and implemented one of the two core schedulers used in Apache Hadoop.[5]

He received two Best Paper awards at NSDI 2012 and SIGCOMM 2012, Honorable Mention for Community Award at NSDI 2012, and a Best Demo Award at SIGMOD 2012. Jointly with Reynold Xin, Parviz Deyhim, Xiangrui Meng, and Ali Ghodsi, he holds the 2014 world record in Daytona GraySort using Apache Spark.[6] In 2015 he received the ACM Doctoral Dissertation Award.[7]


  1. ^ "How Companies are Using Spark, and Where the Edge in Big Data Will Be". Strata Conference. Retrieved 26 August 2014.
  2. ^ Zaharia, Matei. "An Architecture for Fast and General Data Processing on Large Clusters" (PDF). University of California, Berkeley. Retrieved 29 June 2015.
  3. ^ "Programming Contest Resources".
  4. ^ "Spark: Cluster computing with working sets" (PDF).
  5. ^ "Delay Scheduling: A Simple Technique for Achieving Locality and Fairness in Cluster Scheduling" (PDF).
  6. ^ "Sort Benchmark".
  7. ^ "ACM Doctoral Dissertation Award 2015".

External links[edit]