Hao Sun

Hao Sun

Hello, everybody! I'm Hao Sun. I'm a computational linguist and cognitive scientist by training.

I recently received my Ph.D. from Linguistics Department, University at Buffalo.

I'll start working as an Artificial Intelligence Scientist for Astound.AI very soon.

Email: hsun7@buffalo.edu

Research Interests Research Projects Publications Languages Labs Work

General Research Interests

My approaches are empirical in nature and lie under the broad umbrella of cognitive sciences, namely the information processing approach. My long-term goal is to theorize language as a computational system with various levels of abstractions.

1. Computational linguistics/Natural Language Processing provide a pool of potential tools available for analyzing and modeling language comprehension, production, learning, and linguistic diversity.

2. Psycholinguistics and neurolinguistics provide empirical evidence to how speakers actually use language.

3. Linguistic theories, typological investigations and diachronic processes play a key role in the understanding of language as a computational system.

Research Projects

1. Corpus study on language development: analyzing frequency and semantic patterns of naturalistic language productions in the CHILDES corpora.

2. Cognitive Modeling: Simulating the acquisition of verb argument structures.

3. Grammar Induction/Unsupervised parsing using nonparametric Bayesian models, in particular the Chinese Restaurant Process.

4. Computational typology and computational historical linguistics: Adapting methodologies from the computational biology/bioinformatics literature.


1. Sun, H. (2010) Complementary or supplementary: a regression study of Taobao user online behavior. Undergraduate Thesis. Best thesis in Department of Management Information Systems. Fudan University, China.

2. Sun, H. & Koenig, J. P. (2017) There are more valence alternations than the ditransitive. Proceedings of the 43th annual meeting of the Berkeley Linguistics Society. [pdf]

3. Sun, H. & Pate, J. (2017) The semantic spaces of child speech, child-directed speech and adult-directed speech: a manifold perspective. Proceedings of the 39th annual meeting of the Cognitive Science Society. [pdf]

4. Sun, H. (2018) Association and extension in argument structure learning: a computational perspective. Doctoral Thesis. University at Buffalo.[pdf]


As a linguist with a passion for learning languages, I speak quite a few languages. My mother tongue is Mandarin. As I was growing up in Shanghai, I picked up some Shanghai Wu. Clearly I speak English at a near-native proficiency. I also passed Japanese Language Proficiency Test, Level 1 or Advanced Level, in 2011. I also speak French at an intermediate level.


Computational Linguistics Lab. Linguistics Department, University at Buffalo (Fall 2015-Fall 2017)
The computational linguistics lab was a lab led by Dr. John Pate that focuses on unsupervised parsing, cognitive modeling and computational historical linguistics.

Psycholinguistics Lab. Psychology Department, University at Buffalo (Fall 2013-Spring 2018)
The psycholinguistics lab is a lab led by Dr. Gail Mauner that focuses on language comprehension using a variety of techniques, including reading and eyetracking.

The Tesserae Project. Classics Department, University at Buffalo (Fall 2011-Spring 2012)
Tesseara is a bilingual information retrieval project targeting at digital humanities led by Dr. Neil Coffee. By employing state-of-the-art information retrieval and machine learning algorithms, it automatically retrieves intertextuality in Latin and ancient Greek. I worked on the semantic similarity team and used vector space models to measure sentence similarity.

Previous Work

Exchange Program Coordinator: Summer Language Exchange Program--Chinese Language and Culture, University at Buffalo, USA & Capital Normal University, China

Research Assistant: Computational Linguistics Lab

Language Instructor: Chinese 101, 102