[02/2010] Our parallel spectral clustering paper got accepted by PAMI, please check here.
[11/2009] We released our parallel spectral clustering software, please check below.
I have joined Yahoo! and mainly work on research, new initiatives
and data mining/analysis. Prior to this, I received my Ph.D. degree
from U.C. Santa Barbara (UCSB) in 2009. I have interned at IBM (2005), NEC Labs (2006), and Google (2007-2009).
My research interests include data mining, machine learning, parallel computing, and their applications to
social networks. Specifically, I focus on the information recommendation tasks in social networks and the
parallelization of both memory use and computation on distributed computers.
- Spectral Clustering: A simple and easy-to-use MATLAB package for spectral clustering using sparse similarity matrix (nearest neighbor) and the Nystrom method. This package has received Top 20 downloads on Machine Learning Open Source Software (MLOSS).
- Parallel Spectral Clustering (PSC): A parallel C++ implementation of Parallel Spectral Clustering. We are expecting to present a highly optimized parallel implemention of all the steps of spectral clustering. We use PARPACK as underlying eigenvalue decomposition package and F2C to compile fortran code. The parallelization is based on MPI.
- Parallel Latent Dirichlet Allocation (PLDA): A parallel C++ implementation of fast Gibbs sampling of Latent Dirichlet Allocation (LDA). The parallelization is based on MPI.