[11/2009]
We release our parallel spectral clustering software, please check below.
[08/2009]
I will be joining Yahoo! in Sunnyvale, CA.
[04/2009] I successfully defended my dissertation.
I am currently a research engineer in Yahoo!, work on research, new initiatives
and data mining/analysis. I received my Ph.D. in Computer Science
from U.C. Santa Barbara (UCSB) in 2009. I obtained my B.S./M.S. in
Computer Science from
National Chiao Tung University (NCTU). Since January 2006 I have been working with the Multimedia Database Laboratory (MMDB) advised by Prof. Edward Y. Chang. I have interned at IBM (2005), NEC (2006), and Google Beijing Research (2007-2009).
I am interested in the areas of data mining, machine learning, parallel computing, and their applications to
social networks. Specifically, we focus on the information recommendation tasks in social networks, including
personalized community recommendations, image and community clustering, and automatic image annotations. To
handle large scale data sets, we parallelize both memory use and computation on distributed computers.
- Spectral Clustering: A simple and easy-to-use MATLAB package for spectral clustering using sparse similarity matrix (nearest neighbor) and the Nystrom method. This package has received Top 20 downloads on MLOSS website.
Parallel Spectral Clustering: A parallel C++ implementation of Parallel Spectral Clustering. We are expecting to present a highly optimized parallel implemention of all the steps of spectral clustering. We use PARPACK as underlying eigenvalue decomposition package and F2C to compile fortran code.
- Parallel LDA: A C++ implementation of fast Gibbs sampling of Latent Dirichlet Allocation (LDA). This package uses Message Passing Interface (MPI) for parallelization.
- 4/26-4/30, 2010 WWW 2010, Raleigh, NC, USA (deadline: Nov 2, 2008)
- 7/25-7/28, 2010 KDD 2010, Washington, DC, USA (deadline: Feb 5, 2009, abstract: Feb 2, 2009)
- 7/18-7/23, 2010 SIGIR 2010, Geneva, Switzerland (deadline: Jan 22, 2009, abstract: Jan 15, 2009)