Education

University of California, Santa Barbara, CA

  • Ph.D., Computer Science, 2009
    • Dissertation: Mining Web Social Data with Latent Aspect Models on Distributed Computers
    • Research interests: data mining, machine learning, parallel computing, and their applications to information recommendation in social networks.
    • Advisor: Professor Edward Y. Chang

National Chiao Tung University, Hsinchu, Taiwan

Professional Experience

Yahoo! Inc., Sunnyvale, CA

  • Research Engineer, 2009 - present
    • Member of Advanced Products team.
    • Work on research, new initiatives and data mining/analysis for Yahoo! Open Strategy (YOS).

Google Inc., Beijing, China/Mountain View, CA

  • Research Intern, 2007 - 2009
    • Member of Intelligent Information team.
    • Researched and implemented latent aspect models, including probabilistic graphical models and spectral clustering. Proposed and parallelized a Combinational Collaborative Filtering (CCF) model for personalized community recommendation using Orkut data. Investigated the speedup issue in parallel spectral clustering and performed experiments on large scale data sets such as RCV and Picasa image data.
    • Researched and implemented latent aspect models and association rules mining, including Latent Dirichlet Allocation (LDA) and FP-growth. Compared LDA with FP-growth for community recommendation tasks and explored the latent information introduced by LDA. Conducted experiments on the Orkut data set.
    • Focused on the information recommendation tasks in social networks such as personalized community/ads recommendations and image/community clustering. To handle large scale data sets efficiently, we parallelized both memory use and computation on distributed computers with MapReduce and MPI.
    • Supervisor: Dr. Edward Y. Chang

NEC Labs America, Cupertino, CA

  • Research Intern, 2006
    • Member of Adaptive Information and Knowledge Organization (AIKO).
    • Analyzed anonymized personal email and daily browsing behavior to discover novel information to match users interests. Completed framework design and module implementations (with Eclipse) for a project called "Sharing novel information in personal social networks." Researched ego-centric social networks, user modeling, and novelty detection.
    • Supervisor: Dr. Belle L. Tseng

IBM Corp., Taipei, Taiwan

  • Summer Intern, 2005
    • Member of Pervasive Computing (PvC) department.
    • Developed J2EE applications to automate unit tests and functional tests using IBM Web Everyplace Access (WEA). Designed an detection module for garbage characters to provide multi-language support for an IBM WebSphere product. Studied automation tools on handheld devices.
    • Supervisor: Lily Chen
Academic Experience

University of California, Santa Barbara, CA

  • Staff Research Associate, 2009
    • Speed up Latent Dirichlet Allocation (LDA) by exlporing the possibility of optmiizing Gibbs Sampling algorithm. Enhance Spectral Clustering by exploring the use of Locality Sensitive Hashing (LSH) to speed up nearest neighbor computations.
    • Advisor: Professor Edward Y. Chang
  • Graduate Student Researcher, 2006 - 2009
    • Member of Multimedia Database Lab.
    • Research projects: (1) Analysis of collected datasets to discover the relations between eigen-gaps in the similarity matrices and to discover clustering quality, (2) Fotofiti: automated semantic annotation of digital photos based on features extracted from context and content, (3) Fotowiki: visual and textual information integrated with maps, on a wiki-based platform, (4) Mining and analysis of latent information from online social networks, and (5) Researching into social network analysis and novelty detection.
    • Advisor: Professor Edward Y. Chang
  • Teaching Assistant, 2004 - 2006
    • Led weekly discussion sections, prepared and graded homework, and held office hours.

National Chaio Tung University, Hsinchu, Taiwan

  • Graduate Student Reasearcher, 2002 - 2004

    • Member of Multimedia Communication Lab and Hewlett-Packard sponsored project (e-NCTU: A 21st Century Mobile Campus).
    • Studied packet classification and fast address look supporting IPv6. Research on routing scheme for seamless handover in mobile network. Designed and implemented an agent-based mechanism for improving the quality of service over wireless networks using C++ and NS-2.
    • Completed framework design and development of an e-ticketing system. Integrated the system with short message service (SMS), and deployed multimedia news web server/streaming services.
    • Advisor: Professor Yaw-Chung Chen
  • Undergraduate Student Researcher, 2000 - 2001
    • Member of network security research project (Integrated Network Security Protection System).
    • Analyzed network protocols' vulnerabilities and studied intrusion detection systems (IDS). Completed the framework design and development of auto-patching system and remote scanner. The entire system is written in C++/Perl/Tcl/Tk, and can be run on Linux and MS-Windows. This work received the Research Creativity Award from the National Science Council.
    • Advisor: Professor Shiuh-Pyng Shieh
  • Teaching Assistant, 2002 - 2004

    • Prepared and graded assignments, and held office hours for CS1648: Programming Languages.
    • Built an IPv6 tunnel broker for the TANet (Taiwan Academic Network) IPv6 testing bed. Developed a web application facilitating e-document management.

Cisco Academy Training Center, Hsinchu, Taiwan

  • Academy Instructor, 2002 - 2004
    • Led classes and hands-on labs about routers and switches.
    • Taught five Cisco Certified Network Associate (CCNA) courses: Orientation, Networking Basis, Routers and Routing Basics, Switching and Intermediate Routing, and WAN technologies.
Publications

Book Chapters

  1. Large-scale Spectral Clustering with MapReduce and MPI
    Wen-Yen Chen, Yangqiu Song, Hongjie Bai, Chih-Jen Lin, and Edward Y. Chang
    An invited book chapter to appear in Scaling Up Machine Learning, Cambridge University Press, 2011


  2. Combinational Collaborative Filtering for Personalized Community Recommendation
    Wen-Yen Chen and Edward Y. Chang
    A book chapter to appear in Foundation of Multimedia Information Retrieval, Springer, 2011


Journal Papers

  1. Parallel Spectral Clustering in Distributed Systems
    Wen-Yen Chen, Yangqiu Song, Hongjie Bai, Chih-Jen Lin, and Edward Y. Chang
    IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), Vol. 33, No. 3, pp. 568-586, March 2011
    [PDF (2.5MB)] [Software (MATLAB)] [Software (C++)]


  2. An Agent-Based Metric for Quality of Services over Wireless Networks
    Yaw-Chung Chen and Wen-Yen Chen
    Elsevier Journal of Systems and Software (JSS), Vol. 81, No. 10, pp. 1625-1639, October 2008
    [PDF (1.3MB)]


Conference Papers

  1. Collaborative Filtering for Orkut Communities: Discovery of User Latent Behavior
    Wen-Yen Chen, Jon Chu, Junyi Luan, Hongjie Bai, Yi Wang, and Edward Y. Chang
    International World Wide Web Conference (WWW)
    Madrid, Spain, April 2009 (11% accepted).
    [PDF (320KB)]


  2. PLDA: Parallel Latent Dirichlet Allocation
    Yi Wang, Hongjie Bai, Matt Stanton, Wen-Yen Chen, and Edward Y. Chang
    International Conference on Algorithmic Aspects in Information and Management (AAIM)
    San Francisco, CA, June 2009.
    [PDF (217KB)] [Software]


  3. Combinational Collaborative Filtering for Personalized Community Recommendation
    Wen-Yen Chen, Dong Zhang, and Edward Y. Chang
    ACM SIGKDD Int'l Conference on Knowledge Discovery and Data Mining (KDD)
    Las Vegas, NV, August 2008 (10% accepted).
    [PDF (473KB)]


  4. Parallel Spectral Clustering
    Yangqiu Song, Wen-Yen Chen, Hongjie Bai, Chih-Jen Lin, and Edward Y. Chang
    European Conference on Machine Learning and Knowledge Discovery in Databases (ECML/PKDD)
    Antwerp, Belgium, September 2008 (18% accepted).
    Also appears in Lecture Notes in Artificial Intelligence (LNAI), Vol. 5212, p. 374-389, 2008.
    [PDF (4.4MB)]


  5. The Splog Detection Task and A Solution Based on Temporal and Link Properties
    Y.-R. Lin, W.-Y. Chen, X. Shi, R. Sia, X. Song, Y. Chi, K. Hino, H. Sundaram, J. Tatemuran, and B. Tseng
    Text REtrieval Conference Proceedings (TREC)
    Gaithersburg, MD, November 2006.
    [PDF (789KB)]


  6. A Scalable Service for Photo Annotation, Sharing, and Search
    Ben Lee, Wen-Yen Chen, and Edward Y. Chang
    ACM Int'l Conference on Multimedia (MM)
    Santa Barbara, CA, October 2006.
    [PDF (4.0MB)]


  7. Fotowiki - Distributed Map Enhancement Service
    Wen-Yen Chen, Ben Lee, and Edward Y. Chang
    ACM Int'l Conference on Multimedia (MM)
    Santa Barbara, CA, October 2006.
    [PDF (1.9MB)]


  8. Fotofiti: Web Service for Photo Management
    Ben Lee, Wen-Yen Chen, and Edward Y. Chang
    ACM Int'l Conference on Multimedia (MM)
    Santa Barbara, CA, October 2006.
    [PDF (262KB)]


  9. An Agent-Based Metric for Quality of Services over Wireless Networks
    Wen-Yen Chen and Yaw-Chung Chen
    IEEE Int'l Computer Software and Applications Conference (COMPSAC)
    Chicago, IL, September 2006.
    [PDF (632KB)] Nonimated for Best Paper Award


Thesis Work and Technical Reports

  1. Mining Web Social Data with Latent Aspect Models on Distributed Computers
    Wen-Yen Chen
    PhD Dissertation, UCSB, September 2009.


  2. Fotofiti: Goals, Functionalities, and Research Challenges
    Ben Lee, Zack Davis, Wen-Yen Chen, Arun Qamra, and Edward Y. Chang
    UCSB Technical Report, March 2006.


  3. An Agent-based System for Improving Quality of Service over Wireless Networks
    Wen-Yen Chen
    M.S. Thesis, NCTU, June 2004.


  4. Study and Design of Fast Address Lookup Scheme Supporting IPv6 Protocol
    Yaw-Chung Chen, Yi-Cheng Chan, Kuo-Chen Kuo, and Wen-Yen Chen
    NCTU Technical Report/NSC-90-2213-E-009-167, June 2002.


  5. Integrated Network Security Protection System
    Wen-Yen Chen
    NCTU Technical Report/NSC-89-2815-C009-051R-E, April 2001. Research Creativity Award

Selected Recent Talks
  • Mining Web Social Data with Latent Aspect Models on Distributed Computers, Computer Science Seminar, National Taiwan University (NTU), Taipei, Taiwan, October, 2009 (invited talk).
  • Collaborative Filtering for Orkut Communities: Discovery of User Latent Behavior, WWW'09, Madrid, Spain, April 2009.
  • Combinational Collaborative Filtering for Personalized Community Recommendation, KDD'08, Las Vegas, USA, August 2008.
  • Parallel Spectral Clustering, ECML'08, Antwerp, Belgium, September 2008.
  • A Scalable Service for Photo Annotation, Sharing, and Search, MM'06, Santa Barbara, USA, October 2006.
  • Fotowiki - Distributed Map Enhancement Service, MM'06, Santa Barbara, USA, October 2006.
  • Fotofiti: Web Service for Photo Management, MM'06, Santa Barbara, USA, October 2006.
Professional Activities
  • Reviewer
    • ACM Knowledge Discovery and Data Mining (KDD), 2010
    • ACM Multimedia (MM), 2007, 2008
    • ACM Conference on Information and Knowledge Management (CIKM), 2008
    • IEEE Transactions on Multimedia (TMM), 2009
    • IEEE Transactions on Knowledge and Data Engineering (TKDE), 2007
    • IEEE Communications Magazine, 2007
    • Very Large Data Base (VLDB), 2008, 2009
    • Multimedia System Journal (MMSJ), 2007, 2008, 2009
    • Journal of Information Science and Engineering (JISE), 2009
  • Volunteer
    • ACM Multimedia (MM), 2006
  • Member
    • Society for Industrial and Applied Mathematics (SIAM)
Honors and Awards
  • Student Travel Award in WWW Conference, 2009
  • Student Travel Award in ACM SIGKDD Conference, 2008
  • Doctoral Student Travel Grant, University of California Santa Barbara, 2008
  • Graduate Fellowship, University of California Santa Barbara, 2004
  • Graduate Fellowship, National Science Council of Taiwan, 2002
  • Research Creativity Award, National Science Council of Taiwan, 2001
  • Bronze Prize, Research and Development Contest, NCTU, 2001
  • Undergraduate Fellowship, National Science Council of Taiwan, 2000
  • Fourth Place, Nationwide Hackers' Competition, Accton corp., 2000
  • Pan Wen-Yuan Scholarship, Pan Wen-Yuan Foundation, 1999
  • Academic Achievement Award (Top 3 student), NCTU, 1999
  • Academic Achievement Award (Top 3 student), NCTU, 1998
Released Software

MATLAB spectral clustering package

Parallel Spectral Clusterring (PSC)

  • A parallel C++ implementation of Parallel Spectral Clustering. We are expecting to present a highly optimized parallel implemention of all the steps of spectral clustering. We use PARPACK as underlying eigenvalue decomposition package and F2C to compile fortran code. The parallelization is based on MPI.

Parallel Latent Dirichlet Allocation (PLDA)

  • A parallel C++ implementation of fast Gibbs sampling of Latent Dirichlet Allocation (LDA). The parallelization is based on MPI.
Computing Skills
  • C/C++, Java, MATLAB, Perl, Python, MySQL, PHP, and HTML.
  • Linux system administration and shell scripting.
Qualifications
  • Cisco Certified Academy Instructor, Cisco Systems
  • Cisco Certified Network Associate, Cisco Systems