CHU, WEI(褚崴)


  View Wei Chu's Google Scholar Profile   View Wei Chu's LinkedIn Profile   View Wei Chu's Short CV   Write to Wei Chu's Gmail

About Me  

Recent Work  

Working Experience  

Publications  

Patents  

Honors & Awards  




About Me

I am a R&D team leader and an award-winning researcher with well-balanced academia and industry experience of more than 15 years. I am now a senior director&researcher with Ant Financial, a subsidiary company of Alibaba Group. I am leading the team of 100+ researchers and engineers to develop cognitive computing services, including platforms for computer vision, natural language understanding and knowledge graph. Previously I was the director in charge of distributed machine learning platform for Alibaba Cloud, known as PAI 2.0. Prior to joining Alibaba, I was team leader at Microsoft Bing to develop personalized search technology. At Yahoo! Labs I worked with colleagues on web-scale user-click stream for content optimization via contextual bandits.

My academic interest is to design and implement statistical learning algorithms that convert large scale machine-readable data into human-understandable knowledge. I ever conducted research at CCLS, Columbia University, including relational Gaussian processes and other pragmatic Bayesian techniques. I was also a postdoc research fellow for three years at the Gatsby Computational Neuroscience Unit, UCL, mentored by Zoubin Ghahramani and David L. Wild on statistical machine learning. I received my Ph.D. degree from the National University of Singapore, under the joint guidance of S. Sathiya Keerthi and Chong Jin Ong. I have published 50+ papers at top-tier conferences and journals, received 8000+ citations according to Google Scholar, and also earned a Best Paper Award at ACM WSDM and a Best Demo Award at ACM CIKM.


Recent Work

  1. "Question directed graph attention network for numerical reasoning over text" EMNLP 2020, at the first place on the DROP leaderboard of AI2

  2. "Knowledge graph and real-life applications", presented at CogX 2020

  3. "SpellGCN: incorporating phonological and visual similarities into language models for Chinese spelling check" ACL 2020, access to the code on github


Working Experience

  1. Senior Director of Engineering, AI Dept, Ant Group, 2017.08 till now

  2. Director of Engineering, Alibaba Cloud, Alibaba Group, 2014.11 to 2017.08

  3. Principal Applied Scientist Lead, Bing, Microsoft, 2011.05 to 2014.11

  4. Scientist, Yahoo! Labs, 2008.01 to 2011.05

  5. Associate Research Scientist, CCLS, Columbia University, 2006.01 to 2008.01

  6. Research Fellow, Gatsby Unit, University College London, 2003.02 to 2006.01


Publications

  1. F. Xu, M. Wang, W. Zhang, Y. Cheng and W. Chu (2021) Discrimination-aware mechanism for fine-grained representation learning, CVPR 2021 (View Abstract)

  2. W. Hong, P. Guo, W. Zhang, J. Chen and W. Chu (2021) LPSNet: a lightweight solution for fast panoptic segmentation, CVPR 2021 (View Abstract)

  3. K. Chen, W. Xu, X. Cheng, X. Zou, Y. Zhang, L. Song, T. Wang, Y. Qi and W. Chu (2020) Question directed graph attention network for numerical reasoning over text, EMNLP 2020:6759–6768 (View Abstract)

  4. L. Cao, J. Chen and W. Chu (2020) Variational connectionist temporal classification, ECCV 2020:460-476 (View Abstract)

  5. X. Chen, W. Xu, K. Chen, T. Wang, S. Jiang, F. Wang, W. Chu and Y. Qi (2020) SpellGCN: incorporating phonological and visual similarities into language models for Chinese Spelling Check, ACL 2020:871–881 (View Abstract)

  6. X. Lin, W. Jian, J. He, T. Wang, and W. Chu (2020) Generating informative conversational response using recurrent knowledge-interaction and knowledge-copy, ACL 2020:41–52 (View Abstract)

  7. F. Xu, W. Zhang, Y. Cheng and W. Chu (2020) Metric learning with equidistant and equidistributed triplet-based loss for product image search, WWW 2020:57–65 (View Abstract)

  8. S. Wang, B. Zhu, C. Li, M. Wu, J. Zhang, W. Chu, and Y. Qi (2020) Riemannian proximal policy optimization, Computer and Information Science 13(3) (View Abstract)

  9. W. Zhang, Y. Cheng, X. Guo, Q. Guo, J. Wang, Q. Wang, C. Jiang, M. Wang, F. Xu and W. Chu (2020) Automatic car damage assessment system: reading and understanding videos as professional insurance inspectors, AAAI 2020:13646-13647 Demonstration Track (View Abstract)

  10. W. Huang, X. Cheng, K. Chen, T. Wang, W. Chu (2020) Towards fast and accurate neural Chinese word segmentation with multi-criteria learning, COLING 2020:2062-2072 (View Abstract)

  11. C. Li, X. Yan, X. Deng, Y. Qi, W. Chu, L. Song, J. Qiao, J. He and J. Xiong (2019) Latent dirichlet allocation for Internet price war, AAAI 2019:639-646 (View Abstract)

  12. X. Cheng, W. Xu, T. Wang, W. Chu, W. Huang, K. Chen and J. Hu (2019) Variational semi-supervised aspect-term sentiment analysis via transformer, CoNLL 2019:961-969 (View Abstract)

  13. W. Huang, X. Cheng, T. Wang and W. Chu (2019) BERT-based multi-head selection for joint entity-relation extraction, NLPCC (2) 2019:713-723 (View Abstract)

  14. W. Sui, Q. Zhang, J. Yang and W. Chu (2018) A novel integrated framework for learning both text detection and recognition, ICPR 2018:2233-2238 (View Abstract)

  15. T. Yin, X. Deng, Y. Qi, W. Chu, J. Pan, X. Yan and J. Xiong (2018) Personalized behavior prediction with encoder-to-decoder structure, NAS 2018:1-10 (View Abstract)

  16. J. Yu, M. Qiu, J. Jiang, J. Huang, S. Song, W. Chu and H. Chen (2018) Modelling domain relationships for transfer learning on retroeval-based question answering systems in E-commerce, ACM International Conference on Web Search and Data Mining (WSDM-11):682–690 (View Abstract)

  17. M. Qiu, P. Zhao, K. Zhang, X. Shi, X. Wang, J. Huang and W. Chu (2017) A short-term rainfall prediction model using multi-task convolutional neural networks, IEEE International Conference on Data Mining (ICDM) (View Abstract)

  18. F. Li et al. (2017) AliMe Assist: an intelligent assistant for creating an innovative E-commerce experience, ACM International Conference on Information and Knowledge Management (CIKM) (View AbstractWinner of the Best Demo Award

  19. M. Qiu, et al. (2017) AliMe Chat: a sequence to sequence and rerank based ChatBot engine, Annual Meeting of the Association for Computational Linguistics (ACL-55 Short Paper) (View Abstract)

  20. J. Yang, Y. Chen, S. Wang, L. Li, C. Meng, M. Qiu, W. Chu (2017) Practical lessons of distributed deep learning, Workshop on Principled Approaches to Deep Learning, at ICML (View Abstract)

  21. B. Bi, H. Ma, B. Hsu, W. Chu, K. Wang and J. Cho (2015) Learning to recommend related entities to search users, ACM International Conference on Web Search and Data Mining (WSDM-08):139–148 (View Abstract)

  22. J. Yan, W. Chu, R. W. White (2014) Cohort modeling for enhanced personalized search, ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR-37) (View Abstract)

  23. X. Li, C. Guo, W. Chu, Y. Wang, J. Shavlik (2014) Deep learning powered in-session contextual ranking using clickthrough data, Workshop on Personalization: Methods and Applications, at Neural Information Processing Systems (NIPS) (View Abstract)

  24. H. Wang, X. He, M. Chang, Y. Song, R. W. White, W. Chu (2013) Personalized ranking model adaptation for web search, ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR-36) (View Abstract)

  25. R. W. White, W. Chu, A. Hassan, X. He, Y. Song, H. Wang (2013) Enhancing personalized search by mining and modeling task behavior, International World Wide Web Conference (WWW-22) (View Abstract)

  26. H. Wang, Y. Song, M. Chang, X. He, R. W. White, W. Chu (2013) Learning to extract cross-session search tasks, International World Wide Web Conference (WWW-22):1353–1364 (View Abstract)

  27. T. Moon, W. Chu, L. Li, Z. Zheng, Y. Chang (2012) An online learning framework for refining recency search results with user click feedback, Transactions on Information Systems 30(4) (View Abstract)

  28. L. Li, W. Chu, J. Langford, T. Moon, and X. Wang (2012) An unbiased offline evaluation of contextual bandit algorithms with generalized linear models, Journal of Machine Learning Research - Workshop and Conference Proceedings 26 (JMLR W&CP-26) (View Abstract)

  29. P. Bennett, R. W. White, W. Chu, S. Dumais, P. Bailey, F. Borisyuk and X. Cui (2012) Modeling and measuring the impact of short and long-term behavior on search personalization, ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR-35) (View Abstract)

  30. W. Chu, M. Zinkevich, L. Li, A. Thomas, and B. Tseng (2011) Unbiased online active learning in data streams, ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD-17) (View Abstract)

  31. L. Zhang, J. Yang, W. Chu, and B. Tseng (2011) A machine-learned proactive moderation system for auction fraud detection, ACM Conference on Information Retrieval and Knowledge Management (CIKM-20 Short Paper) (View Abstract)

  32. L. Li, W. Chu, J. Langford and X. Wang (2011) Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms, ACM International Conference on Web Search and Data Mining (WSDM-04) 297-306 (View Abstract) Winner of the Best Paper Award

  33. W. Chu, L. Li, L. Reyzin, and R. E. Schapire (2011) Contextual bandits with linear payoff functions, International Conference on Artificial Intelligence and Statistics (AISTATS-14) (View Abstract)

  34. T. Moon, L. Li, W. Chu, C. Liao, Z. Zheng and Y. Chang (2010) Online learning for recency search ranking using real-time user feedback, International Conference on Information and Knowledge Management (CIKM-19 Short Paper) 1501-1504 (View Abstract)

  35. L. Li, W. Chu, J. Langford and R. E. Schapire (2010) A contextual-bandit approach to personalized news article recommendation, International World Wide Web Conference (WWW-19) 661-670 (View Abstract)

  36. S.-T. Park and W. Chu (2009) Pairwise preference regression for cold-start recommendation, ACM Recommender Systems (RecSys-03):21-28 (View Abstract)

  37. W. Chu and Z. Ghahramani (2009) Probabilistic models for incomplete multi-dimensional arrays, International Conference on Artificial Intelligence and Statistics (AISTATS-12):89-96 (View Abstract)

  38. W. Chu and S.-T. Park (2009) Personalized recommendation on dynamic content using predictive bilinear models, International World Wide Web Conference (WWW-18):692-700 (View Abstract)

  39. W. Chu, et al. (2009) A case study of behavior-driven conjoint analysis on Yahoo! Front Page Today Module, ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD-15 Industry Track):1097-1104 (View Abstract)

  40. R. Silva, W. Chu and Z. Ghahramani (2007) Hidden common cause relations in relational learning, Neural Information Processing Systems (NIPS-20):1345-1352 (View Abstract)

  41. K. Yu and W. Chu (2007) Gaussian process models for link analysis and transfer learning, Neural Information Processing Systems (NIPS-20):1657-1664 (View Abstract)

  42. P. K. Shivaswamy, W. Chu and M. Jansche (2007) A support vector approach to censored targets, IEEE International Conference on Data Mining (ICDM-07):655-660 (View Abstract)

  43. W. Chu and S. S. Keerthi (2007)  Support vector ordinal regressionNeural Computation 19(3):792-815 (View Abstract)

  44. V. Sindhwani, W. Chu and S. S. Keerthi (2007) Semi-supervised Gaussian process classifiersInternational Joint Conferences on Artificial Intelligence (IJCAI-20):1059-1064 (View Abstract)

  45. W. Chu, V. Sindhwani, Z. Ghahramani and S. S. Keerthi (2006) Relational learning with Gaussian processes, Neural Information Processing Systems (NIPS-19):289-296 (View Abstract)

  46. K. Yu, W. Chu, S. Yu, V. Tresp and Z. Xu (2006) Stochastic relational models for discriminative link prediction, Neural Information Processing Systems (NIPS-19):1553-1560 (View Abstract)

  47. S. K. Shevade and W. Chu (2006) Minimum enclosing spheres formulations for support vector ordinal regressionIEEE International Conference on Data Mining (ICDM-06):1054-1058 (View Abstract)

  48. W. Chu, Z. Ghahramani, R. Krause and D. L. Wild  (2006)  Identifying protein complexes in high-throughput protein interaction screens using an infinite latent feature modelPacific Symposium on Biocomputing (PSB-11):231-242 (View Abstract)

  49. W. Chu (2006)  Model selection: an empirical study on two kernel classifiersInternational Joint Conference on Neural Networks (IJCNN-06):1673-1679

  50. W. Chu, Z. Ghahramani, A. Podtelezhnikov and D. L. Wild (2006) Bayesian segmental models with multiple sequence alignment profiles for protein secondary structure and contact map predictionIEEE/ACM Transactions on Computational Biology and Bioinformatics 3(2):98-113 (View Abstract)

  51. W. Chu, S. S. Keerthi, C. J. Ong and Z. Ghahramani (2006)  Bayesian support vector machines for feature ranking and selection,   In I. Guyon, S. Gunn, M. Nikravesh, and L. Zadeh, editors, Feature Extraction, Foundations and Applications   Springer:403-418

  52. W. Chu, Z. Ghahramani, F. Falciani and D. L. Wild (2005)  Biomarker discovery with Gaussian processes in microarray gene expression data,  Bioinformatics 2005(21):3385-3393 (View Abstract)

  53. W. Chu and Z. Ghahramani (2005)  Gaussian processes for ordinal regression,  Journal of Machine Learning Research 6(Jul):1019-1041 (View Abstract)

  54. W. Chu, C. J. Ong and S. S. Keerthi (2005)  An improved conjugate gradient scheme to the solution of least squares SVM,  IEEE Transactions on Neural Networks 16(2):498-501 (View Abstract)

  55. S. S. Keerthi and W. Chu (2005)  A matching pursuit approach to sparse Gaussian process regression, Neural Information Processing Systems (NIPS-18):643-650 (View Abstract)

  56. W. Chu and Z. Ghahramani (2005)  Preference learning with Gaussian processes, International Conference on Machine Learning (ICML-22):137-144 (View Abstract)

  57. W. Chu and S. S. Keerthi (2005)  New approaches to support vector ordinal regression,  International Conference on Machine Learning (ICML-22):145-152 (View Abstract)

  58. W. Chu and Z. Ghahramani (2005)  Extensions of Gaussian processes for ranking: semi-supervised and active learningWorkshop Learning to Rank at (NIPS-18):29-34 (View Abstract)

  59. W. Chu, Z. Ghahramani and D. L. Wild (2004)  A graphical model for protein secondary structure prediction,  International Conference on Machine Learning (ICML-21):161-168 (View Abstract)

  60. W. Chu, Z. Ghahramani and D. L. Wild (2004)  Protein secondary structure prediction using sigmoid belief networks to parameterize segmental semi-Markov models,  European Symposium on Artificial Neural Networks (ESANN-05):81-86

  61. W. Chu, S. S. Keerthi and C. J. Ong (2004)  Bayesian support vector regression using a unified loss functionIEEE Transactions on Neural Networks 15(1):29-44 (View Abstract)

  62. W. Chu (2003)  Bayesian approach to support vector machines, Doctoral Dissertation, National University of Singapore (View Abstract)

  63. K. Duan, S. S. Keerthi, W. Chu, S. K. Shevade and A. N. Poo  (2003)  Multi-category classification by soft-max combination of binary classifiers,  Multiple Classifier Systems (MCS-04) Lecture Notes in Computer Science 2709   Springer:125-134

  64. W. Chu, S. S. Keerthi and C. J. Ong (2003)  Bayesian trigonometric support vector classifierNeural Computation 15(9):2227-2254 (View Abstract)

  65. W. Chu, S. S. Keerthi and C. J. Ong (2002)  A general formulation for support vector machines,  International Conference on Neural Information Processing (ICONIP-09)

  66. W. Chu, S. S. Keerthi and C. J. Ong (2002)  A new Bayesian design method for support vector classification,  International Conference on Neural Information Processing (ICONIP-09)

  67. S. S. Keerthi, et al. (2002)  A machine learning approach for the curation of Biomedical literature - KDD Cup 2002 (Task 1),  SIGKDD Explorations Newsletter, 4(2)  Honorable Mention

  68. W. Chu, S. S. Keerthi and C. J. Ong (2001)  A unified loss function in Bayesian framework for support vector regression,  International Conference on Machine Learning (ICML-18):51-58


Patents

  1. User trustworthiness, US Patent 9519682 B1

  2. Determining user preference of items based on user ratings and user features, US Patent 8301624 B2

  3. Predicting item-item affinities based on item features by regression, US Patent 8442929 B2

  4. Enhanced matching through explore/exploit schemes, US Patent 8244517 B2

  5. Character recognition method and device, US Patent 10872274 B2

  6. Segmentation-based damage detection, US Patent 10783643 B1

  7. Methods and systems relating to ranking functions for multiple domains, US Patent 10019518 B2

  8. Personalized recommendations on dynamic content, US Patent 9600581 B2

  9. Online active learning in user-generated content streams, US App. 20130111005 A1

  10. Methods and apparatuses for building data identification models, US App. 20180365522 A1

  11. Text information clustering method and text information clustering system, US App. 20180365218 A1

  12. Multi-sampling model training method and device, US App. 20180365525 A1

  13. Question recommendation method and device, US App. 20180330226 A1

  14. Feature data processing method and device, US App. 20180341801 A1


Honors & Awards

  • Best Demo Award, ACM CIKM, 2017

  • Best Paper Award, ACM WSDM, 2011

  • Super Star Team Award, Yahoo!, 2008

  • Honorable Mention Team, ACM KDD CUP, 2002


EMAIL : email dot chuwei at gmail.com

2021.03.24