王超

发布时间:2023-12-21浏览次数:11027

 电子邮箱:cswang@ustc.edu.cn

 个人主页:https://faculty.ustc.edu.cn/cswang/

 主要研究方向:智能计算机体系结构、深度学习处理器、FPGA应用加速。


 王超,特任教授,博士生导师,软件学院副院长。IEEE高级会员,ACM高级会员,CCF杰出会员,CCF体系结构专委会常务委员,中国科学院青年创新促进会优秀会员,江苏省电子学会理事,(曾)担任ACM  Transactions on Design Automations for Electronics SystemsIEEE/ACM  Transactions on Computational Biology and  Bioinformatics等国际著名期刊的编委。分别于20062011年在中国科学技术大学计算机学院获得学士、博士学位,曾于美国加州大学圣塔芭芭拉分校担任访问学者。主持国家自然科学基金、国家重点研发计划子课题、装发重大专项课题、中科院先导项目子课题等多项国家和省部级科研项目。在TPDSTCTCADMICRORTSS等高水平学术期刊和会议上发表论文100余篇,参与研制了基于国产智能芯片的智能计算系统,相关成果在多家单位得以应用。


 获奖情况

1.2023-国家级青年人才

2.2021-中国科学院青年创新促进会优秀会员

3.2018-CODES+ISSS国际会议最佳论文提名

4.2016-安徽省自然科学优秀学术论文一等奖

5.2016-ACM中国新星提名奖


 代表性学术论文

1.Changlong Li, Yu Liang, Liang Shi, Chao Wang, Chun Jason Xue, Xuehai Zhou: Flexible and Efficient Memory Swapping Across Mobile Devices With LegoSwap. IEEE Trans. Parallel Distributed Syst. 35(1): 140-153 (2024)

2.Lei Gong, Chao Wang, Haojun Xia, Xianglan Chen, Xi Li, Xuehai Zhou: Enabling Fast and Memory-Efficient Acceleration for Pattern Matching Workloads: The Lightweight Automata Processing Engine. IEEE Trans. Computers 72(4): 1011-1025 (2023)

3.Wenqi Lou, Lei Gong, Chao Wang, Zidong Du, Xuehai Zhou: OctCNN: A High Throughput FPGA Accelerator for CNNs Using Octave Convolution Algorithm. IEEE Trans. Computers 71(8): 1847-1859 (2022)

4.Yingxue Gao, Lei Gong, Chao Wang, Teng Wang, Xi Li, Xuehai Zhou: Algorithm/Hardware Co-Optimization for Sparsity-Aware SpMM Acceleration of GNNs. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 42(12): 4763-4776 (2023)

5.Yuanbo Wen, Qi Guo, Zidong Du, Jianxing Xu, Zhenxing Zhang, Xing Hu, Wei Li, Rui Zhang, Chao Wang, Xuehai Zhou, Tianshi Chen: Enabling One-Size-Fits-All Compilation Optimization for Inference Across Machine Learning Computers. IEEE Trans. Computers 71(9): 2313-2326 (2022)

6.Teng Wang, Lei Gong, Chao Wang, Yang Yang, Yingxue Gao, Xuehai Zhou, Huaping Chen: ViA: A Novel Vision-Transformer Accelerator Based on FPGA. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 41(11): 4088-4099 (2022)

7.Yuanbo Wen, Qi Guo, Qiang Fu, Xiaqing Li, Jianxing Xu, Yanlin Tang, Yongwei Zhao, Xing Hu, Zidong Du, Ling Li, Chao Wang, Xuehai Zhou, Yunji Chen: BabelTower: Learning to Auto-parallelized Program Translation. ICML 2022: 23685-23700

8.Chao Wang, Lei Gong, Fahui Jia, Xuehai Zhou: An FPGA Based Accelerator for Clustering Algorithms With Custom Instructions. IEEE Trans. Computers 70(5): 725-732 (2021)

9.Chao Wang, Lihui Jin, Lei Gong, Chongchong Xu, Yahui Hu, Luchao Tan, Xuehai Zhou: Tinker: A Middleware for Deploying Multiple NN-Based Applications on a Single Machine. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 40(7): 1495-1499 (2021)

10.Lei Gong, Chao Wang, Xi Li, Xuehai Zhou: Improving HW/SW Adaptability for Accelerating CNNs on FPGAs Through A Dynamic/Static Co-Reconfiguration Approach. IEEE Trans. Parallel Distributed Syst. 32(7): 1854-1865 (2021)

11.Chao Wang, Lei Gong, Xi Li, Qi Yu, Aili Wang, Patrick Hung, Xuehai Zhou: SOLAR: Services-Oriented Deep Learning Architectures-Deep Learning as a Service. IEEE Trans. Serv. Comput. 14(1): 262-273 (2021)

12.Changlong Li, Hang Zhuang, Qingfeng Wang, Chao Wang, Xuehai Zhou: LKSM: Light Weight Key-Value Store for Efficient Application Services on Local Distributed Mobile Devices. IEEE Trans. Serv. Comput. 14(4): 1026-1039 (2021)

13.Xi Zeng, Tian Zhi, Xuda Zhou, Zidong Du, Qi Guo, Shaoli Liu, Bingrui Wang, Yuanbo Wen, Chao Wang, Xuehai Zhou, Ling Li, Tianshi Chen, Ninghui Sun, Yunji Chen: Addressing Irregularity in Sparse Neural Networks Through a Cooperative Software/Hardware Approach. IEEE Trans. Computers 69(7): 968-985 (2020)

14.Chao Wang, Lei Gong, Xiang Ma, Xi Li, Xuehai Zhou: WooKong: A Ubiquitous Accelerator for Recommendation Algorithms With Custom Instruction Sets on FPGA. IEEE Trans. Computers 69(7): 1071-1082 (2020)

15.Xuan Wang, Chao Wang, Jing Cao, Lei Gong, Xuehai Zhou: WinoNN: Optimizing FPGA-Based Convolutional Neural Network Accelerators Using Sparse Winograd Algorithm. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 39(11): 4290-4302 (2020)

16.Chao Wang, Lei Gong, Xi Li, Xuehai Zhou: A Ubiquitous Machine Learning Accelerator With Automatic Parallelization on FPGA. IEEE Trans. Parallel Distributed Syst. 31(10): 2346-2359 (2020)

17.Lei Gong, Chao Wang, Xi Li, Huaping Chen, Xuehai Zhou: MALOC: A Fully Pipelined FPGA Accelerator for Convolutional Neural Networks With All Layers Mapped on Chip. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 37(11): 2601-2612 (2018)

18.Xuda Zhou, Zidong Du, Qi Guo, Shaoli Liu, Chengsi Liu, Chao Wang, Xuehai Zhou, Ling Li, Tianshi Chen, Yunji Chen: Cambricon-S: Addressing Irregularity in Sparse Neural Networks through A Cooperative Software/Hardware Approach. MICRO 2018: 15-28

19.Chao Wang, Xi Li, Aili Wang, Xuehai Zhou: A Classroom Scheduling Service for Smart Classes. IEEE Trans. Serv. Comput. 10(2): 155-164 (2017)

20.Chao Wang, Xi Li, Yunji Chen, Youhui Zhang, Oliver Diessel, Xuehai Zhou: Service-Oriented Architecture on FPGA-Based MPSoC. IEEE Trans. Parallel Distributed Syst. 28(10): 2993-3006 (2017)

21.Chao Wang, Lei Gong, Qi Yu, Xi Li, Yuan Xie, Xuehai Zhou: DLAU: A Scalable Deep Learning Accelerator Unit on FPGA. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 36(3): 513-517 (2017)

22.Bo Wan, Xi Li, Haizhao Luo, Chao Wang, Xianglan Chen, Xuehai Zhou: Work-in-Progress: TTI: A Timing ISA for LET Model in Safety-Critical Systems. RTSS 2017: 363-365

23.Chao Wang, Junneng Zhang, Xi Li, Aili Wang, Xuehai Zhou: Hardware Implementation on FPGA for Task-Level Parallel Dataflow Execution Engine. IEEE Trans. Parallel Distributed Syst. 27(8): 2303-2315 (2016)

24.Chao Wang, Xi Li, Junneng Zhang, Peng Chen, Yunji Chen, Xuehai Zhou, Ray C. C. Cheung: Architecture Support for Task Out-of-Order Execution in MPSoCs. IEEE Trans. Computers 64(5): 1296-1310 (2015)

25.Shaoli Liu, Tianshi Chen, Ling Li, Xi Li, Mingzhe Zhang, Chao Wang, Haibo Meng, Xuehai Zhou, Yunji Chen: FreeRider: Non-Local Adaptive Network-on-Chip Routing with Packet-Carried Propagation of Congestion Information. IEEE Trans. Parallel Distributed Syst. 26(8): 2272-2285 (2015)

26.Chao Wang, Xi Li, Junneng Zhang, Xuehai Zhou, Xiaoning Nie: MP-Tomasulo: A Dependency-Aware Automatic Parallel Execution Engine for Sequential Programs. ACM Trans. Archit. Code Optim. 10(2): 9:1-9:26 (2013)