![]() |
Zhou YANG(杨 洲) |
![]() |
Currently, I am a PhD student at SMU. I am a member of Software Analytics Research Group, supervised by Prof. David LO. In the two years, my main focus will be "SE4AI4SE",trying to explore the following three questions: (1) to what extent AI code models (e.g., CoPilot) leak privacy information of data contributors (e.g., GitHub users)? (2) how can we efficiently eliminate privacy information in these large AI code models without simply retraining them? (3) how to protect the privacy and intellectual property in software repositories from being learned by AI without authorization?
Our group and I are open to collaboration and communication.
I'm looking for a short-term visit in 2024. If you happen to know good opportunities, let's talk!
|
![]() |
I was a MSc student at University College London, studying Software System Engineering. My master dissertation is about automated program repair for syntax errors, which is co-supervised by Prof. Earl Barr (UCL) and Prof. Martin Monperrus (KTH).
|
Conference Click here to show/hide
[19] Zhou Xin, Bowen Xu, Donggyun Han, Zhou Yang, Junda He and David Lo. CCBERT: Context-Aware, Fine- Grained, and Self-Supervised Code Change Representation Learning" 2023 IEEE 39th International Conference on Software Maintenance and Evolution (ICSME) (13 Pages, Technical Track)
[18] Julia Kaiwen Lau, Kelvin Kai Wen Kong, Julian Hao Yong, Per Hoong Tan, Zhou Yang, Zi Qian Yong, Joshua Chern Wey Low, Chun Yong Chong, Camellia Lok, Mei Kuan Lim and David Lo. "Synthesizing Speech Test Cases with Text-To-Speech? An Empirical Study on the False Alarms in Automated Speech Recognition Testing" The ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2023) (13 pages, Technical Track.)
[17] Ratnadira Widyasari, Zhou Yang, Ferdian Thung, Sheng Qin Sim, Fiona Wee, Camellia Lok, Jack Phan, Haodi Qi, Constance Tan, Qijin Tay and David Lo. "NICHE: A Curated Dataset of Engineered Machine Learning Projects in Python" The 20th IEEE International Conference on Mining Software Repositories (MSR 2023) (5 pages, Data and Tool Showcase Track.)
[16] Zhou Yang, Chenyu Wang, Jieke Shi, Thong Hoang, Pavneet Kochhar, Qinghua Lu, Zhenchang Xing and David Lo. "What Do Users Ask in Open-Source AI Repositories? An Empirical Study of GitHub Issues" The 20th IEEE International Conference on Mining Software Repositories (MSR 2023) (12 pages, Technical Track.)
[15] Daniel Hao Xian Yuen, Andrew Yong Chen Pang, Zhou Yang, Chun Yong Chong, Mei Kuan Lim and David Lo. "ASDF: A Differential Testing Framework for Automatic Speech Recognition Systems" The 16th IEEE International Conference on Software Testing, Verification and Validation (ICST) 2023 (3 pages, Tool Demo Track.)
[14] Lin Sze Khoo, Jia Qi Bay, Kimberly Ming Lee Yap, Mei Kuan Lim, Chun Yong Chong, Zhou Yang and David Lo. "Exploring and Repairing Gender Fairness Violations in Word Embedding-based Sentiment Analysis Model through Adversarial Patches" IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER 2023) (12 pages, Research Track.) [PDF]
[13] Chen Gong, Zhou Yang, Yunpeng Bai, Jieke Shi, Arunesh Sinha, Bowen Xu, David Lo, Xinwen Hou, Guoliang Fan "Curiosity-Driven and Victim-Aware Adversarial Policies" The Annual Computer Security Applications Conference (ACSAC 2022) (15 pages, Technical Track.) [Code][PDF] [Honorable Mention Award]
[12] Chengran Yang, Bowen Xu, Ferdian Thung, Yucen Shi, Ting Zhang, Zhou Yang, Xin Zhou, Jieke Shi, Junda He, DongGyun Han, David Lo "Answer Summarization for Technical Queries: Benchmark and New Approach" The 37th IEEE/ACM International Conference on Automated Software Engineering. (ASE 2022) (11 pages, Research Track.)
[11] Jieke Shi, Zhou Yang, Bowen Xu, Hong Jin Kang, David Lo "Compressing Pre-trained Models of Code into 3 MB" The 37th IEEE/ACM International Conference on Automated Software Engineering. (ASE 2022) (12 pages, Research Track.) [PDF][Code] [Nomination for Distinguish Paper Award]
[10] Junda He, Bowen Xu, Zhou Yang, DongGyun Han, Chengran Yang, David Lo "PTM4Tag: Sharpening Tag Recommendation of Stack Overflow Posts with Pre-trained Models" 30th IEEE/ACM International Conference on Program Comprehension (ICPC 2022) (11 pages, Research Track.) [PDF][Code] [Recommended for Journal Extension]
[9] Chengran Yang, Bowen Xu, Junaed Younus Khan, Gias Uddin, Donggyun Han, Zhou Yang, David Lo. "Aspect-Based API Review Classification: How Far Can Pre-Trained Transformer Model Go?" 2022 IEEE 29th International Conference on Software Analysis, Evolution and Reengineering (SANER). (11 pages, Research Track.) [PDF][Code]
[8] Zhou Yang, Jieke Shi, Muhammad Hilmi Asyrofi and David Lo. "Revisiting Neuron Coverage Metrics and Quality of Deep Neural Networks." 2022 IEEE 29th International Conference on Software Analysis, Evolution and Reengineering (SANER). (12 pages, RENE Track.) [PDF][Code][Video]
[7] Jieke Shi, Zhou Yang, Junda He, Bowen Xu and David Lo. "Can Identifier Splitting Improve Open-Vocabulary Language Model of Code?" 2022 IEEE 29th International Conference on Software Analysis, Evolution and Reengineering (SANER). (5 pages, ERA Track.) [PDF][Poster][Code][Video]
[6] Zhou Yang, Jieke Shi, Junda He and David Lo. "Natural Attack for Pre-trained Models of Code." 2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE). (12 pages, Technical Track.) [PDF][Code]
[5] Zhou Yang, Harshit Jain, Jieke Shi, Muhammad Hilmi Asyrofi and David Lo. "BiasHeal: On-the-Fly Black-Box Healing of Bias in Sentiment Analysis Systems." 2021 IEEE 37th International Conference on Software Maintenance and Evolution (ICSME). (5 pages, NIER Track.) [PDF][Code]
[4] Muhammad Hilmi Asyrofi, Zhou Yang, Jieke Shi, Chu Wei Quan and David Lo. "Can Differential Testing Improve Automatic Speech Recognition Systems?" 2021 IEEE 37th International Conference on Software Maintenance and Evolution (ICSME). (5 pages, NIER Track.) [PDF][Code]
[3] Zhou Yang*, Jieke Shi*, Shaowei Wang and David Lo. "IncBL: Incremental Bug Localization." 2021 IEEE/ACM 36th International Conference on Automated Software Engineering (ASE). (4 pages, Tool Demonstrations, *Equal contributions.) [PDF][Poster][Code][Video]
[2] Zhou Yang, Muhammad Hilmi Asyrofi and David Lo. "BiasRV: Uncovering Biased Sentiment Predictions at Runtime." In Proceedings of ESEC/FSE 2021. Association for Computing Machinery, New York, NY, USA, 1540–1544. [PDF][Code][Video]
[1] Muhammad Hilmi Asyrofi, Zhou Yang, and David Lo. "CrossASR++: A Modular Differential Testing Framework for Automatic Speech Recognition." In Proceedings of ESEC/FSE 2021. Association for Computing Machinery, New York, NY, USA, 1575-1579. [PDF][Code][Video]
Journal Click here to show/hide
Preprints Click here to show/hide
[5] Junda He, Xin Zhou, Bowen Xu, Ting Zhang, Kisub Kim, Zhou Yang, Ferdian Thung, Ferdian Thung, Ivana Irsan and David Lo. Representation Learning for Stack Overflow Posts: How Far are We?" Under review of ACM Transactions on Software Engineering and Methodology.
[4] Junda He, Bowen Xu, Zhou Yang, DongGyun Han, Chengran Yang, Jiakun Liu, Zhipeng Zhao and David Lo. PTM4Tag+: Tag Recommendation of Stack Overflow Posts with Pre-trained Models?" Under review of IEEE Empirical Software Engineering.
[3] Zhou Yang, Bowen Xu, Jie M Zhang, Hong Jin Kang, Jieke Shi, Junda He, David Lo. "Stealthy Backdoor Attack for Code Models." Under review of IEEE TSE [PDF]
[2] Zhou Yang, Jieke Shi, Muhammad Hilmi Asyrofi, Bowen Xu, Xin Zhou, DongGyun Han, David Lo. Prioritizing Speech Test Cases." Under review of IEEE TSE [PDF]
[1] Chen Gong, Zhou Yang, Yunpeng Bai, Junda He, Jieke Shi, Arunesh Sinha, Bowen Xu, Xinwen Hou, Guoliang Fan, David Lo. Mind Your Data! Hiding Backdoors in Offline Reinforcement Learning Datasets." Under review of IEEE TSE [PDF]
Reviewing:
Conference Activities:
© Zhou Yang | Last Update: Feb 2023