2021 KDD Workshop on Programming Language Processing (PLP)

 Virtual Conference
 Workshop Date: 8:00am - 12:00pm, August 15, 2021 (Singapore Time)


  • Data Mining and Knowledge Discovery has accepted our special issue on programming language processing. The planned submission deadline is October 31, 2021.
  • Data Mining and Knowledge Discovery has expressed an interest on the topic of programming language processing. We are preparing a formal special issue proposal. If successful, high-quality workshop papers will be recommended to this special issue.
  • It has been decided to postpone the paper submission deadline to June 3rd, 2021. We look forward to receiving your final contributions.


Programming language origins in natural language. Different from natural language that is used by humans amongst themselves, programming languages allow humans to tell machines what to do. The meaningful identifier names and natural language documentation allow other developers to understand the author’s intent and then maintain and extend the code. At the same time, the substantial information contained in the code enables the intervention of machine learning algorithms in a variety of software engineering tasks. However, the mining of programming languages could not exactly follow the manner of natural language processing, because of their difference. Programming languages need a high degree of expertise, completeness and precision because computer cannot think outside the statement while natural language may be informal and allow minor errors. The programming language syntax is also not based on natural language grammar. We have witnessed an increasing number of successful machine learning techniques for natural language processing, e.g., GPT (Generative Pre- Training) by Open AI, and BERT (Bidirectional Encoder Representations from Transformers) for language understanding. In this deep learning era, what are the challenges and opportunities to deploy such NLP breakthroughs in programming language processing? What is the current more specialised model for programming language processing? How do machine learning and software engineering researchers apply the knowledge in collaboration to further the field and improve intelligence of the code? We propose to invite world-leading experts from both machine learning and software engineering to discuss and debate the path forward for mining the value of programming languages.

Topic of Interest

This workshop will provide a premium platform for researchers from both academia and industry to exchange ideas on opportunities, challenges, and cutting-edge techniques of machine learning for software engineering applications and systems. Papers will be accepted under the topics including, but not limited to, the following three broad categories:

Novel Machine Learning Techniques for Programming Language
  • Weakly supervised machine learning for programming languages
  • Pretrained models for programming languages
  • Deep generative models for programming languages
  • Graph convolutional neural networks for programming languages
  • Sequence modelling for programming languages
  • Machine translation for programming languages

Novel Machine Learning Applications to Software Engineering Problems
  • Deployment of languages to different platforms
  • Code generation, optimization, and synthesis
  • Software language validation
  • Compilation and interpretation techniques
  • Software language design and implementation
  • Testing techniques for languages
  • Simulation techniques for languages

Novel Machine Learning Systems of Software Engineering Tasks
  • Code recommendation systems
  • Dialogue and Interactive Systems
  • Performance benchmarks
  • User studies evaluating usability
  • Programming tools, including refactoring editors, checkers, compilers and debuggers
  • Techniques in secure, parallel, distributed, embedded or mobile environments

Call for Papers

Submissions should follow the SIGKDD formatting requirements and will be evaluated using the SIGKDD Research Track evaluation criteria. Preference will be given to papers that are reproducible, and authors are encouraged to share their data and code publicly whenever possible. Submissions are strongly recommended to be no more than 4 pages, excluding references or supplementary materials (all in a single pdf). The appropriateness of using additional pages over the recommended length will be judged by reviewers. Papers must be submitted in PDF format to easychair https://easychair.org/conferences/?conf=plp2021 and formatted according to the new Standard ACM Conference Proceedings Template .

The review process is single-round and double-blind (submission files have to be anonymized). The program committee will select the papers based on originality, presentation, and technical quality for spotlight and/or poster presentation. Concurrent submissions to other journals and conferences are acceptable.

Any questions may be directed to: c.xu@sydney.edu.au or slivia.ma@uq.edu.au.




Key Dates

  • Paper Submission deadline: May 20th, 2021     June 3rd, 2021
  • Acceptance Notice: June 10th, 2021
  • Camera Ready Submission: June 20th, 2021
  • Workshop Date: August 14th-18th, 2021

All deadlines are 11.59 pm UTC -12h ("Anywhere on Earth").


  • Chang Xu, University of Sydney, Australia
  • Siqi Ma, University of Queensland, Australia
  • David Lo, Singapore Management University

Program Committee