Now showing 1 - 1 of 1
  • Publication
    The PIPr Dataset of Public Infrastructure as Code Programs
    With Programming Languages Infrastructure as Code (PL-IaC), developers implement IaC programs in popular imperative programming languages like Python and Typescript. Such programs generate the declarative target state of the deployment, i.e., they describe what to set up, not how to set it up. Despite the popularity of PL-IaC, which has grown more than ten times from 2020 to 2023, we know little about how developers apply it and how IaC programs differ from other software. Such knowledge is essential to effectively use existing software engineering techniques and develop new ones for PL-IaC. To shed light on PL-IaC in practice, we present PIPr, the first systematic PL-IaC dataset. PIPr is based on 37 712 public IaC programs on GitHub from August 2022 and includes initial analyses, assessing the programming languages, testing techniques, and licenses of the IaC programs. Beyond the metadata and analysis results of all IaC programs, PIPr contains the code of all 15 504 IaC programs whose licenses permit redistribution. PIPr sets the ground for future in-depth investigations on PL-IaC in practice.
    Scopus© Citations 2