Skip to content

tudasc/Alpaca

Folders and files

NameName
Last commit message
Last commit date
Aug 15, 2023
Jan 2, 2025
Jun 2, 2023
Jan 2, 2025
Jan 2, 2025
Jan 2, 2025
Jul 2, 2023
Aug 29, 2023
Dec 5, 2023
Jul 24, 2023
Dec 5, 2023
Jan 2, 2025
Jun 12, 2023
Jan 2, 2025
Dec 5, 2023
Jan 2, 2025
Aug 14, 2023

Repository files navigation

ALPACA · License

ALPACA (Automated Library and Program API Change Analyzer) is a Clang compiler-frontend based tool designed to detect and analyze changes between different versions of C/C++ source code. By leveraging AST-based semantic information from the Clang compiler combined with syntactic similarity analysis, ALPACA can identify and classify various types of code changes and refactorings.

Key Features

  • Detection of various entity (functions, variables, records) changes between two versions of a code base
  • Support for complex C/C++ codebases (>1M LOC)
  • High accuracy (92% change detection rate, 91% correct classification) -> dont know if we can/should write that in before the paper is published
  • Can handle template specializations and overloaded functions
  • Built on Clang LibTooling for robust C/C++ parsing

Build Requirements

Core Requirements

  • CMake 3.0 or higher
  • LLVM/Clang 12
  • C++17 compatible compiler

Building ALPACA

Native Build

# Clone the repository
git clone https://github.com/tudasc/Alpaca
cd ALPACA

# Configure with CMake
cmake -B build

# Build
cmake --build build

# Verify LLVM/Clang version
./build/APIAnalysis --help

Exemplary Docker Build

We provide a Dockerfile that sets up an example from the evaluation with an OpenMPI codebase analysis and all the required build dependencies:

# Build the Docker image
docker build -t alpaca .

# Run ALPACA in Docker
docker run -v $(pwd):/work -w /work alpaca APIAnalysis

Usage

ALPACA requires two versions of the codebase to perform the analysis:

./APIAnalysis --oldDir=/path/to/old/version \
              --newDir=/path/to/new/version \
              [options]

Common Options

  • --oldDir, --newDir: Paths to the old and new versions of the project (required)
  • --oldCD, --newCD: Paths to compilation databases
  • --extra-args: Additional Clang arguments for both versions
  • --extra-args-old, --extra-args-new: Version-specific Clang arguments
  • --exclude: Directories/files to ignore (comma-separated)
  • --doc: Enable the second analysis step using Levenshtein Matching
  • --json: Output results in JSON format
  • --ipf: Include private functions in analysis

Using with Compilation Databases

For best results, provide compilation databases for both versions:

  1. Generate compilation databases (for example using Bear):
cd old_version
bear -- make
cd ../new_version
bear -- make
  1. Run analysis with compilation databases:
./APIAnalysis --oldDir=old_version \
              --newDir=new_version \
              --oldCD=old_version/compile_commands.json \
              --newCD=new_version/compile_commands.json

Future Work (currently work in progress)

  • Parallel Processing: Implementation of parallel extraction and analysis using OpenMPI to significantly reduce runtime for large codebases
  • Robust Handling of Overloaded Functions: Reworking of C++ overloaded function handling for improved accuracy and coverage
  • Extended Change Detection: Support for additional types of code changes and refactorings
  • Reference Graph Matcher: Introduction of a new matching algorithm using Reference Graphs to improve entity matching between project versions

License

ALPACA is licensed under the BSD 3-Clause License. See LICENSE for details.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages