Work on this exercise locally
This web app is a reference guide — you can read instructions, browse starter code, and view tests here. To actually complete the exercise, you need to work in your local development environment.
git clone https://github.com/weihaoqu/program-analysis-bootcamp-studentfailwith "TODO" with your implementation.dune runtest labs/lab6-integrated-analyzerLab 6: Integrated Analyzer
Lab 6: Integrated Analyzer
Overview
Build a complete, multi-pass program analysis tool that detects safety violations (division by zero), security vulnerabilities (injection attacks), and code quality issues (dead code). This lab integrates techniques from Modules 2-6 into a unified pipeline with structured reporting.
Learning Objectives
By completing this lab, you will:
- Define a unified finding type that standardizes outputs across analysis passes
- Implement dead code detection using purely AST-level analysis
- Build a safety analysis pass using the sign abstract domain
- Build a taint analysis pass using the taint abstract domain
- Compose passes in a pipeline that collects and sorts findings
- Generate structured reports from analysis results
Prerequisites
- Module 4: Abstract Interpretation (sign domain, MakeEnv, eval_expr)
- Module 5: Security Analysis (taint domain, source/sink/sanitizer)
- Module 6: Tools Implementation (finding types, multi-pass composition)
Structure
| Part | Points | Files | Description |
|---|---|---|---|
| A | 35 | finding.ml, dead_code.ml | Unified finding type + AST dead code detection |
| B | 40 | safety_analysis.ml, taint_analysis.ml, pipeline.ml | Multi-pass analysis + configurable pipeline |
| C | 25 | reporter.ml, analysis_report.md | Report generation + written analysis |
Provided Files (Do Not Modify)
sign_domain.ml— Sign abstract domain (from Module 4)taint_domain.ml— Taint abstract domain (from Module 5)taint_config.ml— Security configuration (sources, sinks, sanitizers)
Part A: Finding Types + Dead Code (35 pts)
finding.ml (15 pts)
Implement the unified finding type operations:
severity_to_string,category_to_string,severity_to_intsort_by_severity,filter_by_severity,filter_by_categoryformat_finding
dead_code.ml (20 pts)
Implement purely AST-level dead code detection:
collect_used_vars_expr,collect_used_vars_stmts,collect_assigned_varsfind_unreachable_code— detect statements after Returnfind_unused_variables— detect assigned-but-never-read variablesfind_unused_parameters— detect parameters never read in the body- Variables/parameters prefixed with
_are exempt from unused warnings
Part B: Multi-Pass Analysis (40 pts)
safety_analysis.ml (15 pts)
Build a safety analysis pass using the sign domain:
eval_expr— evaluate expressions in the sign domaintransfer_stmt— process statements, detect division by zeroanalyze_function,analyze_program
Division by zero detection:
- Divisor is
Zero→ High severity finding - Divisor is
Top→ Medium severity finding
taint_analysis.ml (15 pts)
Build a taint analysis pass using the taint domain:
eval_expr— evaluate expressions for taint statustransfer_stmt— process statements, check sinks for tainted dataanalyze_function,analyze_program
Use Taint_config.default_config for sources, sinks, and sanitizers.
pipeline.ml (10 pts)
Compose the analysis passes:
default_passes— return dead_code, safety, and taint passesrun_all— run all passes and return findings sorted by severity
Part C: Reporter (25 pts)
reporter.ml (15 pts)
Generate structured reports:
build_report— compute severity/category counts from findingsformat_text_report— human-readable text reportformat_summary— one-line summary
analysis_report.md (10 pts, manual grading)
Write a 1-2 page analysis report covering:
- Your analysis pipeline architecture
- A sample program and its findings
- Strengths and limitations of your tool
- How you would extend it for production use
Building and Testing
# Build
dune build labs/lab6-integrated-analyzer/
# Run student tests
dune runtest labs/lab6-integrated-analyzer/starter/tests/
# Check for compilation errors
dune build @check
Tips
- Start with Part A (finding.ml + dead_code.ml) — they have no domain dependencies
- For Part B, reference your Module 4 (sign domain) and Module 5 (taint domain) exercises
- The pipeline is simple once the individual passes work
- Use
Finding.severity_to_intfor sorting (higher = more severe) - Exempt
_-prefixed variables from unused warnings
Starter Files
Test Files
Work on this exercise locally
This web app is a reference guide — you can read instructions, browse starter code, and view tests here. To actually complete the exercise, you need to work in your local development environment.
git clone https://github.com/weihaoqu/program-analysis-bootcamp-studentfailwith "TODO" with your implementation.dune runtest labs/lab6-integrated-analyzerLab 6: Integrated Analyzer
Lab 6: Integrated Analyzer
Overview
Build a complete, multi-pass program analysis tool that detects safety violations (division by zero), security vulnerabilities (injection attacks), and code quality issues (dead code). This lab integrates techniques from Modules 2-6 into a unified pipeline with structured reporting.
Learning Objectives
By completing this lab, you will:
- Define a unified finding type that standardizes outputs across analysis passes
- Implement dead code detection using purely AST-level analysis
- Build a safety analysis pass using the sign abstract domain
- Build a taint analysis pass using the taint abstract domain
- Compose passes in a pipeline that collects and sorts findings
- Generate structured reports from analysis results
Prerequisites
- Module 4: Abstract Interpretation (sign domain, MakeEnv, eval_expr)
- Module 5: Security Analysis (taint domain, source/sink/sanitizer)
- Module 6: Tools Implementation (finding types, multi-pass composition)
Structure
| Part | Points | Files | Description |
|---|---|---|---|
| A | 35 | finding.ml, dead_code.ml | Unified finding type + AST dead code detection |
| B | 40 | safety_analysis.ml, taint_analysis.ml, pipeline.ml | Multi-pass analysis + configurable pipeline |
| C | 25 | reporter.ml, analysis_report.md | Report generation + written analysis |
Provided Files (Do Not Modify)
sign_domain.ml— Sign abstract domain (from Module 4)taint_domain.ml— Taint abstract domain (from Module 5)taint_config.ml— Security configuration (sources, sinks, sanitizers)
Part A: Finding Types + Dead Code (35 pts)
finding.ml (15 pts)
Implement the unified finding type operations:
severity_to_string,category_to_string,severity_to_intsort_by_severity,filter_by_severity,filter_by_categoryformat_finding
dead_code.ml (20 pts)
Implement purely AST-level dead code detection:
collect_used_vars_expr,collect_used_vars_stmts,collect_assigned_varsfind_unreachable_code— detect statements after Returnfind_unused_variables— detect assigned-but-never-read variablesfind_unused_parameters— detect parameters never read in the body- Variables/parameters prefixed with
_are exempt from unused warnings
Part B: Multi-Pass Analysis (40 pts)
safety_analysis.ml (15 pts)
Build a safety analysis pass using the sign domain:
eval_expr— evaluate expressions in the sign domaintransfer_stmt— process statements, detect division by zeroanalyze_function,analyze_program
Division by zero detection:
- Divisor is
Zero→ High severity finding - Divisor is
Top→ Medium severity finding
taint_analysis.ml (15 pts)
Build a taint analysis pass using the taint domain:
eval_expr— evaluate expressions for taint statustransfer_stmt— process statements, check sinks for tainted dataanalyze_function,analyze_program
Use Taint_config.default_config for sources, sinks, and sanitizers.
pipeline.ml (10 pts)
Compose the analysis passes:
default_passes— return dead_code, safety, and taint passesrun_all— run all passes and return findings sorted by severity
Part C: Reporter (25 pts)
reporter.ml (15 pts)
Generate structured reports:
build_report— compute severity/category counts from findingsformat_text_report— human-readable text reportformat_summary— one-line summary
analysis_report.md (10 pts, manual grading)
Write a 1-2 page analysis report covering:
- Your analysis pipeline architecture
- A sample program and its findings
- Strengths and limitations of your tool
- How you would extend it for production use
Building and Testing
# Build
dune build labs/lab6-integrated-analyzer/
# Run student tests
dune runtest labs/lab6-integrated-analyzer/starter/tests/
# Check for compilation errors
dune build @check
Tips
- Start with Part A (finding.ml + dead_code.ml) — they have no domain dependencies
- For Part B, reference your Module 4 (sign domain) and Module 5 (taint domain) exercises
- The pipeline is simple once the individual passes work
- Use
Finding.severity_to_intfor sorting (higher = more severe) - Exempt
_-prefixed variables from unused warnings