Petal App Manager Profiling Tools
Comprehensive profiling infrastructure for analyzing PAM performance under different operational scenarios to identify bottlenecks and optimize resource usage on low-resource devices (RPi, Orin).
Table of Contents
Overview
This profiling tool uses py-spy, a sampling profiler that can profile ALL threads and async tasks in PAM:
✅ Captures complete PAM behavior:
MAVLink worker threads for message processing
Redis worker threads for pub/sub handling
MQTT worker threads for message handling
Async background tasks (health status publishing, petal loading, etc.)
Petal startup/shutdown async methods
Proxy initialization and runtime behavior
✅ Rich visualization:
Interactive Speedscope viewer with flame graph, icicle graph, and timeline views
Per-thread filtering and analysis
No installation required (web-based viewer)
Quick Start
1. Prerequisites
Ensure you have:
PAM development environment set up
Python 3.11.x virtual environment at
.venvRedis, MQTT broker, and other PAM dependencies running
PAM already running (for profiling and monitoring to attach to)
2. Install Dependencies
# Navigate to petal-app-manager directory
cd /home/droneleaf/petal-app-manager-dev/petal-app-manager
# Activate virtual environment
source .venv/bin/activate
# Install profiling tools
pip install -r tools/profiling/requirements-profiling.txt
# (Optional) Install Speedscope CLI for local viewing
# Note: The web version at https://www.speedscope.app/ requires NO installation!
# Only install CLI if you prefer local viewing over uploading to web.
# Requires Node.js/npm to be installed first:
npm install -g speedscope
Installed Python tools:
py-spy- Multi-threaded sampling profiler (used by profile_pam.py)psutil- Process monitoring (used by monitor_cpu.py)matplotlib- Plotting library (used by monitor_cpu.py)
Optional CLI tools:
speedscope(npm) - Local speedscope viewer (web version at speedscope.app requires no installation)
3. Start PAM
# Terminal 1 - Start PAM first (required for profiling)
uvicorn petal_app_manager.main:app --host 0.0.0.0 --port 9000 --log-level info --no-access-log --http h11
4. Run Profiling and/or Monitoring
You can run the profiler alone, CPU monitor alone, or both simultaneously.
Option A: Profiling Only
# Terminal 2 - Profile PAM (attaches to running process)
python tools/profiling/profile_pam.py --scenario idle-no-leaffc
# Press Ctrl+C when done to stop and save profile
Option B: CPU Monitoring Only
# Terminal 2 - Monitor CPU and memory usage
python tools/profiling/monitor_cpu.py --scenario idle-no-leaffc --interval 1 --plot
# Press Ctrl+C when done to save CSV and PNG
Option C: Both Profiling and Monitoring (Not Recommended)
Note: Running both simultaneously is not recommended because the profiler adds significant CPU overhead depending on the sampling rate (30-40% increase at 100Hz), which will distort CPU monitoring results. Run them separately for accurate measurements.
# Terminal 2 - CPU monitoring
python tools/profiling/monitor_cpu.py --scenario idle-no-leaffc --interval 1 --plot
# Terminal 3 - Profiling (if needed separately)
python tools/profiling/profile_pam.py --scenario idle-no-leaffc
# Press Ctrl+C in both terminals when done
This will:
Attach to the running PAM process (PID found automatically)
Profile/monitor until you press Ctrl+C
Generate timestamped output files
Save to
tools/profiling/profiles/
Output files (with matching timestamps):
Profile:
pam_<scenario>_<timestamp>_profile.speedscope.jsonCPU data:
pam_<scenario>_<timestamp>_cpu.csvCPU plot:
pam_<scenario>_<timestamp>_cpu.png
5. View Results
Interactive (best for thread analysis):
Speedscope Profile Viewer:
Option 1: Web-based (Recommended - No installation required!)
1. Visit https://www.speedscope.app/
2. Click "Browse" and select the .speedscope.json file from:
tools/profiling/profiles/pam_*_profile.speedscope.json
3. Toggle between views:
- Time Order: See execution timeline
- Left Heavy: Flame graph (top-down)
- Sandwich: Icicle graph (bottom-up)
4. Use thread filter to isolate worker threads
Option 2: CLI (if you installed speedscope via npm)
# View speedscope locally (opens in browser)
speedscope tools/profiling/profiles/pam_idle-no-leaffc_*_profile.speedscope.json
# Or navigate to the file and run
cd tools/profiling/profiles
speedscope pam_*_profile.speedscope.json
CPU Monitor Plots:
# View the PNG plot
xdg-open tools/profiling/profiles/pam_idle-no-leaffc_*_cpu.png
# Or analyze CSV data
cat tools/profiling/profiles/pam_idle-no-leaffc_*_cpu.csv
6. Command-Line Flags
profile_pam.py Flags
python tools/profiling/profile_pam.py [OPTIONS]
Required:
--scenario TEXT Scenario label for organizing files
Examples: idle-no-leaffc, mission-execution, esc-calibration
Optional:
--output PATH Output directory for profile files
Default: tools/profiling/profiles/
--help Show help message and exit
Example:
python tools/profiling/profile_pam.py --scenario idle-no-leaffc
python tools/profiling/profile_pam.py --scenario mission-execution --output /tmp/profiles
monitor_cpu.py Flags
python tools/profiling/monitor_cpu.py [OPTIONS]
Required:
--scenario TEXT Scenario label for organizing files
Examples: idle-no-leaffc, mission-execution
Optional:
--interval INTEGER Measurement interval in seconds (default: 2)
Lower = more data points
--output PATH Output directory for CSV and PNG files
Default: tools/profiling/profiles/
--plot Generate PNG plot in addition to CSV
Recommended for visual analysis
--help Show help message and exit
Examples:
# Basic monitoring (2-second intervals, CSV only)
python tools/profiling/monitor_cpu.py --scenario idle-no-leaffc
# High-resolution monitoring with plot (1-second intervals)
python tools/profiling/monitor_cpu.py --scenario mission-execution --interval 1 --plot
# Custom output directory
python tools/profiling/monitor_cpu.py --scenario test --interval 1 --plot --output /tmp/cpu_data
Profiling Scenarios
Example scenario labels for organizing profile data:
idle-no-leaffcidle-with-leaffcesc-calibrationrc-streammission-execution
Note: Scenarios are labels for record keeping only. The profiler captures whatever PAM is actually doing at runtime.
Workflow for Profiling Different Scenarios
Start PAM:
# Terminal 1 uvicorn petal_app_manager.main:app --host 0.0.0.0 --port 9000 --log-level info --no-access-log --http h11
Start monitoring (optional but recommended):
# Terminal 2 python tools/profiling/monitor_cpu.py --scenario <scenario> --interval 1 --plot
Start profiling:
# Terminal 3 python tools/profiling/profile_pam.py --scenario <scenario>
Execute the scenario:
For
idle-*: Just let it runFor
mission-execution: Send mission plan (see example below)For
esc-calibration: Trigger calibration via APIFor
rc-stream: Start RC streaming
Stop profiling:
Press Ctrl+C in both monitoring and profiling terminals
Files are automatically saved
Example SITL Mission for Profiling
For profiling mission execution in SITL (Software-In-The-Loop), use this sample mission plan:
File: tools/profiling/example_mission_sitl.json
{
"config": {
"joystick_mode": "ENABLED_ON_PAUSE"
},
"edges": [
{
"from": "Takeoff",
"to": "Wait 1"
},
{
"from": "Wait 1",
"to": "GotoLocalWaypoint 1"
},
{
"from": "GotoLocalWaypoint 1",
"to": "GotoLocalWaypoint 2"
},
{
"from": "GotoLocalWaypoint 2",
"to": "GotoLocalWaypoint 3"
},
{
"from": "GotoLocalWaypoint 3",
"to": "Wait 2"
},
{
"from": "Wait 2",
"to": "Land"
}
],
"id": "main",
"nodes": [
{
"name": "Takeoff",
"params": {
"alt": 1
},
"type": "Takeoff"
},
{
"name": "Wait 1",
"params": {
"duration": 2
},
"type": "Wait"
},
{
"name": "GotoLocalWaypoint 1",
"params": {
"speed": [
0.2
],
"waypoints": [
[
0.5,
0,
1
]
],
"yaw_speed": [
30
],
"yaws_deg": [
0
]
},
"type": "GotoLocalPosition"
},
{
"name": "GotoLocalWaypoint 2",
"params": {
"speed": [
0.2,
0.2
],
"waypoints": [
[
0.5,
0.5,
1
],
[
0,
0,
1
]
],
"yaw_speed": [
30,
30
],
"yaws_deg": [
0,
0
]
},
"type": "GotoLocalPosition"
},
{
"name": "GotoLocalWaypoint 3",
"params": {
"speed": [
0.2,
0.3,
0.4
],
"waypoints": [
[
0,
0.5,
1
],
[
0.5,
0.5,
1
],
[
0.5,
0,
1
]
],
"yaw_speed": [
10,
20,
20
],
"yaws_deg": [
0,
10,
20
]
},
"type": "GotoLocalPosition"
},
{
"name": "Wait 2",
"params": {
"duration": 2
},
"type": "Wait"
},
{
"name": "Land",
"params": {},
"type": "Land"
}
]
}
How to profile with this mission:
Start PAM:
# Terminal 1 uvicorn petal_app_manager.main:app --host 0.0.0.0 --port 9000 --log-level info --no-access-log --http h11
Start monitoring:
# Terminal 2 python tools/profiling/monitor_cpu.py --scenario mission-execution --interval 1 --plot
Start profiling:
# Terminal 3 python tools/profiling/profile_pam.py --scenario mission-execution
Send mission via MQTT or HTTP:
# Terminal 4 (or use a separate terminal) curl -X POST http://localhost:9000/petal-leafsdk/mission/plan \ -H "Content-Type: application/json" \ -d @tools/profiling/example_mission_sitl.json
Wait for mission to complete, then stop profiling (Ctrl+C in terminals 2 and 3)
The profile will capture mission execution including:
Mission loading and validation
Mission runner loop execution
MAVLink command generation
Waypoint processing
Output Formats & Visualizations
Speedscope JSON (Interactive) ⭐
Each profiling run generates a single Speedscope JSON file:
File: pam_<scenario>_<timestamp>_profile.speedscope.json
Three interactive views in one:
Time Order: Chronological execution timeline
Left Heavy: Flame graph (top-down call hierarchy)
Sandwich: Icicle graph (bottom-up call hierarchy)
Features:
Filter by thread name to isolate workers
Search for specific functions
Zoom into call stacks
See exact time percentages and call counts
No installation required (web viewer at speedscope.app)
How to view:
Click “Browse” and select the
.speedscope.jsonfileUse the thread dropdown to analyze individual workers
Toggle between the three view modes
Or with CLI (if installed):
speedscope tools/profiling/profiles/pam_*_profile.speedscope.json
Monitor Output: CPU and Memory Usage
Each monitoring run generates CSV data and a plot:
Files:
pam_<scenario>_<timestamp>_cpu.csv- Time-series datapam_<scenario>_<timestamp>_cpu.png- Visualization
CSV Columns:
elapsed_seconds- Time since monitoring startedcpu_percent- Process CPU usage (% of total system)memory_mb- Process memory usage in MBmemory_percent- Process memory usage (% of total system)timestamp- ISO timestamp
Plot Format:
Two subplots: CPU usage (top), Memory usage (bottom)
X-axis: Elapsed seconds
Useful for identifying CPU/memory spikes during operations
Interpretation Guide
What to Look For
1. Worker Thread Overhead (Target: <5% CPU in idle)
Where: Filter by thread in Speedscope, or look for thread names in flame graph
Look for:
time.sleep()should dominate (95%+ in idle scenarios)Actual work (message parsing, pub/sub) should be minimal in idle
Excessive polling or busy-waiting
PAM worker threads:
MAVLink worker threads: Message processing
Redis worker threads: Pub/sub handling
MQTT worker threads: Message handling
Example findings:
Thread: MAVLinkWorker-1
├─ time.sleep() ━━━━━━━━━━━━━━━━━━━ 97.5%
└─ parse_message() ━ 2.5% ← This should be minimal in idle
2. Async Task Analysis
Where: Search for async task functions in Speedscope (e.g., health status publishing, petal loading)
Look for:
Health publisher frequency and overhead
Petal loading overhead
Unnecessary async task creation
Example:
publish_health_status()
├─ redis.publish() ━━━━━ 60%
├─ get_health_data() ━━ 30%
└─ json.dumps() ━ 10%
3. Periodic Task Frequency
Where: Speedscope search or timeline view
Look for:
Functions appearing frequently in the timeline
Repetitive call patterns in flame graph
High sample counts for specific functions
How to identify:
Use Speedscope search (Ctrl/Cmd+F) to find a function
Check how many times it appears across samples
Look for regular patterns in timeline view
Example issue: Function appears every few milliseconds when it should run every few seconds
4. Blocking Operations
Where: Speedscope timeline view, look for long plateaus
Look for:
Synchronous HTTP requests in async context
File I/O blocking event loop
Database queries without connection pooling
Slow JSON parsing on large payloads
Red flags:
requests.get()in async function (useaiohttpinstead)open().read()on large files (use async I/O)time.sleep()in async function (useasyncio.sleep())
5. Message Processing Overhead
Where: Filter by MAVLink/Redis/MQTT threads
Look for:
Time spent in message deserialization
Excessive message copying
Inefficient message routing
Reading Flame Graphs
Visual guide:
Width = CPU time (wider = more time spent)
↓
┌─────────────────────────────────────────────┐
│ main() [60.0s] │ ← Entry point (widest)
├─────────────────────┬───────────────────────┤
│ startup_all() │ message_loop() │ ← Major functions
│ [10.0s] │ [50.0s] │
├──────┬──────────────┼──────────┬────────────┤
│load_ │publish_ │parse_msg │process_msg │ ← Leaf functions
│petals│health() │() │() │ (actual work)
│[5.0s]│[5.0s] │[25.0s] │[25.0s] │
└──────┴──────────────┴──────────┴────────────┘
↑
Height = call stack depth
Interpretation:
Wide plateaus at top = Functions doing actual work (optimization targets)
Thin spikes = Quick function calls (usually not worth optimizing)
Many horizontal slices = Deep call stacks (potential for inlining)
Uneven widths = Branching logic or different code paths
Common Patterns to Identify
Pattern 1: Busy-Wait Loop
Flame graph shows:
while_loop() ━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100%
└─ check_condition() ━━━━━━━━━━━━━━━━━━ 98%
Fix: Add sleep() or use event-based waiting
Pattern 2: Excessive Polling
Speedscope timeline shows:
Repeated calls to poll_status() every few milliseconds
Function dominates thread CPU time
Fix: Increase poll interval or use change notifications
Pattern 3: Synchronous I/O in Async Context
Timeline shows long plateau:
async_handler() ━━━━━━━━━━━━━━━━━━━━━
└─ requests.get() ━━━━━━━━━━━━━━━━ (blocking!)
Fix: Use aiohttp.ClientSession() instead
Pattern 4: Redundant Serialization
Flame graph shows:
send_message() ━━━━━━━━━━━━━
├─ json.dumps() ━━━━━ 40%
├─ json.loads() ━━━━━ 40%
└─ actual_work() ━ 20%
Fix: Cache serialized data or reduce serialize/deserialize cycles
Troubleshooting
Environment Issues
Q: ``Command not found: python`` or ``pip``
# Check if venv is activated
echo $VIRTUAL_ENV # Should show .venv path
# Activate venv
source .venv/bin/activate
# Verify prompt shows (petal-app-manager-3.11)
Q: ``ModuleNotFoundError: No module named ‘petal_app_manager’``
# Ensure you're in the right directory
pwd # Should be .../petal-app-manager
# Verify PYTHONPATH (profiler sets this automatically)
echo $PYTHONPATH
# Check if PAM is installed in editable mode
pip show petal-app-manager
Q: Wrong Python version
# Check version
python --version # Should be 3.11.x
# If wrong, recreate venv
rm -rf .venv
python3.11 -m venv .venv
source .venv/bin/activate
pip install -e .
Profiler Issues
Q: py-spy error: “Permission denied” or “Operation not permitted”
# Option 1: Run with sudo (not recommended)
sudo $(which python) tools/profiling/profile_pam_pyspy.py ...
# Option 2: Grant ptrace capability (Linux)
sudo setcap cap_sys_ptrace=eip $(which py-spy)
# Option 3: Adjust ptrace_scope (temporary, less secure)
echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope
Q: py-spy error: “No python processes found”
This was fixed by having py-spy launch uvicorn directly (not attach to existing process)
Ensure you’re using the updated
profile_pam_pyspy.pyCheck that uvicorn is installed:
pip show uvicorn
Q: PAM fails to start during profiling
# Check dependencies are running
systemctl status redis
systemctl status mosquitto
# Check port 9000 is available
lsof -i :9000 # Should be empty
# Check PAM can start normally
python -m uvicorn petal_app_manager.main:app --host 0.0.0.0 --port 9000 --log-level info --no-access-log --http h11
# Then Ctrl+C and try profiler again
Q: Profiling terminates early
# Check for errors in output
python tools/profiling/profile_pam_pyspy.py --scenario idle-no-leaffc --duration 60 2>&1 | tee profile.log
# Look for common issues:
# - Import errors
# - Configuration file missing
# - Dependency connection failures
Visualization Issues
Q: Speedscope won’t load JSON file
Verify file exists and is not empty:
ls -lh tools/profiling/profiles/*.speedscope.jsonCheck file is valid JSON:
jq . <file>.speedscope.json | headTry uploading to alternative Speedscope instance
Check browser console for errors (F12)
Performance Issues
Q: Profile files are too large (>100MB)
# Reduce duration
python tools/profiling/profile_pam.py --scenario idle-no-leaffc --duration 30
# Reduce sampling rate (edit profile_pam.py, change --rate 100 to --rate 50)
Q: Profiling takes too long
# Reduce duration for quick tests
python tools/profiling/profile_pam.py --scenario idle-no-leaffc --duration 30
Results Interpretation Issues
Q: Profile shows 97% time.sleep(), can’t see actual work
Use Speedscope’s thread filter to isolate worker threads:
1. Upload to speedscope.app
2. Click "Thread" dropdown at top
3. Select individual worker threads (e.g., MAVLinkWorker, RedisWorker, MQTTWorker)
4. Now you'll see actual work in each thread (sleep will be much less dominant)
In idle scenarios, sleep dominating is expected. The key is looking at the non-sleep portions to understand overhead.
Q: Can’t find specific function in flame graph
Use Speedscope search:
1. Upload to speedscope.app
2. Press Ctrl+F or Cmd+F
3. Type function name
4. Click through matches to see all occurrences
Q: Don’t understand thread names
Common thread naming patterns in PAM:
- MainThread: Main event loop, PAM startup
- ThreadPoolExecutor-*: Async executor threads
- MAVLinkWorker-*: MAVLink message processing
- RedisWorker-*: Redis pub/sub handlers
- MQTTWorker-*: MQTT message handlers
- asyncio_*: Asyncio internal threads
Quick Reference
Command Cheatsheet
# SETUP
source .venv/bin/activate
pip install -r tools/profiling/requirements-profiling.txt
# START PAM (Required first!)
uvicorn petal_app_manager.main:app --host 127.0.0.1 --port 9000
# PROFILE (in separate terminal)
python tools/profiling/profile_pam.py --scenario idle-no-leaffc
# Press Ctrl+C to stop and save
# MONITOR CPU (in separate terminal)
python tools/profiling/monitor_cpu.py --scenario idle-no-leaffc --interval 1 --plot
# Press Ctrl+C to stop and save
# VISUALIZE
# Upload .speedscope.json to https://www.speedscope.app/
# Or: speedscope tools/profiling/profiles/*.speedscope.json
# Open .png files to view CPU/memory plots
File Naming Conventions
Profile output:
pam_<scenario>_<YYYYMMDD_HHMMSS>_profile.speedscope.json
CPU monitor output:
pam_<scenario>_<YYYYMMDD_HHMMSS>_cpu.csv
pam_<scenario>_<YYYYMMDD_HHMMSS>_cpu.png
Examples:
pam_idle-no-leaffc_20260211_143022_profile.speedscope.json
pam_idle-no-leaffc_20260211_143022_cpu.csv
pam_idle-no-leaffc_20260211_143022_cpu.png
pam_mission-execution_20260211_150145_profile.speedscope.json
pam_mission-execution_20260211_150145_cpu.csv
pam_mission-execution_20260211_150145_cpu.png
Keyboard Shortcuts (Speedscope)
Ctrl/Cmd + F Search for function
Ctrl/Cmd + +/- Zoom in/out
W/A/S/D Navigate flamegraph
T Toggle thread view
V Toggle view (flame/icicle/timeline)
0 Reset zoom
Getting Help
If you encounter issues not covered here:
Start PAM first - profiler and monitor attach to running process
Use separate terminals for PAM, profiling, and monitoring
Use matching scenario labels to keep related files together
Press Ctrl+C to stop - data is saved automatically via signal handlers
Check terminal output for error messages
Verify setup using commands in Quick Reference
Review profile files exist and have content:
ls -lh tools/profiling/profiles/
Common first steps:
# Full environment check
source .venv/bin/activate
python --version # Should be 3.11.x
which py-spy # Should show .venv/bin/py-spy
which psutil # For monitor_cpu.py
pwd # Should be .../petal-app-manager
# Verify PAM is running (required for profiling/monitoring)
pgrep -f 'uvicorn.*petal_app_manager' # Should show a PID
curl http://localhost:9000/health # Should return OK
Summary
Use
profile_pam.py- py-spy profiler (sees all threads/async tasks)Use
monitor_cpu.py- simple CPU and memory trackingStart with Speedscope web viewer - no installation needed, just upload JSON
Profile all 5 scenarios to identify state-specific bottlenecks
Focus on thread-level analysis in idle scenarios (should be >95% sleep)
Look for:
Excessive polling (check individual worker threads)
Blocking I/O in async context (long plateaus in timeline view)
Redundant work (wide bars in flame graph)
Worker thread overhead (filter by thread in Speedscope)
CPU/memory spikes in monitor plots
Expected idle behavior:
Worker threads: 95%+ in
time.sleep()Periodic tasks: Health publishing, file monitoring, etc.
Total CPU: <5% on idle
Memory: Stable (no leaks)