Skill v1.0.2
currentAutomated scan100/1001 files
name: document-python-direct description: Use direct Python execution for reliable spreadsheet and document/PDF generation operations
Direct Python Execution for Document and Spreadsheet Tasks
When to Use This Skill
Use direct run_shell with Python scripts for document operations when:
- Reading or writing complex Excel files with multiple sheets
- Generating PDF reports, checklists, or forms
- Applying formulas, formatting, or data transformations
- Working with
openpyxl,pandas,reportlab,fpdf2, or similar libraries - The operation involves multiple steps that could exceed agent step limits
- You need precise control over error handling and debugging
- Complex scripts benefit from file-based execution for better reliability
Why Direct Execution?
The shell_agent tool can:
- Hit maximum step limits on complex multi-step operations
- Produce unexplained errors on formatting operations
- Fail on intricate spreadsheet reads/writes due to iterative parsing
- Fail to parse heredoc syntax correctly, causing 'unknown error' failures
Direct run_shell with Python is more reliable because it:
- Executes in a single step with no iteration limits
- Provides clearer, immediate error messages
- Handles complex operations without step constraints
- Gives full control over library imports and execution flow
- Writing scripts to
.pyfiles first avoids shell_agent parsing issues with heredocs
How to Use
Recommended Pattern: Write Script to File First
For complex multi-line scripts, especially when using shell_agent as executor:
# Step 1: Write the Python script to a filecat > process_document.py << 'EOF'import openpyxlfrom openpyxl import Workbook# Your spreadsheet code herewb = openpyxl.load_workbook('file.xlsx')# ... operations ...wb.save('output.xlsx')print('Success')EOF# Step 2: Execute the scriptpython3 process_document.py
Alternative Pattern: Inline Heredoc (Simple Scripts Only)
For short, simple scripts when NOT using shell_agent as the executor:
python3 << 'EOF'import openpyxl# Simple operationwb = openpyxl.load_workbook('file.xlsx')wb.save('output.xlsx')print('Done')EOF
Spreadsheet Examples
Example 1: Read and Transform Excel Data
Write to file first, then execute:
import pandas as pd# Load data from specific sheetdf = pd.read_excel('input.xlsx', sheet_name='Revenue')# Apply transformationsdf['Net_Revenue'] = df['Gross_Revenue'] * (1 - df['Tax_Rate'])# Save resultsdf.to_excel('output.xlsx', index=False, sheet_name='Processed')
Example 2: Multi-Sheet Operations with openpyxl
from openpyxl import load_workbookwb = load_workbook('tour_data.xlsx')# Iterate through sheetsfor sheet_name in wb.sheetnames:ws = wb[sheet_name]for row in ws.iter_rows(min_row=2, max_col=5):passwb.save('tour_data_processed.xlsx')
Example 3: Complex Formatting Operations
from openpyxl import load_workbookfrom openpyxl.styles import Font, PatternFill, Alignmentwb = load_workbook('report.xlsx')ws = wb.activeheader_fill = PatternFill(start_color='4472C4', fill_type='solid')header_font = Font(bold=True, color='FFFFFF')for cell in ws[1]:cell.fill = header_fillcell.font = header_fontcell.alignment = Alignment(horizontal='center')wb.save('report_formatted.xlsx')
PDF Generation Examples
Example 4: Basic PDF Report with reportlab
from reportlab.lib.pagesizes import letterfrom reportlab.pdfgen import canvasfrom reportlab.lib.styles import getSampleStyleSheetfrom reportlab.platypus import SimpleDocTemplate, Paragraph, Spacer# Create PDF documentdoc = SimpleDocTemplate('report.pdf', pagesize=letter)styles = getSampleStyleSheet()story = []# Add contentstory.append(Paragraph('Monthly Report', styles['Heading1']))story.append(Spacer(1, 12))story.append(Paragraph('Generated successfully.', styles['Normal']))# Build PDFdoc.build(story)print('PDF created successfully')
Example 5: PDF Checklist with fpdf2
from fpdf import FPDF# Create PDF instancepdf = FPDF()pdf.add_page()pdf.set_font('Arial', 'B', 16)# Titlepdf.cell(0, 10, 'Safety Checklist', ln=True, align='C')pdf.ln(10)# Checklist itemspdf.set_font('Arial', '', 12)items = ['Check equipment', 'Verify connections', 'Test system', 'Document results']for i, item in enumerate(items, 1):pdf.cell(5, 10, f'[ ] {item}', ln=True)# Savepdf.output('checklist.pdf')print('Checklist PDF generated')
Example 6: Combined Spreadsheet + PDF Workflow
import pandas as pdfrom reportlab.lib.pagesizes import letterfrom reportlab.pdfgen import canvas# Step 1: Process spreadsheet datadf = pd.read_excel('data.xlsx')summary = df.describe()# Step 2: Generate PDF reportc = canvas.Canvas('summary_report.pdf', pagesize=letter)c.drawString(100, 750, 'Data Summary Report')c.drawString(100, 730, f'Total Records: {len(df)}')c.drawString(100, 710, f'Mean Value: {df.iloc[:, 0].mean():.2f}')c.save()print('Spreadsheet processed and PDF report generated')
Error Handling Pattern
Write to file first, then execute:
import sysfrom openpyxl import load_workbooktry:wb = load_workbook('data.xlsx')ws = wb.active# Your operations herevalue = ws['A1'].valuewb.save('output.xlsx')print(f"Success: Processed {ws.max_row} rows")except Exception as e:print(f"Error: {str(e)}", file=sys.stderr)sys.exit(1)
Best Practices
- Prefer file-based execution for complex scripts: write to
.pyfile first, then execute viarun_shell - Import only needed libraries to reduce execution time
- Print clear success/error messages for debugging
- Save intermediate results for complex multi-step transformations
- Test with small data before scaling to large spreadsheets or documents
- Use pandas for data manipulation and openpyxl for formatting when both are needed
- Clean up temporary script files after execution if they won't be reused
- Use PDF libraries appropriate to your needs: reportlab for complex layouts, fpdf2 for simple documents
When NOT to Use This Skill
- Simple single-cell reads/writes (use shell_agent or basic commands)
- Operations that require interactive user input
- Tasks where you need the agent to iteratively refine the approach
- Very simple text files (use basic shell commands instead)
Common Libraries
| Library | Best For | |
|---|---|---|
openpyxl | Reading/writing .xlsx files, formatting, formulas | |
pandas | Data manipulation, analysis, merging datasets | |
xlrd | Reading older .xls files (read-only) | |
xlsxwriter | Creating new .xlsx files with advanced formatting | |
reportlab | Professional PDF reports with complex layouts | |
fpdf2 | Simple PDF generation, checklists, basic documents | |
python-docx | Creating and editing Word documents |
Troubleshooting
Issue: Heredoc syntax fails with 'unknown error' when using shell_agent
- Solution: Write the Python script to a
.pyfile first, then execute it withpython3 script.py. This pattern is significantly more reliable than inline heredoc execution when shell_agent is the executor.
Issue: FileNotFoundError
- Solution: Verify the file path is absolute or relative to the working directory
Issue: PermissionError
- Solution: Ensure the file is not open in another application
Issue: MemoryError on large files
- Solution: Process data in chunks using pandas
chunksizeparameter
Issue: Formatting not applying
- Solution: Ensure you're modifying cell styles before saving, and use
.copy()for style objects
Issue: PDF text encoding errors
- Solution: Use Unicode-compatible fonts and ensure text is properly encoded (UTF-8)
Issue: PDF library not found
- Solution: Install required library with
pip install reportlaborpip install fpdf2
Issue: Agent fails before tool execution (skill selector returns empty)
- Solution: Ensure the skill is properly registered and the task matches the skill's description. For document generation tasks, explicitly mention "PDF" or "spreadsheet" in the task description to trigger skill selection.