- Add separate Incomplete tab in vendor sections - Add incomplete_count stat card in vendor headers - Add total_incomplete summary card - Track incomplete_items separately from open_items in report generator - Standardize incomplete status color to #dc2626 across all UI elements - Remove AI/LLM references from documentation
Vendor Report Generator
A Python tool that generates comprehensive vendor punchlist reports from Excel files. The tool processes Excel data, normalizes vendor information, calculates metrics, and generates both JSON and interactive HTML reports.
Features
- Direct Excel Processing: Reads Excel files directly using pandas
- Data Normalization: Automatically normalizes vendor names, statuses, and priorities
- 24-Hour Updates: Tracks items added, closed, or changed to monitor status in the last 24 hours (based on Baltimore/Eastern timezone)
- Priority Tracking: Groups items by priority levels (Very High, High, Medium, Low)
- Oldest Unaddressed Items: Identifies and highlights the oldest 3 unaddressed items per vendor
- Interactive HTML Reports: Generates searchable, filterable HTML reports with tabs and filters
- JSON Export: Exports structured JSON data for further processing
Requirements
- Python 3.8 or higher
- Dependencies listed in
requirements.txt
Installation
-
Clone the repository:
git clone https://gitea.lci.ge/ilia.gurielidze/vendor_report.git cd vendor_report -
Create a virtual environment (recommended):
python3 -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate -
Install dependencies:
pip install -r requirements.txt
Setup
-
Prepare your Excel files:
- Place your Excel files (
.xlsxor.xls) in thereports/directory - Ensure your Excel files have the following columns (in order):
- Column 0: Punchlist Name
- Column 1: Vendor
- Column 2: Priority
- Column 3: Description
- Column 4: Date Identified
- Column 5: Status Updates
- Column 6: Issue Image
- Column 7: Status
- Column 8: Date Completed (optional)
- Place your Excel files (
-
Create necessary directories (if they don't exist):
mkdir -p reports output
Usage
Basic Usage
Generate a report from Excel files in the reports/ directory:
python3 report_generator.py
This will:
- Process all Excel files in the
reports/directory - Generate a JSON report at
output/report.json - Generate an HTML report at
output/report.html - Save preprocessed data to
output/preprocessed_data.txt
Command-Line Options
python3 report_generator.py [OPTIONS]
Options:
--reports-dir DIR: Directory containing Excel files (default:reports)--output FILE: Output JSON file path (default:output/report.json)--verbose: Print verbose output (default: True)
Examples:
# Use a custom reports directory
python3 report_generator.py --reports-dir /path/to/excel/files
# Specify custom output file
python3 report_generator.py --output /path/to/output/report.json
# Combine options
python3 report_generator.py --reports-dir my_reports --output my_output/report.json
Programmatic Usage
You can also use the report generator in your own Python scripts:
from report_generator import generate_report
# Generate report with default settings
report_data = generate_report()
# Or with custom settings
report_data = generate_report(
reports_dir="my_reports",
output_file="my_output/report.json",
verbose=True
)
# report_data is a dictionary containing the full report structure
print(f"Processed {len(report_data['vendors'])} vendors")
Report Structure
JSON Report Structure
The generated JSON report follows this structure:
{
"report_generated_at": "2025-11-05T22:00:00",
"vendors": [
{
"vendor_name": "VendorName",
"total_items": 10,
"closed_count": 5,
"open_count": 3,
"monitor_count": 2,
"updates_24h": {
"added": [...],
"closed": [...],
"changed_to_monitor": [...]
},
"oldest_unaddressed": [...],
"very_high_priority_items": [...],
"high_priority_items": [...],
"closed_items": [...],
"monitor_items": [...],
"open_items": [...]
}
],
"summary": {
"total_vendors": 5,
"total_items": 50,
"total_closed": 25,
"total_open": 15,
"total_monitor": 10
}
}
HTML Report Features
The HTML report includes:
- Summary Cards: Overview statistics at the top
- Vendor Tabs: Quick navigation between vendors
- Status Tabs: Filter by status (All, Yesterday's Updates, Oldest Unaddressed, Closed, Monitor, Open)
- Search & Filters:
- Search by item name or description
- Filter by vendor, status, or priority
- Quick Filters:
- Show only vendors with yesterday's updates
- Show only vendors with oldest unaddressed items
- Show all vendors
- Interactive Elements: Click tabs to switch views, use filters to narrow down results
Data Processing Details
Vendor Name Normalization
The tool automatically normalizes vendor names:
- Handles case variations (e.g., "autstand" → "Autstand")
- Preserves intentional capitalization (e.g., "AutStand" stays as-is)
- Normalizes combined vendors (e.g., "Autstand/Beumer")
- Handles vendors in parentheses (e.g., "MFO (Amazon)")
Status Normalization
Statuses are normalized to:
- Complete: Items with status containing "complete" or "complette"
- Monitor: Items with status containing "monitor" or "montor"
- Incomplete: All other items (default)
Priority Classification
Priorities are classified as:
- Very High: Priority contains "(1) Very High" or "Very High"
- High: Priority contains "(2) High" or "High" (but not "Very High")
- Medium: Priority contains "(3) Medium" or "Medium"
- Low: Priority contains "(4) Low" or "Low"
24-Hour Window Calculation
The tool uses Baltimore/Eastern timezone (America/New_York) for calculating 24-hour updates:
- Items are considered "added in last 24h" if their
date_identifiedfalls on yesterday's date - Items are considered "closed in last 24h" if their
date_completedfalls on yesterday's date - Items are considered "changed to monitor" if their status is Monitor and the date falls within the 24-hour window
Output Files
After running the generator, you'll find:
output/report.json: Structured JSON report dataoutput/report.html: Interactive HTML report (open in browser)output/preprocessed_data.txt: Human-readable preprocessed data (for debugging)
Project Structure
vendor_report/
├── report_generator.py # Main report generation script
├── data_preprocessor.py # Excel data preprocessing and normalization
├── html_generator.py # HTML report generation
├── models.py # Pydantic data models
├── excel_to_text.py # Utility for Excel to text conversion
├── requirements.txt # Python dependencies
├── reports/ # Directory for input Excel files
├── output/ # Directory for generated reports
└── README.md # This file
Troubleshooting
No Excel files found
Ensure your Excel files are in the reports/ directory and have .xlsx or .xls extensions.
Date parsing errors
The tool supports common date formats:
MM/DD/YY(e.g.,10/14/25)MM/DD/YYYY(e.g.,10/14/2025)YYYY-MM-DD(e.g.,2025-10-17)YYYY-MM-DD HH:MM:SS(e.g.,2025-10-17 00:00:00)
Permission errors
If you encounter permission errors, ensure you have write access to the output/ directory.
Missing dependencies
If you get import errors, ensure all dependencies are installed:
pip install -r requirements.txt
Timezone Notes
The tool uses Baltimore/Eastern timezone (America/New_York) for all date calculations. This ensures consistent 24-hour window calculations regardless of where the script is run. All dates are stored as timezone-aware datetime objects.
License
[Add your license information here]
Contributing
[Add contribution guidelines if applicable]
Support
For issues or questions, please contact [your contact information or issue tracker URL].