471 lines
14 KiB
Markdown
471 lines
14 KiB
Markdown
# Vendor Report Generator - Taskboard Integration Context
|
|
|
|
## 🎯 Goal & Purpose
|
|
|
|
The **Vendor Report Generator** is a Python-based tool designed to automate the generation of comprehensive vendor punchlist reports from Excel files stored in SharePoint. The goal is to:
|
|
|
|
1. **Automate Report Generation**: Eliminate manual Excel processing and report creation
|
|
2. **Centralize Data**: Pull vendor punchlist data directly from SharePoint
|
|
3. **Provide Insights**: Generate actionable reports with metrics, priorities, and status tracking
|
|
4. **Enable Integration**: Make reports accessible within Taskboard for team collaboration
|
|
|
|
### Business Value
|
|
|
|
- **Time Savings**: Automates hours of manual report generation
|
|
- **Accuracy**: Consistent data normalization and calculation
|
|
- **Visibility**: Real-time vendor status tracking and metrics
|
|
- **Accessibility**: Web-based interface for non-technical users
|
|
- **Integration Ready**: Can be embedded as a tool/widget in Taskboard
|
|
|
|
---
|
|
|
|
## 📋 Application Overview
|
|
|
|
### What It Does
|
|
|
|
The application processes Excel files containing vendor punchlist items and generates:
|
|
- **Interactive HTML Reports**: Searchable, filterable web reports with vendor tabs, status filters, and priority grouping
|
|
- **JSON Data**: Structured data for further processing or API integration
|
|
- **Metrics**: Per-vendor statistics (total items, closed/open counts, 24-hour updates, oldest unaddressed items)
|
|
|
|
### Key Features
|
|
|
|
1. **Excel Processing**: Direct pandas-based reading (no manual conversion needed)
|
|
2. **Data Normalization**: Automatically handles vendor name variations, status inconsistencies, priority classifications
|
|
3. **24-Hour Tracking**: Identifies items added, closed, or changed to monitor status in the last 24 hours (Baltimore/Eastern timezone)
|
|
4. **Priority Classification**: Groups items by Very High, High, Medium, Low priorities
|
|
5. **Oldest Items**: Highlights the oldest 3 unaddressed items per vendor
|
|
6. **SharePoint Integration**: Automatically downloads Excel files from SharePoint
|
|
7. **Scheduled Generation**: Can run automatically on a schedule
|
|
8. **Web UI**: User-friendly interface for generating reports
|
|
9. **REST API**: Programmatic access for integration
|
|
|
|
---
|
|
|
|
## 🏗️ Architecture & Components
|
|
|
|
### Core Components
|
|
|
|
```
|
|
vendor_report/
|
|
├── report_generator.py # Main entry point - orchestrates report generation
|
|
├── data_preprocessor.py # Excel parsing, normalization, data cleaning
|
|
├── html_generator.py # Generates interactive HTML reports
|
|
├── models.py # Pydantic data models for validation
|
|
├── sharepoint_downloader.py # SharePoint file downloader
|
|
├── scheduler.py # Scheduled report generation
|
|
├── api_server.py # REST API server
|
|
├── web_ui.py # Web UI server (Flask-based)
|
|
├── config.py # Configuration management
|
|
└── config.yaml # Configuration file
|
|
```
|
|
|
|
### Data Flow
|
|
|
|
```
|
|
SharePoint Excel Files
|
|
↓
|
|
[SharePoint Downloader] → Local reports/ directory
|
|
↓
|
|
[Data Preprocessor] → Normalize vendors, statuses, priorities, parse dates
|
|
↓
|
|
[Report Generator] → Calculate metrics, group by vendor, identify updates
|
|
↓
|
|
[HTML Generator] → Generate interactive report.html
|
|
↓
|
|
[Output] → output/report.json + output/report.html
|
|
```
|
|
|
|
### Processing Pipeline
|
|
|
|
1. **Input**: Excel files with columns:
|
|
- Punchlist Name, Vendor, Priority, Description, Date Identified, Status Updates, Issue Image, Status, Date Completed
|
|
|
|
2. **Preprocessing**:
|
|
- Parse Excel files using pandas
|
|
- Normalize vendor names (handle case variations, combined vendors)
|
|
- Normalize statuses (Complete, Monitor, Incomplete)
|
|
- Classify priorities (Very High, High, Medium, Low)
|
|
- Parse dates (multiple formats supported)
|
|
- Calculate 24-hour windows (Baltimore/Eastern timezone)
|
|
- Calculate item age (days since identified)
|
|
|
|
3. **Report Generation**:
|
|
- Group items by vendor
|
|
- Calculate metrics per vendor (total, closed, open, monitor counts)
|
|
- Identify 24-hour updates (added, closed, changed to monitor)
|
|
- Find oldest 3 unaddressed items per vendor
|
|
- Group by priority levels
|
|
- Generate JSON structure
|
|
- Generate HTML report
|
|
|
|
4. **Output**:
|
|
- `output/report.json`: Structured JSON data
|
|
- `output/report.html`: Interactive HTML report
|
|
- `output/preprocessed_data.txt`: Debug/preview data
|
|
|
|
---
|
|
|
|
## 🔧 Technical Details
|
|
|
|
### Dependencies
|
|
|
|
```python
|
|
# Core
|
|
pandas>=2.0.0 # Excel processing
|
|
openpyxl>=3.0.0 # Excel file reading
|
|
pydantic>=2.0.0 # Data validation
|
|
|
|
# Optional: SharePoint
|
|
Office365-REST-Python-Client>=2.3.0 # SharePoint API
|
|
|
|
# Optional: Scheduling
|
|
apscheduler>=3.10.0 # Task scheduling
|
|
|
|
# Optional: Web UI/API
|
|
flask>=2.3.0 # Web framework
|
|
flask-cors>=4.0.0 # CORS support
|
|
|
|
# Configuration
|
|
pyyaml>=6.0 # YAML config parsing
|
|
```
|
|
|
|
### Configuration
|
|
|
|
Configuration is managed via `config.yaml`:
|
|
|
|
```yaml
|
|
sharepoint:
|
|
enabled: true/false
|
|
site_url: "https://company.sharepoint.com/sites/SiteName"
|
|
folder_path: "/Shared Documents/Reports"
|
|
use_app_authentication: true # Azure AD app auth (recommended)
|
|
client_id: "azure-ad-client-id"
|
|
client_secret: "azure-ad-client-secret"
|
|
|
|
scheduler:
|
|
enabled: true/false
|
|
schedule_type: "interval" | "cron" | "once"
|
|
interval_hours: 24
|
|
cron_expression: "0 8 * * *" # 8 AM daily
|
|
|
|
api:
|
|
enabled: true/false
|
|
port: 8080
|
|
api_key: "optional-api-key"
|
|
|
|
report:
|
|
output_dir: "output"
|
|
reports_dir: "reports"
|
|
```
|
|
|
|
### API Endpoints
|
|
|
|
**Web UI Server** (`web_ui.py`):
|
|
- `GET /` - Web UI interface
|
|
- `POST /api/generate` - Generate report
|
|
- `POST /api/update-sharepoint` - Download files from SharePoint
|
|
- `GET /api/status` - Service status
|
|
- `GET /api/reports` - List generated reports
|
|
- `GET /api/config` - Configuration (safe, no secrets)
|
|
- `GET /reports/<filename>` - Serve report files
|
|
|
|
**API Server** (`api_server.py`):
|
|
- `POST /api/generate` - Generate report (programmatic)
|
|
- `GET /api/status` - Service status
|
|
- `GET /health` - Health check
|
|
|
|
### Data Models
|
|
|
|
**PunchlistItem**:
|
|
- punchlist_name, description, priority, date_identified, date_completed
|
|
- status, status_updates, issue_image, age_days
|
|
|
|
**VendorMetrics**:
|
|
- vendor_name, total_items, closed_count, open_count, monitor_count
|
|
- updates_24h (added, closed, changed_to_monitor)
|
|
- oldest_unaddressed (top 3)
|
|
- very_high_priority_items, high_priority_items
|
|
|
|
**FullReport**:
|
|
- report_generated_at, vendors[], summary{}
|
|
|
|
---
|
|
|
|
## 🔗 Taskboard Integration Possibilities
|
|
|
|
### Option 1: Embedded Widget/Page
|
|
|
|
Create a new page in Taskboard (`/vendor-reports`) that:
|
|
- Uses Taskboard's authentication (already authenticated users)
|
|
- Embeds the generated HTML report in an iframe or renders it directly
|
|
- Provides a button to trigger report generation
|
|
- Shows report history/list
|
|
|
|
**Implementation**:
|
|
```typescript
|
|
// taskboard/src/app/(dashboard)/vendor-reports/page.tsx
|
|
// - Call Python API server to generate reports
|
|
// - Display generated HTML reports
|
|
// - Use Taskboard's UI components for consistency
|
|
```
|
|
|
|
### Option 2: API Integration
|
|
|
|
Create Taskboard API routes that proxy to the Python API:
|
|
- `POST /api/vendor-reports/generate` → Calls Python `POST /api/generate`
|
|
- `GET /api/vendor-reports/list` → Calls Python `GET /api/reports`
|
|
- `GET /api/vendor-reports/status` → Calls Python `GET /api/status`
|
|
|
|
**Benefits**:
|
|
- Single authentication system (Taskboard)
|
|
- Consistent API patterns
|
|
- Can add Taskboard-specific features (notifications, task linking)
|
|
|
|
### Option 3: Background Service
|
|
|
|
Run the Python scheduler as a background service that:
|
|
- Generates reports on schedule
|
|
- Saves reports to a shared location
|
|
- Taskboard displays the latest report
|
|
- Can trigger notifications when reports are updated
|
|
|
|
### Option 4: Task Integration
|
|
|
|
Link reports to Taskboard tasks:
|
|
- Create tasks for vendors with unaddressed items
|
|
- Link report generation to project/task completion
|
|
- Use report metrics in task dashboards
|
|
|
|
---
|
|
|
|
## 🚀 Usage Examples
|
|
|
|
### Command Line
|
|
|
|
```bash
|
|
# Generate report from local files
|
|
python report_generator.py
|
|
|
|
# Generate with custom directories
|
|
python report_generator.py --reports-dir /path/to/excel --output /path/to/output.json
|
|
```
|
|
|
|
### Web UI
|
|
|
|
```bash
|
|
# Start web UI server
|
|
python web_ui.py
|
|
|
|
# Open browser: http://localhost:8080
|
|
# Click "Update Data from SharePoint" → "Generate Report"
|
|
```
|
|
|
|
### API
|
|
|
|
```bash
|
|
# Generate report via API
|
|
curl -X POST http://localhost:8080/api/generate \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"download_from_sharepoint": false}'
|
|
|
|
# Update from SharePoint
|
|
curl -X POST http://localhost:8080/api/update-sharepoint
|
|
```
|
|
|
|
### Scheduled
|
|
|
|
```bash
|
|
# Start scheduler (runs continuously)
|
|
python scheduler.py
|
|
|
|
# Configured via config.yaml:
|
|
# scheduler:
|
|
# enabled: true
|
|
# schedule_type: "cron"
|
|
# cron_expression: "0 8 * * *" # 8 AM daily
|
|
```
|
|
|
|
### Programmatic (Python)
|
|
|
|
```python
|
|
from report_generator import generate_report
|
|
|
|
# Generate report
|
|
report_data = generate_report(
|
|
reports_dir="reports",
|
|
output_file="output/report.json",
|
|
verbose=True
|
|
)
|
|
|
|
# Access data
|
|
vendors = report_data['vendors']
|
|
summary = report_data['summary']
|
|
```
|
|
|
|
---
|
|
|
|
## 📊 Report Structure
|
|
|
|
### JSON Report Format
|
|
|
|
```json
|
|
{
|
|
"report_generated_at": "2025-11-06T16:00:00",
|
|
"vendors": [
|
|
{
|
|
"vendor_name": "VendorName",
|
|
"total_items": 10,
|
|
"closed_count": 5,
|
|
"open_count": 3,
|
|
"monitor_count": 2,
|
|
"updates_24h": {
|
|
"added": [...],
|
|
"closed": [...],
|
|
"changed_to_monitor": [...]
|
|
},
|
|
"oldest_unaddressed": [...],
|
|
"very_high_priority_items": [...],
|
|
"high_priority_items": [...],
|
|
"closed_items": [...],
|
|
"monitor_items": [...],
|
|
"open_items": [...]
|
|
}
|
|
],
|
|
"summary": {
|
|
"total_vendors": 5,
|
|
"total_items": 50,
|
|
"total_closed": 25,
|
|
"total_open": 15,
|
|
"total_monitor": 10
|
|
}
|
|
}
|
|
```
|
|
|
|
### HTML Report Features
|
|
|
|
- **Summary Cards**: Overview statistics
|
|
- **Vendor Tabs**: Quick navigation between vendors
|
|
- **Status Tabs**: Filter by All, Yesterday's Updates, Oldest Unaddressed, Closed, Monitor, Open
|
|
- **Search & Filters**: Search by name/description, filter by vendor/status/priority
|
|
- **Quick Filters**: Show only vendors with updates or oldest items
|
|
- **Responsive Design**: Works on desktop and mobile
|
|
|
|
---
|
|
|
|
## 🔐 Authentication & Security
|
|
|
|
### Current State
|
|
|
|
- **Web UI**: Optional API key authentication
|
|
- **SharePoint**: Azure AD app authentication (recommended) or user credentials
|
|
- **No User Management**: Standalone application
|
|
|
|
### Taskboard Integration Benefits
|
|
|
|
- **Leverage Existing Auth**: Use Taskboard's Authentik/Microsoft Entra ID authentication
|
|
- **Role-Based Access**: Control who can generate/view reports
|
|
- **Audit Trail**: Track who generated reports (via Taskboard user system)
|
|
- **Secure Storage**: Use Taskboard's file storage for reports
|
|
|
|
---
|
|
|
|
## 📝 Integration Checklist
|
|
|
|
### Phase 1: Basic Integration
|
|
- [ ] Set up Python API server as background service
|
|
- [ ] Create Taskboard API route that proxies to Python API
|
|
- [ ] Create Taskboard page to display reports
|
|
- [ ] Add "Generate Report" button in Taskboard UI
|
|
|
|
### Phase 2: Enhanced Integration
|
|
- [ ] Use Taskboard authentication for report access
|
|
- [ ] Store report metadata in Taskboard database
|
|
- [ ] Add report history/versioning
|
|
- [ ] Link reports to projects/tasks
|
|
|
|
### Phase 3: Advanced Features
|
|
- [ ] Scheduled report generation via Taskboard
|
|
- [ ] Notifications when reports are generated
|
|
- [ ] Dashboard widgets showing report metrics
|
|
- [ ] Export reports to Taskboard tasks/boards
|
|
|
|
---
|
|
|
|
## 🛠️ Development Notes
|
|
|
|
### Running Locally
|
|
|
|
```bash
|
|
# Setup
|
|
cd vendor_report
|
|
python -m venv venv
|
|
source venv/bin/activate # Windows: venv\Scripts\activate
|
|
pip install -r requirements.txt
|
|
|
|
# Configure
|
|
cp config.yaml.template config.yaml
|
|
# Edit config.yaml with SharePoint credentials
|
|
|
|
# Run Web UI
|
|
python web_ui.py
|
|
# Open http://localhost:8080
|
|
```
|
|
|
|
### Deployment Considerations
|
|
|
|
- **Python Environment**: Requires Python 3.8+
|
|
- **Dependencies**: Install via pip
|
|
- **Configuration**: Store secrets securely (environment variables or vault)
|
|
- **Port**: Default 8080 (configurable)
|
|
- **File Storage**: Reports saved to `output/` directory
|
|
- **SharePoint**: Requires Azure AD app registration
|
|
|
|
### Error Handling
|
|
|
|
- Graceful handling of missing Excel files
|
|
- SharePoint connection errors logged
|
|
- Invalid data formats handled
|
|
- User-friendly error messages in Web UI
|
|
|
|
---
|
|
|
|
## 📚 Additional Resources
|
|
|
|
- **SharePoint Setup**: See `SHAREPOINT_SETUP.md` for detailed Azure AD configuration
|
|
- **Quick Start**: See `QUICK_START.md` for 5-minute setup guide
|
|
- **Full Documentation**: See `README.md` for complete usage guide
|
|
|
|
---
|
|
|
|
## 💡 Integration Ideas for Taskboard
|
|
|
|
1. **Vendor Dashboard**: Show vendor metrics as cards/widgets
|
|
2. **Report History**: Track when reports were generated, by whom
|
|
3. **Task Creation**: Auto-create tasks for vendors with oldest unaddressed items
|
|
4. **Notifications**: Alert project managers when reports are generated
|
|
5. **Export to Tasks**: Convert report items to Taskboard tasks
|
|
6. **Project Linking**: Associate reports with Taskboard projects
|
|
7. **Scheduled Reports**: Use Taskboard's scheduling to trigger reports
|
|
8. **Role-Based Views**: Different report views for different user roles
|
|
|
|
---
|
|
|
|
## 🔄 Current Status
|
|
|
|
- ✅ Core functionality complete
|
|
- ✅ SharePoint integration working
|
|
- ✅ Web UI functional
|
|
- ✅ API endpoints available
|
|
- ✅ Scheduled generation supported
|
|
- ⏳ Taskboard integration pending
|
|
- ⏳ Authentication integration pending
|
|
- ⏳ Database storage pending
|
|
|
|
---
|
|
|
|
**Last Updated**: November 6, 2025
|
|
**Version**: 1.0
|
|
**Status**: Production Ready (Standalone), Integration Ready (Taskboard)
|
|
|