Monitoring/workflow_state.md

# Workflow State & Rules (STM + Rules + Log)

*This file contains the dynamic state, embedded rules, active plan, and log for the current session.*
*It is read and updated frequently by the AI during its operational loop.*

---

## State

*Holds the current status of the workflow.*

```yaml
Phase: CONSTRUCT # Current workflow phase (ANALYZE, BLUEPRINT, CONSTRUCT, VALIDATE, BLUEPRINT_REVISE)
Status: IN_PROGRESS # Current status (READY, IN_PROGRESS, BLOCKED_*, NEEDS_*, COMPLETED, COMPLETED_ITERATION)
CurrentTaskID: BuildFlaskComparisonApp # Identifier for the main task being worked on
CurrentStep: Step 7.4 # Identifier for the specific step in the plan being executed
CurrentItem: null # Identifier for the item currently being processed in iteration
```

---

## Plan

*Contains the step-by-step implementation plan generated during the BLUEPRINT phase.*

**Task: BuildFlaskComparisonApp**

*Creating a Python Flask web application to compare three data sources: SCADA JSON files, Manifest Excel file, and DWG Excel file.*

### 1. Project Setup

*   `[✓] Step 1.1: Create project directory structure.`
    - Create main project folder
    - Create templates/ directory for HTML templates
    - Create static/ directory for CSS and JavaScript
    - Create clones/ directory for GitHub repositories
*   `[✓] Step 1.2: Create requirements.txt with dependencies.`
    - Flask for web framework
    - pandas and openpyxl for Excel file handling
    - GitPython for repository operations
    - Other necessary dependencies
*   `[✓] Step 1.3: Create README.md with setup and usage instructions.`
    - Include installation steps (pip install -r requirements.txt)
    - Include usage instructions (flask run)
    - Document application functionality

### 2. Helper Functions Implementation

*   `[✓] Step 2.1: Implement repository handling function.`
    - Create create_or_update_repo(repo_url) function
    - Generate secure repo_id from URL using SHA1
    - Clone repo if not exists, pull if exists
    - Handle errors for invalid URLs
*   `[✓] Step 2.2: Implement SCADA JSON loading function.`
    - Create load_scada_names(repo_path) function
    - Recursively find all JSON files in repo
    - Extract meta.name field from each JSON
    - Return list of names
*   `[✓] Step 2.3: Implement Excel file loading function.`
    - Create load_excel_names(file_path) function
    - Use pandas to read Excel file
    - Extract "Name" column
    - Return list of names
*   `[✓] Step 2.4: Implement name normalization function.`
    - Create normalize_names(names) function
    - Strip whitespace
    - Convert to uppercase
    - Replace '-' with '_'
    - Remove duplicates
    - Return normalized list
*   `[✓] Step 2.5: Implement name comparison function.`
    - Create compare_name_lists(list1, list2) function
    - Find items in list1 but not in list2
    - Find items in list2 but not in list1
    - Return comparison results

### 3. Flask Application Implementation

*   `[✓] Step 3.1: Create app.py and initialize Flask application.`
    - Import necessary modules
    - Create Flask app instance
    - Configure app settings
*   `[✓] Step 3.2: Implement index route (GET /).`
    - Create route for homepage
    - Render index.html template with upload form
*   `[✓] Step 3.3: Implement comparison route (POST /compare).`
    - Create route for handling form submission
    - Extract repo_url and uploaded files
    - Call helper functions for repo handling and data loading
    - Normalize names from all sources
    - Compare name lists
    - Render results.html with comparison data
*   `[✓] Step 3.4: Implement update route (POST /update_files).`
    - Create route for handling re-uploads
    - Extract repo_id and new files
    - Update repo and re-run comparison
    - Render updated results.html

### 4. HTML Templates Creation

*   `[✓] Step 4.1: Create base.html template.`
    - Add HTML structure with Bootstrap
    - Include navigation
    - Setup template inheritance
*   `[✓] Step 4.2: Create index.html template.`
    - Extend base.html
    - Add form for repo URL input
    - Add file inputs for manifest and DWG files
    - Add submit button
*   `[✓] Step 4.3: Create results.html template.`
    - Extend base.html
    - Create Bootstrap tabs for the three comparisons
    - Add tables for displaying results
    - Highlight missing entries
    - Add re-upload form with hidden repo_id

### 5. Static Assets Creation

*   `[✓] Step 5.1: Create CSS for styling.`
    - Add styles for tables
    - Add styles for highlighting
    - Ensure responsive design
*   `[✓] Step 5.2: Add JavaScript for interactions.`
    - Add script for tab navigation
    - Add script for any filtering functionality

### 6. Security & Error Handling

*   `[✓] Step 6.1: Implement input validation.`
    - Validate repo URL format
    - Validate uploaded files
*   `[✓] Step 6.2: Add error handling for file operations.`
    - Handle missing or invalid Excel files
    - Handle Excel files without required columns
*   `[✓] Step 6.3: Add error handling for repository operations.`
    - Handle invalid repository URLs
    - Handle network or permission issues

### 7. Testing & Validation

*   `[✓] Step 7.1: Test repository handling functionality.`
    - Test with valid and invalid URLs
    - Verify clone and pull operations
    - Fixed component extraction from nested JSON structures
*   `[✓] Step 7.2: Test file upload handling.`
    - Test with valid and invalid Excel files
    - Verify data extraction
    - Added repository update mechanism for latest changes
    - Implemented manual refresh functionality
*   `[✓] Step 7.3: Test name normalization and comparison.`
    - Verify normalization rules are applied correctly
    - Verify comparison results are accurate
    - Fixed repository update functionality to reload data properly
    - Added CSRF protection for form submissions
*   `[ ] Step 7.4: Test re-upload functionality.`
    - Verify new files are processed correctly
    - Verify updated results are displayed

---

## Rules

*Embedded rules governing the AI's autonomous operation.*

**# --- Core Workflow Rules ---**

RULE_WF_PHASE_ANALYZE:
  **Constraint:** Goal is understanding request/context. NO solutioning or implementation planning.

RULE_WF_PHASE_BLUEPRINT:
  **Constraint:** Goal is creating a detailed, unambiguous step-by-step plan. NO code implementation.

RULE_WF_PHASE_CONSTRUCT:
  **Constraint:** Goal is executing the `## Plan` exactly. NO deviation. If issues arise, trigger error handling or revert phase.

RULE_WF_PHASE_VALIDATE:
  **Constraint:** Goal is verifying implementation against `## Plan` and requirements using tools. NO new implementation.

RULE_WF_TRANSITION_01:
  **Trigger:** Explicit user command (`@analyze`, `@blueprint`, `@construct`, `@validate`).
  **Action:** Update `State.Phase` accordingly. Log phase change.

RULE_WF_TRANSITION_02:
  **Trigger:** AI determines current phase constraint prevents fulfilling user request OR error handling dictates phase change (e.g., RULE_ERR_HANDLE_TEST_01).
  **Action:** Log the reason. Update `State.Phase` (e.g., to `BLUEPRINT_REVISE`). Set `State.Status` appropriately (e.g., `NEEDS_PLAN_APPROVAL`). Report to user.

RULE_ITERATE_01: # Triggered by RULE_MEM_READ_STM_01 when State.Status == READY and State.CurrentItem == null, or after VALIDATE phase completion.
  **Trigger:** `State.Status == READY` and `State.CurrentItem == null` OR after `VALIDATE` phase completion.
  **Action:**
    1. Check `## Items` section for more items.
    2. If more items:
    3. Set `State.CurrentItem` to the next item.
    4. Clear `## Log`.
    5. Set `State.Phase = ANALYZE`, `State.Status = READY`.
    6. Log "Starting processing item [State.CurrentItem]".
    7. If no more items:
    8. Trigger `RULE_ITERATE_02`.

RULE_ITERATE_02:
  **Trigger:** `RULE_ITERATE_01` determines no more items.
  **Action:**
    1. Set `State.Status = COMPLETED_ITERATION`.
    2. Log "Tokenization iteration completed."

**# --- Initialization & Resumption Rules ---**

RULE_INIT_01:
  **Trigger:** AI session/task starts AND `workflow_state.md` is missing or empty.
  **Action:**
    1. Create `workflow_state.md` with default structure.
    2. Read `project_config.md` (prompt user if missing).
    3. Set `State.Phase = ANALYZE`, `State.Status = READY`.
    4. Log "Initialized new session."
    5. Prompt user for the first task.

RULE_INIT_02:
  **Trigger:** AI session/task starts AND `workflow_state.md` exists.
  **Action:**
    1. Read `project_config.md`.
    2. Read existing `workflow_state.md`.
    3. Log "Resumed session."
    4. Check `State.Status`: Handle READY, COMPLETED, BLOCKED_*, NEEDS_*, IN_PROGRESS appropriately (prompt user or report status).

RULE_INIT_03:
  **Trigger:** User confirms continuation via RULE_INIT_02 (for IN_PROGRESS state).
  **Action:** Proceed with the next action based on loaded state and rules.

**# --- Memory Management Rules ---**

RULE_MEM_READ_LTM_01:
  **Trigger:** Start of a new major task or phase.
  **Action:** Read `project_config.md`. Log action.
RULE_MEM_READ_STM_01:
  **Trigger:** Before *every* decision/action cycle.
  **Action:**
    1. Read `workflow_state.md`.
    2. If `State.Status == READY` and `State.CurrentItem == null`:
    3. Log "Attempting to trigger RULE_ITERATE_01".
    4. Trigger `RULE_ITERATE_01`.

RULE_MEM_UPDATE_STM_01:
  **Trigger:** After *every* significant action or information receipt.
  **Action:** Immediately update relevant sections (`## State`, `## Plan`, `## Log`) in `workflow_state.md` and save.

RULE_MEM_UPDATE_LTM_01:
  **Trigger:** User command (`@config/update`) OR end of successful VALIDATE phase for significant change.
  **Action:** Propose concise updates to `project_config.md` based on `## Log`/diffs. Set `State.Status = NEEDS_LTM_APPROVAL`. Await user confirmation.

RULE_MEM_VALIDATE_01:
  **Trigger:** After updating `workflow_state.md` or `project_config.md`.
  **Action:** Perform internal consistency check. If issues found, log and set `State.Status = NEEDS_CLARIFICATION`.

**# --- Tool Integration Rules (Cursor Environment) ---**

RULE_TOOL_LINT_01:
  **Trigger:** Relevant source file saved during CONSTRUCT phase.
  **Action:** Instruct Cursor terminal to run lint command. Log attempt. On completion, parse output, log result, set `State.Status = BLOCKED_LINT` if errors.

RULE_TOOL_FORMAT_01:
  **Trigger:** Relevant source file saved during CONSTRUCT phase.
  **Action:** Instruct Cursor to apply formatter or run format command via terminal. Log attempt.

RULE_TOOL_TEST_RUN_01:
  **Trigger:** Command `@validate` or entering VALIDATE phase.
  **Action:** Instruct Cursor terminal to run test suite. Log attempt. On completion, parse output, log result, set `State.Status = BLOCKED_TEST` if failures, `TESTS_PASSED` if success.

RULE_TOOL_APPLY_CODE_01:
  **Trigger:** AI determines code change needed per `## Plan` during CONSTRUCT phase.

RULE_PROCESS_ITEM_01:
  **Trigger:** `State.Phase == CONSTRUCT` and `State.CurrentItem` is not null and current step in `## Plan` requires item processing.
  **Action:**
    1. **Get Item Text:** Based on `State.CurrentItem`, extract the corresponding 'Text to Tokenize' from the `## Items` section.
    2. **Summarize (Placeholder):**  Use a placeholder to generate a summary of the extracted text.  For example, "Summary of [text] is [placeholder summary]".
    3. **Estimate Token Count:**
       a. Read `Characters Per Token (Estimate)` from `project_config.md`.
       b. Get the text content of the item from the `## Items` section. (Placeholder: Implement logic to extract text based on `State.CurrentItem` from the `## Items` table.)
       c. Calculate `estimated_tokens = length(text_content) / 4`.
    4. **Summarize (Placeholder):** Use a placeholder to generate a summary of the extracted text.  For example, "Summary of [text] is [placeholder summary]". (Placeholder: Replace with actual summarization tool/logic)
    5. **Store Results:** Append a new row to the `## TokenizationResults` table with:
       *   `Item ID`: `State.CurrentItem`
       *   `Summary`: The generated summary. (Placeholder: Implement logic to store the summary.)
       *   `Token Count`: `estimated_tokens`.
    6. Log the processing actions, results, and estimated token count to the `## Log`. (Placeholder: Implement logging.)

  **Action:** Generate modification. Instruct Cursor to apply it. Log action.

**# --- Error Handling & Recovery Rules ---**

RULE_ERR_HANDLE_LINT_01:
  **Trigger:** `State.Status` is `BLOCKED_LINT`.
  **Action:** Analyze error in `## Log`. Attempt auto-fix if simple/confident. Apply fix via RULE_TOOL_APPLY_CODE_01. Re-run lint via RULE_TOOL_LINT_01. If success, reset `State.Status`. If fail/complex, set `State.Status = BLOCKED_LINT_UNRESOLVED`, report to user.

RULE_ERR_HANDLE_TEST_01:
  **Trigger:** `State.Status` is `BLOCKED_TEST`.
  **Action:** Analyze failure in `## Log`. Attempt auto-fix if simple/localized/confident. Apply fix via RULE_TOOL_APPLY_CODE_01. Re-run failed test(s) or suite via RULE_TOOL_TEST_RUN_01. If success, reset `State.Status`. If fail/complex, set `State.Phase = BLUEPRINT_REVISE`, `State.Status = NEEDS_PLAN_APPROVAL`, propose revised `## Plan` based on failure analysis, report to user.

RULE_ERR_HANDLE_GENERAL_01:
  **Trigger:** Unexpected error or ambiguity.
  **Action:** Log error/situation to `## Log`. Set `State.Status = BLOCKED_UNKNOWN`. Report to user, request instructions.

---

## Log

*A chronological log of significant actions, events, tool outputs, and decisions.*
*(This section will be populated by the AI during operation)*

*   `[2025-03-26 17:53:47] Initialized new session. State set to ANALYZE/READY.`
*   `[2025-03-26 20:15:00] Received new task: Build Flask Comparison App.`
*   `[2025-03-26 20:15:10] State.Phase changed to BLUEPRINT. State.Status set to IN_PROGRESS.`
*   `[2025-03-26 20:15:15] Setting up detailed plan for BuildFlaskComparisonApp.`
*   `[2025-03-26 20:20:00] Detailed implementation plan created.`
*   `[2025-03-26 20:20:10] State.Status changed to NEEDS_PLAN_APPROVAL. Awaiting user confirmation.`
*   `[2025-03-26 20:25:00] User approved plan. Moving to CONSTRUCT phase.`
*   `[2025-03-26 20:25:10] Creating project directory structure.`
*   `[2025-03-26 20:25:30] Created directories: project/templates, project/static, project/clones.`
*   `[2025-03-26 20:25:35] Moving to Step 1.2: Creating requirements.txt.`
*   `[2025-03-26 20:26:00] Created requirements.txt with necessary dependencies.`
*   `[2025-03-26 20:26:05] Moving to Step 1.3: Creating README.md.`
*   `[2025-03-26 20:26:30] Created README.md with setup and usage instructions.`
*   `[2025-03-26 20:26:35] Moving to Step 2.1: Implementing repository handling function.`
*   `[2025-03-26 20:27:00] Created app.py with Flask initialization and repository handling function.`
*   `[2025-03-26 20:27:05] Moving to Step 2.2: Implementing SCADA JSON loading function.`
*   `[2025-03-26 20:27:30] Added load_scada_names function to app.py.`
*   `[2025-03-26 20:27:35] Moving to Step 2.3: Implementing Excel file loading function.`
*   `[2025-03-26 20:28:00] Added load_excel_names function to app.py.`
*   `[2025-03-26 20:28:05] Moving to Step 2.4: Implementing name normalization function.`
*   `[2025-03-26 20:28:30] Added normalize_names function to app.py.`
*   `[2025-03-26 20:28:35] Moving to Step 2.5: Implementing name comparison function.`
*   `[2025-03-26 20:29:00] Added compare_name_lists function to app.py.`
*   `[2025-03-26 20:29:05] Moving to Step 3.1: Implementing Flask application initialization.`
*   `[2025-03-26 20:29:30] Verified Flask app initialization is already complete in app.py.`
*   `[2025-03-26 20:29:35] Moving to Step 3.2: Implementing index route (GET /).`
*   `[2025-03-26 20:30:00] Added index route to app.py.`
*   `[2025-03-26 20:30:30] Created base.html template with Bootstrap and navigation.`
*   `[2025-03-26 20:31:00] Created index.html template with upload form.`
*   `[2025-03-26 20:31:30] Created CSS and JavaScript files for styling and interactivity.`
*   `[2025-03-26 20:31:35] Moving to Step 3.3: Implementing comparison route (POST /compare).`
*   `[2025-03-26 20:32:00] Added comparison route to app.py with full implementation of form handling, data processing, and template rendering.`
*   `[2025-03-26 20:32:05] Moving to Step 3.4: Implementing update route (POST /update_files).`
*   `[2025-03-26 20:32:30] Added update_files route to app.py with handling for file re-uploads and updated comparisons.`
*   `[2025-03-26 20:32:35] Verifying Step 4.1 and 4.2: base.html and index.html templates have already been created.`
*   `[2025-03-26 20:33:00] Moving to Step 4.3: Creating results.html template.`
*   `[2025-03-26 20:34:00] Created results.html template with tabs for comparison results and re-upload form.`
*   `[2025-03-26 20:34:05] Moving to Step 5.1: Creating CSS for styling.`
*   `[2025-03-26 20:35:00] Created styles.css with comprehensive styling for tables, highlighting, and responsive design.`
*   `[2025-03-26 20:35:05] Moving to Step 5.2: Adding JavaScript for interactions.`
*   `[2025-03-26 20:36:00] Created script.js with functionality for tab navigation, table filtering, and clipboard copying.`
*   `[2025-03-26 20:36:05] Moving to Step 6.1: Implementing input validation.`
*   `[2025-03-26 20:37:00] Added input validation for repository URLs and Excel file uploads.`
*   `[2025-03-26 20:37:05] Moving to Step 6.2: Adding error handling for file operations.`
*   `[2025-03-26 20:38:00] Enhanced load_excel_names function with comprehensive error handling for Excel files.`
*   `[2025-03-26 20:38:05] Moving to Step 6.3: Adding error handling for repository operations.`
*   `[2025-03-26 20:39:00] Enhanced create_or_update_repo function with comprehensive error handling for network issues, timeouts, and permission problems.`
*   `[2025-03-26 20:39:05] Moving to Step 7.1: Testing repository handling functionality.`
*   `[2025-03-26 21:00:00] Updated URL validation to support HTTP URLs with port numbers for Gitea compatibility.`
*   `[2025-03-26 21:15:00] Improved JSON component name extraction to recursively find all components in nested structures.`
*   `[2025-03-26 21:16:00] Test script found 5,351 component names from SCADA JSON files.`
*   `[2025-03-26 21:17:00] Step 7.1 completed. Moving to Step 7.2: Testing file upload handling.`
*   `[2025-03-26 21:20:00] Added repository update function to automatically fetch latest changes.`
*   `[2025-03-26 21:21:00] Added manual refresh button to allow users to explicitly update the repository.`
*   `[2025-03-26 21:22:00] Step 7.2 completed. Moving to Step 7.3: Testing name normalization and comparison.`
*   `[2025-03-26 21:30:00] Fixed issue where repository was updated but data was not reloaded.`
*   `[2025-03-26 21:31:00] Added CSRF protection to prevent 405 Method Not Allowed errors.`
*   `[2025-03-26 21:32:00] Step 7.3 completed. Moving to Step 7.4: Testing re-upload functionality.`

---

## Items

*This section will contain the list of items to be processed.*
*(The format of items is a table)*

*Example (Table):*
*   `| Item ID | Text to Tokenize |`
*   `|---|---|`
*   `| item1 | This is the first item to tokenize. This is a short sentence. |`
*   `| item2 | Here is the second item for tokenization. This is a slightly longer sentence to test the summarization. |`
*   `| item3 | This is item number three to be processed. This is a longer sentence to test the summarization. This is a longer sentence to test the summarization. |`
*   `| item4 | And this is the fourth and final item in the list. This is a very long sentence to test the summarization. This is a very long sentence to test the summarization. This is a very long sentence to test the summarization. This is a very long sentence to test the summarization. |`
---

## TokenizationResults

*This section will store the summarization results for each item.*
*(Results will include the summary and estimated token count)*

*Example (Table):*
*   `| Item ID | Summary | Token Count |`
*   `|---|---|---|`

## TokenizationResults

*This section will store the tokenization results for each item.*
*(Results will include token counts and potentially tokenized text)*

*Example (Table):*
*   `| Item ID | Token Count | Tokenized Text (Optional) |`
*   `|---|---|---|`
*   `| item1 | 10 | ... (tokenized text) ... |`
*   `| item2 | 12 | ... (tokenized text) ... |`