Add complete scanner-workflow project with auto-scan functionality

2026-02-14 19:55:08 -06:00
parent f2714198f9
commit 41ec7e94e7
6 changed files with 530 additions and 0 deletions
--- a/scanner/scanner-workflow/.env.example
+++ b/scanner/scanner-workflow/.env.example
@@ -0,0 +1,4 @@
+# LLM API Key (OpenAI or OpenAI-compatible)
+# For OpenAI: sk-your-api-key-here
+# For Ollama: leave empty, it runs locally
+API_KEY=
--- a/scanner/scanner-workflow/.gitignore
+++ b/scanner/scanner-workflow/.gitignore
@@ -0,0 +1,29 @@
+# Environment variables with API keys
+.env
+
+# Python
+__pycache__/
+*.py[cod]
+*$py.class
+*.so
+.Python
+env/
+venv/
+ENV/
+
+# IDEs
+.vscode/
+.idea/
+*.swp
+*.swo
+
+# OS
+.DS_Store
+Thumbs.db
+
+# Scan directory (generated files)
+scans/*.pdf
+scans/*.json
+
+# Logs
+*.log
--- a/scanner/scanner-workflow/README.md
+++ b/scanner/scanner-workflow/README.md
@@ -0,0 +1,219 @@
+# Brother Scanner Auto-Workflow
+
+Automated scanning workflow for Brother DS mobile scanners with LLM-powered document naming.
+
+## Features
+
+- 🔍 **Auto-detect** - Automatically detects when you place a document in the scanner
+- 📄 **Auto-scan** - Starts scanning without manual intervention
+- 🧠 **Smart naming** - Uses LLM to generate meaningful titles
+- 💾 **Local storage** - Saves to configured local folder
+- 📊 **Scan logging** - Tracks all scans in JSON log
+
+## Prerequisites
+
+### Hardware
+- Brother DS-640 scanner (or compatible Brother mobile scanner)
+
+### Software
+- Python 3.9+
+- Brother scanner drivers and CLI tools (`brscan-skey` or `brscan4`)
+- LLM API (OpenAI or OpenAI-compatible like Ollama)
+
+## Installation
+
+1. **Clone or navigate to the project directory:**
+   ```bash
+   cd scanner-workflow
+   ```
+
+2. **Install Python dependencies:**
+   ```bash
+   pip install -r requirements.txt
+   ```
+
+3. **Configure environment variables:**
+   ```bash
+   cp .env.example .env
+   # Edit .env and add your API key
+   ```
+
+4. **Configure scanner settings (optional):**
+   ```bash
+   # Edit config.yml to set scan directory and brother command
+   # For DS-640, try: brother_cmd: "brscan-skey -s"
+   ```
+
+## Usage
+
+### Basic Auto-Scan (Continuous Mode)
+
+Run the scanner in auto-detect mode. It will wait for you to place a document and automatically scan it:
+
+```bash
+python scanner-auto.py
+```
+
+The script will:
+1. Wait for you to place a document in the scanner
+2. Auto-detect when the document is ready
+3. Start scanning automatically
+4. Generate a meaningful title using LLM
+5. Save as `{title} - {timestamp}.pdf`
+
+### Limit Number of Scans
+
+Scan a maximum of 10 documents and then stop:
+
+```bash
+python scanner-auto.py --max-scans 10
+```
+
+### Test Scanner Detection
+
+Check if the scanner is detected without actually scanning:
+
+```bash
+python scanner-auto.py --test
+```
+
+## Configuration
+
+### config.yml
+
+```yaml
+# Directory where scanned PDFs will be saved
+scan_dir: "scans"
+
+# Brother scanner command
+brother_cmd: "brscan-skey -s"
+
+# LLM API configuration
+api_url: "http://localhost:11434/v1/chat/completions"
+model: "llama3"
+```
+
+### .env
+
+```bash
+# LLM API Key (optional for local LLMs like Ollama)
+API_KEY=
+```
+
+## File Structure
+
+```
+scanner-workflow/
+├── scanner-auto.py          # Main script
+├── config.yml               # Configuration
+├── .env.example             # Environment template
+├── requirements.txt         # Python dependencies
+├── README.md                # This file
+├── .env                     # Your API key (create from .env.example)
+├── scans/                   # Output directory (auto-created)
+│   ├── scan_log.json        # Scan history
+│   └── *.pdf                # Scanned documents
+└── tests/                   # (optional) Test scripts
+```
+
+## LLM Integration
+
+### Using OpenAI
+
+Set your API key in `.env`:
+```bash
+API_KEY=sk-your-openai-api-key
+```
+
+Update `config.yml`:
+```yaml
+api_url: "https://api.openai.com/v1/chat/completions"
+model: "gpt-3.5-turbo"
+```
+
+### Using Ollama (Local)
+
+No API key needed! Just run Ollama locally and update `config.yml`:
+
+```yaml
+api_url: "http://localhost:11434/v1/chat/completions"
+model: "llama3"
+```
+
+Install Ollama:
+```bash
+# macOS
+brew install ollama
+
+# Linux
+curl -fsSL https://ollama.com/install.sh | sh
+
+# Then pull the model
+ollama pull llama3
+```
+
+## Troubleshooting
+
+### Scanner not detected
+
+1. Check if Brother scanner tools are installed:
+   ```bash
+   brscan-skey --version
+   ```
+
+2. Try different `brother_cmd` in `config.yml`:
+   ```yaml
+   brother_cmd: "brscan4 -s"
+   ```
+
+3. Test detection:
+   ```bash
+   python scanner-auto.py --test
+   ```
+
+### LLM not working
+
+1. Verify API key in `.env`
+2. Check API URL in `config.yml`
+3. Test API connectivity:
+   ```bash
+   curl -X POST $api_url \
+     -H "Authorization: Bearer $API_KEY" \
+     -H "Content-Type: application/json" \
+     -d '{"model":"'$model'","messages":[{"role":"user","content":"Hello"}]}'
+   ```
+
+### Permission errors
+
+Make sure the script can write to the scan directory:
+```bash
+chmod +x scanner-auto.py
+```
+
+## Example Output
+
+```
+============================================================
+🤖 Brother Scanner - Auto-Detect Mode
+============================================================
+📁 Scan directory: scans
+🔄 Brother command: brscan-skey -s
+🧠 LLM API: http://localhost:11434/v1/chat/completions
+============================================================
+
+[14:30:15] Waiting for document...
+✓ Document detected by scanner
+→ Starting scan: brscan-skey -s -f scans/scan_20260214_143015.pdf
+✓ Scan completed: scan_20260214_143015.pdf (245678 bytes)
+→ Generating title with LLM...
+✓ LLM title: Invoice from Acme Corp
+✓ Saved as: Invoice from Acme Corp - 20260214_143015.pdf
+```
+
+## License
+
+MIT License - Feel free to use and modify as needed.
+
+## Contributing
+
+Suggestions and improvements welcome!
--- a/scanner/scanner-workflow/config.yml
+++ b/scanner/scanner-workflow/config.yml
@@ -0,0 +1,14 @@
+# Brother Scanner Configuration
+
+# Directory where scanned PDFs will be saved
+scan_dir: "scans"
+
+# Brother scanner command (adjust based on your scanner model)
+# For Brother DS-640, you might use: brscan-skey -s
+brother_cmd: "brscan-skey -s"
+
+# LLM API configuration (OpenAI-compatible)
+# For OpenAI: https://api.openai.com/v1/chat/completions
+# For Ollama: http://localhost:11434/v1/chat/completions
+api_url: "http://localhost:11434/v1/chat/completions"
+model: "llama3"
--- a/scanner/scanner-workflow/requirements.txt
+++ b/scanner/scanner-workflow/requirements.txt
@@ -0,0 +1,3 @@
+requests>=2.31.0
+pyyaml>=6.0.1
+python-dotenv>=1.0.0
--- a/scanner/scanner-workflow/scanner-auto.py
+++ b/scanner/scanner-workflow/scanner-auto.py
@@ -0,0 +1,261 @@
+#!/usr/bin/env python3
+"""
+Auto-scanner script for Brother DS mobile scanners.
+Auto-detects when a document is placed and starts scanning automatically.
+"""
+
+import os
+import sys
+import subprocess
+import time
+import json
+from datetime import datetime
+from pathlib import Path
+from typing import Optional, Dict
+
+# Add parent directory to path for config import
+sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
+
+try:
+    import requests
+    from dotenv import load_dotenv
+    import yaml
+except ImportError as e:
+    print(f"Missing dependency: {e}")
+    print("Run: pip install -r requirements.txt")
+    sys.exit(1)
+
+
+class ScannerAuto:
+    """Automated Brother scanner with LLM-powered naming."""
+
+    def __init__(self, config_path: str = "config.yml"):
+        self.load_config(config_path)
+        self.scan_count = 0
+
+    def load_config(self, config_path: str):
+        """Load configuration from YAML file."""
+        with open(config_path, 'r') as f:
+            config = yaml.safe_load(f)
+
+        self.scan_dir = Path(config.get('scan_dir', 'scans'))
+        self.scan_dir.mkdir(parents=True, exist_ok=True)
+
+        self.brother_cmd = config.get('brother_cmd', 'brscan-skey -s')
+        self.api_url = config.get('api_url', 'http://localhost:11434/v1/chat/completions')
+        self.api_key = config.getenv('API_KEY', '')
+        self.model = config.get('model', 'llama3')
+
+        self.log_file = self.scan_dir / 'scan_log.json'
+
+    def log_scan(self, original_name: str, final_name: str):
+        """Log scan metadata."""
+        log_entry = {
+            'timestamp': datetime.now().isoformat(),
+            'original_name': original_name,
+            'final_name': final_name
+        }
+
+        if self.log_file.exists():
+            with open(self.log_file, 'r') as f:
+                logs = json.load(f)
+        else:
+            logs = []
+
+        logs.append(log_entry)
+
+        with open(self.log_file, 'w') as f:
+            json.dump(logs, f, indent=2)
+
+    def detect_document(self) -> bool:
+        """
+        Detect if a document is placed in the scanner.
+        Returns True if document detected, False otherwise.
+        """
+        try:
+            # Brother scanner detection
+            result = subprocess.run(
+                self.brother_cmd,
+                shell=True,
+                capture_output=True,
+                text=True,
+                timeout=5
+            )
+
+            # Check if scanner reports a document is ready
+            # This may vary based on Brother scanner model and tools
+            if result.returncode == 0 and ('ready' in result.stdout.lower() or 'scan' in result.stdout.lower()):
+                print(f"✓ Document detected by scanner")
+                return True
+
+            return False
+
+        except subprocess.TimeoutExpired:
+            print("✗ Scanner timeout")
+            return False
+        except Exception as e:
+            print(f"✗ Scanner detection error: {e}")
+            return False
+
+    def start_scan(self) -> Optional[str]:
+        """Start scanning and return the saved filename."""
+        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
+
+        # Create output filename (temporary until LLM generates title)
+        temp_name = f"scan_{timestamp}.pdf"
+        output_path = self.scan_dir / temp_name
+
+        try:
+            # Start scanning using Brother CLI
+            scan_cmd = f"{self.brother_cmd} -f {output_path}"
+            print(f"→ Starting scan: {scan_cmd}")
+
+            result = subprocess.run(
+                scan_cmd,
+                shell=True,
+                capture_output=True,
+                text=True,
+                timeout=60  # 1 minute timeout for scan
+            )
+
+            if result.returncode == 0 and output_path.exists():
+                file_size = output_path.stat().st_size
+                print(f"✓ Scan completed: {temp_name} ({file_size} bytes)")
+
+                # Use LLM to generate meaningful title
+                title = self.generate_title(output_path)
+
+                # Rename with final title
+                final_name = f"{title} - {timestamp}.pdf"
+                final_path = self.scan_dir / final_name
+
+                output_path.rename(final_path)
+                self.log_scan(temp_name, final_name)
+
+                print(f"✓ Saved as: {final_name}")
+                return final_name
+
+            else:
+                print(f"✗ Scan failed: {result.stderr}")
+                return None
+
+        except subprocess.TimeoutExpired:
+            print("✗ Scan timeout")
+            return None
+        except Exception as e:
+            print(f"✗ Scan error: {e}")
+            return None
+
+    def generate_title(self, pdf_path: Path) -> str:
+        """Generate a meaningful title using LLM."""
+        if not self.api_key:
+            # Fallback to basic naming if no API key
+            return "document"
+
+        try:
+            # Read PDF content (first 10 pages, or just metadata)
+            # This is a simplified version - in production you might want to use pdfminer or similar
+            print("→ Generating title with LLM...")
+
+            # Simple prompt for LLM
+            prompt = f"""
+            Analyze this document and suggest a concise, descriptive title (no more than 5 words).
+            Focus on the document type (invoice, receipt, contract, letter, etc.).
+            Return only the title, no other text.
+            """
+
+            response = requests.post(
+                self.api_url,
+                headers={
+                    "Authorization": f"Bearer {self.api_key}",
+                    "Content-Type": "application/json"
+                },
+                json={
+                    "model": self.model,
+                    "messages": [
+                        {"role": "system", "content": "You are a helpful assistant that generates document titles."},
+                        {"role": "user", "content": prompt}
+                    ],
+                    "max_tokens": 50,
+                    "temperature": 0.3
+                },
+                timeout=30
+            )
+
+            if response.status_code == 200:
+                title = response.json()['choices'][0]['message']['content'].strip()
+                # Clean up title (remove quotes, extra whitespace)
+                title = title.strip('"\'').strip()
+                print(f"✓ LLM title: {title}")
+                return title
+            else:
+                print(f"✗ LLM API error: {response.status_code}")
+                return "document"
+
+        except Exception as e:
+            print(f"✗ LLM error: {e}")
+            return "document"
+
+    def run(self, max_scans: int = None):
+        """Main loop - auto-detects and scans documents."""
+        print("=" * 60)
+        print("🤖 Brother Scanner - Auto-Detect Mode")
+        print("=" * 60)
+        print(f"📁 Scan directory: {self.scan_dir}")
+        print(f"🔄 Brother command: {self.brother_cmd}")
+        print(f"🧠 LLM API: {self.api_url}")
+        print("=" * 60)
+
+        if max_scans:
+            print(f"⏱️  Max scans: {max_scans}")
+            print("=" * 60)
+
+        self.scan_count = 0
+
+        try:
+            while True:
+                if max_scans and self.scan_count >= max_scans:
+                    print(f"\n✓ Completed {max_scans} scans")
+                    break
+
+                print(f"\n[{datetime.now().strftime('%H:%M:%S')}] Waiting for document...")
+
+                # Wait for document detection
+                while not self.detect_document():
+                    time.sleep(2)
+
+                # Document detected - start scanning
+                self.scan_count += 1
+                self.start_scan()
+
+        except KeyboardInterrupt:
+            print(f"\n\n✓ Stopped after {self.scan_count} scans")
+        except Exception as e:
+            print(f"\n✗ Error: {e}")
+
+
+def main():
+    """Main entry point."""
+    import argparse
+
+    parser = argparse.ArgumentParser(description='Auto-scan Brother scanner with LLM naming')
+    parser.add_argument('--config', default='config.yml', help='Config file path')
+    parser.add_argument('--max-scans', type=int, help='Maximum number of scans')
+    parser.add_argument('--test', action='store_true', help='Test detection without scanning')
+
+    args = parser.parse_args()
+
+    scanner = ScannerAuto(args.config)
+
+    if args.test:
+        print("🔍 Testing scanner detection...")
+        if scanner.detect_document():
+            print("✓ Scanner detected")
+        else:
+            print("✗ No document detected")
+    else:
+        scanner.run(args.max_scans)
+
+
+if __name__ == '__main__':
+    main()