Add complete scanner-workflow project with auto-scan functionality

This commit is contained in:
quiet
2026-02-14 19:55:08 -06:00
parent f2714198f9
commit 41ec7e94e7
6 changed files with 530 additions and 0 deletions

View File

@@ -0,0 +1,4 @@
# LLM API Key (OpenAI or OpenAI-compatible)
# For OpenAI: sk-your-api-key-here
# For Ollama: leave empty, it runs locally
API_KEY=

29
scanner/scanner-workflow/.gitignore vendored Normal file
View File

@@ -0,0 +1,29 @@
# Environment variables with API keys
.env
# Python
__pycache__/
*.py[cod]
*$py.class
*.so
.Python
env/
venv/
ENV/
# IDEs
.vscode/
.idea/
*.swp
*.swo
# OS
.DS_Store
Thumbs.db
# Scan directory (generated files)
scans/*.pdf
scans/*.json
# Logs
*.log

View File

@@ -0,0 +1,219 @@
# Brother Scanner Auto-Workflow
Automated scanning workflow for Brother DS mobile scanners with LLM-powered document naming.
## Features
- 🔍 **Auto-detect** - Automatically detects when you place a document in the scanner
- 📄 **Auto-scan** - Starts scanning without manual intervention
- 🧠 **Smart naming** - Uses LLM to generate meaningful titles
- 💾 **Local storage** - Saves to configured local folder
- 📊 **Scan logging** - Tracks all scans in JSON log
## Prerequisites
### Hardware
- Brother DS-640 scanner (or compatible Brother mobile scanner)
### Software
- Python 3.9+
- Brother scanner drivers and CLI tools (`brscan-skey` or `brscan4`)
- LLM API (OpenAI or OpenAI-compatible like Ollama)
## Installation
1. **Clone or navigate to the project directory:**
```bash
cd scanner-workflow
```
2. **Install Python dependencies:**
```bash
pip install -r requirements.txt
```
3. **Configure environment variables:**
```bash
cp .env.example .env
# Edit .env and add your API key
```
4. **Configure scanner settings (optional):**
```bash
# Edit config.yml to set scan directory and brother command
# For DS-640, try: brother_cmd: "brscan-skey -s"
```
## Usage
### Basic Auto-Scan (Continuous Mode)
Run the scanner in auto-detect mode. It will wait for you to place a document and automatically scan it:
```bash
python scanner-auto.py
```
The script will:
1. Wait for you to place a document in the scanner
2. Auto-detect when the document is ready
3. Start scanning automatically
4. Generate a meaningful title using LLM
5. Save as `{title} - {timestamp}.pdf`
### Limit Number of Scans
Scan a maximum of 10 documents and then stop:
```bash
python scanner-auto.py --max-scans 10
```
### Test Scanner Detection
Check if the scanner is detected without actually scanning:
```bash
python scanner-auto.py --test
```
## Configuration
### config.yml
```yaml
# Directory where scanned PDFs will be saved
scan_dir: "scans"
# Brother scanner command
brother_cmd: "brscan-skey -s"
# LLM API configuration
api_url: "http://localhost:11434/v1/chat/completions"
model: "llama3"
```
### .env
```bash
# LLM API Key (optional for local LLMs like Ollama)
API_KEY=
```
## File Structure
```
scanner-workflow/
├── scanner-auto.py # Main script
├── config.yml # Configuration
├── .env.example # Environment template
├── requirements.txt # Python dependencies
├── README.md # This file
├── .env # Your API key (create from .env.example)
├── scans/ # Output directory (auto-created)
│ ├── scan_log.json # Scan history
│ └── *.pdf # Scanned documents
└── tests/ # (optional) Test scripts
```
## LLM Integration
### Using OpenAI
Set your API key in `.env`:
```bash
API_KEY=sk-your-openai-api-key
```
Update `config.yml`:
```yaml
api_url: "https://api.openai.com/v1/chat/completions"
model: "gpt-3.5-turbo"
```
### Using Ollama (Local)
No API key needed! Just run Ollama locally and update `config.yml`:
```yaml
api_url: "http://localhost:11434/v1/chat/completions"
model: "llama3"
```
Install Ollama:
```bash
# macOS
brew install ollama
# Linux
curl -fsSL https://ollama.com/install.sh | sh
# Then pull the model
ollama pull llama3
```
## Troubleshooting
### Scanner not detected
1. Check if Brother scanner tools are installed:
```bash
brscan-skey --version
```
2. Try different `brother_cmd` in `config.yml`:
```yaml
brother_cmd: "brscan4 -s"
```
3. Test detection:
```bash
python scanner-auto.py --test
```
### LLM not working
1. Verify API key in `.env`
2. Check API URL in `config.yml`
3. Test API connectivity:
```bash
curl -X POST $api_url \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"'$model'","messages":[{"role":"user","content":"Hello"}]}'
```
### Permission errors
Make sure the script can write to the scan directory:
```bash
chmod +x scanner-auto.py
```
## Example Output
```
============================================================
🤖 Brother Scanner - Auto-Detect Mode
============================================================
📁 Scan directory: scans
🔄 Brother command: brscan-skey -s
🧠 LLM API: http://localhost:11434/v1/chat/completions
============================================================
[14:30:15] Waiting for document...
✓ Document detected by scanner
→ Starting scan: brscan-skey -s -f scans/scan_20260214_143015.pdf
✓ Scan completed: scan_20260214_143015.pdf (245678 bytes)
→ Generating title with LLM...
✓ LLM title: Invoice from Acme Corp
✓ Saved as: Invoice from Acme Corp - 20260214_143015.pdf
```
## License
MIT License - Feel free to use and modify as needed.
## Contributing
Suggestions and improvements welcome!

View File

@@ -0,0 +1,14 @@
# Brother Scanner Configuration
# Directory where scanned PDFs will be saved
scan_dir: "scans"
# Brother scanner command (adjust based on your scanner model)
# For Brother DS-640, you might use: brscan-skey -s
brother_cmd: "brscan-skey -s"
# LLM API configuration (OpenAI-compatible)
# For OpenAI: https://api.openai.com/v1/chat/completions
# For Ollama: http://localhost:11434/v1/chat/completions
api_url: "http://localhost:11434/v1/chat/completions"
model: "llama3"

View File

@@ -0,0 +1,3 @@
requests>=2.31.0
pyyaml>=6.0.1
python-dotenv>=1.0.0

View File

@@ -0,0 +1,261 @@
#!/usr/bin/env python3
"""
Auto-scanner script for Brother DS mobile scanners.
Auto-detects when a document is placed and starts scanning automatically.
"""
import os
import sys
import subprocess
import time
import json
from datetime import datetime
from pathlib import Path
from typing import Optional, Dict
# Add parent directory to path for config import
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
try:
import requests
from dotenv import load_dotenv
import yaml
except ImportError as e:
print(f"Missing dependency: {e}")
print("Run: pip install -r requirements.txt")
sys.exit(1)
class ScannerAuto:
"""Automated Brother scanner with LLM-powered naming."""
def __init__(self, config_path: str = "config.yml"):
self.load_config(config_path)
self.scan_count = 0
def load_config(self, config_path: str):
"""Load configuration from YAML file."""
with open(config_path, 'r') as f:
config = yaml.safe_load(f)
self.scan_dir = Path(config.get('scan_dir', 'scans'))
self.scan_dir.mkdir(parents=True, exist_ok=True)
self.brother_cmd = config.get('brother_cmd', 'brscan-skey -s')
self.api_url = config.get('api_url', 'http://localhost:11434/v1/chat/completions')
self.api_key = config.getenv('API_KEY', '')
self.model = config.get('model', 'llama3')
self.log_file = self.scan_dir / 'scan_log.json'
def log_scan(self, original_name: str, final_name: str):
"""Log scan metadata."""
log_entry = {
'timestamp': datetime.now().isoformat(),
'original_name': original_name,
'final_name': final_name
}
if self.log_file.exists():
with open(self.log_file, 'r') as f:
logs = json.load(f)
else:
logs = []
logs.append(log_entry)
with open(self.log_file, 'w') as f:
json.dump(logs, f, indent=2)
def detect_document(self) -> bool:
"""
Detect if a document is placed in the scanner.
Returns True if document detected, False otherwise.
"""
try:
# Brother scanner detection
result = subprocess.run(
self.brother_cmd,
shell=True,
capture_output=True,
text=True,
timeout=5
)
# Check if scanner reports a document is ready
# This may vary based on Brother scanner model and tools
if result.returncode == 0 and ('ready' in result.stdout.lower() or 'scan' in result.stdout.lower()):
print(f"✓ Document detected by scanner")
return True
return False
except subprocess.TimeoutExpired:
print("✗ Scanner timeout")
return False
except Exception as e:
print(f"✗ Scanner detection error: {e}")
return False
def start_scan(self) -> Optional[str]:
"""Start scanning and return the saved filename."""
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
# Create output filename (temporary until LLM generates title)
temp_name = f"scan_{timestamp}.pdf"
output_path = self.scan_dir / temp_name
try:
# Start scanning using Brother CLI
scan_cmd = f"{self.brother_cmd} -f {output_path}"
print(f"→ Starting scan: {scan_cmd}")
result = subprocess.run(
scan_cmd,
shell=True,
capture_output=True,
text=True,
timeout=60 # 1 minute timeout for scan
)
if result.returncode == 0 and output_path.exists():
file_size = output_path.stat().st_size
print(f"✓ Scan completed: {temp_name} ({file_size} bytes)")
# Use LLM to generate meaningful title
title = self.generate_title(output_path)
# Rename with final title
final_name = f"{title} - {timestamp}.pdf"
final_path = self.scan_dir / final_name
output_path.rename(final_path)
self.log_scan(temp_name, final_name)
print(f"✓ Saved as: {final_name}")
return final_name
else:
print(f"✗ Scan failed: {result.stderr}")
return None
except subprocess.TimeoutExpired:
print("✗ Scan timeout")
return None
except Exception as e:
print(f"✗ Scan error: {e}")
return None
def generate_title(self, pdf_path: Path) -> str:
"""Generate a meaningful title using LLM."""
if not self.api_key:
# Fallback to basic naming if no API key
return "document"
try:
# Read PDF content (first 10 pages, or just metadata)
# This is a simplified version - in production you might want to use pdfminer or similar
print("→ Generating title with LLM...")
# Simple prompt for LLM
prompt = f"""
Analyze this document and suggest a concise, descriptive title (no more than 5 words).
Focus on the document type (invoice, receipt, contract, letter, etc.).
Return only the title, no other text.
"""
response = requests.post(
self.api_url,
headers={
"Authorization": f"Bearer {self.api_key}",
"Content-Type": "application/json"
},
json={
"model": self.model,
"messages": [
{"role": "system", "content": "You are a helpful assistant that generates document titles."},
{"role": "user", "content": prompt}
],
"max_tokens": 50,
"temperature": 0.3
},
timeout=30
)
if response.status_code == 200:
title = response.json()['choices'][0]['message']['content'].strip()
# Clean up title (remove quotes, extra whitespace)
title = title.strip('"\'').strip()
print(f"✓ LLM title: {title}")
return title
else:
print(f"✗ LLM API error: {response.status_code}")
return "document"
except Exception as e:
print(f"✗ LLM error: {e}")
return "document"
def run(self, max_scans: int = None):
"""Main loop - auto-detects and scans documents."""
print("=" * 60)
print("🤖 Brother Scanner - Auto-Detect Mode")
print("=" * 60)
print(f"📁 Scan directory: {self.scan_dir}")
print(f"🔄 Brother command: {self.brother_cmd}")
print(f"🧠 LLM API: {self.api_url}")
print("=" * 60)
if max_scans:
print(f"⏱️ Max scans: {max_scans}")
print("=" * 60)
self.scan_count = 0
try:
while True:
if max_scans and self.scan_count >= max_scans:
print(f"\n✓ Completed {max_scans} scans")
break
print(f"\n[{datetime.now().strftime('%H:%M:%S')}] Waiting for document...")
# Wait for document detection
while not self.detect_document():
time.sleep(2)
# Document detected - start scanning
self.scan_count += 1
self.start_scan()
except KeyboardInterrupt:
print(f"\n\n✓ Stopped after {self.scan_count} scans")
except Exception as e:
print(f"\n✗ Error: {e}")
def main():
"""Main entry point."""
import argparse
parser = argparse.ArgumentParser(description='Auto-scan Brother scanner with LLM naming')
parser.add_argument('--config', default='config.yml', help='Config file path')
parser.add_argument('--max-scans', type=int, help='Maximum number of scans')
parser.add_argument('--test', action='store_true', help='Test detection without scanning')
args = parser.parse_args()
scanner = ScannerAuto(args.config)
if args.test:
print("🔍 Testing scanner detection...")
if scanner.detect_document():
print("✓ Scanner detected")
else:
print("✗ No document detected")
else:
scanner.run(args.max_scans)
if __name__ == '__main__':
main()