dp2ui/README-TESTING.md

# DP2 Moderation Assistant - Testing Framework

This comprehensive testing framework uses Playwright to test every aspect of the DP2 moderation assistant application, with special focus on debugging calculation issues.

## Overview

The testing suite covers:
- **23 different crime types** across 4 categories
- **Form interaction testing** for all input types
- **Calculation accuracy testing** for punishment logic
- **Edge case testing** and validation
- **Debug utilities** with screenshot capture and detailed logging

## Quick Start

### Install Dependencies
```bash
npm install
npx playwright install
```

### Run All Tests
```bash
npm test
```

### Run Tests with UI
```bash
npm run test:ui
```

### Run Debug Tests (Recommended for troubleshooting)
```bash
npm run test:debug
```

### View Test Reports
```bash
npm run test:report
```

## Test Structure

```
tests/
├── dp2-form.spec.ts          # Main functionality tests
├── debug.spec.ts             # Debug utilities with screenshots
└── utils/
    ├── page-objects.ts       # Page object models
    └── test-data.ts          # Test data factories
```

## Crime Categories Tested

### Item Offenses (5 crimes)
- Theft (with special items and classification)
- Unconsensual Killing
- Illegal Item Use
- Inappropriate Item Names
- Inappropriate Book Contents

### Block Offenses (6 crimes)
- Vandalism
- Grief (with classification: minor/moderate/large/massive)
- Theft-Grief
- Vandalism of Infrastructure
- Trespassing (regular + staff/SPP)
- Trespassing on Staff/SPP Land

### Hacking Offenses (5 crimes)
- X-Raying
- Hacking Client
- Lagging Server
- Worldedit Misuse
- Exploit Abuse

### Communication Offenses (8 crimes)
- Abusive Chat
- Inciting Verbal Conflict
- Abusive VC Language
- Lying to Staff
- Manipulation
- Grand Manipulation
- Slander (Against SPP Only)
- Violation of NCA

## Key Features

### 🔍 Debug Tools
- **Screenshot Capture**: Every step saved as PNG for visual debugging
- **Detailed Logging**: Console output with form values and results
- **Step-by-Step Testing**: Individual test flows for complex scenarios
- **Error Isolation**: Targeted tests for specific crime types

### 📊 Test Data Factory
- **Reusable Scenarios**: Pre-built test cases for all crime types
- **Edge Cases**: Zero values, maximum values, validation tests
- **Point Decay Testing**: Automatic point reduction calculations
- **SPP Modifiers**: Special protection person logic testing

### 🎯 Page Object Model
- **Type-Safe Interactions**: Full TypeScript support
- **Dynamic Form Handling**: Adapts to different crime requirements
- **Result Validation**: Automated checking of commands and summaries
- **Reset Functionality**: Clean state between tests

## Debugging Workflows

### 1. Quick Diagnosis
```bash
npm run test:debug
```
This runs step-by-step tests with screenshots showing exactly where calculations fail.

### 2. Specific Crime Testing
```bash
npx playwright test --grep "theft"
npx playwright test --grep "grief"
```

### 3. Visual Debugging
Screenshots are saved to `debug-screenshots/` directory:
- `01-basic-info-filled.png` - Form filled
- `02-theft-selected.png` - Crime selected
- `03-special-items-filled.png` - Special items entered
- `04-additional-items-added.png` - Additional items added
- `05-calculate-clicked.png` - Calculate button pressed
- `06-results-shown.png` - Final results (or error screenshot)

### 4. Console Analysis
Debug tests output detailed logs:
```
=== RESULTS ===
Commands: ['/note Player1 11', '/warn Player1 DP2 violation']
Summary: { crime: 'Theft', basePoints: 11, totalPoints: 11, punishment: 'warning' }
Explanation: Crime: Theft (11 points)...
```

## Known Issues & Fixes

The testing framework has identified several calculation bugs in the current implementation:

### 1. Item Point Calculation Error
**Issue**: Calculator sums item quantities instead of using DP2 item point values.
**Expected**: Use `ITEM_POINTS` mapping (elytra = 20 points, diamond = 5 points, etc.)
**Status**: Tests will fail until this is fixed.

### 2. Missing Theft Classification
**Issue**: Theft should be classified as minor/moderate/severe based on total item points.
**Expected**: <50 points = minor (1 base), 50-500 = moderate (2 base), >500 = severe (3 base)
**Status**: Currently uses fixed 1 point base.

### 3. Missing Grief Classification
**Issue**: Grief should classify based on block count.
**Expected**: <100 = minor, 100-1000 = moderate, 1000-100000 = large, >100000 = massive
**Status**: Uses fixed 1 point base.

## Contributing

### Adding New Test Scenarios
1. Add to `TestDataFactory` in `tests/utils/test-data.ts`
2. Include expected results based on DP2 rules
3. Run tests to verify

### Fixing Calculation Bugs
1. Run debug tests to identify failures
2. Check screenshots in `debug-screenshots/`
3. Fix logic in `src/hooks/useDP2Calculator.ts`
4. Re-run tests to verify fixes

## Configuration

### Playwright Config (`playwright.config.ts`)
- Runs against local dev server (`http://localhost:3000`)
- Tests all 3 browsers (Chromium, Firefox, WebKit)
- Captures screenshots on failure
- Generates HTML reports

### Test Environment
- Requires Node.js and npm
- Next.js dev server must be running
- 3 browser engines installed via `npx playwright install`

## Performance

- **~30 test scenarios** covering all crime types
- **Parallel execution** across browsers
- **~2-3 minutes** for full test suite
- **Debug tests**: ~5-10 minutes with screenshots

## CI/CD Integration

Tests can be run in CI with:
```yaml
- run: npm ci
- run: npx playwright install
- run: npm test
```

For headless CI environments, use:
```bash
npx playwright install-deps
npm test