Testing Philosophy

Understanding our approach to testing NHL Scrabble.

Testing Goals

Our tests aim to:

Verify correctness - Does the code work as intended?
Prevent regressions - Does old code still work?
Document behavior - What is the expected behavior?
Enable refactoring - Can we change code safely?

Test Pyramid

           /\
          /  \         Integration (20%)
         /____\           - Full workflows
        /      \          - Component interactions
       /  Unit  \         - Mocked external APIs
      /__________\
     Unit (80%)
     - Individual functions
     - Isolated components
     - Fast execution

Why this distribution?

Unit tests (80%):

Fast to run (<1s for all)
Easy to write
Pinpoint failures
Test one thing at a time

Integration tests (20%):

Verify components work together
Test real user workflows
Catch integration bugs
Slower but comprehensive

No E2E tests:

CLI tool, not web app
Integration tests cover full workflow
Would be slow and fragile

What We Test

Core Logic (100% coverage goal)

ScrabbleScorer - Letter value calculations
TeamProcessor - Score aggregation
PlayoffCalculator - Playoff seeding logic

Why?: This is our unique value - must be correct.

API Client (Integration tests)

Mock NHL API responses
Test retry logic
Test rate limiting
Test error handling

Why?: External dependency - can’t rely on real API.

CLI (Integration tests)

Test command invocation
Test option parsing
Test output formatting
Test error messages

Why?: User-facing - must work correctly.

Reports (Unit tests)

Test formatting logic
Test data transformations
Test edge cases

Why?: Complex formatting - easy to break.

What We Don’t Test

External APIs

Don’t test:

NHL API responses
Network behavior
Third-party libraries

Why?:

Out of our control
Would be slow and flaky
Mock instead

Framework Code

Don’t test:

Click argument parsing
Pydantic validation
Python standard library

Why?:

Already tested by authors
Trust mature libraries
Focus on our code

Mocking Strategy

When to Mock

Mock external dependencies:

NHL API calls
Network requests
File system operations
Time/dates

Don’t mock:

Our own code
Data structures
Pure functions

How We Mock

Using pytest-mock and unittest.mock:

def test_fetch_roster(mocker):
    # Mock external API
    mock_get = mocker.patch("requests.get")
    mock_get.return_value.json.return_value = {...}

    # Test our code
    client = NHLApiClient()
    roster = client.fetch_roster("TOR")

    # Verify
    assert len(roster) == 25
    mock_get.assert_called_once()

Test Organization

tests/
├── unit/                   # Fast, isolated unit tests
│   ├── test_scrabble.py    # Scoring logic
│   ├── test_processor.py   # Business logic
│   └── test_models.py      # Data models
├── integration/            # Slower integration tests
│   ├── test_cli.py         # CLI workflows
│   └── test_api_client.py  # API integration
└── conftest.py             # Shared fixtures

Naming Conventions

def test_<function>_<scenario>_<expected>():
    """Test that <function> <expected> when <scenario>."""
    # Arrange
    # Act
    # Assert

Examples:

def test_calculate_score_simple_name_returns_correct_value():
    """Test that calculate_score returns correct value for simple name."""


def test_fetch_roster_api_failure_raises_exception():
    """Test that fetch_roster raises exception on API failure."""

Coverage Targets

Core logic: 100% coverage required
Reports: >90% coverage
Utilities: >80% coverage
Overall: >80% coverage

Check coverage:

pytest --cov
pytest --cov --cov-report=html

Test Fixtures

Reusable test data in conftest.py:

@pytest.fixture
def sample_player():
    """Sample player for testing."""
    return Player(
        firstName="Alexander", lastName="Ovechkin", sweaterNumber=8, positionCode="L"
    )


@pytest.fixture
def mock_nhl_api(mocker):
    """Mock NHL API responses."""
    mock_get = mocker.patch("requests.get")
    mock_get.return_value.json.return_value = {...}
    return mock_get

Parametrized Tests

Test multiple inputs efficiently:

@pytest.mark.parametrize(
    "name,expected",
    [
        ("A", 1),
        ("Z", 10),
        ("TEST", 4),
        ("HELLO", 8),
    ],
)
def test_scores(name, expected):
    scorer = ScrabbleScorer()
    assert scorer.calculate_score(name) == expected

Continuous Integration

Tests run automatically on:

Every commit (pre-commit hooks)
Every push (GitHub Actions)
Every PR (required to pass)
Multiple Python versions (3.12, 3.13, 3.14, 3.15)

CI workflow:

  - Run pre-commit hooks
  - Run pytest on py3.12, 3.13, 3.14, 3.15
  - Check coverage >80%
  - Run type checking (mypy)
  - Run linting (ruff)

Trade-offs

High Coverage vs Speed

Chose high coverage:

✅ Catch more bugs
✅ Safe refactoring
❌ Slower test suite

Currently: 131 tests in ~25s (acceptable)

Mocking vs Real APIs

Chose mocking:

✅ Fast and reliable
✅ Test edge cases
❌ May miss real API changes

Mitigation: Occasional manual testing with real API

Unit vs Integration

Chose more unit tests:

✅ Faster feedback
✅ Easier debugging
❌ May miss integration bugs

Mitigation: Comprehensive integration tests for critical paths

Future Improvements

Property-Based Testing

Use Hypothesis for generative testing:

from hypothesis import given
from hypothesis.strategies import text


@given(text(alphabet=string.ascii_letters))
def test_score_any_text(s):
    scorer = ScrabbleScorer()
    assert scorer.calculate_score(s) >= 0

Mutation Testing

Verify tests actually catch bugs:

mutmut run  # Introduce mutations
mutmut results  # See if tests caught them

Performance Testing

Benchmark critical paths:

def test_performance_benchmark(benchmark):
    result = benchmark(scorer.calculate_score, "LONGNAME" * 100)
    assert result > 0