NHL API Strategy

Understanding our approach to NHL API integration.

The NHL API

NHL provides a public JSON API at api-web.nhle.com with endpoints for:

Current standings
Team rosters
Player statistics
Game schedules
And more

Why This API?

Alternatives Considered

Official NHL Stats API (stats.nhl.com):

More comprehensive data
Better documented
❌ More complex
❌ Requires API key (authentication)

Third-party APIs (ESPN, The Sports DB):

Easier to use
❌ Less reliable
❌ May have usage limits
❌ Not official source

Web Scraping:

Could get any data
❌ Fragile (breaks when HTML changes)
❌ Violates terms of service
❌ Unethical

Chose api-web.nhle.com:

✅ Official NHL source
✅ No authentication required
✅ Simple JSON responses
✅ Reliable uptime
✅ Current roster data

Integration Approach

1. Retry Logic

Network requests can fail. We retry automatically:

@retry(
    stop=stop_after_attempt(3),
    wait=wait_exponential(multiplier=1, min=2, max=10),
    retry=retry_if_exception_type((RequestException, Timeout)),
)
def _fetch_with_retry(self, url: str) -> dict:
    response = self.session.get(url, timeout=self.timeout)
    response.raise_for_status()
    return response.json()

Why?:

Temporary network blips
Server momentary overload
Connection resets

Strategy: Exponential backoff (wait longer each retry)

2. Rate Limiting

We delay between requests to be polite:

def fetch_roster(self, team_abbrev: str) -> list[Player]:
    time.sleep(self.rate_limit_delay)  # Default 0.3s
    return self._fetch_with_retry(url)

Why?:

Respect API provider
Avoid being blocked
Distribute load over time

Trade-off: Slower (but polite and reliable)

3. Caching

We cache API responses in memory:

@lru_cache(maxsize=128)
def fetch_standings(self) -> list[Team]:
    return self._fetch_standings_uncached()

Why?:

Reduce redundant requests
Faster subsequent runs
Less load on NHL servers

Trade-off: May show slightly stale data

4. Error Handling

We handle errors gracefully:

try:
    roster = api_client.fetch_roster(team_abbrev)
except NHLApiError as e:
    logger.error(f"Failed to fetch {team_abbrev}: {e}")
    failed_teams.append(team_abbrev)
    continue  # Process other teams

Why?:

One team failure shouldn’t break entire analysis
User gets partial results
Failures are logged for debugging

Data Validation

We validate API responses using Pydantic:

class Player(BaseModel):
    firstName: str = Field(..., min_length=1)
    lastName: str = Field(..., min_length=1)
    sweaterNumber: int | None = None
    positionCode: str

Why?:

Catch API changes early
Type-safe data
Self-documenting code
Clear error messages

Trade-offs

Synchronous vs Async

Chose synchronous:

✅ Simpler code
✅ Easier to understand
✅ Sufficient performance
❌ Could be faster with async

Could optimize: Fetch all rosters in parallel with asyncio (~3-5s instead of ~10-15s)

Caching Strategy

Chose in-memory LRU cache:

✅ Simple implementation
✅ Works for CLI use
❌ Cache lost between runs
❌ Not shared across instances

Could optimize: Redis cache for persistent, shared caching

Error Handling Philosophy

Chose graceful degradation:

✅ Partial results better than none
✅ User can see what succeeded
❌ May be unexpected for users

Alternative: Fail fast on any error (more predictable but less useful)

Future Improvements

1. Async/Parallel Fetching

async def fetch_all_rosters(self, teams):
    tasks = [self.fetch_roster(t) for t in teams]
    return await asyncio.gather(*tasks)

Could reduce runtime from ~15s to ~3-5s.

2. Persistent Caching

import redis

cache = redis.Redis()

@cache_with_redis(expiry=3600)
def fetch_standings(): ...

Share cache across runs and users.

3. Webhook/Streaming

Subscribe to roster changes instead of polling:

@on_roster_change
def handle_roster_update(team, changes):
    # Update scores automatically
    ...

Real-time updates without constant polling.

4. GraphQL Alternative

If NHL provides GraphQL:

query {
  teams {
    abbreviation
    roster {
      firstName
      lastName
    }
  }
}

More efficient - fetch exactly what we need.

NHL API Strategy

The NHL API

Why This API?

Alternatives Considered

Integration Approach

1. Retry Logic

2. Rate Limiting

3. Caching

4. Error Handling

Data Validation

Trade-offs

Synchronous vs Async

Caching Strategy

Error Handling Philosophy

Future Improvements

1. Async/Parallel Fetching

2. Persistent Caching

3. Webhook/Streaming

4. GraphQL Alternative

Related