Architecture & Internals¶
This document explains how the pretalx-client package is structured, why it is
structured that way, and what you need to know to work on it effectively.
Three-layer design¶
The package is split into three layers. Each has a distinct responsibility and a clear boundary with the layer above it.
Your code
|
v
+--------------------------+
| PretalxClient | client.py + models.py
| (public API) | Frozen dataclasses, typed methods
+--------------------------+
| delegates HTTP to
v
+--------------------------+
| adapters/ | normalization.py, schedule.py, talks.py
| (normalization) | Multilingual fields, ID resolution, fallback logic
+--------------------------+
| consumes raw dicts from
v
+--------------------------+
| generated/ | http_client.py + models.py
| (auto-generated) | One method per OpenAPI endpoint, raw dataclasses
+--------------------------+
| uses
v
httpx
|
v
Pretalx REST API
Data flows down as method calls and up as raw dicts that get progressively refined into typed, frozen dataclasses.
Layer 1: Generated (generated/)¶
Everything in this directory is machine-produced. Two scripts in the parent
django-program repo handle regeneration:
scripts/pretalx/generate_client.pyrunsdatamodel-code-generatoragainstschemas/pretalx/schema.ymlto producegenerated/models.py. This file contains plain dataclasses likeSubmission,Speaker,TalkSlot, andRoomthat mirror the OpenAPI component schemas.scripts/pretalx/generate_http_client.pyreads the same schema and emitsgenerated/http_client.pycontainingGeneratedPretalxClient. This class has one method peroperationIdin the spec – things likespeakers_list(),submissions_list(),slots_list(), androoms_list(). Every method returns rawdict[str, Any]orlist[dict[str, Any]].
GeneratedPretalxClient also provides the pagination and error-handling
primitives that the rest of the package depends on:
Method |
Behavior |
|---|---|
|
Single HTTP request, raises |
|
Same, but returns |
|
Follows |
|
Same, but returns |
Pagination works by reusing a single httpx.Client session and clearing query
params after the first request (subsequent pages encode params in the next
URL).
Do not edit generated files by hand. Regenerate them:
# From the django-program root:
uv run python scripts/pretalx/generate_client.py # models.py
uv run python scripts/pretalx/generate_http_client.py # http_client.py
The generated/__init__.py re-exports the types that the adapter and client
layers actually use, aliased with Generated prefixes to avoid name collisions:
from pretalx_client.generated.models import Speaker as GeneratedSpeaker
from pretalx_client.generated.models import Submission as GeneratedSubmission
from pretalx_client.generated.models import TalkSlot as GeneratedTalkSlot
# etc.
Layer 2: Adapters (adapters/)¶
The Pretalx API has a few behaviors that make a raw generated client painful to use directly. The adapter layer exists to absorb those quirks.
normalization.py – multilingual fields and ID resolution¶
Pretalx represents localizable text in two ways depending on the endpoint and authentication level:
# Sometimes a plain string:
{"name": "Main Hall"}
# Sometimes a language-keyed dict:
{"name": {"en": "Main Hall", "de": "Hauptsaal"}}
localized() handles both. It prefers the "en" key, falls back to the first
available value, and returns "" for None. It also recurses into nested
{"name": {...}} structures that appear in some responses.
A related problem is foreign-key fields. The real (authenticated) API returns
integer IDs for submission_type, track, room, and tags. The public API
sometimes returns inline objects instead. resolve_id_or_localized() accepts
either form:
# Integer ID with a mapping dict:
resolve_id_or_localized(42, {42: "Tutorial"}) # "Tutorial"
# Localized dict without a mapping:
resolve_id_or_localized({"en": "Tutorial"}) # "Tutorial"
# None:
resolve_id_or_localized(None) # ""
resolve_many_ids_or_localized() does the same for lists (used by tags).
schedule.py – slot normalization and datetime parsing¶
Two slot formats exist in the wild:
Field |
Legacy format ( |
Paginated format ( |
|---|---|---|
Room |
|
|
Talk ref |
|
|
Title |
|
Not present |
normalize_slot() unifies both into a consistent dict with keys room,
start, end, code, title, start_dt, and end_dt. The room value goes
through resolve_id_or_localized(), so it works with both strings and IDs.
parse_datetime() is a thin wrapper around datetime.fromisoformat() that
returns None instead of raising on bad input.
talks.py – the talks endpoint fallback¶
This adapter encapsulates a real-world compatibility problem.
fetch_talks_with_fallback() implements a two-step strategy:
Try
GET /api/events/{slug}/talks/If that returns 404, fall back to fetching
GET /api/events/{slug}/submissions/?state=confirmedandGET /api/events/{slug}/submissions/?state=accepted, then concatenate the results.
The fallback exists because some Pretalx instances – including PyCon US – do
not expose the /talks/ endpoint at all. The submissions endpoint returns the
same data in a slightly different shape, and the rest of the normalization
pipeline handles the differences.
The function takes a PretalxClient instance and calls its _get_paginated()
and _get_paginated_or_none() methods directly. It returns raw dicts; the
caller is responsible for converting them into PretalxTalk instances.
Layer 3: Client (client.py + models.py)¶
This is the public API. Consumers import PretalxClient, PretalxSpeaker,
PretalxTalk, PretalxSlot, and SubmissionState from the package root.
PretalxClient¶
Constructed with an event slug, optional base URL, and optional API token:
client = PretalxClient("pycon-us-2026", api_token="abc123")
Internally it creates a GeneratedPretalxClient and delegates all HTTP through
it. The public methods – fetch_speakers(), fetch_talks(),
fetch_schedule(), etc. – call generated methods to get raw dicts, then pass
them through the appropriate from_api() classmethods on the model dataclasses.
For talks specifically, fetch_talks() routes through
fetch_talks_with_fallback() in the adapter layer before constructing
PretalxTalk instances.
The client also provides _fetch_id_name_mapping(), which fetches lookup tables
from endpoints like /rooms/, /submission-types/, /tracks/, and /tags/.
These mappings are dict[int, str] and get passed into from_api() calls so
integer IDs can be resolved to human-readable names.
Model dataclasses¶
All three model classes (PretalxSpeaker, PretalxTalk, PretalxSlot) are
frozen, slotted dataclasses. Each has a from_api() classmethod that:
Tries to parse the raw dict through the corresponding generated dataclass via
_parse_generated().If that succeeds, extracts and normalizes fields from the generated instance.
If that fails (returns
None), falls back to direct dict access with sensible defaults.
This two-path design is the core resilience mechanism. The generated dataclasses capture the “expected” API shape from the OpenAPI spec, but the real API frequently deviates – extra fields, missing fields, different types for the same field across endpoints. The fallback path handles all of that.
_parse_generated()¶
This function is worth understanding in detail:
def _parse_generated[T](cls: type[T], data: dict[str, Any]) -> T | None:
field_names = {f.name for f in _dc.fields(cls)}
filtered = {k: v for k, v in data.items() if k in field_names}
return cls(**filtered) # or None on any exception
It filters the raw dict to only the fields the generated dataclass declares,
then tries to construct an instance. If construction fails for any reason –
wrong types, missing required fields, unexpected values – it returns None and
logs a debug message. The caller then falls back to manual dict extraction.
This makes the package forward-compatible with API changes: new fields get ignored, removed fields trigger the fallback, and renamed fields still work through the dict path.
Request flow in practice¶
A typical fetch_talks() call traverses the entire stack:
client.fetch_talks(submission_types={1: "Talk"}, rooms={7: "Main Hall"})
|
+--> fetch_talks_with_fallback(client) [adapters/talks.py]
| |
| +--> client._get_paginated_or_none() [client.py]
| | |
| | +--> self._http._paginate_or_none() [generated/http_client.py]
| | |
| | +--> httpx.Client.get() GET /api/events/{slug}/talks/
| | |
| | +--> (follows "next" links until exhausted)
| |
| +--> (if 404: fall back to /submissions/?state=confirmed + accepted)
| |
| +--> returns list[dict]
|
+--> for each raw dict:
|
+--> PretalxTalk.from_api(item, submission_types=..., rooms=...) [models.py]
|
+--> _parse_generated(GeneratedSubmission, item)
| |
| +--> filters dict to Submission fields
| +--> constructs GeneratedSubmission or returns None
|
+--> resolve_id_or_localized(raw.submission_type, {1: "Talk"})
| [adapters/normalization.py]
|
+--> returns frozen PretalxTalk(code="ABC", title="...", ...)
Key design decisions¶
Why generated code at all?¶
The Pretalx API has 50+ endpoints. Writing HTTP methods for each by hand is tedious and error-prone. The generated client guarantees complete endpoint coverage and correct URL construction. When Pretalx adds endpoints, regenerating picks them up automatically.
Why not use the generated models directly?¶
Three reasons:
API inconsistency. The OpenAPI spec describes the “ideal” response shape. Real responses diverge:
avatarvsavatar_url, speakers as strings vs dicts, integer IDs vs inline objects. The handwritten models absorb this.Consumer ergonomics. The generated
Submissiondataclass has 17 fields includingslots: list[int]andanswers: list[int]that consumers never need.PretalxTalkhas 12 fields, all resolved to human-readable strings.Stability. Regenerating the generated layer is safe because the public models are a separate, handwritten contract. Internal refactors to the generated code don’t break consumers.
Why frozen dataclasses?¶
Immutability. Once a PretalxTalk is constructed, it cannot be accidentally
mutated. This makes the objects safe to cache, pass between threads, and use as
dict keys. The slots=True option reduces memory footprint.
Why not pydantic?¶
This package depends only on httpx. No pydantic, no attrs, no Django. Stdlib
dataclasses plus datamodel-code-generator keeps the dependency tree minimal,
which matters because this package is also used as a standalone library outside
of django-program.
File reference¶
File |
Purpose |
|---|---|
|
Re-exports |
|
|
|
Frozen dataclasses with |
|
Re-exports adapter functions |
|
|
|
|
|
|
|
Re-exports generated types with |
|
|
|
Generated dataclasses ( |