If there is only one agency show all routes #100

Closed
opened 2026-05-04 23:03:07 +00:00 by maxtkc · 0 comments
Owner

Summary

The GTFS spec makes agency_id optional in agency.txt (when there's only one agency) and in routes.txt (required only when multiple agencies exist). The codebase currently breaks for feeds like LIRR where neither the single agency nor its routes carry an agency_id field — routes never appear in the agency view, and agency display can render empty/broken. The fix is threefold: (1) index agency_id in the virtual-table fieldMaps so missing IDs are bucketed as "", (2) create a small utility that encapsulates the "single-agency fallback" logic so it lives in one place, and (3) update every consumer that compares against agency_id to go through that abstraction. Using "" as the sentinel for a missing agency_id is the natural choice because the CSV parser already produces an empty string for blank/absent fields.

Relevant context

  • Virtual table engine (src/modules/gtfs-parser.ts:buildAndRegisterVirtual, ~L190–270): in-memory store used for all CSV-backed tables. Its query() method uses a pre-built fieldMaps index if available (keyed with String(val ?? ''), so undefined → ""), otherwise falls back to a linear scan with strict === equality. The linear scan does NOT coerce undefined to "", so { agency_id: '' } never matches a route whose agency_id field is absent. The fix: add agency_id to fieldMaps for the agency and routes tables in setupVirtual (~L375–388), which forces the indexed path and gets the normalization for free.
  • setupVirtual (src/modules/gtfs-parser.ts:375): only builds fieldMaps for stop_times (trip_id, stop_id) and trips (route_id, service_id). Routes and agency have no fieldMaps — all their queries fall through to the linear scan.
  • getRoutesForAgencyAsync (src/modules/gtfs-relationships.ts:442): queries routes by { agency_id }. In a single-agency feed where routes omit agency_id, this returns nothing.
  • getRoutesForAgency (sync, src/modules/gtfs-relationships.ts:51): same issue — filter(route => route.agency_id === agency_id) never matches undefined.
  • agency-view-controller.ts:176: calls queryRows('routes', { agency_id }) directly — same problem.
  • map-controller.ts (~L720): route.agency_id === agency_id in highlight logic.
  • service-view-controller.ts (~L192): groups routes by route.agency_id || 'default' — partially defensive but inconsistent with the rest.
  • getAgencyDisplay (src/utils/entity-display.ts:9): falls back to { primary: id ?? '' } — renders a blank card for an agency with no id and no name.
  • Home page (src/modules/page-content-renderer.ts:287): sets data-agency-id="${agencyData.agency_id}" — if id is "", navigation still works but display needs to show "Not specified".

Phase 1 — Core abstraction (src/utils/agency-helpers.ts)

Create a new file with pure utilities that all consumers import. This is the single place that documents and encodes the GTFS optionality rule.

  • Create src/utils/agency-helpers.ts:

    • export const UNSPECIFIED_AGENCY_ID = '' as const
    • export function normalizeAgencyId(id: string | undefined | null): string — returns id if truthy, otherwise UNSPECIFIED_AGENCY_ID
    • export function agencyRouteFilter(agencyId: string, agencyCount: number): string[] — returns [agencyId] in the multi-agency case, or [agencyId, UNSPECIFIED_AGENCY_ID] in the single-agency case (deduplicated if agencyId is already ""). This is the canonical logic for "which agency_id values count as belonging to this agency."
  • Update getAgencyDisplay in src/utils/entity-display.ts:

    • When both agency_name and agency_id are falsy/empty, return { primary: 'Not specified' } instead of an empty string.

Phase 2 — Virtual table indexing (src/modules/gtfs-parser.ts)

Fix setupVirtual so agency and route lookups by agency_id use the indexed path (which normalises undefined → ""), not the strict-equality linear scan.

  • In setupVirtual (~L378), add branches for agency and routes:
    } else if (tableName === 'agency') {
      fieldMaps.set('agency_id', new Map());
    } else if (tableName === 'routes') {
      fieldMaps.set('agency_id', new Map());
    }
    
  • Both changes are purely additive — no existing data is mutated.

Gotcha: The fieldMap key is built with String(val ?? ''), so a row where agency_id is undefined ends up in the "" bucket. A queryRows('routes', { agency_id: '' }) call will then correctly find those rows via the map path.

Phase 3 — Fix relationship query methods

All methods that fetch routes for an agency must apply the single-agency fallback.

src/modules/gtfs-relationships.ts:

  • getRoutesForAgency(agency_id) (sync, ~L51): count agencies via this.gtfsParser.getFileDataSync('agency.txt').length. Use agencyRouteFilter to get the set of accepted agency_id values; filter routes against that set.
  • getRoutesForAgencyAsync(agency_id) (~L442): await this.gtfsDatabase.getAllRows('agency') to get agency count, call agencyRouteFilter, then union two queryRows calls (one per accepted id).
  • getAgenciesAsync() (~L415): normalize agency_id via normalizeAgencyId when constructing the returned objects' id and agency_id fields.
  • getAgencyByIdAsync(agency_id) (~L984): normalize the incoming agency_id with normalizeAgencyId before passing to queryRows.

src/modules/agency-view-controller.ts:

  • getAgencyData(agency_id) (~L148): normalize with normalizeAgencyId before queryRows.
  • getRoutesForAgency(agency_id) (~L176): fetch agency count via queryRows('agency'), apply agencyRouteFilter, union results with Promise.all.

Notes: normalizeAgencyId signature widened to accept number | boolean | undefined | null to match GTFSDatabaseRecord value types. QueryOnlyDatabase has only queryRows (no getAllRows), so agency count is obtained via queryRows('agency') with no filter.

Phase 4 — Fix remaining consumers

Any other location that compares route.agency_id or agency.agency_id directly.

  • src/modules/map-controller.ts (highlightAgencyRoutes, ~L720): replace .filter(route => route.agency_id === agency_id) with a filter using normalizeAgencyId on both sides, respecting the single-agency case via agencyRouteFilter (pass agency count or fetch it).
  • src/modules/service-view-controller.ts (~L192): replace route.agency_id || 'default' with normalizeAgencyId(route.agency_id) for consistency. Also fixed the agency lookup key in routesByAgency.get to use normalizeAgencyId(agency.agency_id).
  • src/modules/stop-view-controller.ts (~L125): same — route.agency_id || 'default'normalizeAgencyId(route.agency_id). Also fixed the agency lookup key.
  • src/modules/page-content-renderer.ts (~L287): wrap agencyData.agency_id in normalizeAgencyId(agencyData.agency_id) for the data-agency-id attribute so clicking navigates to "" rather than "undefined".

Phase 5 — Fix service discovery on home page

The getServices() method in src/modules/page-content-renderer.ts (~L405) only reads from the calendar table. GTFS allows services to be defined exclusively in calendar_dates.txt (no calendar.txt row required). For such feeds the home page shows "No services found in GTFS data." even though trips reference valid service IDs.

src/modules/page-content-renderer.ts:

  • In getServices() (~L405): after fetching calendar rows, also fetch all rows from calendar_dates via this.dependencies.gtfsDatabase.getAllRows('calendar_dates'). Build a Set<string> of service_ids already covered by calendar rows. For each calendar_dates row whose service_id is not in the set, synthesize a minimal { service_id } record and add it to the result array. Return the merged array.
  • No change needed to the rendering code — getServiceDisplay already falls back to service_id as the label, so synthesized records render correctly.

Gotcha: calendar_dates has multiple rows per service_id (one per date exception). Deduplication is essential — use the Set approach above rather than a Map to avoid returning duplicate cards.

Testing notes

  • Manual: load LIRR feed. Home page should show one agency card labelled by name (or "Not specified" if name is also absent). Clicking it should show all LIRR routes. Map highlight should work.
  • Home page should also show all service cards — including any services defined only in calendar_dates.txt with no corresponding calendar.txt row.
  • Also verify a multi-agency feed is unaffected.
  • Edge case: agency with no agency_id but with a name — card shows name as primary, no secondary.

Original Issue

https://gtfs.org/documentation/schedule/reference/#routestxt

agency_id Foreign ID referencing agency.agency_id Conditionally Required Agency for the specified route.

Conditionally Required:

  • Required if multiple agencies are defined in agency.txt.
  • Recommended otherwise.

I think that every foreign key to agency is optional. It at least applies to services as well. Even in agency.txt, the id is not required, so lets make sure that we safely handle that.

In the routes list, we should show it properly. Lets make sure to abstract out this edge case because it's a tricky one to test. Use LIRR for manual test.

## Summary The GTFS spec makes `agency_id` optional in `agency.txt` (when there's only one agency) and in `routes.txt` (required only when multiple agencies exist). The codebase currently breaks for feeds like LIRR where neither the single agency nor its routes carry an `agency_id` field — routes never appear in the agency view, and agency display can render empty/broken. The fix is threefold: (1) index `agency_id` in the virtual-table fieldMaps so missing IDs are bucketed as `""`, (2) create a small utility that encapsulates the "single-agency fallback" logic so it lives in one place, and (3) update every consumer that compares against `agency_id` to go through that abstraction. Using `""` as the sentinel for a missing agency_id is the natural choice because the CSV parser already produces an empty string for blank/absent fields. ## Relevant context - **Virtual table engine** (`src/modules/gtfs-parser.ts:buildAndRegisterVirtual`, ~L190–270): in-memory store used for all CSV-backed tables. Its `query()` method uses a pre-built `fieldMaps` index if available (keyed with `String(val ?? '')`, so undefined → `""`), otherwise falls back to a linear scan with strict `===` equality. The linear scan does NOT coerce `undefined` to `""`, so `{ agency_id: '' }` never matches a route whose `agency_id` field is absent. The fix: add `agency_id` to `fieldMaps` for the `agency` and `routes` tables in `setupVirtual` (~L375–388), which forces the indexed path and gets the normalization for free. - **`setupVirtual` (`src/modules/gtfs-parser.ts:375`)**: only builds fieldMaps for `stop_times` (trip_id, stop_id) and `trips` (route_id, service_id). Routes and agency have no fieldMaps — all their queries fall through to the linear scan. - **`getRoutesForAgencyAsync` (`src/modules/gtfs-relationships.ts:442`)**: queries `routes` by `{ agency_id }`. In a single-agency feed where routes omit `agency_id`, this returns nothing. - **`getRoutesForAgency` (sync, `src/modules/gtfs-relationships.ts:51`)**: same issue — `filter(route => route.agency_id === agency_id)` never matches `undefined`. - **`agency-view-controller.ts:176`**: calls `queryRows('routes', { agency_id })` directly — same problem. - **`map-controller.ts` (~L720)**: `route.agency_id === agency_id` in highlight logic. - **`service-view-controller.ts` (~L192)**: groups routes by `route.agency_id || 'default'` — partially defensive but inconsistent with the rest. - **`getAgencyDisplay` (`src/utils/entity-display.ts:9`)**: falls back to `{ primary: id ?? '' }` — renders a blank card for an agency with no id and no name. - **Home page (`src/modules/page-content-renderer.ts:287`)**: sets `data-agency-id="${agencyData.agency_id}"` — if id is `""`, navigation still works but display needs to show "Not specified". ## Phase 1 — Core abstraction (`src/utils/agency-helpers.ts`) Create a new file with pure utilities that all consumers import. This is the single place that documents and encodes the GTFS optionality rule. - [x] Create `src/utils/agency-helpers.ts`: - `export const UNSPECIFIED_AGENCY_ID = '' as const` - `export function normalizeAgencyId(id: string | undefined | null): string` — returns `id` if truthy, otherwise `UNSPECIFIED_AGENCY_ID` - `export function agencyRouteFilter(agencyId: string, agencyCount: number): string[]` — returns `[agencyId]` in the multi-agency case, or `[agencyId, UNSPECIFIED_AGENCY_ID]` in the single-agency case (deduplicated if agencyId is already `""`). This is the canonical logic for "which agency_id values count as belonging to this agency." - [x] Update `getAgencyDisplay` in `src/utils/entity-display.ts`: - When both `agency_name` and `agency_id` are falsy/empty, return `{ primary: 'Not specified' }` instead of an empty string. ## Phase 2 — Virtual table indexing (`src/modules/gtfs-parser.ts`) Fix `setupVirtual` so agency and route lookups by `agency_id` use the indexed path (which normalises undefined → `""`), not the strict-equality linear scan. - [x] In `setupVirtual` (~L378), add branches for `agency` and `routes`: ```ts } else if (tableName === 'agency') { fieldMaps.set('agency_id', new Map()); } else if (tableName === 'routes') { fieldMaps.set('agency_id', new Map()); } ``` - Both changes are purely additive — no existing data is mutated. **Gotcha**: The fieldMap key is built with `String(val ?? '')`, so a row where `agency_id` is `undefined` ends up in the `""` bucket. A `queryRows('routes', { agency_id: '' })` call will then correctly find those rows via the map path. ## Phase 3 — Fix relationship query methods All methods that fetch routes for an agency must apply the single-agency fallback. **`src/modules/gtfs-relationships.ts`**: - [x] `getRoutesForAgency(agency_id)` (sync, ~L51): count agencies via `this.gtfsParser.getFileDataSync('agency.txt').length`. Use `agencyRouteFilter` to get the set of accepted agency_id values; filter routes against that set. - [x] `getRoutesForAgencyAsync(agency_id)` (~L442): `await this.gtfsDatabase.getAllRows('agency')` to get agency count, call `agencyRouteFilter`, then union two `queryRows` calls (one per accepted id). - [x] `getAgenciesAsync()` (~L415): normalize `agency_id` via `normalizeAgencyId` when constructing the returned objects' `id` and `agency_id` fields. - [x] `getAgencyByIdAsync(agency_id)` (~L984): normalize the incoming `agency_id` with `normalizeAgencyId` before passing to `queryRows`. **`src/modules/agency-view-controller.ts`**: - [x] `getAgencyData(agency_id)` (~L148): normalize with `normalizeAgencyId` before `queryRows`. - [x] `getRoutesForAgency(agency_id)` (~L176): fetch agency count via `queryRows('agency')`, apply `agencyRouteFilter`, union results with Promise.all. **Notes**: `normalizeAgencyId` signature widened to accept `number | boolean | undefined | null` to match `GTFSDatabaseRecord` value types. `QueryOnlyDatabase` has only `queryRows` (no `getAllRows`), so agency count is obtained via `queryRows('agency')` with no filter. ## Phase 4 — Fix remaining consumers Any other location that compares `route.agency_id` or `agency.agency_id` directly. - [x] **`src/modules/map-controller.ts`** (`highlightAgencyRoutes`, ~L720): replace `.filter(route => route.agency_id === agency_id)` with a filter using `normalizeAgencyId` on both sides, respecting the single-agency case via `agencyRouteFilter` (pass agency count or fetch it). - [x] **`src/modules/service-view-controller.ts`** (~L192): replace `route.agency_id || 'default'` with `normalizeAgencyId(route.agency_id)` for consistency. Also fixed the agency lookup key in `routesByAgency.get` to use `normalizeAgencyId(agency.agency_id)`. - [x] **`src/modules/stop-view-controller.ts`** (~L125): same — `route.agency_id || 'default'` → `normalizeAgencyId(route.agency_id)`. Also fixed the agency lookup key. - [x] **`src/modules/page-content-renderer.ts`** (~L287): wrap `agencyData.agency_id` in `normalizeAgencyId(agencyData.agency_id)` for the `data-agency-id` attribute so clicking navigates to `""` rather than `"undefined"`. ## Phase 5 — Fix service discovery on home page The `getServices()` method in `src/modules/page-content-renderer.ts` (~L405) only reads from the `calendar` table. GTFS allows services to be defined exclusively in `calendar_dates.txt` (no `calendar.txt` row required). For such feeds the home page shows "No services found in GTFS data." even though trips reference valid service IDs. **`src/modules/page-content-renderer.ts`**: - [x] In `getServices()` (~L405): after fetching `calendar` rows, also fetch all rows from `calendar_dates` via `this.dependencies.gtfsDatabase.getAllRows('calendar_dates')`. Build a `Set<string>` of `service_id`s already covered by `calendar` rows. For each `calendar_dates` row whose `service_id` is not in the set, synthesize a minimal `{ service_id }` record and add it to the result array. Return the merged array. - No change needed to the rendering code — `getServiceDisplay` already falls back to `service_id` as the label, so synthesized records render correctly. **Gotcha**: `calendar_dates` has multiple rows per `service_id` (one per date exception). Deduplication is essential — use the `Set` approach above rather than a `Map` to avoid returning duplicate cards. ## Testing notes - Manual: load LIRR feed. Home page should show one agency card labelled by name (or "Not specified" if name is also absent). Clicking it should show all LIRR routes. Map highlight should work. - Home page should also show all service cards — including any services defined only in `calendar_dates.txt` with no corresponding `calendar.txt` row. - Also verify a multi-agency feed is unaffected. - Edge case: agency with no `agency_id` but with a name — card shows name as primary, no secondary. --- ## Original Issue https://gtfs.org/documentation/schedule/reference/#routestxt agency_id Foreign ID referencing agency.agency_id Conditionally Required Agency for the specified route. Conditionally Required: - Required if multiple agencies are defined in agency.txt. - Recommended otherwise. I think that every foreign key to agency is optional. It at least applies to services as well. Even in agency.txt, the id is not required, so lets make sure that we safely handle that. In the routes list, we should show it properly. Lets make sure to abstract out this edge case because it's a tricky one to test. Use LIRR for manual test.
maxtkc self-assigned this 2026-05-04 23:03:07 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
gtfs.zone/coloring-book#100
No description provided.