Load from gtfs atlas lookup #36

Closed
opened 2026-03-27 23:09:19 +00:00 by maxtkc · 0 comments
Owner

Summary

Add two new options to the Load dropdown: From URL (paste a direct zip URL) and Search Atlas (search transitland-atlas feeds). A build-time script processes the transitland-atlas DMFR JSON files into a compact, lazily-loaded public/atlas-feeds.json. The UI is transparent — users see exactly which URL they're loading from. The atlas modal shows all available DMFR metadata so users can make informed choices before loading.

Key tradeoffs considered: client-side filtering is fine given the file is ~a few hundred KB after processing; no server needed; the script runs manually and its output is committed.

Relevant Context

Files to modify:

  • src/index.html (lines 163–245) — Load dropdown <ul id="load-dropdown">, add two new <li> items: "From URL" and "Search Atlas"
  • src/modules/ui.tssetupEventListeners() adds click handlers for the two new buttons; loadGTFSFromURL(url) already exists and is the target call for both flows
  • src/modules/modal-utils.tsshowModal() supports an HTML string body + onMount hook; sufficient for "From URL"; atlas search needs a richer custom modal

Files to create:

  • scripts/generate-atlas-data.ts — Node/tsx script: fetch transitland-atlas feeds via GitHub API, extract GTFS static feeds, write public/atlas-feeds.json
  • src/modules/atlas-search.tsshowAtlasSearchModal(): custom modal (not showModal) with search input, lazy-loaded results, click-to-load behavior; returns the selected URL or null
  • public/atlas-feeds.json — committed generated file; lazy-fetched by atlas-search.ts on first open

Key patterns:

  • Dropdown close pattern: closeLoadDropdown() helper in setupEventListeners() blurs the active element
  • URL loading: this.loadGTFSFromURL(url) in UIController
  • Modal creation pattern in modal-utils.ts: document.createElement('div'), append to document.body, remove on close
  • Notification on success/error: notifications.showSuccess/showError()

DMFR schema (transitland-atlas): each JSON file has feeds[] with id, spec (gtfs | gtfs-rt | ...), urls.static_current, name; and operators[] with name, short_name, tags.country_code, tags.metro_area.


Phase 1: Data pipeline — generate atlas-feeds.json

Write a script that pulls the transitland-atlas feed data and emits a compact, searchable JSON file committed to the repo.

Why first: the atlas search UI depends on this file existing; locking the data format before building the UI avoids rework.

Results: 743 DMFR files processed, 3774 GTFS feeds written, 1994 non-GTFS/no-URL entries skipped, 0 errors. Output is 758KB minified. 1363 of 3774 feeds have operator_name populated; location is sparse (many operators don't set metro_area/country_code). Fetches from raw.githubusercontent.com in batches of 10 — unauthenticated is fine at this scale, but GITHUB_TOKEN env var is supported. raw.githubusercontent.com returns JSON directly (no Content-Type negotiation needed) unlike the API endpoint.

  • Create scripts/generate-atlas-data.ts:
    • Fetch the file tree for transitland/transitland-atlas at path feeds/ using the GitHub API (https://api.github.com/repos/transitland/transitland-atlas/git/trees/HEAD?recursive=1), then fetch each .json file under feeds/
    • Parse each DMFR file; for each entry in feeds[], keep only entries where spec === 'gtfs' and urls.static_current is non-empty
    • Extract per-feed: { id: string, name: string, operator_name: string, location: string, url: string }
      • id: the feed's DMFR id field
      • name: feed.name (may be empty — fall back to id)
      • operator_name: join operators[].name for operators associated with this feed (use associated_feeds or just the first operators[] entry per file)
      • location: operator.tags?.metro_area or operator.tags?.country_code or empty string
      • url: feed.urls.static_current
    • Write the array as minified JSON to public/atlas-feeds.json
    • Log how many feeds were written
  • Add script entry to package.json: "atlas": "tsx scripts/generate-atlas-data.ts" (run with npm run atlas)
  • Run npm run atlas and commit the generated public/atlas-feeds.json

Gotchas:

  • GitHub API rate-limits unauthenticated requests to 60/hr; the tree endpoint + bulk file fetches may hit this. Use a GITHUB_TOKEN env var if present (Authorization: Bearer $GITHUB_TOKEN).
  • Some DMFR files may be malformed or missing fields — wrap per-file processing in try/catch and log warnings, don't abort the whole run.
  • operators and feeds can have many-to-many relationships. Keep it simple: for each feed, look for the first operators[] entry in the same DMFR file to get the operator name and location. Don't try to resolve cross-file references.

Phase 2: "From URL" dropdown item + modal

Add a simple "paste a URL" option to the Load dropdown.

Why second: self-contained, no dependencies on Phase 1, and useful standalone.

  • In src/index.html, add a new <li> to #load-dropdown after the Upload item:
    <li>
      <a id="from-url-btn" class="flex items-center gap-2">
        <!-- link icon (use the existing chain/link SVG pattern) -->
        From URL
      </a>
    </li>
    
  • In src/modules/ui.ts setupEventListeners(), add a click handler for #from-url-btn:
    • Call closeLoadDropdown()
    • Call this.showFromURLModal()
  • Add showFromURLModal() method to UIController in ui.ts:
    • Use showModal() from modal-utils.ts
    • Body: <input id="gtfs-url-input" type="url" class="input input-bordered w-full" placeholder="https://example.com/gtfs.zip" />
    • Actions: [{ label: 'Cancel' }, { label: 'Load', className: 'btn-primary', onClick: async () => { ... } }]
    • In onClick: read (document.getElementById('gtfs-url-input') as HTMLInputElement).value, validate non-empty, call this.loadGTFSFromURL(url), return false to close
    • If empty, return true (keep modal open — existing pattern from modal-utils)
    • Use onMount to focus the input and wire up Enter key to trigger Load

Phase 3: "Search Atlas" dropdown item + modal

Add the full atlas search experience: lazy-load the feed list, filter by text input, display all DMFR metadata, click to load.

Why third: depends on Phase 1 (data file) and benefits from the modal pattern established in Phase 2.

  • In src/index.html, add another <li> to #load-dropdown after "From URL":
    <li>
      <a id="atlas-search-btn" class="flex items-center gap-2">
        <!-- search/globe icon -->
        Search Atlas
      </a>
    </li>
    
  • Create src/modules/atlas-search.ts:
    • Define the type: interface AtlasFeed { id: string; name: string; operator_name: string; location: string; url: string }
    • Module-level cache: let cachedFeeds: AtlasFeed[] | null = null
    • async function loadAtlasFeeds(): Promise<AtlasFeed[]> — fetches /atlas-feeds.json once, caches result; shows a loading state in the modal while fetching
    • export async function showAtlasSearchModal(): Promise<string | null> — builds and shows a custom modal, returns the selected URL or null on cancel
    • Modal structure (create DOM manually, like showModal does):
      modal > modal-box (large/max-w-2xl)
        h3: "Search Atlas"
        input[type=search] #atlas-search-input
        div#atlas-results (scrollable, max-h-96, overflow-y-auto)
          → populated by filterAndRender()
        modal-action
          button: Cancel
      
    • filterAndRender(query, feeds): filter feeds where name/operator_name/location/url contains query (case-insensitive); render up to 50 results as clickable rows showing: feed name (bold), operator name, location, URL (truncated, text-xs text-base-content/60); clicking a row resolves the modal with that URL
    • Use debounce (~200ms) on the search input to avoid re-rendering on every keystroke
    • On open: load feeds (show spinner if not cached), then render all results with empty query
  • In src/modules/ui.ts setupEventListeners(), add click handler for #atlas-search-btn:
    • Call closeLoadDropdown()
    • const url = await showAtlasSearchModal()
    • If url is non-null: this.loadGTFSFromURL(url)
  • Import showAtlasSearchModal in ui.ts

Gotchas:

  • The modal must be removed from the DOM on both Cancel and result selection — ensure both paths call document.body.removeChild(modal).
  • Empty state: if no results match, show a "No feeds found" message in #atlas-results rather than leaving it blank.
  • The fetch path /atlas-feeds.json works in both dev (Vite serves public/) and prod (dist/ includes public/ contents).
  • Keep the result list render simple — no virtual scrolling needed for 50-item cap.

Original Issue

This is important for initial demo purposes. At the moment, the two feeds are great, but we should be able to load any feed from transitland atlas: https://github.com/transitland/transitland-atlas To do this, we need to have a searchable list of urls.

Notes:

  • we don't need this list to be always perfectly accurate
  • lets make it a transparent pass through (don't hide anything, so the user knows exactly what they're doing)
  • We need to figure out how to handle things like feed hierarchy from DMFR: https://github.com/transitland/distributed-mobility-feed-registry
    • We can filter out feeds that are not good for whatever reason, there are enough left over and users can always use the url or upload
  • Lets have this as a script that is run manually for now and updates one searchable file. Maybe there is a better way to do this, but this seems fine to me. The feeds directory from the transitland atlas is only 4.9M. Lets load it lazily. Lets optimize to whatever format is easily searchable.

UI: We use a lot of modals, so lets continue that. Load should still be a dropdown with simple options: New feed, Upload, From URL (lets add this while we're at it), and Search Atlas

From url and search atlas should open modals. From url will just be a place where you can paste the url. Search atlas will give a search box and you can select a feed. Lets give the user all the info we have from transitland, then they can click to load

## Summary Add two new options to the Load dropdown: **From URL** (paste a direct zip URL) and **Search Atlas** (search transitland-atlas feeds). A build-time script processes the transitland-atlas DMFR JSON files into a compact, lazily-loaded `public/atlas-feeds.json`. The UI is transparent — users see exactly which URL they're loading from. The atlas modal shows all available DMFR metadata so users can make informed choices before loading. Key tradeoffs considered: client-side filtering is fine given the file is ~a few hundred KB after processing; no server needed; the script runs manually and its output is committed. ## Relevant Context **Files to modify:** - `src/index.html` (lines 163–245) — Load dropdown `<ul id="load-dropdown">`, add two new `<li>` items: "From URL" and "Search Atlas" - `src/modules/ui.ts` — `setupEventListeners()` adds click handlers for the two new buttons; `loadGTFSFromURL(url)` already exists and is the target call for both flows - `src/modules/modal-utils.ts` — `showModal()` supports an HTML string body + `onMount` hook; sufficient for "From URL"; atlas search needs a richer custom modal **Files to create:** - `scripts/generate-atlas-data.ts` — Node/tsx script: fetch transitland-atlas feeds via GitHub API, extract GTFS static feeds, write `public/atlas-feeds.json` - `src/modules/atlas-search.ts` — `showAtlasSearchModal()`: custom modal (not `showModal`) with search input, lazy-loaded results, click-to-load behavior; returns the selected URL or null - `public/atlas-feeds.json` — committed generated file; lazy-fetched by `atlas-search.ts` on first open **Key patterns:** - Dropdown close pattern: `closeLoadDropdown()` helper in `setupEventListeners()` blurs the active element - URL loading: `this.loadGTFSFromURL(url)` in `UIController` - Modal creation pattern in `modal-utils.ts`: `document.createElement('div')`, append to `document.body`, remove on close - Notification on success/error: `notifications.showSuccess/showError()` **DMFR schema (transitland-atlas):** each JSON file has `feeds[]` with `id`, `spec` (`gtfs` | `gtfs-rt` | ...), `urls.static_current`, `name`; and `operators[]` with `name`, `short_name`, `tags.country_code`, `tags.metro_area`. --- ## Phase 1: Data pipeline — generate `atlas-feeds.json` Write a script that pulls the transitland-atlas feed data and emits a compact, searchable JSON file committed to the repo. **Why first:** the atlas search UI depends on this file existing; locking the data format before building the UI avoids rework. **Results:** 743 DMFR files processed, 3774 GTFS feeds written, 1994 non-GTFS/no-URL entries skipped, 0 errors. Output is 758KB minified. 1363 of 3774 feeds have operator_name populated; location is sparse (many operators don't set metro_area/country_code). Fetches from `raw.githubusercontent.com` in batches of 10 — unauthenticated is fine at this scale, but GITHUB_TOKEN env var is supported. `raw.githubusercontent.com` returns JSON directly (no Content-Type negotiation needed) unlike the API endpoint. - [x] Create `scripts/generate-atlas-data.ts`: - Fetch the file tree for `transitland/transitland-atlas` at path `feeds/` using the GitHub API (`https://api.github.com/repos/transitland/transitland-atlas/git/trees/HEAD?recursive=1`), then fetch each `.json` file under `feeds/` - Parse each DMFR file; for each entry in `feeds[]`, keep only entries where `spec === 'gtfs'` and `urls.static_current` is non-empty - Extract per-feed: `{ id: string, name: string, operator_name: string, location: string, url: string }` - `id`: the feed's DMFR `id` field - `name`: `feed.name` (may be empty — fall back to `id`) - `operator_name`: join `operators[].name` for operators associated with this feed (use `associated_feeds` or just the first `operators[]` entry per file) - `location`: `operator.tags?.metro_area` or `operator.tags?.country_code` or empty string - `url`: `feed.urls.static_current` - Write the array as minified JSON to `public/atlas-feeds.json` - Log how many feeds were written - [x] Add script entry to `package.json`: `"atlas": "tsx scripts/generate-atlas-data.ts"` (run with `npm run atlas`) - [x] Run `npm run atlas` and commit the generated `public/atlas-feeds.json` **Gotchas:** - GitHub API rate-limits unauthenticated requests to 60/hr; the tree endpoint + bulk file fetches may hit this. Use a `GITHUB_TOKEN` env var if present (`Authorization: Bearer $GITHUB_TOKEN`). - Some DMFR files may be malformed or missing fields — wrap per-file processing in try/catch and log warnings, don't abort the whole run. - `operators` and `feeds` can have many-to-many relationships. Keep it simple: for each feed, look for the first `operators[]` entry in the same DMFR file to get the operator name and location. Don't try to resolve cross-file references. --- ## Phase 2: "From URL" dropdown item + modal Add a simple "paste a URL" option to the Load dropdown. **Why second:** self-contained, no dependencies on Phase 1, and useful standalone. - [x] In `src/index.html`, add a new `<li>` to `#load-dropdown` after the Upload item: ```html <li> <a id="from-url-btn" class="flex items-center gap-2"> <!-- link icon (use the existing chain/link SVG pattern) --> From URL </a> </li> ``` - [x] In `src/modules/ui.ts` `setupEventListeners()`, add a click handler for `#from-url-btn`: - Call `closeLoadDropdown()` - Call `this.showFromURLModal()` - [x] Add `showFromURLModal()` method to `UIController` in `ui.ts`: - Use `showModal()` from `modal-utils.ts` - Body: `<input id="gtfs-url-input" type="url" class="input input-bordered w-full" placeholder="https://example.com/gtfs.zip" />` - Actions: `[{ label: 'Cancel' }, { label: 'Load', className: 'btn-primary', onClick: async () => { ... } }]` - In `onClick`: read `(document.getElementById('gtfs-url-input') as HTMLInputElement).value`, validate non-empty, call `this.loadGTFSFromURL(url)`, return false to close - If empty, return `true` (keep modal open — existing pattern from modal-utils) - Use `onMount` to focus the input and wire up Enter key to trigger Load --- ## Phase 3: "Search Atlas" dropdown item + modal Add the full atlas search experience: lazy-load the feed list, filter by text input, display all DMFR metadata, click to load. **Why third:** depends on Phase 1 (data file) and benefits from the modal pattern established in Phase 2. - [x] In `src/index.html`, add another `<li>` to `#load-dropdown` after "From URL": ```html <li> <a id="atlas-search-btn" class="flex items-center gap-2"> <!-- search/globe icon --> Search Atlas </a> </li> ``` - [x] Create `src/modules/atlas-search.ts`: - Define the type: `interface AtlasFeed { id: string; name: string; operator_name: string; location: string; url: string }` - Module-level cache: `let cachedFeeds: AtlasFeed[] | null = null` - `async function loadAtlasFeeds(): Promise<AtlasFeed[]>` — fetches `/atlas-feeds.json` once, caches result; shows a loading state in the modal while fetching - `export async function showAtlasSearchModal(): Promise<string | null>` — builds and shows a custom modal, returns the selected URL or null on cancel - **Modal structure** (create DOM manually, like `showModal` does): ``` modal > modal-box (large/max-w-2xl) h3: "Search Atlas" input[type=search] #atlas-search-input div#atlas-results (scrollable, max-h-96, overflow-y-auto) → populated by filterAndRender() modal-action button: Cancel ``` - `filterAndRender(query, feeds)`: filter feeds where name/operator_name/location/url contains query (case-insensitive); render up to 50 results as clickable rows showing: feed name (bold), operator name, location, URL (truncated, `text-xs text-base-content/60`); clicking a row resolves the modal with that URL - Use debounce (~200ms) on the search input to avoid re-rendering on every keystroke - On open: load feeds (show spinner if not cached), then render all results with empty query - [x] In `src/modules/ui.ts` `setupEventListeners()`, add click handler for `#atlas-search-btn`: - Call `closeLoadDropdown()` - `const url = await showAtlasSearchModal()` - If url is non-null: `this.loadGTFSFromURL(url)` - [x] Import `showAtlasSearchModal` in `ui.ts` **Gotchas:** - The modal must be removed from the DOM on both Cancel and result selection — ensure both paths call `document.body.removeChild(modal)`. - Empty state: if no results match, show a "No feeds found" message in `#atlas-results` rather than leaving it blank. - The fetch path `/atlas-feeds.json` works in both dev (Vite serves `public/`) and prod (`dist/` includes `public/` contents). - Keep the result list render simple — no virtual scrolling needed for 50-item cap. --- ## Original Issue This is important for initial demo purposes. At the moment, the two feeds are great, but we should be able to load any feed from transitland atlas: https://github.com/transitland/transitland-atlas To do this, we need to have a searchable list of urls. Notes: - we don't need this list to be always perfectly accurate - lets make it a transparent pass through (don't hide anything, so the user knows exactly what they're doing) - We need to figure out how to handle things like feed hierarchy from DMFR: https://github.com/transitland/distributed-mobility-feed-registry - We can filter out feeds that are not good for whatever reason, there are enough left over and users can always use the url or upload - Lets have this as a script that is run manually for now and updates one searchable file. Maybe there is a better way to do this, but this seems fine to me. The feeds directory from the transitland atlas is only 4.9M. Lets load it lazily. Lets optimize to whatever format is easily searchable. UI: We use a lot of modals, so lets continue that. Load should still be a dropdown with simple options: New feed, Upload, From URL (lets add this while we're at it), and Search Atlas From url and search atlas should open modals. From url will just be a place where you can paste the url. Search atlas will give a search box and you can select a feed. Lets give the user all the info we have from transitland, then they can click to load
maxtkc self-assigned this 2026-03-27 23:09:19 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
gtfs.zone/coloring-book#36
No description provided.