Onboarding & Data Import¶
Status — 2026-04-24 · v0.2.0 shipped. This file was the design spec written before any code landed. Everything it describes is now implemented and on
main. Treat this doc as historical: it documents the decisions, not the current shape of the code.If you want the current flow, read in this order:
README.md→ Quickstart — the working commands.docs/DEPLOYMENT.md→ Quick install (via wizard) — the same thing with more context.fixtures/phase6/SESSION_STATE.md— the live cross-session tracker. What actually shipped in which commit.The sections below still accurately describe the design intent, but specific API examples (e.g.
python -m throughline install— actual command ispython install.pyorpython -m throughline_cli install) may read differently from the final shipped surface. Commands likeimport sample,taxonomy review,doctorlanded after this doc was written.
(Original design spec below)¶
Planning doc for v0.2.0. Captures how a new user (empty vault OR existing-chat-history user) gets from
git cloneto a working flywheel. Nothing below is implemented yet — this is the design spec before any code lands.v0.2.0 scope principle (2026-04-23): every onboarding decision happens inside a single
python -m throughline installwizard (13 steps, 1 default per step, all Enter for sensible defaults). No separateimport,configure,setupcommands for v0.2.0 — one entry point. The wizard is the deliverable.
python -m throughline install — the single entry point¶
The v0.2.0 onboarding is one wizard that collects every user decision
in ordered steps. Each step has a recommended default. A fully
hands-off user on the default path presses Enter 16 times and lands
on a working config. Two early branches (step 2 Mission and
step 5 Privacy) can drop the effective step count to 9-10 for
specialised missions. Re-running (python -m throughline reconfigure)
lets the user revisit any step; state persists to
~/.throughline/config.toml.
[1/16] Python + venv + dependencies check
[2/16] Mission — Full flywheel / RAG-only / Notes-only ← early branch
[3/16] Vector DB — Qdrant / Chroma / LanceDB / DuckDB-VSS
[skipped if Notes-only]
[4/16] LLM provider API key (entered here, never persisted to a
git-tracked path)
[5/16] LLM provider matrix — Anthropic / OpenAI / Google / xAI /
DeepSeek / Qwen. Default: anthropic/claude-sonnet-4.6.
[6/16] Privacy level — Local-only / Hybrid / Cloud-max. Default:
Hybrid.
[7/16] Retrieval backend — embedder + reranker paired
(bge-m3 + bge-reranker-v2-m3 / nomic / MiniLM / OpenAI API /
Voyage / Cohere / skip-reranker). [skipped if Notes-only]
[8/16] Prompt family — auto-derived from step 5 (Claude → XML,
GPT → Markdown+JSON, Gemini → structured output, else
generic). Shown for confirm only.
[9/16] Import source — ChatGPT / Claude / Gemini / multiple / none.
If none: cold-start warning + explicit confirm.
[10/16] Import scan — count conversations, estimate token volume.
[11/16] Refine tier — Skim ~$0.005/conv, Normal ~$0.04/conv,
Deep ~$0.20/conv. Wizard auto-suggests based on corpus size
and daily budget.
[12/16] Card structure — Full/Notes-only: Compact / Standard /
Detailed. RAG-only: fixed RAG-optimized format, step
auto-skipped.
[13/16] First-card preview — refine one real conversation at chosen
tier+structure, show result, optionally tune via 5 dials
(Tone / Length / Sections / Register / Keep-verbatim).
Repeat until user approves.
[14/16] Taxonomy source — derive from existing vault / derive from
first 30 imported conversations / fallback template
(JD / PARA / Zettelkasten).
[15/16] Daily budget cap — THROUGHLINE_MAX_DAILY_USD. Default $20.
[16/16] Summary + y/N. Writes config, kicks off bulk refine if
imports were selected.
Mission branch effects. Step 2 branches the rest of the wizard:
- Full flywheel (default, recommended): all 16 steps apply.
- RAG-only: skip step 12 (card format fixed to RAG-optimized); step 11 defaults to Skim. Effective ~12 steps.
- Notes-only: skip steps 3 (vector DB), 7 (retrieval backend). No RAG infrastructure installed. Effective ~9 steps.
Skip the wizard entirely:
python install.py --expressauto- detects the LLM provider env var you have set, picks Full mission + sane defaults, writes config in ~3 seconds, and exits — no questions asked. Use it when you don't need to override defaults; come back to the full wizard if/when you want to tune.
The rest of this doc goes deeper on each non-trivial step.
Step 2: Mission — what is throughline for YOU?¶
This step is the single most important branch in the wizard. It decouples two concerns that the v0.1 architecture accidentally welded together:
- "Cards as notes you read in Obsidian" (human consumption)
- "Cards as embeddings for RAG recall" (machine consumption)
Those aren't the same product. Forcing every user through the same 6-section refined Markdown makes RAG-only users pay for prose they never read, and makes the prose less optimal for retrieval (dense fact claims retrieve better than beautifully paragraphed essays).
Three mission options:
| Mission | Cards are… | Card format | Per-conv cost | Wizard length |
|---|---|---|---|---|
| Full flywheel (default) | Both read and retrieved | Standard 6-section | ~$0.04 | 16 steps |
| RAG-only | Machine-only — never rendered for humans | Title + entities + 3-8 dense claims | ~$0.001 | ~12 steps |
| Notes-only | Read only — no retrieval | Standard 6-section | ~$0.04 | ~9 steps |
RAG-only card example:
---
title: "Setting up PyTorch on M2 Mac with MPS backend"
entities: [PyTorch, M2 Mac, MPS, Apple Silicon, GPU]
---
- PyTorch 2.0+ supports MPS backend natively
- Use `torch.device("mps")` instead of `cuda`
- Fallback: `PYTORCH_ENABLE_MPS_FALLBACK=1` for unsupported ops
- Install: `conda install pytorch torchvision -c pytorch-nightly`
- Verify: `torch.backends.mps.is_available()`
No narrative prose, no six-section envelope. Pure retrieval food. The title is specific (not "about PyTorch") because the reranker weights it high. Entities help BM25-hybrid retrieval. Claims are atomic so the embedder has dense semantic units.
This is a separate refiner-prompt variant
(prompts/en/refiner.rag_only.<family>.md) and a separate
card-structure enum value — not a degraded Standard. Choosing
RAG-only means committing to never browsing cards as prose.
Notes-only skips step 3 (vector DB), step 7 (retrieval backend), and every RAG-path in the running services. The daemon refines, Obsidian shows the results, and there is no Qdrant / Chroma / embedder / reranker deployed at all. For users who want a smart summariser but not a memory system.
Default is Full flywheel — assumes the user wants everything.
Step 3: Vector DB backends (Notes-only skips)¶
Default is Qdrant via Docker (production-ready, scales to millions of cards). Lightweight alternatives for users who want to avoid Docker / avoid a dedicated service:
| Backend | Needs service? | Embeddable? | Typical scale |
|---|---|---|---|
| Qdrant (default for Full) | Docker | No | Millions |
| Chroma (default for Local-only privacy tier) | Either | Yes | 10K+ |
| LanceDB | No (Rust-embedded) | Yes | 100K |
| DuckDB-VSS | No (DuckDB extension) | Yes | 100K |
| SQLite-vec | No (SQLite extension) | Yes | 10K |
| pgvector | Postgres | No | Millions |
The choice here affects ingest_qdrant.py naming (will become
ingest_vectors.py with a BaseVectorStore implementation
dispatching by $VECTOR_DB), rag_server.py's retrieval layer,
and the daemon's upsert path. Non-trivial code change — this is the
largest engineering item in v0.2.0.
Compatibility constraint: not every vector DB supports every payload schema. The wizard ships a compatibility matrix and disables options that can't service the full payload (title / body_full / tags / knowledge_identity / path). In practice this drops SQLite-vec to "metadata-lean mode" (title + body_preview only).
Step 5: LLM provider matrix¶
Default goes through OpenRouter, so any of these work without extra code. The matrix helps users pick by context/cost rather than loyalty:
| Provider | Representative models | Typical use |
|---|---|---|
| Anthropic | Sonnet 4.6, Haiku 4.5, Opus 4.7 | Refine main (balanced / cheap / deep) |
| OpenAI | GPT-5, GPT-4o-mini | Backup / cheap |
| Gemini 3 Flash | Judgement tasks (cheap, fast) | |
| xAI | Grok 3 / Grok Code | Time-sensitive content, coding |
| DeepSeek | v3.2 | Low-cost Sonnet alternative |
| Qwen | 3.5 72B | Local alt via Ollama |
Grok requires no code change — it's an OpenRouter model ID. This step is documentation, not engineering.
Step 6: Privacy level (orthogonal to refine tier)¶
Three levels, chosen separately from the refine tier. A health-conscious user can pick "Local-only" with the "Deep" tier — the two decisions are independent.
| Level | Slice / refine / route | Embed / rerank | Typical user |
|---|---|---|---|
| Local-only | Ollama Qwen 72B | bge-m3 local | High-sensitivity content (health, therapy, legal) |
| Hybrid (default) | OpenRouter API | bge-m3 local | Most users |
| Cloud-max | OpenRouter API | OpenAI / Voyage API | Fastest; least private |
Step 7: Retrieval backend — embedder + reranker (paired)¶
The default is bge-m3 embedder + bge-reranker-v2-m3 reranker. Both must be chosen together because they have compatibility implications (e.g. API-only reranker forces the embedder to be dimension-compatible with its input encoder; some rerankers only work on English, etc.).
Embedder options:
| Backend | Dim | Cost | Quality | Use when |
|---|---|---|---|---|
| bge-m3 (local) | 1024 | $0 | 9/10 | Default; you have ~8 GB RAM |
| OpenAI text-embedding-3-large (API) | 3072 | API | 9/10 | No local GPU / no heavy RAM |
| nomic-embed-text-v1.5 (local) | 768 | $0 | 8/10 | Limited RAM but decent quality |
| all-MiniLM-L6-v2 (local) | 384 | $0 | 6/10 | CPU-only / absolute minimum |
| Voyage voyage-3 (API) | 1024 | API | 9/10 | Long-document retrieval |
Reranker options:
| Backend | Cost | Quality | Notes |
|---|---|---|---|
| bge-reranker-v2-m3 (local) | $0 | 9/10 | Default; 2.3 GB |
| bge-reranker-v2-gemma (local) | $0 | 9/10 | Newer, larger |
| Cohere rerank-v3 (API) | $$ | 9/10 | No local RAM |
| Voyage rerank-2 (API) | $$ | 9/10 | Long-text friendly |
| Jina reranker-v2 (API) | $ | 8/10 | Cheapest API option |
| Skip reranker | $0 | 7/10 | Embedding-only; fastest, least RAM |
The wizard pairs them: choosing an API reranker offers only API-compatible embedders for that provider; choosing Skip reranker is allowed for any embedder.
Binding constraint: the vector DB collection's vector size must
match the embedder's dimension. Switching embedders post-install
requires rebuilding the collection. The wizard pins this at step 7,
and python -m throughline reconfigure for this step requires an
explicit --rebuild-vector-db flag.
Code impact: rag_server/rag_server.py needs BaseEmbedder +
BaseReranker abstractions; scripts/ingest_qdrant.py becomes
scripts/ingest_vectors.py with a backend dispatcher; VECTOR_SIZE
is derived from the active embedder, not hardcoded 1024.
Step 8: Prompt family (auto-picked, confirm only)¶
Different LLM families produce different quality depending on how the prompt is shaped:
- Anthropic Claude prefers XML tagging (
<recent_history>,<current_query>,<fail_safe>). The shipped v0.1 prompts are already this shape. - OpenAI GPT prefers Markdown + explicit JSON schema for structured output. Tool-use is a first-class path.
- Google Gemini supports structured output directly, with function-declarations-like schemas.
- Other / generic — plain Markdown with explicit output headers, the lowest-common-denominator form.
The wizard auto-derives the prompt family from step 5's provider choice (Anthropic → claude, OpenAI → gpt, Gemini → gemini, else generic). Step 8 shows the derived choice for confirm; users can override but usually shouldn't.
Every shipped prompt exists in the corresponding family variant:
prompts/en/refiner.normal.claude.md
prompts/en/refiner.normal.gpt.md
prompts/en/refiner.normal.gemini.md
prompts/en/refiner.normal.generic.md
Cross product with tier (Skim / Normal / Deep) and mode (Full /
RAG-only) means up to ~48 prompt files. This is doc-heavy but
code-light — the pipeline just loads
refiner.{tier}.{mode}.{family}.md at call time.
Step 11: Refine tier (3 tiers, 40× cost spread)¶
User picks upfront; can override per-import with --tier.
| Tier | Output | Pipeline | Cost per conv | Use case |
|---|---|---|---|---|
| Skim | 1-paragraph summary + 1 tag, single card | Haiku 4.5 one call (skip slicer, skip reranker) | ~$0.005 | Index old chat history for searchable retrieval |
| Normal | Standard 6-section card | Sonnet 4.6 slice → Sonnet 4.6 refine → Haiku 4.5 route | ~$0.04 | Daily use; default |
| Deep | Multi-pass: slice → refine → self-critique → cross-ref | Opus 4.7 + Sonnet 4.6 three-pass | ~$0.20 | Research grade, decisions, long-term memory |
Implementation is not three separate refiner scripts; it's the same
pipeline parameterised along (model × prompt × stage count). Skim
skips slicer and reranker; Deep adds a critique stage. The three
refiner prompt variants live at prompts/en/refiner.skim.md,
refiner.normal.md (current default), refiner.deep.md.
Cost examples for a typical 1247-conversation ChatGPT import:
- Skim: ~$6
- Normal: ~$48
- Deep: ~$240
Step 12: Card structure (only for Full and Notes-only missions)¶
If Mission = RAG-only this step is skipped — the card format is fixed to the RAG-optimized variant documented in Step 2.
The wizard doesn't ask "which structure do you want" in the abstract. It refines one real conversation ($0.04 at Normal tier) and shows the rendered card, then asks "does this fit?". If not, the user swaps structure and re-previews. Cycles until the user agrees.
| Structure | Shape | Suits |
|---|---|---|
| Compact | Title + one paragraph + tags | Zettelkasten / single-claim style |
| Standard (default) | 6-section skeleton (scenario / core / execution / avoid / insight / summary) | Balanced, most users |
| Detailed | 6-section + sidebar (related cards, contradictions, open questions) | Power users, research-grade |
Implementation: three refiner prompt variants, same pipeline.
User's choice persisted to ~/.throughline/config.toml.
Step 14: Taxonomy derivation (not template selection)¶
The wizard prefers to derive the user's taxonomy from their content rather than ship a generic template.
- Path A — user already has an Obsidian vault. The wizard scans
top-level directory names plus 3-5 sampled notes per directory,
then runs a single Claude pass (~$0.02) that emits a suggested
taxonomy.py. The user reviews it in a diff view and edits. This is the recommended path — users see "this was built from my existing vault" and immediately trust it. - Path B — user has imports but no prior vault. The wizard refines the first 30 imported conversations with the Normal-tier pipeline, clusters the resulting cards, then derives a taxonomy from those clusters. Subsequent re-refine of the full import uses the derived taxonomy.
- Fallback templates. Johnny Decimal, PARA, Zettelkasten templates remain available for users who prefer a known shape or want to skip the LLM-derivation step. These are the "I'll figure it out later" escape hatch, not the default path.
Tool: scripts/derive_taxonomy.py, one-shot, writes to
config/taxonomy.py.
Step 13: First-card preview + 5-dial constrained edit¶
After import scan + tier + structure choice, before kicking off bulk refine, the wizard refines one randomly selected conversation at the chosen (tier × structure × mode × prompt family) and shows the rendered card. The user sees actual quality, actual structure, actual token/cost footprint.
Five tuning dials are exposed (bounded mutation — no free-form prompt editing, which would break the refiner schema):
- Tone — formal / neutral / casual
- Length — short / medium / long
- Sections (Full / Notes-only only) — toggle any of the six sections off
- Language register — technical / plain / ELI5
- Keep-verbatim quotes — on / off (retain literal original- conversation phrasing inside cards)
Each dial change re-runs refine on the same conversation and shows the updated card. The preview loop costs about $0.04 per cycle at Normal tier (~$0.001 per cycle at Skim / RAG-only); users typically converge in 2-5 cycles ($0.10-$0.20 total).
Once the user approves, the tuned-dial settings are baked into
~/.throughline/config.toml as refiner parameters. The bulk refine
then applies them across the entire import.
What is NOT exposed at this gate:
- Free-form editing of prompt text — the daemon's downstream consumers (router, dedup, Qdrant payload) depend on a stable schema; arbitrary prompts break it.
- Adding new sections — sections are fixed per structure template.
- Changing the YAML frontmatter schema.
Users who want those things fork the repo, edit the prompt files,
and run the wizard with --use-custom-prompts to bypass the
managed defaults.
The two new-user starting points¶
Cold start — empty vault¶
- User has never used Obsidian or has an Obsidian vault with zero refined cards.
- They install throughline, configure OpenRouter, point it at an empty vault, and start chatting.
- Expected behaviour v0.2.0: the Filter emits a status line
(
🌱 cold start: 0 cards · flywheel warming up) so the user understands RAG will not fire until they accumulate a baseline of cards. The cold-start line remains until ~50 cards; a🌿 ramping: N cards · partial recallline appears in the 50-199 range; normal recall status lines resume at 200+. - No extra configuration is required — the daemon refines every conversation as it happens, and card count grows organically.
Warm start — bulk-import existing chat history¶
- User has years of conversation history in ChatGPT, Claude, or Gemini and wants that history refined into cards before they start the flywheel.
- For v0.2.0 we ship three adapters that turn source exports into raw Markdown matching the OpenWebUI exporter's on-disk shape. The existing daemon then consumes those raw files without any change.
Adapter specs (v0.2.0 planning)¶
All three adapters share the same output contract:
…which is exactly the shape the OpenWebUI exporter produces. The
daemon's queue_existing_raw() catch-up picks up everything on the
next start.
Shared CLI (unified entry point, not loose scripts/adapters/*.py):
python -m throughline import chatgpt <zip_path> --dry-run
python -m throughline import claude <zip_path> --out $THROUGHLINE_RAW_ROOT
python -m throughline import gemini <zip_path> # --out defaults to $THROUGHLINE_RAW_ROOT
Design principle — aggregation happens at three distinct layers¶
Before diving into the adapters, a clarification that drives the whole shape of this subsystem:
| Layer | What it aggregates | Who does it |
|---|---|---|
| L1 | Raw source events into multi-turn conversations | adapter (source-specific) |
| L2 | Conversations into individual cards (slice → refine → route) | daemon (already implemented) |
| L3 | Multiple same-topic cards into master cards; cross-source synthesis | daemon's future B2 "Merge & Synthesis" pass (v0.3.0) |
The adapters only do L1. They deliberately do NOT do L3-style "make the whole import come out as one well-organised knowledge base" because:
- An adapter only sees one source. Cross-source dedupe / merge must compare ChatGPT + Gemini + OpenWebUI cards together.
- The right moment for L3 aggregation is "once cards reach some density", not "at bulk-import time" — by then the user's cards from the existing daemon flow are also in play.
- Doing L3 at import couples bulk import to the same machinery that handles steady-state synthesis. Keep the layers independent.
So: adapters produce raw MD that mirrors the shape of existing OpenWebUI raw exports. Daemon handles the rest. B2 in v0.3.0 will provide the "merge + master-card" story.
--dry-run prints a summary (number of conversations, approximate
token volume, estimated OpenRouter cost at current model prices) and
exits without writing anything.
All adapters must tag the emitted raw files with a frontmatter
import_source field (e.g. import_source: chatgpt-2026-04-23)
so the user can later purge a whole batch if they regret the import.
ChatGPT export¶
- How to obtain: OpenAI → Settings → Data Controls → Export. An email arrives within a few minutes with a ZIP link.
- ZIP contents:
conversations.json+chat.html+message_feedback.json+model_comparisons.json+user.json. - Key file:
conversations.json. - Structure: array of conversation objects. Each conversation
contains a
mappingfield that is a flat dict of message nodes, keyed by node UUID. Each node hasmessage,parent,children. The linear conversation has to be reconstructed by walking from the root alongchildren[0]. - Adapter responsibilities:
- Expand the
mappingtree into a linear user/assistant alternation. ChatGPT branches (edits, regenerations) are collapsed — take the last surviving branch bychildren[-1]at each node. - Skip system-message noise and tool-call artefacts unless the user opts in.
- Preserve
create_timetimestamps fordate:frontmatter.
Claude.ai export¶
- How to obtain: Claude.ai → Settings → Account → Privacy → Export data. Email arrives with a ZIP.
- ZIP contents:
conversations.jsonl(JSON Lines — one conversation per line, not a single JSON array) plus some metadata files. - Structure: each line is a full conversation object with a
linear
messagesarray. Much simpler than ChatGPT — no tree reconstruction. - Adapter responsibilities:
- Line-by-line parse with
json.loads(line)in a loop. - One raw MD per conversation.
- Claude Projects are embedded in the same file; optionally filter
by project name via
--project-name.
Gemini (Google Takeout)¶
- How to obtain:
https://takeout.google.com/→ My Activity → check only Gemini Apps → export. - ZIP contents:
Gemini Apps/MyActivity.jsonat either the root or nested underTakeout/. - Structure:
MyActivity.jsonis Google's generic My Activity schema — an event log, one query+response per record, with no native conversation boundaries. Google explicitly documents this format as unstable and undocumented; may change without notice.
Reconstruction strategy (daily grouping + cross-day semantic stitch)¶
The adapter cannot just emit "1 event = 1 MD". The daemon's slicer is designed for multi-turn input; feeding it 2-sentence events produces thin, fragmented cards at 6× the cost. But the naive "30-minute time gap" cluster (the upstream author's original approach) is too rigid — it either over-fragments (a topic revisited nightly becomes 7 separate sessions) or over-merges (a single evening's multi-topic burst collapses to one noisy session).
The adapter reconstructs conversations in two passes:
- Day buckets. Group every event into the calendar day of its timestamp.
- Cross-day continuation stitch. For each day boundary
(last event of day N-1 vs first event of day N), embed both via
the local bge-m3 rag_server and compute cosine similarity. If
cosine > 0.5, merge day N's events into day N-1's bucket — the user left a topic open overnight and resumed it. Otherwise keep the boundary.
Each resulting bucket becomes one raw MD file with reconstructed user/assistant alternation. The daemon's slicer then subdivides each bucket by topic as usual.
-
Why not finer-grained event-pair semantic clustering (e.g. Haiku per adjacent pair)? Costs ~$1 / 1000 events, and the daemon slicer does a second pass anyway — redundant work. Daily grouping + stitch is the 80/20 tradeoff: ~85% accuracy, zero extra API cost (bge-m3 runs locally), explainable to users ("split on day, merge if topic continues").
-
HTML → Markdown: use the
markdownifylibrary (BeautifulSoup-based). Do not use regex — Gemini'ssafeHtmlItemfield contains nested code blocks, lists, and anchor tags that trip naive regex every time. -
Defensive parsing. Unrecognised record shapes are dumped to stderr and skipped, not crashed. Archive build-date is logged so a format change can be correlated with a dated export.
Privacy / cost guardrails for bulk import¶
Importing 1000+ chats triggers a large refine backlog. Both privacy and cost risk surface at that moment:
- Cost: every conversation goes through Slicer + Refiner + 2×
Router calls. At Sonnet 4.6 prices, a 5-year ChatGPT archive can
be $50-$200 of API spend depending on volume. v0.2.0 must ship a
THROUGHLINE_MAX_DAILY_USDenv var that pauses the daemon queue once hit. - Privacy: imported conversations may contain personally
identifiable information the user did not intend to persist as
cards. The adapter
--dry-runoutput must show this explicit confirmation prompt:
About to ingest 1247 conversations (~3.2M tokens).
- LLM refiner will run on each → approx $42 at current Sonnet rates
- Cards will be written to $VAULT_PATH
- Qdrant vectors will be upserted to collection obsidian_notes
- All cards will carry frontmatter `import_source: chatgpt-2026-04-23`
so you can later remove this batch with
`python scripts/bulk_purge.py --import-source chatgpt-2026-04-23`
Continue? (y/N)
Only after the user types y does actual writing begin. Adapter
writes a manifest state/imports/<import_source>.json with
per-conversation UUIDs so purge can undo cleanly.
New-user rough edges (v0.2.0 P1 scope)¶
Not ship-blockers for v0.2.0, but each one is a known friction point for a fresh install. Priority order by observed friction:
- Obsidian is not required. The daemon writes plain Markdown;
any editor works.
README.md+DEPLOYMENT.mdshould say this in the opening paragraph — currently they implicitly assume Obsidian is installed. - bge-m3 / bge-reranker first download is ~5 GB. A
preflightstep inDEPLOYMENT.mdshould suggesthuggingface-cli download BAAI/bge-m3 BAAI/bge-reranker-v2-m3before starting the RAG server, so a slow-connection user sees hf progress instead of a silent stuck server. - Taxonomy fork friction. The shipped
taxonomy.example.pyis Johnny-Decimal 10-90 with RODC-style leaves. Users on PARA or Zettelkasten hesitate to fork. v0.2.0 ships 2-3 taxonomy templates (taxonomy.jd.py,taxonomy.para.py,taxonomy.zettel.py) and a five-line "how to pick one" section in the deployment guide. - Uninstall / nuke path. A user who tries throughline and
decides it isn't for them needs a one-command uninstall.
scripts/uninstall.shshuts services, drops the Qdrant collection, deletesstate/andlogs/, and optionally keeps the refined vault cards. Document the side-effects clearly.
Hero gif strategy (deferred — do after usability is green)¶
Status: deferred from v0.2.0. The author chose to spend v0.2.0 on "can a new user actually USE the tool end to end with minimum friction" (U1-U8 above) before any marketing/polish work. Gif authoring lands in a later v0.2.x phase once the adapters ship and at least one external alpha confirms the install path.
The design below stays in the doc so future-you doesn't have to redesign from scratch when the time comes.
Two gifs, different audiences:
- README top hero (30 s, silent, loops). Six beats telling the whole flywheel story: user chats → cards grow → bulk import → richer recall. No CLI details — focus on the closed loop. Text overlays only, no voice.
- Import walkthrough (60 s, no loop). Embedded in this doc
below the adapter specs. Shows one complete adapter run end to
end (
python -m throughline import gemini ...recommended, since Gemini is the trickiest — confidence by example).
Toolchain (chosen for reproducibility)¶
| Segment | Tool | Why |
|---|---|---|
| Terminal CLI (all progress bars, prompts) | Charm VHS | .tape source checked into repo; anyone can vhs render to regenerate after a UI tweak. |
| Obsidian / OpenWebUI UI capture | OBS Studio (or Loom) | True UI, no automation substitute. |
| Titles, transitions, brand colours | Remotion | React/TS as source. Deterministic render. |
| Final composition | ffmpeg | Simple overlay + concat. |
Each gif has a docs/assets/hero/ directory (tape files, source
.mp4, rendered .gif). The tape files are the source of truth —
pixel-equivalent renders can be reproduced without re-recording UI.
Claude Design (SVG / brand / banners) is a separate deliverable from
the gifs — it covers the static visual identity (logo, README
<img> above the gif, architecture diagram polish). Not a substitute
for the UI-capture segments.
Out of scope (deferred)¶
- Running without OpenRouter. Some alpha users may want Ollama local models for privacy. Cost pattern is different; taxonomy prompts tuned against Sonnet 4.6 may degrade. v0.3.0 candidate, not v0.2.0.
- Mobile OpenWebUI compatibility. Filter Functions UI on mobile is not tested. Assume desktop-first for v0.2.0.
- Multi-tenant / team vault. Current design is single-user. Changing that touches collection naming, ingest path normalisation, and personal-context leakage. Not v0.2.0.
Status¶
Design spec only. None of the adapters or the cold-start status line are implemented yet. Tracking issues will land in the v0.2.0 GitHub milestone once it is opened.