SecondBrain
Ask the Brain
Index/Queryupdated Sat Jun 27 2026 08:00:00 GMT+0800 (Philippine Standard Time)

Daily Learning Capture Pipeline

Do you save the transcript of the YouTube video link I share to raw folder, or just the link?

metapipelinetelegramdaily-learningopenclawcapture

Daily Learning Capture Pipeline (Telegram → Wiki)

Question (2026-06-27): Do you save the transcript of the YouTube video link I share to raw folder, or just the link?

Short answer: yes — the transcript (or article body, or PDF text) is now saved alongside the link. The openclaw → GitHub pipeline that turns Telegram messages into raw/Daily Learning YYYY-MM-DD HH-MM #<msg_id>.md files now enriches the capture with source text before committing.

The two pieces of evidence

  1. The same day's commit history carries the upstream change: 078fa39 fix: enrich daily learning captures with source text (2026-06-27). This was the fix that retro-fitted the enrichment behaviour on top of the prior link-only captures.
  2. The same-day capture #3482 is the live demo. The Telegram message contained only a YouTube URL (https://youtu.be/O7u6myBRsns?...), but the resulting raw/ file carries:
    • The original Telegram message
    • A ## Linked Source: <URL> header
    • Title + Channel + URL metadata
    • A full ### Transcript section

The same enrichment pattern applies to web articles (full body) and PDFs (extracted text) where they're reachable.

What this means for ingest behaviour

For the LLM (me) processing these captures:

  • Treat the linked content as the source, not the bare link. The user's intent in sharing the URL is to ingest the content, not to file the bookmark.
  • The Telegram message acts as the capture trigger; reference it in sources: alongside the linked source.
  • For YouTube specifically: extract the channel and create/update an <span class="deadlink" title="Not published"><Channel></span> entity page so videos cluster across captures.
  • The raw/ filename is just the timestamp + message ID — derive wiki titles from the content, not the filename.

This is already documented in CLAUDE.md under the "Daily Learning Telegram captures" section; this page is the user-facing confirmation that the enrichment is now operational, so the prior link-only mental model is out of date.

Caveat — when the linked content isn't reachable

The enrichment is best-effort. Failure modes already encountered in this vault:

  • YouTube videos with no caption track → the capture file may have title + channel but no transcript (e.g. capture #3463, the Justin Sung video, filed as a deliberate stub).
  • Paywalled or bot-blocked articles → may carry only the lede + structured metadata (e.g. capture #3420, the Forbes piece on Tom Zehren).
  • Captures with no link (text-only thoughts) → enrichment doesn't apply; the body is the content.

When the linked content fails to fetch, the capture is still committed; the ingest agent decides whether to treat it as a stub, a transcript-blocked entity page, or to re-attempt the fetch from a different vantage point.

Cross-references

  • The Telegram → GitHub → raw/ pipeline is run by openclaw; this isn't documented further in the vault and isn't worth a separate page until something about the pipeline meaningfully changes.
  • See CLAUDE.md § "Daily Learning Telegram captures" for the ingest convention.

Sources

  • Telegram capture #3484 — the meta question
  • Companion commit 078fa39 fix: enrich daily learning captures with source text (2026-06-27) — the upstream pipeline change that introduced the enrichment
  • Telegram capture #3482 — the same-day demonstrating capture