bug(brain): first embed-sync tick reports errors=1 — find which file failed #19

Closed
opened 2026-05-19 11:10:40 +00:00 by mathias · 1 comment
Owner

Symptom

Pod startup at 2026-05-19T10:58:16Z (image 7a13c756, first cold sync) logged:

INFO embed sync added=0 deleted=0 errors=1

One file in brain/wiki/ failed to embed. The aggregate counter surfaces it but the per-file error is currently only written to the slog at the call site (vectorstore.Sync collects errors into SyncResult.Errors and StartSync only logs the count).

After the DSN-leak fix redeploy at 11:04:56Z, the sync result wasn't re-logged because no new files were added/deleted/errored (the embed sync only emits the INFO line when added+deleted>0 || len(errors)>0).

What to do

  1. Surface per-file error context in the log line — e.g.:

    for _, e := range res.Errors {
        slog.Warn("embed sync per-file error", "err", e.Error())
    }
    

    in internal/vectorstore/sync.go StartSync / next to the aggregate INFO line. Without this, every future sync error is a silent black box.

  2. Trigger a backfill (hyperguild issue for that lives alongside) and inspect the response — POST /backfill-embeddings returns errors[] per-file, which is the path to the root cause.

  3. Likely candidates:

    • empty .md file → embed: empty text rejection from embed.Client.Embed
    • file with only frontmatter, no body
    • very large file exceeding ollama's input window

Acceptance criteria

  • vectorstore.StartSync logs each per-file error individually
  • The failing file is identified and either fixed, skipped intentionally, or the embedder is taught to handle the edge case
  • task check clean
## Symptom Pod startup at 2026-05-19T10:58:16Z (image `7a13c756`, first cold sync) logged: ``` INFO embed sync added=0 deleted=0 errors=1 ``` One file in `brain/wiki/` failed to embed. The aggregate counter surfaces it but the per-file error is currently only written to the slog at the call site (`vectorstore.Sync` collects errors into `SyncResult.Errors` and `StartSync` only logs the count). After the DSN-leak fix redeploy at 11:04:56Z, the sync result wasn't re-logged because no new files were added/deleted/errored (the embed sync only emits the INFO line when `added+deleted>0 || len(errors)>0`). ## What to do 1. Surface per-file error context in the log line — e.g.: ```go for _, e := range res.Errors { slog.Warn("embed sync per-file error", "err", e.Error()) } ``` in `internal/vectorstore/sync.go` `StartSync` / next to the aggregate INFO line. Without this, every future sync error is a silent black box. 2. Trigger a backfill (hyperguild issue for that lives alongside) and inspect the response — `POST /backfill-embeddings` returns `errors[]` per-file, which is the path to the root cause. 3. Likely candidates: - empty `.md` file → `embed: empty text` rejection from `embed.Client.Embed` - file with only frontmatter, no body - very large file exceeding ollama's input window ## Acceptance criteria - [ ] `vectorstore.StartSync` logs each per-file error individually - [ ] The failing file is identified and either fixed, skipped intentionally, or the embedder is taught to handle the edge case - [ ] `task check` clean
Author
Owner

Superseded. Per-item embed sync error logging shipped in commit 078ec02 (v0.7.0), and the underlying root cause is now also fixed (infra#37 + #38 / commit 37fdd33 / v0.8.0).

Live evidence:

2026/05/19 19:57:59 INFO embed sync added=32 deleted=0 errors=0

Per-item logging in the live code:

// internal/vectorstore/sync.go (StartSync)
for _, e := range r.Errors {
    slog.Warn("embed sync item failed", "err", e)
}

When a per-file error happens now, the exact path + upstream error string is logged — that's how the chunking work (infra#38) identified the three oversized files. After chunking, the steady-state error count is 0.

Closes.

Superseded. Per-item embed sync error logging shipped in commit `078ec02` (v0.7.0), and the underlying root cause is now also fixed (infra#37 + #38 / commit `37fdd33` / v0.8.0). Live evidence: ``` 2026/05/19 19:57:59 INFO embed sync added=32 deleted=0 errors=0 ``` Per-item logging in the live code: ```go // internal/vectorstore/sync.go (StartSync) for _, e := range r.Errors { slog.Warn("embed sync item failed", "err", e) } ``` When a per-file error happens now, the exact path + upstream error string is logged — that's how the chunking work (infra#38) identified the three oversized files. After chunking, the steady-state error count is 0. Closes.
Sign in to join this conversation.
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: mathias/hyperguild#19