Paper Lantern

SKILL.md Encoding Corruption

BOM/CRLF/code fences silently break the schedule parser
Field Notes AutomationCoworkOperations ·2026-03-26 ·Repeatedly Verified (3+ occurrences)

Domain: Automation, Cowork, Operations Verification: Repeatedly Verified (3+ occurrences) First observed: 2026-03-24 Last verified: 2026-03-26 Knowledge items: kc-2026-03-26-027 (code fences), kc-2026-03-24-001 (BOM/CRLF)


Warning


Pattern 1: BOM / CRLF

Symptoms

Cause pattern

# Dangerous: adds BOM
Set-Content -Path "file.md" -Value $content -Encoding UTF8

# Safe: no BOM
[System.IO.File]::WriteAllText($path, $content, [System.Text.UTF8Encoding]::new($false))

Reading BOM files (Python side) — use utf-8-sig

# Dangerous: BOM included as first character, JSON parsing fails
with open("task_routing.json", "r", encoding="utf-8") as f:
    data = json.load(f)

# Safe: utf-8-sig automatically strips BOM
with open("task_routing.json", "r", encoding="utf-8-sig") as f:
    data = json.load(f)

This pattern was the root cause of an actual production bug (ops_fix fix_20260325_220604_6428) in task_event_emit.py's _load_routing(). Fixed on 2026-03-26 per Orchestra P-01 proposal.


Pattern 2: Code Fence Wrapping (kc-2026-03-26-027)

Symptoms

Root cause

# Dangerous: actual file content ends up like this
```yaml
---
task_id: knowledge-curator
schedule: ...
---
# Knowledge Curator
...
```          ← closing code fence also at end of file

# Safe: frontmatter must start at the first line
---
task_id: knowledge-curator
...

Remedy

  1. Explicitly instruct the LLM: "Save SKILL.md file content without code fences"
  2. Validate before saving: check that the first line is ---
  3. Fix script: automatic code fence removal
# Detect and remove code fence wrapping
$content = Get-Content $skillPath -Raw
if ($content -match '^\s*```(yaml|markdown)?\s*\r?\n') {
    # Remove first-line code fence
    $content = $content -replace '^\s*```(yaml|markdown)?\s*\r?\n', ''
    # Remove last code fence
    $content = $content -replace '\r?\n\s*```\s*$', ''
    [System.IO.File]::WriteAllText($skillPath, $content, [System.Text.UTF8Encoding]::new($false))
    Write-Host "Code fence wrapping removed"
}

Actual occurrences (2026-03-26)


Standard Remedy

Correct encoding baseline

Diagnostic commands

# BOM check: if first 3 bytes are EF-BB-BF, BOM is present
$bytes = [System.IO.File]::ReadAllBytes($path)
$hasBOM = ($bytes[0] -eq 0xEF) -and ($bytes[1] -eq 0xBB) -and ($bytes[2] -eq 0xBF)
# Line ending check: 0D-0A means CRLF
# Code fence wrapping check: dangerous if first line starts with ```

Bulk verify all SKILL.md files

Get-ChildItem "~\AppData\Roaming\Claude\*.md" -Recurse | ForEach-Object {
    $bytes = [System.IO.File]::ReadAllBytes($_.FullName)
    $bom = ($bytes.Length -ge 3) -and ($bytes[0] -eq 0xEF -and $bytes[1] -eq 0xBB -and $bytes[2] -eq 0xBF)
    $firstLine = (Get-Content $_.FullName -TotalCount 1)
    $codeFence = $firstLine -match '^\s*```'
    [PSCustomObject]@{ File = $_.Name; BOM = $bom; CodeFence = $codeFence }
}

Safe writing (safe_skill_rw.ps1)

Use .\safe_skill_rw.ps1:

# Read
.\safe_skill_rw.ps1 -Mode read -Path $skillPath
# Write (automatically ensures UTF-8 No BOM + LF)
.\safe_skill_rw.ps1 -Mode write -Path $skillPath -Content $newContent
# Bulk verify
.\safe_skill_rw.ps1 -Mode bulk-verify -Path $skillDir

Affected Systems

Preventive Measures (built 2026-03-25)

  1. Added "Encoding Safety Rules" section to EMIT_PROTOCOL.md — document read by all tasks before execution
  2. safe_skill_rw.ps1 helper — standard tool for SKILL.md read/write
  3. operational_fixes.jsonl — tracks BOM issues at critical severity
  4. Schedule creation skill updated — encoding safety rules auto-included when creating new schedules
  5. Use utf-8-sig for Python file opens (added 2026-03-26)
  6. Standardized LLM instructions — explicitly prohibit code fence wrapping (added 2026-03-26)

Evidence


Evolution Log