# fpc-msgbase — plug-and-forget redesign proposal **Status:** design handover, pre-implementation. Prepared 2026-04-15. **Target audience:** the next implementation session. **Precondition:** fpc-msgbase is at 0.1.0 (commit `b79e7fb` at the time of writing); 9 format backends implemented from FTSC and format-author specs, 7/7 tests green. --- ## TL;DR fpc-msgbase today solves the *read* side cleanly and ports 9 formats with tested fidelity. It's **not yet plug-and-forget** for embedders like comet is. Four things are missing: a lossless round-trip guarantee, an atomic outbound-packet builder (`TOutboundBatch`), format-agnostic I/O injection, and a single-callback event model. This proposal is **not a rewrite** — it's a six-week reshape in place. The format backends stay. The scaffolding changes. When done, the acceptance test is a five-line Hello World that opens a path, reads messages, closes. No format name, no lock ceremony, no event registry, no `.uni` sidecar unit, no init-order hazard. --- ## Context: where this proposal came from A cross-project conversation between NetReader (`../netreader`, the HPT-drop-in mail tosser) and fpc-msgbase. NetReader has its own format layer (`src/core/nr.msgbase.*`) hardened by a 39/39 live trial and a 54/54 CLI-parity test suite. During that conversation we compared: 1. Do the two libraries agree on bytes when reading/writing the same base? (**Unverified. First action item below.**) 2. What would "plug it in and forget" look like for fpc-msgbase? The conclusions that drove this proposal: - fpc-msgbase's `TPacketBatch` handles the **inbound** concurrent tosser side cleanly. It **does not** handle the symmetric outbound side (writing per-link packets with size rotation and atomic finalize). Neither library does, today. - `TUniMessage` as-currently-defined is **lossy**: JAM `MsgIdCRC`/`ReplyCRC`, `DateProcessed`, and every other format-specific header field has no canonical slot, so a caller that does `read → write` through the unified API silently loses bytes. - The two-unit-per-format inclusion pattern (`ma.fmt.jam` + `ma.fmt.jam.uni`) asks callers to remember both. The `.uni` sidecar's only job is `initialization` registration — it belongs *inside* the format unit. - `wc_sdk/` is ~66K lines of BTrieve-era Pascal pulled in unconditionally by any caller who uses Wildcat. Embeddability requires build-time gating. - The error model mixes typed exceptions (`EMessageBase` from the factory) with boolean returns (all other methods) with event fires (locking). Three mental models for one library is two too many. The reshape below addresses each. --- ## Current state — file-by-file | Unit | LOC | Purpose | Keep? | |---|---|---|---| | `src/ma.api.pas` | ~330 | `TMessageBase` abstract, factory | **Reshape** | | `src/ma.types.pas` | ~600 | `TUniMessage`, `TFTNAddress`, bits | **Reshape** | | `src/ma.events.pas` | ~100 | multi-subscriber event registry | **Replace** with single callback | | `src/ma.lock.pas` | ~250 | 3-layer locking | **Keep**, make explicit | | `src/ma.paths.pas` | ~180 | per-format path derivation | **Keep** | | `src/ma.batch.pas` | 333 | inbound `TPacketBatch` | **Keep** + add `TOutboundBatch` twin | | `src/formats/ma.fmt..pas` × 9 | 400–1300 each | native spec-driven backend | **Keep** — hardened | | `src/formats/ma.fmt..uni.pas` × 9 | 100–200 each | adapter to `TUniMessage` | **Fold into** the format unit | | `src/wc_sdk/` | ~40K | Wildcat BTrieve SDK | **Gate** behind `{$DEFINE MA_WITH_WILDCAT}` | Tests: `tests/test_*.pas` → keep, extend with the round-trip corpus below. --- ## The five-line acceptance test When the reshape is done, this works unchanged across every format: ```pascal uses OpenMsgLib, OpenMsgLib.Auto; var B: TMsgBase; M: TMsgRecord; i: integer; B := OpenMsgBase('/path/to/base', momReadOnly); try for i := 0 to B.Count - 1 do if B.Read(i, M) then WriteLn(M.Subject); finally B.Free; end; ``` No format name (autodetect). No lock call (default = no lock, caller opts in). No event wiring (default = silent). No `.uni` unit (one unit per format registers itself). No init-order hazard (single registration point). If a new session reads a test base and the above doesn't work verbatim, the reshape isn't done. --- ## Target architecture ### Three-tier layering ``` Caller (BBS, tosser, editor, importer) │ ├── TMsgBase (unified) ← most callers live here ├── Direct format class ← drop-down when Extras aren't enough └── Raw stream ← replay, test, encrypt, mock │ ▼ Format backends speak to ITsmIO — never TFileStream directly │ ▼ ITsmIO adapters: file | memory | encrypted | test harness ``` **Why add ITsmIO:** - Test backends without hitting disk (in-memory fixtures) - Wrap the lib in encryption/compression - Run on network mounts with unusual locking semantics - Replay captured corrupt frames for debugging ### `TMsgRecord` with lossless `Extras` > **SUPERSEDED 2026-04-17.** The "named fields + Extras bag" hybrid > below was rejected during implementation in favour of a stricter > two-area model: `Body` holds only the message text; **everything > else** (from/to/subject/dates/addresses + every kludge + every > format-specific field) is an attribute. See > [`docs/attributes-registry.md`](attributes-registry.md) for the > key catalog and [`docs/architecture.md`](architecture.md) for the > updated TUniMessage contract. The original Extras-bag design is > retained below as historical context. > > The capabilities API discussed in this section landed essentially > as proposed (`base.SupportsAttribute(K)` + class-level > `ClassSupportedAttributes`). ```pascal TMsgRecord = record { Universal fields every format has: } Index: longint; From, To_: AnsiString; Subject: AnsiString; DateWritten: TDateTime; DateArrived: TDateTime; OrigAddr: TFtnAddr; DestAddr: TFtnAddr; Attr: cardinal; { canonical MSG_ATTR_* bitset } Body: AnsiString; { kludges + text, CR-separated } { Backend-specific fields preserved verbatim across round-trips. Every key a backend WRITES during a Read it MUST re-consume during a subsequent Write. Test harness enforces this. } Extras: TMsgExtras; end; TMsgExtras = record function Get(const Key, Default: AnsiString): AnsiString; procedure SetValue(const Key, Value: AnsiString); function Has(const Key: AnsiString): boolean; end; ``` **Well-known keys** (published in `docs/extras-registry.md`, to be written): | Key | Type | Source | Notes | |---|---|---|---| | `jam.msgidcrc` | hex u32 | JAM `.jhr` fixed header | needed for NR linker `-j` fast path | | `jam.replycrc` | hex u32 | JAM `.jhr` fixed header | same | | `jam.dateprocessed` | unix int | JAM `.jhr` | tosser timestamp | | `jam.passwordcrc` | hex u32 | JAM `.jhr` | per-msg password | | `jam.cost` | int | JAM `.jhr` | | | `squish.umsgid` | hex u32 | SQI frame | unique msg ID | | `hudson.board` | int 1..200 | MSGINFO | board number | | `hudson.refer` | int | HDR | refer-to ptr | | `pcb.confnum` | int | PCB `.IDX` | conference | | `pcb.refer` | int | PCB `.IDX` | | | `ezy.msgflags` | hex byte | MH#####.BBS | | | `goldbase.userid` | int | IDX | | | `wildcat.confnum` | int | WC SDK | | | `pkt.cost` | int | Type-2 header | | | `pkt.flavour` | enum | outbound pkt only | crash/hold/direct/imm/norm | **Round-trip invariant:** > After `Read(i, M)`, calling `Write(M)` on the same or a different > base of the same format produces bytes that round-trip through > `Read` to an identical `TMsgRecord` (same Extras keys, same values). Tests enforce this across the full corpus. ### Error model — one tree ``` EMessageBase ├─ EMessageBaseIO disk full, permission, corrupt read ├─ EMessageBaseLock timeout, contention, deadlock ├─ EMessageBaseFormat bad signature, truncated header ├─ EMessageBaseRange Index out of [0..Count) └─ EMessageBaseClosed operation on a freed/closed base ``` Every method either **succeeds** or **raises**. Boolean returns mean "nothing to do" (empty Pack, no new messages, no matching records) — never failure. A single `try/except on E: EMessageBase do ...` catches the whole tree. ### Locking — explicit, not implicit ```pascal TMsgBase = class procedure LockForRead; procedure LockForWrite; procedure Unlock; function TryLockForRead (TimeoutMs: integer): boolean; function TryLockForWrite(TimeoutMs: integer): boolean; { Common-case one-liners. } procedure WithReadLock (AProc: TProc); procedure WithWriteLock(AProc: TProc); end; ``` Default `Open` acquires **no lock**. Callers choose. A read-only BBS frontend doesn't need cross-process locking; a tosser does. Library doesn't guess. Common case stays a one-liner via `WithReadLock`. ### Transactions — declared ```pascal base.BeginTransaction; try for msg in batch do base.WriteMessage(msg); base.Commit; except base.Rollback; raise; end; ``` Backends implement per-format: - **JAM:** defer `.jdx` index updates in memory, flush on Commit. - **SDM:** temp-dir shadow, rename on Commit. - **Hudson:** in-memory index delta, flush on Commit. - **Squish:** frame-list delta, flush on Commit. - **PKT (write):** temp file, rename on Commit. - **Read-only mode:** Commit is a no-op. Callers who don't call `Begin/Commit` still work — writes flush per-call. Transactions are opt-in for atomicity. ### Events — single callback ```pascal TMsgEvent = record EventType: TMsgEventType; { BaseOpened, MessageRead, etc. } Source: TMsgBase; { may be nil for lib-global events } Subject: AnsiString; { path, area tag, msgid } Detail: AnsiString; { human-readable } LongValue: int64; { count, size, offset } TimeStamp: TDateTime; end; TMsgEventCallback = procedure(const E: TMsgEvent) of object; base.OnEvent := @MyHandler; { one pointer, that's it } ``` comet does this with `OnLog`. One callback, caller multiplexes if they need multiple observers. No multi-subscriber registry. Radical simplification. ### Format detection is the default API ```pascal { Primary — sniff the path, pick the backend. } function OpenMsgBase(const Path: AnsiString; Mode: TMsgOpenMode): TMsgBase; { Escape hatch — force a specific format. } function OpenMsgBaseAs(Format: TMsgBaseFormat; const Path: AnsiString; Mode: TMsgOpenMode): TMsgBase; ``` Fingerprints: - JAM: `.jhr` + `.jdx` pair, `"JAM\0"` signature - Squish: `.sqd` + `.sqi` - Hudson: `MSGINFO.BBS` + `MSGHDR.BBS` + ... - GoldBase: `MSGINFO.DAT` + ... - PCBoard: `.IDX` + `.MSG` pair - EzyCom: `MH*.BBS` + `MT*.BBS` - Wildcat: WC SDK marker file - SDM: directory full of numbered `.msg` files (fallback) ### One unit per format, self-registering ```pascal uses OpenMsgLib, { core + factory } OpenMsgLib.Jam, { registers JAM format in initialization } OpenMsgLib.Hudson; { registers Hudson in initialization } ``` No `.uni` split. The format unit's `initialization` block calls `RegisterFormat(..., @Factory)`. Inclusion is the registration. ### `TOutboundBatch` — the missing half of the tosser The symmetric twin of `TPacketBatch` (inbound). Details in the appendix; the shape: ```pascal TOutboundBatch = class constructor Create(const AOutboundDir: AnsiString; const AOurAka: TFtnAddr); { Append a message to the outbound packet for (Target, Flavour). Caches the open pkt per (Target, Flavour) pair so repeat calls write to the same file until rotation or Flush. } function DispatchMessage(const Msg: TMsgRecord; const Target: TFtnAddr; Flavour: FlavourType): boolean; { Finalise every cached pkt — write terminator, rename .tmp → final, update .flo if configured. Idempotent. } procedure Flush; property MaxPktSizeKB: longint; { 0 = unlimited; rotate at threshold } property OnEvent: TMsgEventCallback; end; ``` Two features missing in both NR and fpc-msgbase today, baked in here from day one: - **Packet size rotation.** Before each write, check cached stream size + estimated msg size. If over threshold, close current (`.tmp` → `.pkt` rename), open next. - **Atomic finalize.** Writes go to `xxxxxxxx.pkt.tmp`. Only `Flush` (or rotation) renames to `.pkt`. Crash mid-run leaves an orphan `.tmp` — not a corrupt real packet. Format-agnostic: writes to Type-2 / 2+ / 2.2 pkts via the existing `ma.fmt.pkt` backend. FTN routing stays in the caller (NR's `FindRouteForNetmail`/`GetLinkForRoute` is fidoconf-specific, not library material). --- ## Reference map — where to look in existing code ### Inside fpc-msgbase | Concern | File | Relevant lines | |---|---|---| | Factory pattern to extend | `src/ma.api.pas` | full file | | `TUniMessage` to grow with Extras | `src/ma.types.pas` | record definition, top of file | | Existing inbound tosser to mirror for outbound | `src/ma.batch.pas` | `TPacketBatch`, `GetOrCreateBase` (line 296) | | Example tosser template | `examples/example_tosser.pas` | `TSimpleTosser.OnMessage` | | Existing lock layer to make explicit | `src/ma.lock.pas` | full file | | Event registry to replace with single callback | `src/ma.events.pas` | full file | | Per-format native → uni adapters to fold together | `src/formats/*.uni.pas` × 9 | each is ~100-200 LOC | | Sample data for tests (if any) | `tests/` | check what's already there | ### In NetReader (`../netreader`) NR has already solved several of these in its own idiom. The outbound machinery is the most reusable reference: | Concern | File | Relevant lines | |---|---|---| | Cached outbound packet per dest addr | `src/core/nr.scanner.pas` | `GetOutboundPacket` line 214, `CloseAllPackets` line 291 | | Temp-pkt filename generator | `src/core/nr.scanner.pas` | `CreateTempPktFileName` call at 256 | | Per-message route + pack logic | `src/core/nr.scanner.pas` | `PackMsg` line 551 | | Route resolution (stays in NR) | `src/core/nr.scanner.pas` | `FindRouteForNetmail` 398, `GetLinkForRoute` 451 | | Priority from Attr + FLAGS kludge | `src/core/nr.scanner.pas` | `PackMsg` lines 609-636 | | Zone-aware outbound path | `src/core/nr.arcmail.pas` | `GetOutboundDir` (around line 178), `GetFLOPath` 219 | | IsArcMailExt helper | `src/core/nr.arcmail.pas` | `IsArcMailExt` line 155 | | Pkt header + message write | `src/core/nr.packet.pas` | `WritePktHeader`, `WritePktMessage` | | Scanner's NoHighWaters pattern (useful for read-only verifiers) | `src/core/nr.scanner.pas` | `ScanNMArea` 1098 | | JAM header-only read (NR's linker CRC fast path) | `src/msgbase/nr.msgbase.jam.pas` | `TNrJamMsgBase.ReadLinkHdr`, `TNrJamLinkHdr` record | NR's entire CLI parity pass (last 7 commits in `../netreader`) is built on this scaffolding. Pattern should transfer cleanly. ### Reference: format specs The FTSC document collection at `/home/ken/Source Code/ftsc/docs/` and the format-author specs (jam.txt, squish.doc, pcboard.doc, etc.) are the authoritative source. Every backend cites the spec it implements in `docs/ftsc-compliance.md`. ### In comet (`../comet`) comet is the "plug it in and forget" model. Key patterns worth mirroring: - Single log callback (`OnLog`), no registry - `TStream`-centric I/O — caller controls the stream - Config hot-reload without API surgery - Embeddable via narrow callback surface See `../comet/README.md` "Embeddable" bullet. --- ## Test corpus — the `testmsg/` folder **Status:** to be populated. Proposal: live at `/home/ken/Source Code/fpc-msgbase/testmsg/` (checked in, anonymized, git-tracked so rollback is just `git checkout`). Proposed structure: ``` testmsg/ ├── README.md how-to regenerate, licensing notes ├── jam/ │ ├── small_echo/ few-hundred-msg JAM area, real echoarea │ ├── large_echo/ 10k+ messages, stresses index growth │ ├── deleted_mix/ area with tombstoned msgs for Pack tests │ └── netmail/ JAM netmail w/ kludges ├── squish/ │ ├── small_echo/ │ └── netmail/ ├── hudson/ │ └── 3-board/ multi-board fixture for Board field test ├── msg/ FTS-1 numbered *.msg │ ├── netmail/ │ └── echo/ rare but tested ├── pcb/ │ └── sample_conf/ ├── ezycom/ ├── goldbase/ ├── wildcat/ ├── pkt/ │ ├── type2_plain/ │ ├── type2plus/ │ └── type2_2/ └── reference/ ├── jam_small_echo.json canonical-JSON snapshot ├── jam_large_echo.json (round-trip baseline — DO NOT edit) └── ... one per corpus base ``` **Anonymization rules:** before check-in, scrub real user addresses and passwords from kludges. A small helper `tools/anonymize.pas` can do this deterministically — replaces real MSGID address with `999:9999/9999`, replaces user names with `User` tokens. **Regeneration script:** `tools/regen_reference.sh` walks each corpus base via this library, dumps canonical JSON to `testmsg/reference/`. Committed output is the ground truth captured from a known-good build; later test runs diff their output against the committed JSON. **Rollback story:** `git checkout testmsg/` restores any corrupted fixture. Keep fixtures small-ish (<50MB total across all formats) so the repo stays cloneable on slow links. --- ## Byte-agreement cross-verifier — first actionable task Before any redesign, **confirm the two libraries agree on the bytes they read and write today**. Without this baseline, reshape could silently regress behavior that currently works. **Tool:** `tools/cross_verify.pas` — standalone FPC program. **What it does:** 1. Open a corpus base with the existing fpc-msgbase. 2. Read all messages → dump each to a canonical JSON record. 3. Open the same base with NetReader's `nr.msgbase.*`. 4. Read all messages → dump to same canonical JSON format. 5. Diff the two outputs. Report mismatches per field. For write verification: 1. Fabricate 100 messages with known content. 2. Write them through fpc-msgbase to a fresh JAM base. 3. Read them back through NR's backend, diff. 4. Repeat with write via NR, read via fpc-msgbase. Expected outcome: both should agree for universal fields. Where they disagree is where the proposal's `Extras` story becomes important — those are the fields each side handles differently (or one side silently drops). **Deliverable:** a report noting exactly which fields differ per format, so the Extras registry (above) is anchored in real data rather than speculation. This is ~2-4 hours of work and produces the single most important input to the reshape. --- ## Implementation plan — six weeks | Week | Deliverables | |---|---| | **1** | `testmsg/` corpus committed (3-5 bases per format, anonymized). `tools/cross_verify.pas` running. `tools/regen_reference.sh` producing committed canonical JSON. Baseline: which fields differ between fpc-msgbase and NR today? | | **2** | Grow `TUniMessage` → `TMsgRecord` with `Extras` bag. Publish `docs/extras-registry.md` naming every well-known key. Backfill round-trip test: every backend reads a corpus base, writes to a fresh base, reads back, Extras map preserved. Fold `.uni` sidecars into format units. | | **3** | Introduce `ITsmIO`. Refactor every format backend to take an `ITsmIO` instead of `TFileStream` directly. Add `TMemoryTsmIO` for tests. Run the full test suite in memory — no disk writes. | | **4** | Land `TOutboundBatch` with size rotation + atomic finalize. Write `examples/example_multiplex` that splits inbound pkts by destination and writes per-link outbound pkts via `TOutboundBatch`. | | **5** | Unify error model (typed `EMessageBase*` tree; boolean returns mean "nothing to do" only). Replace `TMessageEvents` registry with single `OnEvent` callback. `TMsgBase` locking becomes explicit (no implicit `LockForRead` on Open). `WithReadLock` / `WithWriteLock` helpers. | | **6** | Documentation pass. Full API reference regen. `CHANGELOG.md` with 0.1 → 1.0 migration notes. Version constants `MA_VERSION_MAJOR`/`MINOR` + runtime `MaRequireVersion`. `{$DEFINE MA_WITH_}` gates finalized. Ship 1.0. | Each week's work ships independently — no big-bang merge. --- ## Beyond the six-week plan — things worth planning ### Version ABI discipline ```pascal const MA_VERSION_MAJOR = 1; MA_VERSION_MINOR = 0; { Caller invokes at program start. Raises if compiled against major < required or (major == required AND minor < required). } procedure MaRequireVersion(Major, Minor: integer); ``` Libraries consumed by multiple callers (Fimail, NetReader, third-party) need a noisy "you linked the wrong version" failure mode. Not "weird behavior six months later." ### Read-only airtightness `momReadOnly` should be verifiable. Regression test: 1. Copy a base to a read-only mount (`chmod -w` + `mount -o ro`). 2. `OpenMsgBase(path, momReadOnly)`. 3. Read 10k messages, every field. 4. Assert: every file's `st_mtime` is unchanged; no syscall fired that opens a file for write. A BBS running concurrently with a tosser depends on this being airtight — no "oops, we updated last-read" surprises. ### Build-time format gating ```pascal { config.inc } {$DEFINE MA_WITH_JAM} {$DEFINE MA_WITH_HUDSON} {$DEFINE MA_WITH_SQUISH} { $DEFINE MA_WITH_WILDCAT} { commented out — 66K LOC pulled in } ``` ```pascal { OpenMsgLib.All.pas — convenience include-all unit } unit OpenMsgLib.All; interface uses OpenMsgLib {$IFDEF MA_WITH_JAM}, OpenMsgLib.Jam{$ENDIF} {$IFDEF MA_WITH_HUDSON}, OpenMsgLib.Hudson{$ENDIF} ... ; end. ``` Embedders control what ships. A full tosser wants everything. A minimal BBS UI wants JAM only. No one should be forced to compile BTrieve-era SDK code for a JAM reader. ### Documentation structure ``` docs/ ├── API.md full API reference (regen each 1.x) ├── architecture.md layered design (update to three-tier) ├── extras-registry.md well-known Extras keys per format ├── ftsc-compliance.md spec notes ├── migration-0.1-to-1.0.md for existing callers ├── embedder-guide.md for BBS/tosser authors └── format-notes/ per-format quirks & gotchas ├── jam.md ├── squish.md └── ... ``` ### CI expectations When the repo has CI (GitHub Actions, GitLab CI, whatever), the test job is: 1. Build with all formats enabled. 2. Build with each format *disabled* in turn — prove conditional compilation holds. 3. Run `run_tests.sh`. 4. Run `tools/cross_verify.pas` against the corpus. 5. Verify `docs/extras-registry.md` lists every key any backend writes (grep the source for `Extras.SetValue`). --- ## Contribution path back from NetReader NR has six weeks of HPT-parity work sitting on top of its own msgbase. When the reshape hits 1.0, NR has a decision: - **A.** Adopt fpc-msgbase wholesale — drop `nr.msgbase.*`, call the lib. NR becomes a thin areafix + scanner + CLI over the shared library. Big commit, one-time pain. - **B.** Keep NR's backends, cherry-pick fpc-msgbase's event dispatcher / lock model / outbound batch. Lighter touch. - **C.** Contribute NR's improvements (JAM CRC fast-path, case-insensitive `.msg` globbing, netmail-cfg-writer fix, FTS-0004 tag validation) back to fpc-msgbase. Symmetric win. Option C is the first step regardless of whether A or B lands later. NR's recent work is format-agnostic quality improvements that every embedder wants. --- ## Open questions for the implementation session Before any code changes: 1. **Extras representation.** Key-value `AnsiString` bag is dead simple but slow for 100K messages. Alternatives: - `TDictionary` (rtl-generics) — faster lookup - Packed binary blob with offsets — smallest memory footprint - Keep strings but cap each Extras to N keys (fixed-size array) Benchmark before picking. 2. **Transaction nesting.** Can `BeginTransaction` nest? JAM defers `.jdx` updates in memory — nested transactions just keep deferring. SDM's tempdir-shadow approach can't nest cleanly. Propose: **no nesting**. Second `BeginTransaction` call raises. 3. **Thread safety of `TMsgBase`.** Inbound `TPacketBatch` shares one base across workers, serialises via per-base CS. Works today. Does the reshape preserve that? Answer: yes, the explicit lock API makes it *more* obvious. 4. **Squish `.sql` lastread vs `.sqi` index.** This library treats `.sql` as the cross-process lock sentinel (matching the convention other Squish-aware tools use). The reshape should document this explicitly — it's a format-specific quirk that callers shouldn't need to know. 5. **PKT as a base vs as a stream.** `TMessageBase` abstraction assumes random-access read. A PKT is a forward-only stream of messages. Does PKT implement `Count`? (Reader would need to scan ahead to count.) Propose: PKT implements `Count` but flags `CanRandomAccess = False`, caller iterates via `MoveNext` instead of `Read(i)`. Callers who treat PKT as a base get a clear exception. 6. **Wildcat SDK cleanup.** Is the 40K-LOC `wc_sdk/` still needed, or can it be replaced with a narrower interop layer? (Not a 1.0 blocker, but worth scoping for 1.1.) Decisions on these should go in `docs/design-decisions.md` as they land, so future sessions don't re-litigate. --- ## First-session actionable steps If this proposal makes sense to the next implementer, the first session should: 1. **Read this document end-to-end.** Cross-check my mapping of fpc-msgbase's current state against the actual code — call out anything stale. 2. **Create `testmsg/` with a corpus.** Start with ONE format (JAM — best-spec'd, most used) and 3 bases: small echoarea (~100 msgs), large echoarea (~10K), netmail directory. Commit. 3. **Write `tools/cross_verify.pas`.** Use NR's `nr.msgbase.jam` and fpc-msgbase's `ma.fmt.jam` as the two readers. Dump canonical JSON, diff. Report. 4. **Report the diff.** What fields do the two libraries disagree on today? That diff becomes the initial `extras-registry.md`. 5. **Stop. Discuss.** Before any backend refactor, the cross-diff report informs every architectural decision below. If the two libraries disagree on 20% of fields, the Extras story is validated. If they agree on 100%, the Extras story is more about future-proofing than immediate need. Only after the cross-diff is in hand does the week-1 plan above make sense. --- ## Appendix: the `TOutboundBatch` design in full (From the cross-project conversation; reproduced here for completeness.) ```pascal unit OpenMsgLib.Outbound; {$mode objfpc}{$H+} interface uses Classes, SysUtils, OpenMsgLib, OpenMsgLib.Types, OpenMsgLib.Pkt; type TOutboundBatch = class private FOutboundDir: AnsiString; FOurAka: TFtnAddr; FMaxPktSizeKB: longint; FCacheCS: TRTLCriticalSection; FCache: TFPHashList; { key: "zone:net/node.point|flavour" -> TEntry } FOnEvent: TMsgEventCallback; function GetOrCreateEntry(const Target: TFtnAddr; Flavour: FlavourType): TEntry; procedure RotateIfOver(Entry: TEntry; EstimatedSize: integer); procedure FinalizeEntry(Entry: TEntry); { rename .tmp → final } public constructor Create(const AOutboundDir: AnsiString; const AOurAka: TFtnAddr); destructor Destroy; override; function DispatchMessage(const Msg: TMsgRecord; const Target: TFtnAddr; Flavour: FlavourType): boolean; procedure Flush; { finalize every cached entry } property MaxPktSizeKB: longint read FMaxPktSizeKB write FMaxPktSizeKB; property OnEvent: TMsgEventCallback read FOnEvent write FOnEvent; end; ``` **Per-entry:** ```pascal TEntry = class Key: AnsiString; { "zone:net/node.point|flavour" } Target: TFtnAddr; Flavour: FlavourType; Stream: TStream; { writer, actually holds an ITsmIO under the hood } TmpPath: AnsiString; { xxxxxxxx.pkt.tmp } FinalPath: AnsiString; { xxxxxxxx.pkt — set on Rotate/Flush } CS: TRTLCriticalSection; { serialises writes to this one pkt } WrittenSize: int64; end; ``` **Flow inside `DispatchMessage`:** 1. Lookup entry by `(Target, Flavour)` in `FCache`. - Miss: create entry, open `TmpPath`, write pkt header, cache. - Hit: enter its CS. 2. `RotateIfOver(Entry, EstimatedMsgSize)`: - If `WrittenSize + EstimatedMsgSize > MaxPktSizeKB * 1024` (and `MaxPktSizeKB > 0`): - Write terminator, close, rename `.tmp → .pkt`. - Create a new `.tmp`, write header, reset `WrittenSize`. - Fire `metPktRotated`. 3. Convert `TMsgRecord` → `TPktMessage` via `UniToPkt`. 4. Write via `OpenMsgLib.Pkt` writer. Update `WrittenSize`. 5. Fire `metMessageWritten`. 6. Leave entry CS. **Flush:** iterate `FCache`, for each entry write terminator, close, `FinalizeEntry`. Fire `metBatchFinalized`. **Crash recovery:** on startup, scanner sees orphan `xxxxxxxx.pkt.tmp` files. Decision policy (caller-configurable): - **Discard** (default): delete orphan tmps, assume corrupt. - **Recover**: try to validate pkt header + terminator, if valid rename to `.pkt`, else discard. --- ## References to the cross-project conversation Full transcript of the NR ↔ fpc-msgbase discussion lives in the session log at `/home/ken/.claude/projects/-home-ken-Source-Code-netreader/*.jsonl` (dates around 2026-04-15). Relevant decisions captured in this document; the session log has the reasoning trail if questions arise. Key points from that conversation, reproduced for the next session: - **Why not fork fpc-msgbase into a new project?** Because fpc-msgbase has tested format backends already. Forking means re-validating them. Reshape in place preserves that investment. - **Why the five-line Hello World test?** comet achieves plug-and-forget with a single log callback and a TStream-based API. That's the bar. If the lib requires more ceremony than "open path, read, close," it's not there yet. - **Why explicit locking, not implicit?** BBSes with their own global mutex don't want the lib double-locking. Stateless readers don't need cross-process locks. Library guessing leads to surprises. Explicit means embedders can always reason about behavior. - **Why single callback, not registry?** comet's `OnLog` proves one pointer is enough. Multi-observer is the caller's problem — they can write their own fan-out if needed. Registry adds state the library shouldn't own. --- ## Closing This is a design proposal, not a mandate. The next implementer should push back on anything that doesn't hold up against real code. The testmsg corpus + cross-verifier gives us the data to have that conversation grounded in bytes rather than opinion. When in doubt: simpler is better. comet's model works because it refused to do things the library didn't absolutely need to do. fpc-msgbase 1.0 should exit with **less code, not more**, than 0.1.0 — the reshape is about architectural clarity, not feature addition. Features come in 1.1+ on top of a clean 1.0.