Files
fpc-msgbase/docs/architecture.md
Ken Johnson 1e253e8a78 Phase 5: attribute registry + arch / proposal / README updates
New docs/attributes-registry.md publishes the canonical attribute
key catalog in four tiers:

  1. Universal headers — msg.num, from, to, subject, date.*, addr.*,
     area, board, cost.  Every Fido format carries them.
  2. Canonical attribute bits — attr.private, attr.crash, etc.,
     mapped to/from the FTS-1 attribute word.
  3. FTSC kludges — msgid, replyid, pid, tid, flags, chrs, tzutc,
     seen-by, path, via.  Multi-line keys use #13 between lines.
  4. Format-specific — jam.*, squish.*, hudson.*, goldbase.*, ezy.*,
     pcb.*, wildcat.*, pkt.*, msg.*.  Each backend's namespace.

Plus a per-format support matrix showing which keys each backend
carries. Authoritative source remains each backend's
ClassSupportedAttributes -- the matrix can drift; SupportsAttribute()
is the runtime-correct query.

docs/architecture.md TUniMessage section rewritten:
- Documents the strict two-area model (Body + Attributes only).
- Body holds only the message text, never kludges or headers.
- Library never composes presentation -- consumers walk Attributes
  and assemble their own display.
- Adds the capabilities API section pointing at the registry.
- Removes the stale "kludge lines intact and CR-separated" promise
  the previous adapter implementations didn't honor.

docs/PROPOSAL.md flags the original Extras-bag section as
SUPERSEDED 2026-04-17, points to the registry + architecture docs
as the live design. Original text retained as historical context
since it captures the conversation that drove the redesign.

README.md:
- Features list now leads with the lossless two-area model and the
  capabilities API.
- Adds a Status note flagging 0.2 as a breaking change vs 0.1 with
  a one-paragraph migration sketch (msg.WhoFrom -> Attributes.Get
  ('from'), etc.).
- Documentation index links to the new registry doc.
2026-04-17 14:35:19 -07:00

6.3 KiB

fpc-msgbase — architecture

Layers

        ┌──────────────────────────────────────────────────┐
        │  Caller (BBS, tosser, editor, importer, …)       │
        └──────────────────────────────────────────────────┘
                              │
                              ▼
        ┌──────────────────────────────────────────────────┐
        │  ma.api (TMessageBase, factory, TUniMessage)     │
        ├──────────────────────────────────────────────────┤
        │  ma.events   ma.lock   ma.paths                  │
        │  ma.batch (concurrent tosser helper)             │
        ├──────────────────────────────────────────────────┤
        │  Format backends — one .pas per format           │
        │  ma.fmt.hudson   ma.fmt.jam      ma.fmt.squish   │
        │  ma.fmt.msg      ma.fmt.pkt      ma.fmt.pcboard  │
        │  ma.fmt.ezycom   ma.fmt.goldbase ma.fmt.wildcat  │
        ├──────────────────────────────────────────────────┤
        │  RTL: TFileStream, BaseUnix/Windows for locking  │
        └──────────────────────────────────────────────────┘

Polymorphism

Every backend descends from TMessageBase and implements the abstract DoOpen, DoClose, DoMessageCount, DoReadMessage, DoWriteMessage contract. Callers can either:

  1. Use the unified API — MessageBaseOpen(format, path, mode) returns a TMessageBase. Read/write through TUniMessage. Format-agnostic.
  2. Drop down to format-specific class methods (e.g. TJamBase.IncModCounter, TSquishBase.SqHashName) when they need behaviour the unified API cannot express. Each backend keeps its rich API public.

TUniMessage — two-area model

TUniMessage = record
  Body:       AnsiString;       { only the message text }
  Attributes: TMsgAttributes;   { everything else, key/value }
end;

Two areas, no surprises:

  • Body carries the user-visible message text and nothing else. Never kludge lines, never headers, never SEEN-BY/PATH. Always a ready-to-display blob.
  • Attributes carries every other piece of data: From, To, Subject, dates, addresses, attribute bits, FTSC kludges (MSGID, ReplyID, PID, SEEN-BY, PATH, …), and per-format extras (jam.msgidcrc, squish.umsgid, pcb.confnum, …).

Same model as RFC 822 email (headers + body). Lossless round-trip across Read → Write → Read is enforced by the regression suite in tests/test_roundtrip_attrs.pas.

The library never composes presentation. A BBS that wants to display kludges inline walks Attributes and prepends ^aMSGID: etc. to its own display. A BBS that hides kludges just shows Body. A tosser that needs MSGID for dupe detection reads Attributes.Get('msgid') directly — no body parsing required.

Dates land in TDateTime regardless of how the backend stored them (Hudson MM-DD-YY strings with 1950 pivot, Squish FTS-0001 strings, JAM Unix timestamps, PCBoard / EzyCom DOS PackTime). Stored in attributes as date.written / date.received via SetDate / GetDate.

Format-specific bit fields (Hudson byte attr, JAM 32-bit attr, Squish attr, MSG word attr, PCB status, EzyCom dual byte) are unrolled into individual attr.* boolean attributes on Read via UniAttrBitsToAttributes and recomposed on Write via UniAttrBitsFromAttributes and the per-format XxxAttrFromUni helpers. The canonical MSG_ATTR_* cardinal bitset stays as the internal pivot.

Capabilities API — backend self-description

Each backend declares the canonical list of attribute keys it understands via a class function:

class function TMessageBase.ClassSupportedAttributes: TStringDynArray;

Callers query before setting:

if base.SupportsAttribute('attr.returnreceipt') then
  RenderReceiptCheckbox
else
  HideReceiptCheckbox;

Backends silently ignore unknown attributes on Write (RFC 822 X-header semantics — fine for forward compatibility); the capabilities API exists so callers know in advance which keys won't survive on a given format. The full per-format support matrix lives in docs/attributes-registry.md.

Locking

Three layers, applied in order on every Open:

  1. In-processTRTLCriticalSection per TMessageBase instance.
  2. Cross-process — advisory lock on a sentinel file (<base>.lck or, for Squish, <base>.SQL so we coexist with other Squish-aware tools). fpflock(LOCK_EX|LOCK_SH) on Unix, LockFileEx on Windows. Retry with backoff up to a configurable timeout (default 30s). Lock acquire/release fires events.
  3. OS share modesfmShareDenyWrite for writers, fmShareDenyNone for readers, matching DOS-era multi-process sharing conventions every classic format expects.

Events

TMessageEvents lets callers subscribe one or more handlers to receive metBaseOpened, metMessageRead, metMessageWritten, metLockAcquired, metPackProgress, etc. Internally the dispatcher serialises calls so handlers do not need to be reentrant.

Concurrent tossers

TPacketBatch owns a queue of .pkt paths and a worker thread pool. Each worker opens its packet, reads messages, hands each to the caller-provided processor. The batch caches one TMessageBase per destination area so writes serialise through layer-1 locking; layer-2 keeps separate processes (e.g. an editor) safe at the same time.

Behavioural fidelity

Every format backend is implemented from the published format specification (FTSC documents and the original format authors' own spec papers — see docs/ftsc-compliance.md). Tests read and write real sample bases captured from working BBS installations; round-trip tests verify byte-for-byte preservation across read → write → read cycles.