Files
fpc-msgbase/docs/architecture.md
Ken Johnson 850cc65ee3 Milestone 0.3.3: HWM for Hudson + GoldBase + Board context
Adds per-(user, board) HWM for the QuickBBS-family multi-board
formats. The same physical Hudson/GoldBase base file set holds
ALL boards (1..200 for Hudson, 1..500 for GoldBase) in one
LASTREAD.BBS / LASTREAD.DAT file, indexed by user number with
one word slot per board. Caller has to provide both pieces of
context before HWM operations make sense:

- base.MapUser('NetReader', 60001)   - pick a numeric user ID
- base.Board := 5                    - which board this scan is for

src/ma.api.pas:
- New TMessageBase.Board property (longint, default 0).
- Single-area formats (JAM, Squish) ignore it.
- Multi-board formats return -1 from GetHWM when Board <= 0.

src/formats/ma.fmt.hudson.pas:
- New HudsonLastRead record matching QuickBBS LASTREAD.BBS layout.
- TJamBase pattern: FLrStream lazy + EnsureLrStream +
  GetLastRead(user, board) + SetLastRead(user, board, msgnum).
- SetLastRead extends file with zeros to reach the user slot,
  matching QuickBBS convention.
- Uses fpOpen on Unix (same FPC auto-flock workaround as Squish).

src/formats/ma.fmt.goldbase.pas:
- Same shape, GoldBaseLastRead with GOLDBASE_MAX_BOARDS (500)
  slots, file is LASTREAD.DAT.

Both .uni adapters wire DoSupportsHWM/DoGetHWMById/DoSetHWMById
to the new native methods, gating on Board > 0.

tests/test_hwm.pas: 3 new tests covering Hudson + GoldBase:
- TestHudsonRequiresMapUserAndBoard verifies -1 returns when
  MapUser missing, Board missing, or both.
- TestHudsonSetGetPersistence covers two users on two boards
  with cross-session persistence.
- TestGoldBaseSetGet covers a high board number (250) to
  exercise the wider GOLDBASE_MAX_BOARDS range.

Updated docs/architecture.md HWM coverage map: Hudson and
GoldBase moved from deferred to native. EzyCom still deferred
(per-area layout differs); Wildcat/PCBoard still -1.

Suite: 40/40 across 9 programs (test_hwm now 11/11).

NetReader and similar consumers can now register tossers as
high-numbered users (60000+) and walk per-board HWM the way
Allfix has historically done. Tossers coexist with human BBS
users in the same LASTREAD file (different user slots).
2026-04-18 06:12:23 -07:00

9.1 KiB

fpc-msgbase — architecture

Layers

        ┌──────────────────────────────────────────────────┐
        │  Caller (BBS, tosser, editor, importer, …)       │
        └──────────────────────────────────────────────────┘
                              │
                              ▼
        ┌──────────────────────────────────────────────────┐
        │  ma.api (TMessageBase, factory, TUniMessage)     │
        ├──────────────────────────────────────────────────┤
        │  ma.events   ma.lock   ma.paths                  │
        │  ma.batch (concurrent tosser helper)             │
        ├──────────────────────────────────────────────────┤
        │  Format backends — one .pas per format           │
        │  ma.fmt.hudson   ma.fmt.jam      ma.fmt.squish   │
        │  ma.fmt.msg      ma.fmt.pkt      ma.fmt.pcboard  │
        │  ma.fmt.ezycom   ma.fmt.goldbase ma.fmt.wildcat  │
        ├──────────────────────────────────────────────────┤
        │  RTL: TFileStream, BaseUnix/Windows for locking  │
        └──────────────────────────────────────────────────┘

Polymorphism

Every backend descends from TMessageBase and implements the abstract DoOpen, DoClose, DoMessageCount, DoReadMessage, DoWriteMessage contract. Callers can either:

  1. Use the unified API — MessageBaseOpen(format, path, mode) returns a TMessageBase. Read/write through TUniMessage. Format-agnostic.
  2. Drop down to format-specific class methods (e.g. TJamBase.IncModCounter, TSquishBase.SqHashName) when they need behaviour the unified API cannot express. Each backend keeps its rich API public.

TUniMessage — two-area model

TUniMessage = record
  Body:       AnsiString;       { only the message text }
  Attributes: TMsgAttributes;   { everything else, key/value }
end;

Two areas, no surprises:

  • Body carries the user-visible message text and nothing else. Never kludge lines, never headers, never SEEN-BY/PATH. Always a ready-to-display blob.
  • Attributes carries every other piece of data: From, To, Subject, dates, addresses, attribute bits, FTSC kludges (MSGID, ReplyID, PID, SEEN-BY, PATH, …), and per-format extras (jam.msgidcrc, squish.umsgid, pcb.confnum, …).

Same model as RFC 822 email (headers + body). Lossless round-trip across Read → Write → Read is enforced by the regression suite in tests/test_roundtrip_attrs.pas.

The library never composes presentation. A BBS that wants to display kludges inline walks Attributes and prepends ^aMSGID: etc. to its own display. A BBS that hides kludges just shows Body. A tosser that needs MSGID for dupe detection reads Attributes.Get('msgid') directly — no body parsing required.

Dates land in TDateTime regardless of how the backend stored them (Hudson MM-DD-YY strings with 1950 pivot, Squish FTS-0001 strings, JAM Unix timestamps, PCBoard / EzyCom DOS PackTime). Stored in attributes as date.written / date.received via SetDate / GetDate.

Format-specific bit fields (Hudson byte attr, JAM 32-bit attr, Squish attr, MSG word attr, PCB status, EzyCom dual byte) are unrolled into individual attr.* boolean attributes on Read via UniAttrBitsToAttributes and recomposed on Write via UniAttrBitsFromAttributes and the per-format XxxAttrFromUni helpers. The canonical MSG_ATTR_* cardinal bitset stays as the internal pivot.

High-Water Mark (HWM) — per-user scanner pointer

Tossers, scanners, and editors that want to track "last message I processed for user X" can use the per-user HWM API on TMessageBase:

function  SupportsHWM: boolean;
function  GetHWM(const UserName: AnsiString): longint;
procedure SetHWM(const UserName: AnsiString; MsgNum: longint);
procedure MapUser(const UserName: AnsiString; UserId: longint);
property  ActiveUser: AnsiString;     { auto-bump on Read }

HWM uses the format's native lastread mechanism, not a sidecar. A tosser registers itself as just another user ('NetReader', 'Allfix', 'FidoMail-Toss') and its HWM lives in the same file the BBS uses for human-user lastread, so multiple consumers naturally coexist without colliding.

Coverage:

Format HWM Mechanism
JAM .JLR (CRC32(lower(name)))
Squish .SQL (CRC32(lower(name)))
Hudson LASTREAD.BBS per-(user-id, board); needs MapUser + Board
GoldBase LASTREAD.DAT per-(user-id, board); needs MapUser + Board
EzyCom per-area lastread; deferred
Wildcat SDK exposes MarkMsgRead per-message but no per-user HWM primitive
PCBoard USERS file lastread per-conference; deferred
MSG, PKT spec has no HWM concept

For the multi-board formats (Hudson, GoldBase) the caller must set both:

  • base.MapUser('NetReader', 60001) — pick a numeric user ID (use 60000+ to avoid colliding with real BBS users).
  • base.Board := N — the board / conference number this scan is for. The same physical Hudson base contains all 200 boards; HWM is per-(user, board).

Without either, GetHWM returns -1.

For unsupported formats SupportsHWM returns false and GetHWM returns -1; SetHWM is a no-op. Caller falls back to its own state for those formats (e.g. NR's dupedb).

Auto-bump pattern for scanners:

base.ActiveUser := 'NetReader';
for i := 0 to base.MessageCount - 1 do begin
  base.ReadMessage(i, msg);
  { ... process msg ... }
  { HWM auto-tracks the highest msg.num seen for NetReader. }
end;

When ActiveUser is set, ReadMessage calls SetHWM after each successful read if the just-read msg.num is strictly greater than the current HWM. Never decrements -- reading a lower-numbered message is a no-op. Default off (ActiveUser = '').

Multi-tenant by design: every scanner / tosser gets its own slot in the lastread file, keyed by its name. NR as 'NetReader', Allfix as 'Allfix', Fimail as 'FidoMail-Toss' -- they all coexist in .JLR / .SQL without interfering with each other or with human-user lastread.

Pack/purge is the format's responsibility: each backend's Pack rewrites the lastread file in step with the message renumbering. For JAM and Squish this is handled natively.

Capabilities API — backend self-description

Each backend declares the canonical list of attribute keys it understands via a class function:

class function TMessageBase.ClassSupportedAttributes: TStringDynArray;

Callers query before setting:

if base.SupportsAttribute('attr.returnreceipt') then
  RenderReceiptCheckbox
else
  HideReceiptCheckbox;

Backends silently ignore unknown attributes on Write (RFC 822 X-header semantics — fine for forward compatibility); the capabilities API exists so callers know in advance which keys won't survive on a given format. The full per-format support matrix lives in docs/attributes-registry.md.

Locking

Three layers, applied in order on every Open:

  1. In-processTRTLCriticalSection per TMessageBase instance.
  2. Cross-process — advisory lock on a sentinel file (<base>.lck or, for Squish, <base>.SQL so we coexist with other Squish-aware tools). fpflock(LOCK_EX|LOCK_SH) on Unix, LockFileEx on Windows. Retry with backoff up to a configurable timeout (default 30s). Lock acquire/release fires events.
  3. OS share modesfmShareDenyWrite for writers, fmShareDenyNone for readers, matching DOS-era multi-process sharing conventions every classic format expects.

Events

TMessageEvents lets callers subscribe one or more handlers to receive metBaseOpened, metMessageRead, metMessageWritten, metLockAcquired, metPackProgress, etc. Internally the dispatcher serialises calls so handlers do not need to be reentrant.

Concurrent tossers

TPacketBatch owns a queue of .pkt paths and a worker thread pool. Each worker opens its packet, reads messages, hands each to the caller-provided processor. The batch caches one TMessageBase per destination area so writes serialise through layer-1 locking; layer-2 keeps separate processes (e.g. an editor) safe at the same time.

Behavioural fidelity

Every format backend is implemented from the published format specification (FTSC documents and the original format authors' own spec papers — see docs/ftsc-compliance.md). Tests read and write real sample bases captured from working BBS installations; round-trip tests verify byte-for-byte preservation across read → write → read cycles.