Ask 1 from fpc-binkp consumer thread: non-storage libraries
(fpc-ftn-transport, fpc-binkp, future fpc-comet-proto / fpc-emsi,
SQL-backed messaging like Fastway) only need TFTNAddress, not the
full 1041-line mb.types. Extract to src/mb.address.pas (~90 lines,
only SysUtils) so they can cp a single file into their project.
mb.types continues to uses mb.address so existing callers see the
type transitively -- BUT FPC does not propagate record-field access
through re-export, so consumers that touch TFTNAddress.Zone/Net/
Node/Point directly must add mb.address to their own uses clause.
All 7 in-tree .uni adapters, 2 examples, 5 test harnesses updated.
No behavioural change. Full suite passes, multi-target build
green (x86_64-linux, i386-{linux,freebsd,win32,os2,go32v2}).
302 lines
13 KiB
Markdown
302 lines
13 KiB
Markdown
# fpc-msgbase — architecture
|
|
|
|
## Layers
|
|
|
|
```
|
|
┌──────────────────────────────────────────────────┐
|
|
│ Caller (BBS, tosser, editor, importer, …) │
|
|
└──────────────────────────────────────────────────┘
|
|
│
|
|
▼
|
|
┌──────────────────────────────────────────────────┐
|
|
│ mb.api (TMessageBase, factory, TUniMessage) │
|
|
├──────────────────────────────────────────────────┤
|
|
│ mb.events mb.lock mb.paths mb.kludge │
|
|
├──────────────────────────────────────────────────┤
|
|
│ Format backends — two .pas units per format: │
|
|
│ mb.fmt.<fmt> - native record + I/O class │
|
|
│ mb.fmt.<fmt>.uni - TMessageBase adapter │
|
|
│ mb.fmt.hudson(.uni) mb.fmt.jam(.uni) │
|
|
│ mb.fmt.squish(.uni) mb.fmt.msg(.uni) │
|
|
│ mb.fmt.pcboard(.uni) mb.fmt.ezycom(.uni) │
|
|
│ mb.fmt.goldbase(.uni) mb.fmt.wildcat(.uni) │
|
|
├──────────────────────────────────────────────────┤
|
|
│ RTL: TFileStream, BaseUnix/Windows for locking │
|
|
└──────────────────────────────────────────────────┘
|
|
|
|
┌──────────────────────────────────────────────────┐
|
|
│ Sibling: fpc-ftn-transport │
|
|
│ tt.pkt.format / tt.pkt.reader (registers │
|
|
│ mbfPkt) / tt.pkt.writer / tt.pkt.batch │
|
|
│ plus forthcoming BSO / ArcMail / drop modules. │
|
|
└──────────────────────────────────────────────────┘
|
|
```
|
|
|
|
PKT is a wire format and lives in `fpc-ftn-transport`, not here.
|
|
The `mbfPkt` enum value stays in `mb.types` so `tt.pkt.reader`
|
|
can register the backend with the unified-API factory. Consumers
|
|
wanting to iterate `.pkt` files just `uses tt.pkt.reader` and
|
|
call `MessageBaseOpen(mbfPkt, ...)` as usual. TPacketBatch
|
|
(was here as `ma.batch`) moved with it as `tt.pkt.batch`.
|
|
|
|
**Integration gotcha:** to use a backend through the unified
|
|
`TMessageBase` API you must include the `.uni` adapter unit in
|
|
your `uses` clause, not just the native `mb.fmt.<format>` unit.
|
|
The adapter's `initialization` block is what registers the
|
|
backend with the factory.
|
|
|
|
```pascal
|
|
uses
|
|
mb.types, mb.events, mb.api,
|
|
mb.fmt.jam, mb.fmt.jam.uni; { both — .uni is what registers }
|
|
```
|
|
|
|
Forgetting `.uni` produces `EMessageBase: No backend registered
|
|
for JAM` at the first `MessageBaseOpen(mbfJam, ...)` call. The
|
|
exception message hints at the fix.
|
|
|
|
## Polymorphism
|
|
|
|
Every backend descends from `TMessageBase` and implements the abstract
|
|
`DoOpen`, `DoClose`, `DoMessageCount`, `DoReadMessage`, `DoWriteMessage`
|
|
contract. Callers can either:
|
|
|
|
1. Use the unified API — `MessageBaseOpen(format, path, mode)` returns a
|
|
`TMessageBase`. Read/write through `TUniMessage`. Format-agnostic.
|
|
2. Drop down to format-specific class methods (e.g. `TJamBase.IncModCounter`,
|
|
`TSquishBase.SqHashName`) when they need behaviour the unified API cannot
|
|
express. Each backend keeps its rich API public.
|
|
|
|
## TUniMessage — two-area model
|
|
|
|
```pascal
|
|
TUniMessage = record
|
|
Body: AnsiString; { only the message text }
|
|
Attributes: TMsgAttributes; { everything else, key/value }
|
|
end;
|
|
```
|
|
|
|
Two areas, no surprises:
|
|
|
|
- **Body** carries the user-visible message text and nothing else.
|
|
Never kludge lines, never headers, never SEEN-BY/PATH. Always a
|
|
ready-to-display blob.
|
|
- **Attributes** carries every other piece of data: From, To,
|
|
Subject, dates, addresses, attribute bits, FTSC kludges (MSGID,
|
|
ReplyID, PID, SEEN-BY, PATH, …), and per-format extras
|
|
(`jam.msgidcrc`, `squish.umsgid`, `pcb.confnum`, …).
|
|
|
|
Same model as RFC 822 email (headers + body). Lossless round-trip
|
|
across Read → Write → Read is enforced by the regression suite in
|
|
`tests/test_roundtrip_attrs.pas`.
|
|
|
|
**The library never composes presentation.** A BBS that wants to
|
|
display kludges inline walks `Attributes` and prepends `^aMSGID:`
|
|
etc. to its own display. A BBS that hides kludges just shows
|
|
`Body`. A tosser that needs MSGID for dupe detection reads
|
|
`Attributes.Get('msgid')` directly — no body parsing required.
|
|
|
|
Dates land in `TDateTime` regardless of how the backend stored
|
|
them (Hudson `MM-DD-YY` strings with 1950 pivot, Squish FTS-0001
|
|
strings, JAM Unix timestamps, PCBoard / EzyCom DOS PackTime).
|
|
Stored in attributes as `date.written` / `date.received` via
|
|
`SetDate` / `GetDate`.
|
|
|
|
Format-specific bit fields (Hudson byte attr, JAM 32-bit attr,
|
|
Squish attr, MSG word attr, PCB status, EzyCom dual byte) are
|
|
unrolled into individual `attr.*` boolean attributes on Read via
|
|
`UniAttrBitsToAttributes` and recomposed on Write via
|
|
`UniAttrBitsFromAttributes` and the per-format `XxxAttrFromUni`
|
|
helpers. The canonical `MSG_ATTR_*` cardinal bitset stays as the
|
|
internal pivot.
|
|
|
|
### High-Water Mark (HWM) — per-user scanner pointer
|
|
|
|
Tossers, scanners, and editors that want to track "last message I
|
|
processed for user X" can use the per-user HWM API on
|
|
`TMessageBase`:
|
|
|
|
```pascal
|
|
function SupportsHWM: boolean;
|
|
function GetHWM(const UserName: AnsiString): longint;
|
|
procedure SetHWM(const UserName: AnsiString; MsgNum: longint);
|
|
procedure MapUser(const UserName: AnsiString; UserId: longint);
|
|
property ActiveUser: AnsiString; { auto-bump on Read }
|
|
```
|
|
|
|
HWM uses the format's native lastread mechanism, not a sidecar.
|
|
A tosser registers itself as just another user (`'NetReader'`,
|
|
`'Allfix'`, `'FidoMail-Toss'`) and its HWM lives in the same
|
|
file the BBS uses for human-user lastread, so multiple consumers
|
|
naturally coexist without colliding.
|
|
|
|
**Coverage:**
|
|
|
|
| Format | HWM | Mechanism |
|
|
|---|:-:|---|
|
|
| JAM | ✓ | `.JLR` (CRC32(lower(name))) |
|
|
| Squish | ✓ | `.SQL` (CRC32(lower(name))) |
|
|
| Hudson | ✓ | `LASTREAD.BBS` per-(user-id, board); needs `MapUser` + `Board` |
|
|
| GoldBase | ✓ | `LASTREAD.DAT` per-(user-id, board); needs `MapUser` + `Board` |
|
|
| EzyCom | — | per-user state lives in the BBS user records, not the message base; no msg-base lastread file to plumb |
|
|
| Wildcat | — | SDK exposes `MarkMsgRead` per-message but no per-user HWM primitive |
|
|
| PCBoard | — | USERS file lastread per-conference; deferred |
|
|
| MSG, PKT | — | spec has no HWM concept |
|
|
|
|
For the multi-board formats (Hudson, GoldBase) the caller must
|
|
set both:
|
|
|
|
- `base.MapUser('NetReader', 60001)` — pick a numeric user ID
|
|
(use 60000+ to avoid colliding with real BBS users).
|
|
- `base.Board := N` — the board / conference number this scan
|
|
is for. The same physical Hudson base contains all 200 boards;
|
|
HWM is per-(user, board).
|
|
|
|
Without either, `GetHWM` returns -1.
|
|
|
|
For unsupported formats `SupportsHWM` returns false and `GetHWM`
|
|
returns -1; `SetHWM` is a no-op. Caller falls back to its own
|
|
state for those formats (e.g. NR's dupedb).
|
|
|
|
**Auto-bump pattern for scanners:**
|
|
|
|
```pascal
|
|
base.ActiveUser := 'NetReader';
|
|
for i := 0 to base.MessageCount - 1 do begin
|
|
base.ReadMessage(i, msg);
|
|
{ ... process msg ... }
|
|
{ HWM auto-tracks the highest msg.num seen for NetReader. }
|
|
end;
|
|
```
|
|
|
|
When `ActiveUser` is set, `ReadMessage` calls `SetHWM` after each
|
|
successful read if the just-read `msg.num` is strictly greater
|
|
than the current HWM. Never decrements -- reading a lower-numbered
|
|
message is a no-op. Default off (`ActiveUser = ''`).
|
|
|
|
**Multi-tenant by design:** every scanner / tosser gets its own
|
|
slot in the lastread file, keyed by its name. NR as `'NetReader'`,
|
|
Allfix as `'Allfix'`, Fimail as `'FidoMail-Toss'` -- they all
|
|
coexist in `.JLR` / `.SQL` without interfering with each other or
|
|
with human-user lastread.
|
|
|
|
**Pack/purge** is the format's responsibility: each backend's
|
|
Pack rewrites the lastread file in step with the message
|
|
renumbering. For JAM and Squish this is handled natively.
|
|
|
|
### `area` auto-population
|
|
|
|
When the caller passes an `AAreaTag` to `MessageBaseOpen` (or
|
|
sets the `AreaTag` property post-construction), every successful
|
|
`ReadMessage` auto-populates `Msg.Attributes['area']` with that
|
|
tag — but only if the adapter didn't already populate it from
|
|
on-disk data (PKT's AREA kludge, for example).
|
|
|
|
This saves echomail consumers from having to copy AreaTag into
|
|
every message attribute manually. Multi-format scanners always
|
|
get a populated `area` when the area is configured.
|
|
|
|
### Shared kludge plumbing — `mb.kludge`
|
|
|
|
`mb.kludge` exposes the FTSC-form-kludge parsing/emission helpers
|
|
the inline-kludge backends (MSG, PKT) and CtrlInfo-style backend
|
|
(Squish) share, plus what JAM's FTSKLUDGE subfield walking uses:
|
|
|
|
```pascal
|
|
function ParseKludgeLine(const Line: AnsiString;
|
|
var A: TMsgAttributes): boolean;
|
|
procedure SplitKludgeBlob(const RawBody: AnsiString;
|
|
out PlainBody: AnsiString;
|
|
var A: TMsgAttributes);
|
|
function BuildKludgePrefix(const A: TMsgAttributes): AnsiString;
|
|
function BuildKludgeSuffix(const A: TMsgAttributes): AnsiString;
|
|
```
|
|
|
|
Consumers that need to parse raw FTSC body blobs (e.g. parity
|
|
tests, format converters, debug tools) can call these directly
|
|
without reaching into a backend. Single source of truth for
|
|
kludge naming, INTL/FMPT/TOPT recognition, and the `kludge.<name>`
|
|
forward-compat passthrough.
|
|
|
|
### Capabilities API — backend self-description
|
|
|
|
Each backend declares the canonical list of attribute keys it
|
|
understands via a class function:
|
|
|
|
```pascal
|
|
class function TMessageBase.ClassSupportedAttributes: TStringDynArray;
|
|
```
|
|
|
|
Callers query before setting:
|
|
|
|
```pascal
|
|
if base.SupportsAttribute('attr.returnreceipt') then
|
|
RenderReceiptCheckbox
|
|
else
|
|
HideReceiptCheckbox;
|
|
```
|
|
|
|
Backends silently ignore unknown attributes on Write (RFC 822
|
|
X-header semantics — fine for forward compatibility); the
|
|
capabilities API exists so callers know in advance which keys won't
|
|
survive on a given format. The full per-format support matrix lives
|
|
in `docs/attributes-registry.md`.
|
|
|
|
## Locking
|
|
|
|
Three layers, applied in order on every `Open`:
|
|
|
|
1. **In-process** — `TRTLCriticalSection` per `TMessageBase` instance.
|
|
2. **Cross-process** — advisory lock on a sentinel file
|
|
(`<base>.lck` or, for Squish, `<base>.SQL` so we coexist with other
|
|
Squish-aware tools). `fpflock(LOCK_EX|LOCK_SH)` on Unix,
|
|
`LockFileEx` on Windows. Retry with backoff up to a configurable
|
|
timeout (default 30s). Lock acquire/release fires events.
|
|
3. **OS share modes** — `fmShareDenyWrite` for writers,
|
|
`fmShareDenyNone` for readers, matching DOS-era multi-process sharing
|
|
conventions every classic format expects.
|
|
|
|
## Events
|
|
|
|
`TMessageEvents` lets callers subscribe one or more handlers to receive
|
|
`metBaseOpened`, `metMessageRead`, `metMessageWritten`, `metLockAcquired`,
|
|
`metPackProgress`, etc. Internally the dispatcher serialises calls so
|
|
handlers do not need to be reentrant.
|
|
|
|
## Concurrent tossers
|
|
|
|
`TPacketBatch` (was `ma.batch` here pre-0.4.0; now lives in
|
|
`fpc-ftn-transport` as `tt.pkt.batch`) owns a queue of `.pkt`
|
|
paths and a worker thread pool. Each worker opens its packet,
|
|
reads messages, hands each to the caller-provided processor.
|
|
The batch caches one `TMessageBase` per destination area so
|
|
writes serialise through layer-1 locking; layer-2 keeps
|
|
separate processes (e.g. an editor) safe at the same time.
|
|
Class name unchanged for caller compatibility.
|
|
|
|
## Memory ownership
|
|
|
|
Shared rule across the fpc-* ecosystem (msgbase, ftn-transport,
|
|
binkp, comet, emsi, log):
|
|
|
|
Public types exposed to callers are either **value records**
|
|
(`TFTNAddress`, `TUniMessage`, `TMsgAttributes` — owned by the
|
|
caller's stack/heap; copy semantics) or **TObject descendants
|
|
the caller constructs and frees** (`TMessageBase` and its
|
|
backends). Returned `TBytes` / `string` / `TStream` values are
|
|
RTL-managed and the caller frees via normal heap semantics.
|
|
|
|
The library never allocates memory with `GetMem` and expects
|
|
the caller to `FreeMem` (or vice versa). This keeps static-
|
|
linked consumers (no shared-heap plugin model like Fastway's
|
|
cmem-first pattern) compatible without fiddling.
|
|
|
|
## Behavioural fidelity
|
|
|
|
Every format backend is implemented from the published format
|
|
specification (FTSC documents and the original format authors' own
|
|
spec papers — see `docs/ftsc-compliance.md`). Tests read and write
|
|
real sample bases captured from working BBS installations; round-trip
|
|
tests verify byte-for-byte preservation across read → write → read
|
|
cycles.
|