Archive note: This describes a December 2025 SoCalNomad publishing
experiment. The later system evolved, but the scheduling problem remains
useful.
The first version of an automated publisher is usually a cron
expression.
At a fixed time, it looks for a story. If a story exists, it
publishes. The approach is easy to understand and easy to debug. It also
produces a site that behaves like machinery because it is machinery.
For SoCalNomad, I wanted automation without the unmistakable cadence
of a feed reader dumping content into WordPress. The system needed to
respect freshness and capacity while allowing the day to have some
texture. That led to what the design documents grandly called the
“Humanized Publishing System.”
The name was too ambitious. Randomness does not make software human.
What it can do is prevent a rigid schedule from becoming the most
visible feature of the publication.
The Decision Pipeline
The publisher ran after ingestion, filtering, and story clustering.
Each run evaluated a sequence of gates:
- Is this breaking news that should bypass the normal schedule?
- Is this an appropriate time of day to publish?
- Has the daily quota already been reached?
- Is there a qualified, unpublished cluster?
- Which available cluster has the strongest evidence and
relevance?
Only after those checks would the workflow claim and publish a
story.
The time gate used different probabilities across the day. Midday had
a high chance of publication. Overnight runs had a low chance. Morning,
afternoon, and evening sat between those extremes.
This did not predict audience behavior with scientific precision. It
encoded an editorial preference: publish more often when readers were
likely to be active, but do not make every day identical.
Quotas Were a Safety
Mechanism
The design targeted no more than seven normal posts per day. Breaking
news could exceed that limit.
At first glance, a quota sounds like a growth metric. In practice, it
was a brake. The upstream pipeline could ingest dozens of feeds, and an
AI filter could always find something marginally relevant. Without a
daily limit, a weak scoring model or duplicate cluster could flood the
site before anyone noticed.
The quota converted an unbounded automation problem into a bounded
editorial problem. Even if selection quality degraded, the blast radius
had a ceiling.
That principle generalized beyond publishing. When a probabilistic
system can create public output, cost money, or modify durable state,
volume limits are not an optimization. They are a control surface.
Selection Still Had to
Be Deterministic
Timing could be probabilistic, but story eligibility could not.
A cluster needed multiple independent sources, no existing
publication timestamp, sufficient relevance, and a reasonable freshness
score. Among eligible clusters, the system ranked source count, article
count, relevance, and recency.
This separation became important:
- Probability decided whether the system should act now.
- Explicit rules decided what it was allowed to act on.
- Ranking decided which qualified item should go first.
Mixing those concerns would have made the workflow difficult to
audit. A random number should never rescue an ineligible story.
Breaking News Needed Its Own
Lane
Time curves and quotas are useful until a genuinely time-sensitive
story arrives. The design therefore included a bypass for recent, highly
relevant clusters with strong source support.
The bypass was deliberately narrow. Labeling everything “breaking” is
an easy way to defeat every safeguard in the system. A story needed to
be recent and well-supported, not merely exciting to the language
model.
This was an early example of a pattern that appeared repeatedly in
the project: the AI could contribute scores and summaries, but durable
controls belonged in ordinary code and database constraints.
The Problem with
Simulating Humanity
There is a philosophical trap in this design. A publisher that waits
a random amount of time is not more authentic. It is only less
regular.
Human editorial behavior includes judgment, interruption, changing
priorities, fatigue, and awareness of what has already been said.
Probability curves imitate one surface characteristic while ignoring the
reasons behind it.
The useful objective was therefore not “make the bot look human.” It
was:
- Avoid a visibly mechanical release pattern
- Concentrate activity during sensible hours
- Preserve room for urgent stories
- Cap daily output
- Log why every publication occurred
That framing is less theatrical and more defensible.
What I Would Keep
The exact percentages were guesses and would need analytics to
justify them. The architecture was stronger than the numbers.
I would keep the bounded quota, the breaking-news lane, deterministic
eligibility, quality ranking, and a publication log containing the
decision factors. I would also add stronger idempotency so two
overlapping runs could not claim the same story.
The experiment began as an attempt to make automation feel less
robotic. Its better lesson was about separating policy from chance.
Randomness can vary timing. It should never replace editorial rules,
safety limits, or an audit trail.