Technical writeup

Building an Automated Bidding System for Marketplace Ads

How a manual product-keyword bidding workflow became an objective-driven control system for advertisers, with return estimation, budget pacing, auction-aware bid generation, and measurement built into the loop.

Anonymized No exact metrics Marketplace ads Bidding and control

Executive Summary

I built the first automated bidding system for a large retail marketplace ads business. The product change was simple: instead of asking advertisers to manually set bids for every product and keyword, the system let them provide a budget and objective, then generated bids on their behalf.

The deeper technical shift was from bid entry to campaign control. The system had to estimate where spend would produce value, translate advertiser goals into a target return policy, simulate likely delivery, generate auction-ready bids, and keep learning from noisy marketplace feedback.

The core idea came from first-principles economics: if an advertiser is allocating spend well, marginal return should be roughly equalized across the portfolio. When one product-keyword pocket is clearly delivering better return than another, budget should flow toward the better opportunity until the expected marginal returns converge, subject to budget, pacing, relevance, and marketplace-health constraints.

The work was not just a prediction model. It was an automated bidding design: estimate returns, choose a target return policy, translate that policy into bids, observe realized outcomes, and adjust without breaking advertiser trust.

The Problem

Before automated bidding, advertisers managed bids at a granular level. A campaign might contain many products and many keyword or query contexts. The advertiser had to decide how much to bid in each place, even though the true value of a bid depended on conversion probability, basket behavior, competition, auction mechanics, attribution lag, budget constraints, and changing marketplace demand.

That workflow created three practical problems.

Advertiser Burden

Manual bid management pushed a hard optimization problem onto advertisers who mostly wanted to express business goals: grow sales, hit a return threshold, acquire new buyers, or spend a budget responsibly.

Allocation Waste

Some parts of a campaign could be underbid while others absorbed spend with weaker expected returns. The platform had better aggregate data than any single advertiser, but the manual workflow could not use that advantage fully.

Limited Product Scope

As the ads system matured, new objectives and richer inventory needed a control layer. Manual bids did not scale cleanly to multi-objective optimization, automated pacing, or model-driven campaign management.

The Product Interface

The most important design choice was to change what the advertiser controlled. Instead of requiring a bid for every product-keyword combination, the advertiser could provide a budget and an objective. The platform took responsibility for bid allocation.

That interface sounds easier for the advertiser because it is. But it also makes the platform accountable for a harder promise: spend the budget in places that are likely to produce the objective, while avoiding reckless delivery that damages return, trust, or the user experience.

Old interface: "Tell us what each bid should be."

New interface: "Tell us the budget and goal; we will choose bids to allocate spend intelligently."

The Core Insight

The system started from a simple allocation principle. If an advertiser can move spend across opportunities, then an efficient allocation should not leave obvious return gaps. If one product or query context is expected to produce much better return than another, the campaign should push more spend toward the better opportunity until the marginal return is no longer obviously higher.

In ads language, the return signal was ROAS: revenue attributed to ads divided by ad spend. The exact implementation had to handle attribution, delayed conversions, sparse data, noisy observations, and auction competition. But the governing idea was easy to explain:

Allocate spend so expected marginal ROAS is balanced across eligible opportunities, while respecting budget, pacing, relevance, and marketplace constraints.

This split the system into two linked problems. The first was a large-scale estimation problem: predict expected return across many product-keyword opportunities. The second was a control problem: choose the advertiser-level target return policy that would spend the right amount without crossing the advertiser's return constraint.

System Architecture

The architecture was deliberately practical. The first version needed to work inside an existing marketplace ads stack, operate on batch marketplace data, and produce bids reliably. The design favored debuggable stages over a single opaque model.

Collect Signals

Join impressions, clicks, conversions, spend, attributed sales, product metadata, campaign setup, query context, and budget state.

Estimate Return

Estimate expected sales or return for eligible product-keyword inventory, with smoothing for sparse and delayed outcomes.

Solve Policy

Infer the advertiser's target return threshold or control setting needed to spend efficiently against the objective.

Simulate Delivery

Run bid simulations to estimate spend, pacing, auction eligibility, return, and constraint violations before publishing bids.

Generate Bids

Publish bids into the ads serving path, monitor realized delivery, and feed outcomes back into the next control cycle.

The Algorithm

The first version can be understood as a closed-loop controller around a marketplace auction. The controller had to decide how aggressive each advertiser should be and how that aggressiveness translated into bids for individual opportunities.

1. Estimate opportunity value

For each eligible opportunity, the system estimated expected attributed value. In a sponsored product setting, an opportunity could be represented by a product, keyword, campaign, and serving context. The estimate had to combine signal from granular history with broader priors because many product-keyword pairs were sparse.

The practical modeling question was not "what is the perfect label?" It was "what estimate is stable enough to drive bids tomorrow?" That meant smoothing noisy observations, protecting against outliers, handling conversion delay, and making sure the estimate degraded gracefully when data was thin.

2. Translate value into a bid

Once expected value was available, the bid could be derived from the advertiser's target return policy. At a high level, if an opportunity was expected to produce more value, the system could bid more. If the advertiser needed a stricter return threshold, bids would become more conservative.

candidate_bid = expected_value / target_return_threshold

The real system still needed floors, caps, auction-specific constraints, eligibility rules, and pacing adjustments. But this simple relationship made the system explainable: bids were not magic; they were a function of expected value and the return policy required for the advertiser's goal.

3. Solve for the advertiser-level control setting

The target return threshold was not fixed globally. Each advertiser had a different catalog, budget, competitive environment, and objective. The system therefore solved for a campaign or advertiser-level setting that balanced two goals: spend enough of the budget to be useful, but do not buy low-return traffic just to increase burn.

Conceptually, the controller searched for the aggressiveness level where simulated spend and simulated return were both acceptable. Too conservative, and the campaign underdelivered. Too aggressive, and the campaign risked missing the advertiser's return target.

4. Simulate before publishing

Simulation was essential because bid changes interact with auctions and budgets. A bid that looks reasonable in isolation can produce too much spend, too little delivery, or unexpected concentration once it competes in the marketplace. The system therefore used simulation logic to evaluate candidate policies before publishing bids.

This simulation layer also made the system debuggable. When a campaign underdelivered, the team could distinguish between weak inventory, conservative controls, auction competition, budget constraints, eligibility issues, and noisy return estimates.

5. Close the loop

After bids were live, the system observed spend, attributed sales, budget utilization, pacing, and return. Those outcomes fed the next estimation and control cycle. The loop was not trying to chase every short-term wiggle. It was trying to remain responsive enough to correct allocation mistakes while stable enough that advertisers could trust it.

Production Design

A useful automated bidding system has to be boring in the right places. It needs predictable data dependencies, auditable intermediate tables, replayable decisions, clear ownership of failure modes, and enough observability that a bad campaign outcome can be explained.

The first production version used batch orchestration, warehouse-scale transforms, and database-side simulation functions. That kept large joins and simulations close to the data, reduced operational complexity, and made the system easier to inspect.

The production challenge was not only scale. It was the combination of scale, money movement, delayed feedback, advertiser trust, and marketplace side effects. A notebook model with strong offline metrics would not have been enough. The system had to run every cycle, generate defensible bids, and recover cleanly when inputs were imperfect.

Measurement

The system could not be evaluated by a naive comparison between manual and automated campaigns. Advertisers choose different strategies for different campaigns, and the campaigns that remain manual are often not comparable to the campaigns that adopt automation. A simple observed-ROAS comparison would mix treatment effects with selection bias.

The better measurement frame was experimental and constraint-aware. The question was not "does every automated campaign show higher observed ROAS than every manual campaign?" The question was whether the automated policy improved delivery against the advertiser's stated objective under comparable conditions.

The key families of metrics were:

Return quality: whether realized return stayed above the advertiser's target often enough to preserve trust.
Budget utilization: whether the system could spend the advertiser's intended budget when enough high-quality inventory existed.
Sales or objective lift: whether the campaign produced more of the stated business outcome.
Pacing stability: whether spend was distributed responsibly instead of exhausting budget too early or missing opportunities.
Marketplace health: whether ads delivery remained compatible with relevance, conversion, and user experience.

The safest one-sentence description: the controller tried to maximize useful budget burn subject to an advertiser return constraint and marketplace-quality constraints.

What Made It Hard

Automated bidding looks straightforward when reduced to a formula, but real marketplace ads systems are full of traps.

Noisy Labels

Observed conversions are delayed, attributed through imperfect rules, and affected by prior user intent. The model cannot treat every conversion as a clean causal label.

Auction Feedback

Bids affect whether an ad is shown, which affects what data is observed. The system has to learn from data generated by previous policies.

Budget Constraints

Campaigns are constrained by budgets, pacing rules, and available inventory. Underdelivery can be a policy problem, a supply problem, or a competition problem.

Sparse Granularity

The more useful the bid granularity, the less data each cell has. The system needed estimates that were granular enough to act but stable enough to trust.

Strategic Behavior

Advertisers respond to what the platform offers. Automated bidding changes what bids mean, so the product had to be understandable and credible.

Safeguards

Ads revenue is not the only objective. The system had to respect relevance, marketplace quality, and longer-term trust.

Why the Design Lasted

The design lasted because it separated durable concepts from replaceable implementation details. The exact return model could improve. The objective set could expand. The ad formats could change. The serving path and data model could mature. But the control loop stayed useful: estimate value, choose a policy, simulate delivery, generate bids, measure outcomes, and adjust.

That separation made the system extensible. Later versions could support more granular return estimation, new advertiser goals, richer models, and additional ad formats without discarding the original product and economic framing.

The most important lesson was that automated bidding is not a single model. It is an interface, an economic contract, a prediction layer, a control loop, bid generation, and a measurement problem all tied together. If any one of those pieces is weak, the system can look smart offline and still fail in the marketplace.

Project Summary

The project can be summarized as a sequence of product and technical changes:

The old system required manual bids for product-keyword combinations.
I changed the interface so advertisers could provide a budget and objective.
The economic insight was that efficient allocation should roughly equalize marginal ROAS across the portfolio.
That split the implementation into return estimation and advertiser-level control.
The deployed version generated bids through simulation and safeguards rather than blind model output.
The right measurement was experimental and constraint-aware, not naive manual-versus-automated ROAS.
The architecture endured because the control-loop framing was general enough to support later objectives and richer bidding logic.

I like this project because it began with a simple economic principle and ended as a product that changed how advertisers interacted with the platform.

Short Version

I built the first automated bidding system for a large marketplace ads platform. Before it, advertisers manually set granular product-keyword bids. After it, they could provide a budget and objective, and the platform generated bids on their behalf. The core idea was to allocate spend so expected marginal ROAS was balanced across the advertiser's portfolio, subject to budget, pacing, auction, and marketplace-quality constraints. I built the first production version as a debuggable control loop: estimate returns, solve for an advertiser-level target policy, simulate delivery, publish bids, and measure outcomes. The design became a durable foundation for later objective-based bidding work.