Executive Summary: Alternative lenders processing thousands of funding applications per month run into the same wall: Secretary of State (SOS) verification done by hand. One mid-market MCA shop now sees 5,000 to 6,000 submissions a month and another reports analysts spending 175+ hours each month logging into 51 different state portals to confirm entity status. The merchant cash advance market alone sits between $19 and $35 billion in annual volume,[1]-challenges-insights-and-solutions-from-industry-experts) spread across thousands of brokers and direct funders, and FY2025 SBA 7(a) lending added a record $37.3 billion on top.[2] Demand is outrunning manual workflows, and per-lookup pricing from legacy data providers chews through unit economics fast. This post explains how high-volume alternative lenders automate SOS verification end-to-end, where the hidden costs hide, and how Cobalt Intelligence's Secretary of State API fits into a stack already running thousands of KYB checks a week.
Why Does Manual Secretary of State Verification Break at Scale for Alternative Lenders?
Manual SOS verification works when an underwriter touches a few hundred files per quarter. It does not work when the same underwriter is one of three people behind a queue of thousands of weekly applications.
What Does "Thousands of Lookups Per Month" Actually Look Like?
For a Tier 1 alternative lender or MCA broker, the daily reality is volume the original SOS portals were never designed for. One alt-lender CEO described their submission flow this way:
"When you're doing thousands and thousands and thousands of submissions a month, you cannot have a human go to 51 different state portals."
That is not hyperbole. As noted above, the MCA market alone is in the $19-35B annual range, and "thousands of submissions per month" is a mid-tier reality for direct funders and brokers who would have called the same volume enterprise five years ago.[3]
Where Does Manual SOS Verification Fail Under Volume?
Three failure modes appear once a lender crosses about 500 lookups per week:
• Inconsistent data quality across states. Each Secretary of State office runs its own portal, name-matching logic, and update cadence. An underwriter who searches "ABC Holdings, LLC" in Texas may get a clean active record; the same business in California may surface as "ABC HOLDINGS LLC" and require fuzzy matching to confirm.
• Non-search-friendly portals. Several states still require image-based CAPTCHAs, paid PACER-style logins, or session cookies that break automation scripts. Engineers building in-house scrapers report constant breakage as state UIs change.
• Audit trail gaps. A screenshot pasted into Salesforce is not an audit artifact. Regulators auditing a lender for FinCEN customer due diligence (CDD) or state lending license compliance want timestamped, source-attributable records, not screen captures.[4]
The compounding effect is that volume growth flattens. A different alt-lender CEO put it bluntly: their team was "stuck at the current submission level for 12 to 24 months" because the manual verification queue could not absorb more files.
What Does Manual SOS Verification Really Cost a High-Volume Lender?
Cost shows up in three places: labor, per-lookup fees from data providers, and the hidden tax of bad data.
What Is the True Labor Cost of Manual KYB at 1,000+ Files Per Week?
A single underwriting analyst running SOS lookups, ownership confirmations, and UCC searches by hand carries a fully loaded cost of roughly $4,300 per month at a mid-market MCA shop. That is one analyst, one shift, no overtime. Triple it for the night-shift queue and weekend coverage many alt-lenders need to keep pace.
A Chief Revenue Officer at one alternative lender named the problem in plain terms:
"$4,300 per month in labor cost just for the manual SOS step. Completely manual still. Sort of the Achilles heel."
At alternative-lending scale, every dollar of avoidable verification labor compounds into real margin erosion. Industry benchmarks show automated underwriting lifts loan profits 10.2% and cuts default rates 6.8% compared to manual review, per a 2024 Management Science study of automated credit decisioning.[5] Labor on the SOS verification step is one of the first places that margin leaks out.
How Do Per-Lookup Fees Compound at MCA Scale?
Aggregator KYB platforms typically charge between $2 and $5 per business lookup at standard volume, sometimes more for premium data tiers. Run that across 5,000 monthly submissions and the math gets ugly:
• 5,000 lookups per month at $3 average = $15,000 per month, or $180,000 per year in lookup fees alone
• Many lenders run multiple lookups per file (initial KYB, beneficial owner pull, UCC search, ongoing monitoring) tripling the spend
• High-volume rate cards rarely scale linearly. A 10% volume increase often triggers a tier renegotiation that wipes out the savings
For a lender running thousands per week, the API line item often becomes the single largest external data cost on the underwriting P&L. That is the "per-lookup fee tax" Cobalt and similar direct-source providers exist to remove.
How Bad Is the Hidden Cost of Stale or Wrong Data?
A CTO at one lending infrastructure platform shared a metric that should worry every risk officer:
"We see about a 15% failure rate on the verification step. Sometimes the name is written wrong; sometimes the entity status is just out of date."
A 15% failure rate on a 5,000-file-per-month pipeline means 750 files bouncing back to the queue every month for re-verification. Each re-verification adds cycle time, ties up an analyst, and delays funding decisions. At the automated-underwriting benchmarks cited above, any verification-step failure is direct margin loss, not a small inconvenience.
How Do High-Volume Lenders Compare Bulk Verification Options?
When a lender outgrows manual verification, they typically evaluate four paths. None is a silver bullet; the right answer depends on volume, integration capacity, and how much of the verification stack the lender wants to own.
What Do Aggregator Platforms (Middesk, Alloy) Offer at Scale?
Middesk and Alloy package SOS data, UBO information, and watchlist screening into single API endpoints. Middesk operates as a standalone KYB and business verification API and is also one of the data partners available inside Alloy's data marketplace.[6] The strength of aggregators is convenience: one contract, one integration, one invoice. The cost is the per-lookup pricing model and the indirection. These aggregators are themselves consumers of state-level SOS data, often via direct-source providers further upstream.
For a lender running under 1,000 lookups a month, this is often the right path. Past 5,000, the per-lookup math starts to favor going one layer closer to the source. We compare the major aggregator and direct-source options in our Top 8 Secretary of State API Solutions review.
How Do Legacy Bureaus (D&B, Experian) Fit Modern Alt-Lending?
Dun & Bradstreet and Experian provide deep historical business credit files and remain the standard reference for traditional commercial credit decisions.[7] Their weakness for alternative lenders is real-time SOS status. Bureau records refresh on cycles that lag actual state filings by weeks or months, which is fine for credit history but inadequate for fraud-stage entity verification where a freshly dissolved or suspended entity needs to surface today, not next quarter.
Can a Lender Build Their Own Direct SOS Integration In-House?
A few lenders try. The honest engineering picture is rough. As one CTO of a 1,000+ employee alt-lender said:
"Integrating with all 50 states is obviously really complicated. The maintenance of all of that is the part nobody warns you about."
Each state portal has its own search endpoints, name normalization rules, document-fetch behavior, and rate limits. Build accounts for the engineering time to write 50 scrapers; maintenance accounts for the ongoing breakage as state UIs change without notice. A Lendflow product write-up frames the in-house build as "70% fewer manual steps" only when paired with continuous platform investment.[8]
Where Does a Direct SOS Data API (Cobalt) Fit?
Cobalt sits one layer closer to the state portals than the aggregator platforms. The trade-off is straightforward: a lender uses Cobalt when they want SOS-source data without per-lookup fees and without the engineering burden of building 51 state integrations themselves. Cobalt is not a full KYB stack; it is the SOS data layer inside a stack that often also includes a TIN verification step, an OFAC/AML screen, and a credit pull from a bureau. For a side-by-side on the direct-source vs aggregator decision, see our Cobalt vs Middesk comparison.
What Does an Automated SOS Verification Workflow Actually Look Like?
The workflow has three parts: entity status, beneficial ownership, and UCC surface. Each maps to a specific underwriting decision.
How Does Entity Status Verification Plug Into a Loan Application Flow?
The standard pattern: a loan application comes in with a business name and state of registration. The verification service hits the SOS API, returns active/inactive/dissolved status plus officer and registered agent data, and the lender's loan origination system flags the file accordingly. An inactive or dissolved entity blocks the file at the gate. An active entity proceeds to credit review.
This is the highest-volume call in any KYB workflow and the call where per-lookup pricing causes the most pain. A direct-source API lets a lender amortize the data cost across their full subscription rather than paying per file. For where SOS sits relative to TIN verification in the underwriting waterfall, see our EIN verification waterfall placement write-up.
How Does Beneficial Ownership Verification Handle FinCEN Compliance in 2025-2026?
The regulatory landscape shifted significantly in 2025. On March 21, 2025, FinCEN issued an interim final rule that removed beneficial ownership reporting requirements for U.S. domestic companies under the Corporate Transparency Act, narrowing the rule to foreign reporting companies registered to do business in U.S. states.[9] Then on February 13, 2026, FinCEN issued an Order granting exceptive relief to covered financial institutions from identifying and verifying beneficial owners at each new account opening.[10]
The practical takeaway for alternative lenders: federal CTA-driven UBO collection is lighter than the 2024 expectation, but state-level KYB obligations and individual lender risk policies still drive the need for ownership data. Officer information returned by SOS lookups remains the cleanest first-pass source, especially for smaller loans where bureau-level UBO data is thin.[11]
How Do UCC Filings Surface Loan Stacking Risk?
Loan stacking, where a borrower takes multiple advances from different funders without disclosure, is the single largest fraud vector in MCA. Industry research puts stacking on roughly 5 to 6% of all merchant cash advances, and document analysis platforms like Ocrolus advertise transaction-level stacking detection at 99% accuracy.[12] UCC filings registered through SOS offices are a primary stacking signal because every secured advance typically files a UCC-1 against the borrower's assets.
A KYB workflow that pulls SOS entity data and UCC filings in the same call detects stacking before funding rather than after default.
How Does Cobalt's API Fit a High-Volume Alternative Lender Stack?
With the workflow mechanics defined above (entity status, beneficial ownership, UCC stacking detection), the integration question comes back to data supply chain positioning. Once a lender crosses several thousand lookups per month, the per-lookup fee burden from aggregator platforms becomes the primary cost driver, and the engineering question becomes: how much of that fee pile can be eliminated by going one layer closer to the source? That is the question Cobalt is designed to answer.
What Does the Real-Time vs Cached Data Trade-off Look Like?
Cobalt returns both real-time SOS lookups (queried live from the state portal) and cached results (returned in seconds from a previously fetched record). High-volume lenders typically want cached data for the initial pass (sub-second response, no per-call portal cost) and real-time for files that hit a risk threshold. As one alt-lender engineer described it:
"On the slow states it was about a minute end-to-end. With the cache, it's a matter of seconds."
The cache hit ratio for a typical KYB workload is high enough that the real-time call rarely fires in the daily flow. That is what makes the unit economics work at MCA scale.
What API Endpoints Matter Most for KYB at Scale?
The high-volume calls in a typical alternative lender workflow:
• Business search by name plus state. Returns matches plus filing IDs. The most common entry point.
• Detail lookup by filing ID. Returns active/inactive status, registered agent, officers, formation date.
• UCC filing search by debtor. Returns secured-party filings for stacking detection.
• Optional callback flow for slow states. Async pattern where the API hands back a job ID, then posts the result to a webhook when the state portal returns.
A minimal request looks like this:
curl -X GET "https://apigateway.cobaltintelligence.com/search" \
-H "x-api-key: YOUR_API_KEY" \
-d "searchQuery=ABC Holdings LLC" \
-d "state=CA"
The response includes entity status, filing date, officers (where the state surfaces them), and registered agent. A lender's KYB orchestrator parses the response, applies risk rules, and either advances the file or routes for human review.
What Are the Honest Limitations of a Direct SOS Data API?
Three caveats matter for any team evaluating the path:
• Some state portals are slow. A handful of states require async queries that can take 30 to 90 seconds end-to-end on a cache miss. Lenders running real-time UX (think instant decisioning during application) need a fallback strategy for these states.
• SOS data is not credit data. Entity verification answers "does this business exist and is it in good standing." It does not answer "should this business get funded." Bureau pulls and bank statement analysis still belong in the stack.
• Officer data is state-dependent. Some states surface officers in their public SOS records; others do not. For full UBO coverage, lenders pair SOS data with a bureau or an explicit application-form attestation.
Position SOS API in the verification stack, not as the entire stack. That framing matches how high-volume lenders actually operate.
What Integration Patterns Scale Past 10K Lookups Per Month?
The first 1,000 lookups per month are easy. The first 10,000 are an integration project. Past that, the patterns separate the teams that scale from the teams that get stuck.
When Do You Choose Sync vs Async Callback?
Synchronous calls work for fast states and cached records. They block the requesting service until a result returns, which is fine when the caller is a background job and bad when the caller is a customer-facing application form.
Async callbacks (the API returns a job ID, then posts the result to a webhook) are the right pattern for slow states and any flow where a 30-second wait would harm UX. Most production KYB workflows mix both: sync for fast states with cache hits, async for slow states or cache misses.
How Do Batch Verification Patterns Handle Portfolio Refresh?
A high-volume lender does not just verify at funding; they re-verify portfolio entities periodically to catch dissolutions, suspensions, or status changes that affect ongoing risk. Batch verification (submit a CSV or list of entity IDs, receive results asynchronously) is the right pattern for this. Run it weekly or monthly, route status changes to a risk queue, and you have continuous portfolio monitoring without blowing up your real-time API budget.
What Engineering Patterns Keep Webhooks and Queues Reliable?
Three operational patterns separate teams that scale from teams that don't:
• Idempotent webhook handlers. State portals retry. Networks blip. Your handler must process the same callback twice without double-charging or double-recording.
• Dead-letter queues. A small percentage of jobs will fail. Route them to a DLQ for manual review rather than dropping them silently.
• Per-state rate limiting in your own infrastructure. State portals have wildly different rate limits. Build a per-state token bucket on your side rather than letting the upstream provider's rate limits cascade into your application.
These are the "boring engineering" details that turn a working integration into a production-grade pipeline at 10,000+ daily lookups.
How Do Lenders Measure ROI on Verification Automation?
Three metrics matter: time-to-decision, default rate, and stacking detection lift. All are measurable; all show real numbers when the verification stack works.
How Much Faster Is the Decision With Automated SOS Verification?
The published benchmarks are consistent. The same Management Science study cited above found algorithmic underwriting outperforms manual underwriting with 10.2% higher loan profits and 6.8% lower default rates.[5] Operational case studies show full automation cuts pre-approved loan time-to-approval from 177 hours to 82 hours, and denial decisions from 146 to 42 hours.
Even a partial automation of the SOS step alone removes a 1- to 5-minute manual lookup from every file. At 5,000 files per month, that is 80 to 400 analyst-hours reclaimed monthly.
Does Verification Automation Actually Move Default Rates?
The mechanism is straightforward: faster, cleaner verification surfaces dissolved entities, suspended licenses, and stacking signals that a manual workflow would miss under volume pressure. Synthetic identity fraud now costs financial institutions $20 to $40 billion annually globally, and KYB-stage detection is the cheapest place in the stack to catch it.[13] An entity status check that returns "dissolved 2023" on a 2025 application is the lowest-cost fraud filter in the entire underwriting funnel.
For a 5,000-file-per-month lender, a single avoided fraud file at $50K funded covers a typical KYB API subscription for the year. Two avoided fraud files cover it twice over. The break-even arithmetic on verification automation almost always favors the buy decision.
How Do You Quantify Stacking Detection Lift?
A lender pulling UCC filings in the same KYB call detects active secured-party filings before funding. The math at MCA scale is direct. Take a lender funding 5,000 advances per month at an average $25K per advance. Industry research puts MCA stacking at 5-6% of all advances,[14] which means roughly 250-300 stacked files in that monthly book, representing $6.25M to $7.5M in stacked exposure each month.
If automated UCC pre-funding screening catches half of those at the gate, the lender avoids $3.1M to $3.75M in stacked-file exposure per month. Even applying a conservative 30% loss-given-default to that exposure, the avoided losses run into seven figures annually. Against a KYB API subscription cost in the low five figures, the trade is not close.












.png)