Ongoing

KOSDAQ CB/BW Dilution Screen

Builder · 2026 · 9 min read

2,934 KOSDAQ CB/BW issuances scored across four calibration runs; flag rate reduced from 49.3% to 33.1% after resolving a systematic denomination mismatch between split-adjusted prices and DART exercise prices. Four cases confirmed as genuine extreme ITM at above 10x moneyness.

Overview

A forensic screen for in-the-money convertible bond and bond-with-warrant issuances on KOSDAQ, built on top of the kr-derivatives library. The screen prices the embedded conversion option as a European call using Black-Scholes, drawing exclusively from public DART filings and KRX price data. Over four calibration runs from March 15–16 2026, the methodology was iteratively refined to isolate genuine forensic signals from data quality artifacts — specifically a denomination mismatch created by pykrx split-adjusted prices applied against DART contractual exercise prices.

Problem

Korean convertible bonds (전환사채) and bonds with warrants (신주인수권부사채) are a documented dilution mechanism on KOSDAQ. A controlling shareholder arranges for the company to issue a CB with a conversion price set below the current stock price. The bondholder — often an affiliated party — receives an embedded option already in the money: they profit at issuance before any repricing. Detecting this pattern across the full DART-listed universe requires both option pricing capability and a cross-market data join that no open tool provided. The natural primary source — SEIBRO, the Korea Securities Depository's bondholder register — has been returning resultCode=99 since early 2026, with no restoration timeline. Waiting for SEIBRO to stabilize was not viable. The question became: can the dilution signal be computed from DART and KRX alone, without proprietary bondholder data?

Constraints

SEIBRO's public API (bondholder register) unavailable — all inputs limited to DART and pykrx
pykrx returns split-adjusted prices; DART exercise prices are contractual snapshots that are NOT retroactively adjusted — comparing them creates false moneyness inflation for stocks that underwent share consolidations
board_date is approximate for 66% of rows (DART defaults board_date to issue_date when the board meeting record is not separately filed)
53.6% of scored rows fall back to uniform sigma=0.40 because their tickers have fewer than 30 days of price history before the board date — KOSDAQ micro-caps with thin coverage
pykrx adjusted=False returns empty DataFrames (broken server-side at KRX); no alternative source provides true unadjusted historical prices

Approach

The embedded conversion option is priced as a European call using Black-Scholes on six inputs drawn entirely from public data: the stock price at board approval date (from KRX via pykrx), the DART-disclosed exercise price, implied time to maturity, 30-day historical volatility or a 40% uniform fallback, the KTB risk-free rate, and the DART-reported refixing floor where applicable. If the resulting moneyness — stock price divided by exercise price — exceeds 1.0, the issuance is flagged as in-the-money at issuance. The four-run calibration process moved from a naive first run to a production-quality screen: Run 2 introduced per-ticker volatility and a gap filter for stale price references. Run 3 resolved the denomination mismatch by adjusting exercise prices using DART 감자결정 (capital reduction) filings. Run 4 investigated the remaining 32 extreme outliers case-by-case using the DART stockTotqySttus.json endpoint, resolving 28 as data artifacts and confirming 4 as genuine forensic signals.

Key Decisions

Black-Scholes over waiting for SEIBRO

Reasoning:

SEIBRO's API had been broken since early 2026 with no ETA. Waiting would have delayed the screen indefinitely. DART discloses conversion prices and terms at issuance; pykrx provides daily closing prices. Black-Scholes on these inputs detects ITM-at-issuance — the primary dilution signal — without the bondholder register. The approach also proved analytically superior: the signal question is whether the option was priced fairly at issuance, not what happened to it afterward. SEIBRO data would answer the latter; DART+KRX answer the former.

Alternatives considered:

Wait for SEIBRO API restoration — unacceptable timeline dependency on an external agency with no fix timeline
Purchase commercial SEIBRO access — incompatible with MIT licensing and the reproducibility requirement

Adjust exercise price K using DART 감자결정 filings rather than seeking unadjusted prices

Reasoning:

pykrx adjusted=True retroactively scales all historical prices by cumulative consolidation factors. For stocks that underwent share consolidations after CB issuance, the historical price appears multiplied — producing false extreme moneyness. Two approaches exist: (A) fetch true unadjusted prices, or (B) adjust K upward to match the post-consolidation price scale. Path A failed: pykrx adjusted=False returns empty DataFrames server-side, and FinanceDataReader also returns split-adjusted prices only. Path B succeeded: DART crDecsn.json provides structured capital reduction data (shares_before, shares_after, effective_date) for each corp_code. Multiplying K by cumulative consolidation factors occurring after CB issuance aligns the two series correctly.

Alternatives considered:

Heuristic factor detection from price-series discontinuities (>100% single-day jumps) — implemented as fallback but less reliable than DART regulatory filings
Moneyness cap at 10x as a proxy — treats the symptom, not the cause; excludes genuine extreme cases alongside artifacts

Curated CSV as a manual override layer for unresolvable edge cases

Reasoning:

After DART crDecsn.json adjustment, 32 rows across 10 companies remained above 10x moneyness. Three sources of residual extreme cases: (1) consolidations filed before 2015 (crDecsn.json coverage starts 2015), (2) corporate actions filed under different disclosure types not captured by crDecsn.json, (3) genuine deep-ITM KOSDAQ micro-cap issuances. The DART stockTotqySttus.json endpoint — total issued share counts per reporting period — resolved this by detecting ANY share-count-changing event regardless of filing type. Results were stored as manual_k_adjustments.csv and excluded_corp_codes.csv, consumed by the screen script at runtime. This keeps the automated pipeline clean while documenting the manual research decisions explicitly.

Alternatives considered:

Exclude all >10x rows as artifacts — loses genuine forensic signals (two corp_codes verified via DART stockTotqySttus.json manual review are real)
Pipeline-only approach — crDecsn.json alone misses 6% of the outlier population (pre-2015 and non-standard disclosure types)

Tech Stack

kr-derivatives (Black-Scholes option pricing, CBSpec schema)
pykrx (KRX price and volume data)
scipy.stats.norm (Black-Scholes CDF computation)
DART OpenAPI (crDecsn.json, stockTotqySttus.json, cb_bw_events.parquet)
pandas, numpy
Python ≥3.11, uv

Result & Impact

2,934

Issuances scored (final run)
33.1%

Flag rate (final)
4

Calibration runs
944

Denominator-adjusted rows
4

Confirmed extreme ITM (>10x)

The first open-source screen to price KOSDAQ CB/BW dilution using Black-Scholes without proprietary SEIBRO data. The four-run calibration resolved a systematic data quality issue — split-adjusted prices applied against unadjusted exercise prices — that affects any forensic analysis of Korean CB/BW issuances using pykrx. The methodology and the curated correction tables are published under MIT license and fully reproducible.

Learnings

External API failures are design constraints, not incidents. SEIBRO's breakdown forced the Black-Scholes approach, which is analytically preferable: the signal question (was the option fairly priced at issuance?) is answerable from DART and KRX alone. Designing around the dependency produced a better screen than waiting for the dependency to stabilize.
Split-adjusted prices vs unadjusted contractual prices is a structural data quality issue for any Korean small-cap forensic analysis. pykrx adjusted=True retroactively scales all historical prices; DART contract prices are never retroactively adjusted. Any analysis comparing a historical price series to a contractual reference price must address this denominator mismatch explicitly.
stockTotqySttus.json is the ground truth for Korean share count history — more reliable than crDecsn.json (capital reduction filings) because it captures ALL share-count-changing events regardless of how they were classified in DART, including splits filed as par-value changes and consolidations filed as restructuring events.
Diminishing returns in iterative calibration are real and predictable. Runs 1–3 each reduced the flag rate by a material amount (0pp, 15.3pp) resolving systematic issues. Run 4 required high effort — 277 API calls, multi-hour company-by-company investigation — for 0.9pp reduction. The remaining 5-10x band does not warrant a Run 5: the structural quality limiters (53.6% sigma fallback, 66% approximate board dates) now dominate, and addressing them requires upstream pipeline changes, not K-adjustment refinement.

The Four-Run Journey

Run	Date	Issuances Scored	Flag Rate	Key Change	Primary Issue Resolved
1	2026-03-15	2,988	49.3%	Baseline — uniform σ=0.40, unadjusted prices	Discovered: 527 rows >5× moneyness from split-contaminated prices
2	2026-03-15	2,939	49.3%	Per-ticker historical vol; gap filter for board-to-issue >365d	Root cause identified: adjusted S vs unadjusted K denomination mismatch
3	2026-03-15	2,939	34.0%	K adjustment using DART 감자결정 (crDecsn.json)	864 rows adjusted; >10× rows: 241→32 (−87%)
4	2026-03-16	2,934	33.1%	Manual K adjustments for 10 residual outlier companies	>10× rows: 32→4; remaining 4 confirmed as genuine ITM

Run 2 is the counterintuitive one: adding per-ticker volatility and a stale-reference filter changed the composition of the scored population but not the flag rate. That finding was itself the signal — the flag rate was immune to sigma precision changes because the inflated tail was not a vol computation problem. It was a price denominator mismatch between series that should never have been compared.

The fix was upstream in the data pipeline: adjust K by the product of all share consolidation factors occurring after CB issuance, sourced from DART 감자결정 filings rather than trying to obtain unadjusted price series (which no accessible source provides).

Structural Quality Limitations (Stable Across All Runs)

Two issues did not improve across any run, because they are structural:

53.6% sigma fallback rate. More than half of scored rows use the uniform 40% volatility assumption because their tickers have fewer than 30 days of price history before the board date in price_volume.parquet. KOSDAQ micro-caps — exactly the issuers most likely to issue manipulative CBs — are the least covered in historical price data. This does not affect the ITM flag (moneyness is S/K, independent of sigma) but limits the precision of bs_call_value and underpricing_pct for half the population.

66% approximate board dates. When DART does not file a separate board meeting record, the extractor defaults board_date to issue_date. The forensic question is “what was the price when the decision was made?” not “what was the price when the paperwork was filed?” For most CBs the gap is small (median: 0 days), but the 34% with explicit board dates are analytically cleaner and should be weighted accordingly in downstream review.

Both limitations are addressable but upstream: sigma requires broader price coverage or alternative data sources; board date precision requires DART sub-document parsing. Neither is a kr-derivatives change.

What the Screen Does Not Do

The 33.1% flag rate is not a fraud rate. KOSDAQ has structural reasons for at-issuance discounts — distressed issuers often attract capital by pricing the conversion option below market; the bondholder takes credit risk, and the discount partially compensates for it. The screen’s output is a priority queue for human review: companies where the moneyness and structural patterns together warrant closer attention. False positives are expected at roughly 40% of the flagged population.

The forensic question the screen answers — was the option priced fairly at issuance? — is necessary but not sufficient for a fraud determination. What comes after the screen is where forensic judgment enters: Does the bondholder have an affiliate relationship with management? Did the company have access to other capital at the time? Was the repricing below the refixing floor disclosed in the prospectus? Those questions require SEIBRO data, officer network analysis, and DART document review. This screen provides the top of that funnel.

Portfolio

Overview

Problem

Constraints

Approach

Key Decisions

Black-Scholes over waiting for SEIBRO

Adjust exercise price K using DART 감자결정 filings rather than seeking unadjusted prices

Curated CSV as a manual override layer for unresolvable edge cases

Tech Stack

Result & Impact

Learnings

The Four-Run Journey

Structural Quality Limitations (Stable Across All Runs)

What the Screen Does Not Do

Writing