Ongoing

Korean Lease Risk Engine

Builder · 2026 · 7 min read

V0 CLI proof-of-concept: resolves a Korean rental address to a district-level distress band using 16 years of court registry data, positioning the district on the 17-month 임차권→강제경매 cascade timeline. Not calibrated for paying-tenant decisions — V0.5 required for property-level analysis.

Overview

A CLI-first tenant lease risk engine for Korean residential rentals (전세/월세). The tool takes an address and deposit amount, resolves the address to a 시군구 district via the 행정안전부 법정동 master codebook, queries the district-credit-risk court registry database read-only, and emits five structured risk signals: district composite distress z-score, district distress stage (1–5), lease-lien z-score, position on the 17-month cascade timeline, and the five-year 전세→월세 shift trend for the district. Output is structured JSON plus a markdown risk report with statutory disclaimers. V0 operates at the district level only — property-level analysis (LTV, 신탁, joint mortgage detection) requires 등기부등본 PDF parsing, which is the V0.5 gate.

Problem

Korean tenants signing 전세 (jeonse) leases commit deposits of tens to hundreds of millions of won with limited structured tools for pre-signing due diligence. Existing consumer tools focus on market price comparison (KB시세, 부동산114) — none provide a district-level view of the court registry distress signals that predict whether a deposit is likely to be recoverable if the landlord defaults. The 전세사기 crisis (36,950 confirmed victims as of late 2024) demonstrated that the risk was invisible in price data but visible in the registry data: 임차권 filings spike 17 months before the forced auctions that strand deposits. A tool that surfaces the registry signal before a tenant commits gives them information that market prices cannot provide.

Constraints

  • No free programmatic API exists for 등기부등본 (individual property title registry) — the document required for property-level LTV and 신탁 detection must be supplied by the user as a PDF, which gates V0.5
  • Within-district z-score normalisation systematically under-ranks chronically distressed districts: 인천 미추홀구 (the canonical 전세사기 epicenter) ranks 50th of 228 at amber/stage 3 in V0 — it should rank in the top 10 at red/stage 5. The cause is that the district's own history becomes the baseline, so persistent high distress reads as 'normal for that district.' This is a known V0 limitation that V0.5 must fix with a cross-district baseline.
  • Legal positioning must comply with 변호사법 §109 — the tool cannot characterise contract clauses as lawful or unlawful, cannot recommend signing or refusing a contract, and cannot give legal advice. Every output carries a statutory disclaimer.
  • The engine reads district-credit-risk's iros.duckdb directly — it has no independent data pipeline and cannot run unless the parent project's database is present and current

Approach

V0 reuses the district-credit-risk DuckDB as a read-only data source rather than building an independent collection pipeline. Address resolution maps the user-supplied address string against the 행정안전부 법정동 master (49,861 rows) to extract the 10-digit 법정동 code, then cross-references the IROS 시군구 code table (228 rows) to identify the district. District signals are computed as rolling z-scores over the trailing 12 months versus district history. The 17-month cascade position is computed by detecting the most recent 임차권 onset spike in the district and calculating elapsed months — outputs one of: pre-window (spike not yet detected), in-window (1–17 months elapsed since spike), or post-window (beyond 17 months). Legal compliance is enforced through a dedicated disclaimers module that prepends every output with the 변호사법 §109 safe-harbour framing.

Key Decisions

Read-only consumption of district-credit-risk database rather than independent pipeline

Reasoning:

Building a second IROS collection pipeline for lease-risk-engine would duplicate 222,903 rows of data collection, 74 tests worth of pipeline validation, and 5 API keys worth of rate-limit budget. The district-level signals the engine needs are already computed and maintained by district-credit-risk's monthly update cycle. Reading that database read-only costs nothing and stays current automatically.

Alternatives considered:
  • Independent IROS pipeline — doubles infrastructure, doubles maintenance, identical data
  • Embedded district lookup tables (static snapshots) — goes stale immediately and loses the monthly update

District-level scope only for V0, not property-level

Reasoning:

Property-level analysis (LTV, 신탁, joint mortgage) requires 등기부등본 data. The only way to access this for an arbitrary address is the user-supplied PDF path — there is no free programmatic API. Blocking V0 on that capability would have prevented shipping anything. District-level signals are independently valuable: they answer 'is this neighbourhood deteriorating?' even when property-level data is unavailable.

Alternatives considered:
  • Wait for V0.5 to ship anything — delays a working tool indefinitely waiting for a dependency with no resolution timeline
  • Use KB시세 for property-level estimation — requires a commercial data license incompatible with the open-data positioning

부동산 확인 서비스 legal classification

Reasoning:

Providing district-level risk signals from public registry data is factual analysis, not legal advice. The 변호사법 §109 boundary is characterising contract clauses as lawful or unlawful and directing someone to sign or refuse. The engine stays strictly on the factual side: it reports what the registry data shows, not what the tenant should do. This framing is documented in full in the transaction-assurance legal framework memo.

Alternatives considered:
  • Position as a legal advisory tool — triggers 변호사법 §109 and requires attorney involvement in every output
  • No disclaimer — unacceptable liability exposure if a tenant relies on the output and suffers a loss

Tech Stack

  • Python 3.11
  • DuckDB (read-only, from district-credit-risk)
  • pandas
  • Jinja2 (markdown report rendering)
  • 행정안전부 법정동 master codebook (49,861 rows)
  • IROS 시군구 code table (228 rows)

Result & Impact

  • 228
    Districts covered
  • 0 / 228
    Address resolution errors
  • 5
    Signals emitted
  • 변호사법 §109 compliant
    Legal framework

V0 is a working proof-of-concept submitted as part of the 한국부동산원 Housing-Rights Idea Competition (2026-05-08). It demonstrates that district-level 전세 risk signals can be surfaced from public court registry data alone, without commercial data sources. The known chronic-distress ranking bug (인천 미추홀구 at 50/228 instead of top-10) is documented and will be corrected in V0.5 via cross-district baseline normalisation.

Learnings

  • V0 is worth shipping even with a known limitation if the limitation is documented and the use case is scoped to exclude the failure mode. V0 is correct for 'is this a deteriorating district?' — it is wrong for 'rank all 228 districts by absolute distress level.' Publishing V0 with the scope clearly stated is more useful than waiting for V0.5.
  • Within-district z-score normalisation is conceptually straightforward but produces structurally incorrect rankings for chronically distressed districts. The fix (cross-district baseline) requires a design change, not a bug patch — it changes what the z-score means.
  • Reading another project's database as a dependency is clean only when the upstream project has rigorous data quality guarantees. District-credit-risk's 74 tests and idempotent pipeline make this safe; a less tested upstream would make the dependency fragile.
  • The 변호사법 §109 boundary is more tractable than it appears. Factual analysis of public registry data is clearly on the safe side of the line; the constraints are specific (no clause characterisation, no sign/refuse directive). Designing within those constraints from the start is much easier than retrofitting compliance.

V0 scope and limitations

V0 is a district-level tool. It answers: is this neighbourhood showing court registry distress signals right now, and where does it sit on the 17-month cascade timeline?

It does not answer: is this specific property safe to rent? That requires property-level 등기부등본 analysis — V0.5, planned but not yet built.

Known V0 limitation: Within-district z-score normalisation under-ranks chronically distressed districts. A district that has been in persistent distress for years (e.g. 인천 미추홀구) will read as “moderate” because its own history has become the baseline. V0.5 introduces cross-district baseline normalisation to correct this. Until then, treat the district ranking as directional, not absolute.

This tool is a 부동산 확인 서비스 (property verification service) using public registry data. It does not provide legal advice, does not characterise any contract clause as lawful or unlawful, and does not direct users to sign or refuse a lease. Output carries the statutory disclaimer required under 변호사법 §109.