Ongoing

Korean Apartment Transaction Anomaly Screen

Builder · 2026 · 5 min read

A statistical screen using only public MOLIT data to flag apartment buildings where coordinated transaction cancellations suggest price manipulation. Demo: 강남구 2024 — 3,754 transactions analyzed, 33 buildings flagged. Sector findings cross-validated against public FSS and 국토부 investigation records.

Overview

A forensic screen for Korean apartment transaction records that identifies statistical patterns associated with 부동산 가격 띄우기 — coordinated price inflation via record-high transactions that are filed and then cancelled before transfer tax falls due. The tool consumes the national MOLIT 실거래가 (actual transaction price) API and scores buildings across three signals: cancellation cluster rate, 신고가 (record-high price) flag, and 법인-매수 (corporate purchase) concentration. Output is a district-normalized building ranking with per-building cancelled transaction detail, 보유일수 exploitation window, and district rate comparison. Legal positioning is precise: the tool flags buildings that warrant a second look — it does not characterize transactions as manipulated.

Problem

Korean real estate price manipulation leaves systematic traces in the public transaction record. Under the MOLIT reporting system, every apartment transaction must be registered, including transactions that are later cancelled (해제여부). A coordinated scheme records a building-high price to inflate the district reference value, then cancels the transaction before transfer tax assessment — the price is reported but never actually paid. No market-facing tool existed to screen for this pattern before purchase. Government enforcement (국토부 조사, FSS audits) is retrospective and internal; the product serves a different use case: prevention for market participants.

Constraints

  • Legal positioning must stay strictly within 변호사법 §109 — the tool cannot assert that any transaction is manipulated, cannot recommend legal action, and cannot advise whether to buy or sell. Every output is framed as 'flags for further review,' not a determination.
  • The MOLIT API returns raw XML with underdocumented field semantics — the 해제여부 cancellation flag, 거래유형 (transaction type), and price fields all required reverse-engineering from live responses cross-referenced against KAR public records
  • The p90 신고가 threshold must be computed per building × trailing window, not per district — district-level p90 produces too many false positives in rapidly appreciating submarkets
  • HouseMuch's existing patent covers AVM accuracy (시세 정확도), not manipulation detection — no IP conflict, but required verification before build

Approach

Three-signal composite scoring, all from the public MOLIT 실거래가 API. Signal 1: cancellation cluster rate — the proportion of registered transactions at a building that were subsequently cancelled within a trailing window, measured against the district baseline. Signal 2: 신고가 flag — transactions reported at or above the p90 price for that building in the trailing period. Signal 3: 법인-매수 concentration — share of purchases attributed to corporate buyers, which correlates with coordinated schemes. Each signal is scored 0–45 points; composite maximum is 135. Scores are district-normalized so the ranking reflects local market context. The demo output includes per-cancelled-transaction detail (보유일수, cancellation timing relative to 신고가) and the district's aggregate cancellation rate as a footnote.

Key Decisions

Cancellation cluster as primary signal over raw price deviation

Reasoning:

Genuine record-high prices occur constantly in appreciating districts — they are not anomalous. The cancellation after a record-high price is the anomalous event. Requiring both the 신고가 flag and a subsequent cancellation within the trailing window dramatically reduces false positives while preserving sensitivity to the specific manipulation pattern.

Alternatives considered:
  • Price-deviation-only scoring — high false positive rate in strong submarkets; misses the coordination signal entirely
  • 법인-매수 only — under-captures individual-seller schemes and misses the cancel-refile pattern

Public MOLIT API only — no commercial data sources

Reasoning:

Using only public data makes the methodology reproducible, auditable, and explainable to any institutional buyer without requiring proprietary data access agreements. The FSS government investigation record used to cross-validate the demo findings is itself public, creating an independent validation path.

Alternatives considered:
  • Commercial transaction databases (e.g. 부동산R114) — adds data richness but creates license dependency and blocks reproducibility claims
  • KB시세 for 신고가 baseline — paid API, creates dependency on KB data agreement

'Flags for second look' positioning, not 'detects manipulation'

Reasoning:

Statistical signals cannot determine intent. A building with a high cancellation rate may have legitimate explanations (buyer financing failures, developer renegotiations). Positioning the output as a prioritized list for human review, not a determination, is both legally safer under 변호사법 §109 and analytically honest.

Alternatives considered:
  • 'Manipulation detection' framing — triggers 변호사법 §109 concerns and overstates what statistical signals can establish

Tech Stack

  • Python 3.11
  • MOLIT 실거래가 OpenAPI (국토부, data.go.kr)
  • pandas
  • pytest

Result & Impact

  • 3,754
    Transactions analyzed (demo)
  • 33
    Buildings flagged (demo)
  • 81 / 135
    Top composite score
  • Confirmed (sector-level)
    Government cross-validation

The demo (강남구 2024) flags 33 buildings ranked by composite anomaly score. The top-scored building reached 81/135. Cross-validation against public government sector findings — FSS audit of 616 / 10,640 suspicious mortgages; ₩380억 adjudicated; 국토부 2025 조사 confirming 161 violations and 10 police referrals across the sector — confirms that the cancellation-cluster signal identifies the same category of activity government investigators target, using only public data.

Learnings

  • The enforcement vs. prevention distinction is the entire commercial rationale. Government enforcement is retrospective and internal — it identifies violations after damage is done and keeps findings non-public. A market-facing prevention tool serves buyers and their advisors before a transaction closes, which is a different use case with a different buyer entirely.
  • The cancellation field (해제여부) in the MOLIT API is more forensically valuable than any price field. A price can be set to anything by agreement; a cancellation is a discrete event that the API records independently of the parties' stated intentions.
  • Cross-validating demo findings against public government investigation records before any outreach is the right order of operations — it converts a statistical flag into an independently-verified finding, which is a materially different claim.
  • The prior ABANDON verdict (2026-03-27) was based on two wrong assumptions: that detection requires non-public data, and that government does this for free. Both were overturned. Wrong assumptions about data availability and competitive dynamics are the most common reason good research ideas get abandoned prematurely.