Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
[Unreleased] — targeted as 0.6.0
Phase 11 (HWP5 → HWPX silent-gap closure)
This release closes the largest batch of “HWP5 decoder has the bytes, but projection or shared model can’t carry them” gaps measured against truth HWPX fixtures from 한컴 Office. All work is HWP5-leg only — HWPX encode/decode paths were already verified by existing golden tests.
Added — Wave 0 (audit infrastructure)
audit-hwp5warning taxonomy (SilentGap,DroppedControl, …) and--strictmode so silently dropped semantics are surfaced as build signals instead of accumulating as invisible technical debt.
Added — Wave 1 (character style line family + word break)
- Carry
underline(1b) andstrikeout(1c) line family (DOT,DASH,DOT_DASH, etc.) throughHwpxCharShapeinstead of collapsing toSOLID. - Carry
breakLatinWord = HYPHENATION(1d) throughWordBreakTypeand HWP5 projection so HWPXbreakLatinWordis no longer alwaysKEEP_WORD. - Surface silent
shadowdecode gap as warning (1a) until carry lands.
Added — Wave 2 (paragraph layout fidelity)
- Carry
lineSpacingType = AtLeast(2a). - Verify all 6 alignment variants (2b),
indent+pageBreakBefore(2cd), andborder+shading(2e) carry against truth HWPX fixtures.
Added — Wave 3 (paragraph-level checked decode)
- HWP5 paragraph-level
paraPr.checkeddecode so the third location of checkable-bullet truth (the per-item checked state) is no longer lost. Closes the legacy R1 line.
Added — Wave 4 (Field / Object parity)
- Wave 4a: carry HWP5
Rectcontrol through Core → HWPX. - Wave 4b: carry HWP5 footnote control through Core → HWPX.
- Wave 4c: carry HWP5 chart (OLE-backed BinData) end-to-end as
Control::EmbeddedChartpassthrough — emitsChart/chartN.xml+BinData/oleN.ole+<hp:switch>block. Closes theDroppedControl:ole_objectmeasurement. - Carry HWP5 fixed-width space and non-breaking space through Core to HWPX inline text.
- Carry HWP5 checkable bullet
checkedChar+paraHead.checkablethrough bullet conversion. - Tab fidelity end-to-end: carry inline
<hp:tab width / leader / type>attributes through a newRunContent::InlineTextvariant; HWPX encoder/decoder updated symmetrically; fill-type → leader mapping rebuilt against openhwp truth. See debug doc.docs/debug/2026-05-26_tab_fidelity_bugs.mdand.docs/debug/2026-05-27_hwpx_decoder_inline_tab_attrs_lost.md. - Field controls: emit non-zero
fieldidforCROSSREFand keepfieldBegin idwithin signed 32-bit range.
Added — Wave 5 (page-level features)
- Wave 5 gap A: carry per-ctrl
applyPageType(BOTH/ODD/EVEN) through HWP5head/footctrl property word into multiple<hp:header>/<hp:footer>elements. Verified againstsample-header-footer-odd-eventruth fixture. - Wave 5 gap B: carry
secdctrl property bits (0/1/2/5/19) intoSection.visibility.hide_first_header / footer / page_num / empty_line / master_page. Verified againstsample-header-footer-hide-firsttruth fixture. Newcrates/hwpforge-smithy-hwp5/examples/probe_secd.rsreusable probe. - Wave 5 gap C (
masterPagecarry): deferred. Diagnosed as fixture-asymmetric on macOS 한컴 (truth HWPX hasmasterpage0.xmlbut the paired HWP5 has no master-page sub-records). See debug doc.docs/debug/2026-05-27_hwp5_page_features_lost.mdfor full probe results. Resume when a PC-한컴 fixture is available.
Fixed — Wave 6 (corpus-driven conversion robustness)
Measured by extending the audit-hwp5 signal source from synthetic
fixtures to the real government-document corpus and clustering the
pre-categorized conversion failures. The two real bugs recovered all
29 hwp5_convert_failed documents (plus one more from 탈락); the
remaining failures are inputs that are genuinely not HWP5.
- Schemeless hyperlink URLs (e.g.
www.motie.go.kr) are normalized tohttp://instead of aborting the whole conversion. The HWPX encoder previously rejected any URL outside thehttp:///https:///mailto:allowlist; explicit unsafe schemes (javascript:,data:,file:, …) are still rejected. (16 corpus docs) - Non-leading table header rows are demoted to normal rows (with a
warning) in HWP5 projection instead of failing Core validation with
NonLeadingTableHeaderRow. Real 한글 tables sometimes restate a header row mid-table; the leading header block is preserved and the stray header row is demoted so the document still converts. (14 corpus docs) .hwpinputs that are actually ZIP (PK.., i.e. an HWPX saved with a.hwpextension) or a Hancom secured/DRM container (SCDS..) now fail with an actionable message instead of a raw CFB byte dump.
Changed — Breaking
- ADR-002:
hwpforge_core::Section.header: Option<HeaderFooter>→Section.headers: Vec<HeaderFooter>(andfooter→footers). This changes the publicSectionshape so that multiple page-type-scoped headers (ODD/EVEN/BOTH) can coexist as required by HWPX wire format. EmptyVecmeans “no header”. JSON dump now emits a list instead of an object (or omits when empty). Patch / MD encoder / CLI / MCP consumers updated to iterate the slot. See.docs/architecture/adr/ADR-002-section-multi-header-footer-cardinality.md.
Added — Public API
hwpforge_core::RunContent::InlineText(InlineText)variant for mixed-content runs that include<hp:tab>(and is forward-compatible with<hp:lineBreak>/<hp:nbSpace>/<hp:fwSpace>).InlineText/InlineSegment/InlineTabAttrare#[non_exhaustive].Sectionconstructors continue to benew()/with_paragraphs(); push headers/footers viasection.headers.push(HeaderFooter::…).
Migration
- Replace
section.header = Some(hf)withsection.headers.push(hf). - Replace
section.header.as_ref()withsection.headers.first()(single-slot consumers) or iteratesection.headers(general). serdeshape now usesheaders/footersarrays; persisted JSON using the old keys must be migrated.- Match arms on
RunContentmay need a newInlineText(_)arm. UseRunContent::carries_text()/plain_text()for read-only consumers that don’t care about inline attribute fidelity.
[0.5.2] - 2026-05-13
Added
hwpforgeCLI gainsto-mdlossy / lossless modes for HWPX → Markdown export choice.- HWP5 → HWPX projection preserves field controls and checkable state (precursor to Phase 11 Wave 3/4 closure).
Fixed
- Dependency hygiene updates (
sha2, GitHub Actions pinning, suppressRUSTSEC-2026-0097non-applicable advisory).
[0.5.1] - 2026-04-13
Added
- HWP5 → HWPX style fidelity bridge improvements (more char/para style surface preserved end-to-end).
- HWPX char effects: preserve
emboss,engrave,superscript,subscript(also covered later under Wave 1 audit).
Fixed
- Warn on conflicting vertical-position bits instead of silently normalizing.
- HWP5 paragraph layout hints (linesegarray, safe table height) carried to HWPX so visual diff matches truth better.
convert-hwp5/audit-hwp5/inspectsummary share the same style-projection warning source.
[0.5.0] - 2026-03-20
Changed — Breaking
- Adopt shared
ordered/bullet/outlinelist semantics acrosscore,blueprint, andsmithy-hwpx. Markdown bridge integrated. - Add checkable bullet semantics (HWPX
heading(type="BULLET")+bullet.checkedChar+bullet.paraHead.checkable+paraPr.checked). Markdown task lists normalize to this shared HWP semantic; ordered task lists intentionally drop numbering on the way in. - Tighten bullet semantics: paragraph
heading_levelis no longer a catch-all for list semantics (see gotcha #7 inCLAUDE.md).
Added
- Markdown bridge: preserve task list continuation paragraphs; normalize ordered task lists.
- Fixtures: reorganized under
examples/andtests/fixtures/.
Fixed
- HWPX style id bridging for registry-local style ids.
- Outline contract hardening (golden tests).
[0.4.0] - 2026-03-20
Changed
- Promote the workspace release line to
0.4.0for the breaking tab semantics contract inhwpforge-coreandhwpforge-blueprint. - Add shared tab-stop semantics across the IR stack so HWPX/HWP5 codecs can preserve explicit tab definitions and paragraph tab references.
Migration
hwpforge_core::TabDefnow includes explicitstops; downstream struct literals must initialize the new field.hwpforge_blueprint::Template,ParaShape, andPartialParaShapenow carry tab definition references/collections directly.- Consumers matching on
BlueprintErrorCodeshould handle the new tab-related error codes explicitly.
[0.3.0] - 2026-03-19
Changed
- Promote the workspace release line to
0.3.0for the breaking HWPX section editing contract update. - Preserve-first section editing now requires preservation metadata on
ExportedSectionand rejects stale or legacy section exports explicitly.
[0.2.0] - 2026-03-17
Changed
- Adopt the
hwpforge-corev0.2.0 public DOM contract for richer table and image semantics. - Align the workspace release line and internal crate pins on
0.2.0.
Migration
Table,TableRow,TableCell, andImageare now#[non_exhaustive]and should be constructed vianew/with_*builders instead of struct literals.- Table DOM now carries page-break, repeat-header, cell-spacing, border/fill, header-row, cell margin, and vertical-alignment semantics directly in
hwpforge-core. - Image DOM now carries placement metadata directly in
hwpforge-core. - Validation now exposes
CoreErrorCode::NonLeadingTableHeaderRow; downstream code that inspects validation codes should handle it explicitly.
0.1.0 - 2026-03-06
Added
- hwpforge: Umbrella crate with feature flags (
hwpx,md,full) - hwpforge-foundation: Primitive types (HwpUnit, Color BGR, branded Index
, enums, error codes) - hwpforge-core: Format-independent document model with typestate validation (Draft/Validated)
- Document, Section, Paragraph, Run, Table, Image
- Controls: TextBox, Footnote, Endnote, Equation, Chart (18 types)
- Shapes: Line, Ellipse, Polygon, Arc, Curve, ConnectLine
- References: Bookmark, CrossRef, Field, Memo, IndexMark
- Layout: Multi-column, captions, headers/footers, page numbers, master pages
- Annotations: Dutmal, compose characters
- hwpforge-blueprint: YAML-based style template system
- Template inheritance with DFS merge
- StyleRegistry with deduplicated fonts, char shapes, para shapes
- Built-in default template (Hancom 한컴바탕)
- BorderFill support
- hwpforge-smithy-hwpx: Full HWPX codec (KS X 6101)
- Decoder: HWPX ZIP+XML -> Core Document
- Encoder: Core Document -> HWPX ZIP+XML
- Lossless roundtrip for all supported content
- HancomStyleSet support (Classic/Modern/Latest)
- 22 default styles with per-style charPr/paraPr
- ZIP bomb defense (50MB/500MB/10k limits)
- OOXML chart generation (18 chart types)
- Golden fixture tests with real Hancom 한글 files
- hwpforge-smithy-md: Markdown codec
- GFM decoder (pulldown-cmark) with YAML frontmatter
- Lossy encoder (readable GFM) and lossless encoder (HTML+YAML)
- Full pipeline: MD -> Core -> HWPX verified in Hancom 한글