← Back to Portfolio

Semantic Integrity — Case Study Library

3 Levels of Semantic Validation in Patent Translation

Neural Machine Translation optimizes for statistical fluency, not legal accuracy. These case studies document the systematic failure modes that alter patent scope — and the alignment protocols that correct them.

Level 1

Term-Level Precision

Domain-Specific Terminology & Hallucination Prevention

Generic NMT models optimize for statistical probability, not technical accuracy. They hallucinate plausible-sounding but factually wrong translations by choosing high-frequency terms over domain-correct ones. In polysemous contexts — "current" as water vs. electricity, "soft" as texture vs. mechanical compliance — models default to the statistically dominant meaning, creating semantic errors that alter patent scope.

13 cases Semiconductors · Photonics · Medical · Robotics · Telecom
View Case Studies →
Level 2

Phrase-Level Accuracy

Multi-Word Technical Expressions

Technical terms often function as indivisible semantic units that lose meaning under word-by-word translation. Generic models break compound expressions into components and translate them independently, destroying the technical relationship that defines the concept. A "gate oxide layer" is a unified semiconductor concept, not three words that can be recombined arbitrarily.

2 cases Medical Devices · Surgical Instruments
View Case Studies →
Level 3

In-Context Consistency

Document-Wide Terminological Stability

NMT models have no long-term memory — each sentence is translated semi-independently. In multi-claim patent documents, this creates catastrophic term variation where a single component accumulates three different French translations across 15 claims. Patent examiners interpret term variation as intentional claim differentiation, triggering indefiniteness rejections under 35 U.S.C. § 112(b).

4 cases Medical Devices · Bio-Pharma · Cardiology · Telecommunications 5G/6G
View Case Studies →