Traditional Chinese QA for AI Content in CI/CD

「視頻載入中」.

Four characters. Taiwan readers are jarred in an instant. A typical coding assistant may not understand that localization context.

Literally correct. Contextually wrong.

Words like 「視頻」 and 「軟件」 betray Simplified Chinese training data. This isn’t a prompting problem — it’s a side effect of how training corpora are distributed.

We blame AI for not being smart enough. What it’s actually doing is faithfully reflecting the biases in its training data.

Without sufficient localized training data, this kind of bias is expected behavior.

For engineers, this opens a new dimension of engineering quality: how do you ensure AI-generated content meets specific language standards inside an automated pipeline?

In the past, this relied primarily on manual review. Now we can let tools catch high-risk strings first, then hand off to humans for judgment.

From Converter to QA: zhtw’s Pivot

Six months ago, zhtw had one job: a CLI tool for converting between Simplified and Traditional Chinese.

The use case was simple: you have a Simplified Chinese document, and you want it in Traditional.

Conversion is deterministic. QA is probabilistic.

The current version is 4.3.0, released April 11, 2026. It supports Python 3.10+, installable via pip, pipx, or Homebrew.

Three Layers of Conversion: Ambiguity as a First-Class Problem

Take the character 「干」. In Simplified Chinese, it can map to Traditional 「干」, 「乾」, or 「幹」.

A naive character-level mapping of 「干」→「乾」 turns 「干擾」 (interference) into 「乾擾」 — semantically wrong.

zhtw 4.1.0 introduced a third conversion layer: Balanced Defaults.

The three layers are:

Term (vocabulary layer): Full term matching. zhtw ships with over 31,000 term mappings (sourced from the official zhtw repository dictionary tables). This is the highest-priority check.
Balanced Defaults: For 10 high-frequency ambiguous characters (干, 面, 復, 裡, 發, 後, 表, 簾, 谷, 纖), sets the most contextually common choice and maintains a Protect Terms list telling the system which known terms to leave untouched.
Charmap (character layer): The final fallback — character-level mapping across 6,360 character pairs.

Most people assume Simplified-to-Traditional is a 1:1 mapping. Ambiguous characters are where the real action is.

The core design principle: don’t assume every character has a single correct mapping. For ambiguous characters, you bring in context and frequency statistics to make the “most likely correct” call — not the “definitively correct” one.

To validate this logic, zhtw runs Golden Tests across all six SDKs, aligning outputs so that Python, TypeScript, and Java all produce the same result.

Confidence Scoring with the Lookup API

zhtw 3.3.0 introduced the zhtw lookup command, giving developers dictionary lookups and source metadata without reimplementing the dictionary themselves.

zhtw lookup 軟件 --json

It returns the mapped Traditional Chinese term, the source dictionary, and status.

You don’t need to maintain your own vocabulary list. But the mapping is a statistical probability, not semantic truth.

So I wrapped a Review Bot around it — piping Chinese strings from pull requests one by one into lookup_word or lookup_words, automating the scan.

Confidence score isn’t an answer — it’s a sorting tool. What I learned: routing anything below 0.8 to human review is more efficient than chasing 100% automation.

Automating Checks in CI/CD

Pipelines don’t get tired. People do.

I’ve seen too many teams rely on reviewer eyeballs to catch Simplified Chinese terms — spending automatable attention on a task that fatigues.

The core problem zhtw 4.x is solving isn’t conversion — it’s attention.

The main command is:

zhtw check . --json

This scans the entire repository, flags Simplified Chinese or suspicious terms, and outputs structured JSON.

The official docs cover GitHub Actions, GitLab CI, and other common platforms — each with copy-paste-ready examples.

If your repository is too large to scan fully, pair it with a Changed File Workflow to check only files modified in the PR. Use a .zhtwignore file for exclusions — same syntax as .gitignore.

On your local machine, set up a pre-commit hook and forget about it:

repos:
  - repo: https://github.com/rajatim/zhtw
    rev: v4.3.0
    hooks:
      - id: zhtw-check

Two hooks are available: zhtw-check only flags, zhtw-fix modifies files directly.

My preference is zhtw-check only. I’d rather have a commit fail and review it myself before deciding what to change. Personal preference — zhtw-fix is useful for bulk-processing legacy files, but in daily development, keeping room for human judgment tends to work better.

Not Just Python: Seven Packages in Sync

Earlier versions of zhtw shipped only a Python package. The latest version aligns six packages to 4.3.0:

Language	Package / Path	Use Case
Python	`pip install zhtw`	CLI, CI/CD, data processing
TypeScript	`npm i zhtw-js`	Node backend, build pipeline
Java	`com.rajatim:zhtw`	JVM backend, enterprise integration
Go	`github.com/rajatim/zhtw/sdk/go/v4`	High-throughput workers, CLI binary
Rust	crates.io `zhtw`	Embedded, low-latency
WASM	`npm i zhtw-wasm`	Browser, Edge

Because AI-generated Traditional Chinese doesn’t live only in Python environments.

Your frontend editor needs real-time hints as the user types — that’s TypeScript or WASM. Your customer service system calls the LLM through Java — that’s JVM territory. Your batch conversion runs in a Go worker — that’s where Go shines.

Using the same dictionary and the same Golden Test across languages is what keeps outputs consistent. For large teams, that’s the foundation of a coherent product experience.

Cross-language alignment comes at a maintenance cost. With multiple SDKs in parallel, consistency is the baseline — but the risk of version drift grows with it.

Making the Case to Your Team

My PM tagged me on Slack: “We aligned on these terms last week — how are they different again?”

Once, to save time, I ran zhtw-fix — and it overwrote three key terms that didn’t match our team glossary.

That made me realize: the space for human review matters more than the automation rate.

The most common pushback isn’t technical — it’s “will this slow us down?”

I tested on a mid-sized repository of roughly 500 files — a mix of Python scripts and frontend static assets. zhtw check took about 2–3 seconds on a MacBook Pro M2. The pre-commit latency is barely noticeable.

The real cost isn’t the check. It’s the fix after the fact. Correcting an i18n string is easy. Correcting a customer email that already went out is not.

Automated checks exist to catch errors before deploy, not after release.

“Do We Actually Have This Gap?”

Run zhtw check . --json and look at the output.

The signal is straightforward: hit count, confidence scores, and whether those terms appear in user-facing copy.

If your team uses AI-assisted writing regularly, or if team members come from different regions, the gap is usually larger than expected.

“Will It Flag False Positives?”

Yes. Ambiguous terms carry inherent judgment cost.

That’s why zhtw provides a confidence score. For low-confidence items, you route to human review rather than relying on automated correction.

As for AI models natively distinguishing regional variants of Traditional Chinese, this is still evolving. It depends on how localized future training data becomes and how model architectures evolve.

Three Questions Before Adopting zhtw

Rather than a standardized decision framework, here are three questions to work through before deciding whether to bring in zhtw.

Does your product target Taiwan?
- Yes → this is a direction worth prioritizing. Localized experience is part of product competitiveness.
- No → can defer.
Does your team use AI-assisted writing extensively?
- Yes → this is a direction worth prioritizing. Simplified Chinese bias in AI is systemic — manual review costs scale with volume.
- No → if the current data basis isn’t sufficient, start with CI check mode to gather hit data first.
Does your product have multi-regional versions?
- Yes → recommended. zhtw can unify Simplified or Hong Kong Traditional input into Taiwan Traditional.
- No → start with CI check mode, then decide whether to simplify the rules.

The market signal isn’t clear yet: will future models have this correction built in? Maybe. But until then, this is a marginal gain we can control.

Closing: Language Quality as an Engineering Concern

When AI starts writing your copy, language quality moves from a cultural concern to an engineering one.

zhtw 4.3 offers an open-source solution for automating language checks.

Automation catches errors — and can also mask subtle semantic differences. Where does your team draw the line between efficiency and precision?

Human attention is a scarce resource — keep it for semantics and judgment. Let the pipeline handle dictionary lookups.

Are you comfortable with customers seeing 「視頻載入中」 in your product? That’s a conversation worth having with your team.

Sources

zhtw GitHub Repository — Source code and documentation
Python Package Index: zhtw — Python package installation and version info
zhtw CLI Advanced Guide — Advanced CLI parameters and CI/CD setup guide