I Compared MNIST-Style Digits Across Languages. Mandarin Chinese Was 4x Harder to Separate
· 21 min read
I started with a joke about "Linear Mandarin" and ended up with a measurable result: in this PCA experiment, Mandarin Chinese digits were about 4x harder to separate than English MNIST.