Analyzing the Japanese Wikipedia

The Internet based encyclopedia Wikipedia is a potentially very useful source of information but intuitively it is difficult to have confidence in the quality of an encyclopedia that anyone can modify. For an encyclopedia, one aspect of correctness is the writing style and especially for Wikipedia an inconsistent writing style would give a bad impression. If errors that can be detected by any native speaker of a language go uncorrected, how likely is it that errors that only a subject expert can spot will be corrected? Japanese is a language where honorifics processes are very explicit, involving different forms between which language users in some cases need to choose every time a sentence is uttered or written. Some forms are sufficiently easy to identify in a sentence for it to be feasible to perform this operation with a computer. By using this approach on the Japanese Wikipedia it is possible to determine whether the writing style of articles is consistent and correct. This book describes a project that undertook this task for all the Japanese encyclopedia articles.
