Fact-Checking the Wikipedia
The drawbacks of having a one-stop reference source
Carlos Arturo Serrano Gomez 
Published 2008-02-09   
At some time in 1999 I used to be especially interested in reading about the Easter Island head statues. This was partly because I was attempting to write a historic fiction novel that would deal with the mysterious Moai. When the first cybercafes opened in my hometown, I seized the chance to do as wide a research as the resources available at the time allowed.

What is now inconceivable was then the ordinary state of affairs: sans Google, sans Wikipedia, where on earth could one find anything? But somehow the Web seemed to work. I meandered through site after site where the dispersed parts of the relevant information were scattered for me to gather and make sense of. I daresay it's a pity that the tiresome experience of browsing a lengthy list of sites, each with their portion of valuable information, which together added up to a huge amount of material (and led more credence to the reputation of the Internet as an infinite reference source), has vanished.

Efficiency has come to stay, but at the cost of variety. Google and Wikipedia are marvelous tools, but some of you surely have wondered whether it isn't dangerous to have only one source for everything we want to know. True, there are hundreds of other Internet resources where you could find what you need, but how many of you are willing to bother taking the time to look for it by yourselves, particularly when it has already been researched, put together, edited, formatted, bookmarked, tagged and made available in one place?

In a similar way to the de facto monopoly that Wikipedia enjoys in the text department, for every media type there is a firmly established mecca it calls home. Want to see a video? YouTube. A map? Google Maps. A photo? Flickr. These associations come to mind instantly, and seldom do we lazy consumers pause to consider that there could -- and should -- be other places to search for content.

So it occurred to me to inspect how well the one-stop reference source fares in completeness and reliability. For this I picked a subject I both love and know enough about: my mother tongue, Spanish.

Let's Count

The English Wikipedia article on the Spanish language (as of Feb. 6, 2008) weighs 206 KB, which is flattering when compared to the entry on English, which is 227 KB in size. (The latter is about the same amount of bytes Spanish gets in its own Wikipedia). In embarrassing contrast, however, the article about the aliens that appear in Star Trek weighs 317 KB.

The puzzling thing is that Spanish receives much less attention among the languages more closely related to it: 122 KB in the Portuguese version of Wikipedia, 99 KB in the Romanian one, 90 KB in French, 89 KB in Italian, 66 KB in Catalan, 56 KB in Occitan, 53 KB in Galician, 46 KB in Latin, 45 KB in Aragonese, 44 KB in Asturian, 42 KB in Ladino, 41 KB in Sicilian, 35 KB in Piamontese, 29 KB in Sardinian, 28 KB in Provencal, 26 KB in Corsican and 26 KB in Romansch. These numbers shrink further when one considers that many of these bytes are accounted for by navigation bars, hyperlinks and page formatting.

The distribution of information within each individual article merits deeper analysis. The English version employs 557 words in discussing the history of Spanish, 1,311 words for its geographical distribution, 813 for its dialectal variants, 534 for its writing, 512 for its phonology and 78 for its grammar. Although these topics have each their own separate entry, their allocation in the main article doesn't make sense. Anglophone readers wanting to learn about Spanish will have no use for such a lengthy digression on the countries where it is spoken, appended at the end by a couple of quick paragraphs that amount to a dictionary definition of Spanish grammar. I wonder why English-speaking "Wikipedians" chose to organize this information in proportions that don't help them.

While word count comparisons are tricky from one language to another, the Spanish Wikipedia devotes even more space to explain the distribution and dialects of Spanish (presumably of more interest to Spanish-speaking Wikipedians), and includes a broad section on vocabulary. However, its demographics are outdated, and grammar is dispatched in two paragraphs as well.

The process of translation or parallel composition of this subject shows varying measures of organization among the other Wikipedias. To speakers of other Romance languages, the relevant facts to know first about Spanish would be the similar features that would help them learn it faster. Let's see: the French article accomplishes a very good summarization of important details without overwhelming the reader; the Portuguese one emphasizes geography and the nuances of our verb tenses (a vital concern for understanding Spanish sentences); and the Romanian one, which misquotes the Lord's Prayer, speaks mostly about the Spanish sounds and alphabet. These three sources could benefit from mutual contributions. If only their authors could talk to each other.

Quality starts to decline as one progresses in searching. The Italian, Catalan and Occitan articles make a brief mention of history and geography, but don't describe Spanish grammar, nor do they even provide a functioning link. Italian misquotes the Lord's Prayer too, and Galician discusses almost exclusively the differences between Spanish and other Romance languages. The Latin, Asturian and Sicilian Wikipedias offer only a quick description of Spanish; in fact, these three owe most of their KB size to a lengthy chart with numbers of Spanish-speakers distributed by country. And the articles in Aragonese, Ladino, Piamontese, Sardinian, Corsican, Provencal and Romansch merely state something to the effect that Spanish is what they speak in Spain.

At this point in my analysis, I started to feel I wasn't being fair to the authors. Perhaps this meager coverage of Spanish is not as indicative of the interest it deserves from those communities as of the degree of progress made in composing the different versions of Wikipedia. Yet those readers with poor grasp of English still deserve to have equivalent information in their own languages.

All There Is to Know, but Nothing More

In order to make a fairer comparison, I proceeded to examine just the specific entries concerned with Spanish grammar. It didn't surprise me that the Spanish and English Wikipedias had them. But among the Romance languages, only a French version is referenced (it turned out there's also one in Romanian, but the aforementioned articles lack a link to it).

The main criticism of Wikipedia is the allegation that its collaborative model would imply that truth is whatever the most people agree it is. Does Wikipedia describe the grammar of my mother tongue truthfully? Well, the 269 KB version written in Spanish (presumably by native Spanish-speakers) should be the most reliable.

I found its introduction concise and to the point. The main sections on morphology and syntax are detailed enough; actually, they include more useful material than I had expected or even knew about: the kinds of verbal periphrases and speech markers, the stylistic criteria for placing adjectives and the newest additions to the inventory of prepositions are fully explained. Also, controversial usages, like the Iberic misuse of the indirect object pronoun "le" for direct object situations, the correct status of the reflexive passive voice, and the subtle difference between position and movement adverbs ("fuera" = "out", "afuera" = "outward"), are adjudicated the right way. Finally, our annoying verb tenses are given the extensive coverage they deserve, although it's weird that regular conjugations appear first in a table and then reoccur in a list and then again in another table.

A few other points need revision. The lists of Latin and Greek affixes are incomplete, as well as the list of Spanish suffixes used in word formation. The subsection on vocabulary is composed of two huge paragraphs that are difficult to read through, and several times in the rest of the article I encountered cumbersome definitions that had little to do with the topic at hand. This is especially true of the final section on semantics, which is written in a clumsy style and contributes no specific Spanish features. The whole of it would be better replaced by a link to its relevant section in the main article on grammar.

Another case is the subsection that deals with the types of pronouns, which mentions the unusual case of the English "it." I don't know what that reference is doing there. "It" doesn't have a Spanish equivalent (we don't use the neuter pronoun "ello" for impersonal and cleft sentences like English does). And the parts of speech are described redundantly: once in the section on morphology, and again in the subsection on syntagmas, under syntax. In spite of these nuisances, Spanish-speaking readers may rest assured that their Wikipedia has its facts right about Spanish grammar.

Let's quickly review the other versions. The English article (41 KB) concentrates on the differences between Spanish and English, but does not bother with details and links to a cluster of separate entries dealing with the intricacies of Spanish verbs, conjugations, irregular verbs, nouns, adjectives, determiners, pronouns and prepositions. Given the difficulties English-speakers have when studying Spanish, I think that splitting the important facts across various locations (that is, the numerous instances of "See the main article for further information") only complicates the process.

The version written in Romanian (99 KB) starts with a notice requesting that someone fix the formatting of the page. A bad start, I thought. The empty sections under the headings for nouns, articles and adjectives confirmed my first impression. Pronouns and verbs, however, are more than satisfactorily covered, and some peculiarities are exposed in detail that doesn't appear in the Spanish version, perhaps because the Spanish-speaking reader is already familiar with them. In the lengthy section on pronouns the only errors I could find are that the accented and unaccented pronouns that sound "mi" (English "my" and "me," respectively) are mistaken for each other, and some sample sentences lack number agreement. If the section on verbs is divided in shorter paragraphs and has its headings revised, it will constitute a most useful guide for the Romanian-speaking reader.

According to the French Wikipedia, the main facts to know (43 KB) about Spanish grammar are the 27 letters of its alphabet (actually there are 29), the rules for placing the acute accent, the orthographic changes introduced in 1952 and the conjugation patterns of auxiliary and regular verbs. Not much. As a matter of fact, most of the article is plagiarized from parts of an appendix of the 1980 edition of the Larousse French-Spanish dictionary. I advise French-speaking readers who want to learn about Spanish to benefit from the full text in the book itself.


Currently, the Spanish language is discussed in 122 versions of the Wikipedia. I have limited my research to those written in languages I am able to read.

The same goes for the subject of Spanish grammar, which is available in eight languages -- the others are Hungarian, literary Norwegian, standard Norwegian, Swedish and Mandarin. I cannot read them and make no claim about the quality of those articles.
Wikipedia Articles Referenced in This Review

Spanish language:

Versions in English, Aragonese, Asturian, Catalan, Corsican, Spanish, French, Galician, Italian, Ladino, Latin, Occitan, Piamontese, Portuguese, Romanian, Romansch, Sardinian and Sicilian.

Spanish grammar:

Versions in English, Spanish, French and Romanian.
