Correcting outdated facts in Wikidata
In Facts are Nuanced, I explained how even seemingly simple facts, such as the height of a mountain, are more complex than they may at first appear. Mountain heights, and the way we measure them, change. Old measurements become outdated, and new measurements replace them. This is okay, so long as we know which is the most recent official measurement. Although as we will see, this isn’t always as obvious as it should be.
This blog post is intended for two types of people: 1) people who care deeply about the accuracy of information on the internet, and 2) people who design software systems to support people who care deeply about the accuracy of information on the internet. If you don’t fall into one of these categories, maybe just scroll to the end…
“Duty Calls” by Randall Munroe (xkcd), CC BY-NC 2.5.
Someone is wrong on the internet
Here’s a table of the highest mountains in each Australian state/territory according to Wikidata, English Wikipedia (specifically, the infobox at the top right of Wikipedia articles, which doesn’t always match the text), Encyclopædia Britannica, and the Geoscience Australia website at the time of writing this blog post:
State / Territory | Highest Mountain | Wikidata | English Wikipedia Infobox | Encyclopædia Britannica | Geoscience Australia |
---|---|---|---|---|---|
New South Wales | Mount Kosciuszko | 2,228 | 2,228 | 2,228 | 2,228 |
Victoria | Mount Bogong | 1,986 | 1,986 | 1,986 | 1,986 |
Australian Capital Territory | Bimberi Peak | 1,912 | 1,913 | 1,914 | 1,912 |
Tasmania | Mount Ossa | 1,617 | 1,617 | 1,617 | 1,617 |
Queensland | Mount Bartle Frere | 1,622 | 1,622 | 1,611 | 1,611 |
Northern Territory | Mount Zeil | 1,531 | 1,531 | 1,511 | 1,531 |
South Australia | Mount Woodroffe (Ngarutjaranya) | 1,435 | 1,435 | 1,435 | 1,435 |
Western Australia | Mount Meharry | 1,251 | 1,249 | 1,253 | 1,253 |
I’ve highlighted values that differ from the measurements on the Geoscience Australia website in bold red. The biggest inconsistency is the height of Mount Bartle Frere, which has an 11 metre difference depending on which source you use. But which is correct? Geoscience Australia’s website seems like the most authoritative, but cites its source as “Geoscience Australia National Geodetic database, 1993”, leaving it ambiguous as to whether this is the latest measurement or not.
Let’s check the source for the height of Mount Bartle Frere according to Wikidata. Wikidata states Mount Bartle Frere is 1,622 metres above sea level, and the reference indicates that this fact was automatically imported from German Wikipedia by a bot. Though this immediately raises a concern, why would a fact about a mountain in Australia, where English is the majority language, be imported from German Wikipedia?
Let’s try to trace this back further. The initial German Wikipedia article for Mount Bartle Frere created in 2011 listed the height of Mount Bartle Frere as 1,622 metres, and has been largely unchanged since. It doesn’t provide a citation but ends with a weblink to state8.net. State8 includes an upload of a Queensland Government trail map from July 2007, which shows the height of the summit as 1,622 metres.
Cool, so the most likely underlying source for this statement is the 2007 Bartle Frere trail map produced by the Queensland Government. I did a search for more recent versions, and can see that the 2023 Bartle Frere trail map shows the height of the summit as 1,611 metres, which is our first hint that perhaps 1,622 metres is an outdated measurement.
How about English Wikipedia? It doesn’t provide a citation for the height either, other than the same link at the end to State8, but the edit history reveals something interesting.
In 2016 the English Wikipedia article for Mount Bartle Frere was edited to update the height from 1,622 metres to 1,611 metres with the comment: “A survey conducted by the Queensland Department of Natural Resources and Mines in August 2016 found the height of the mountain is 1611. The mountain’s most accurate height is 1611.074 +/- 20mm”. Although the infobox was later changed back to 1,622 metres by a mobile edit on 10 Nov 2021 with the comment “1622m tall as per the Qld government sign”. The mobile edit only changed the infobox, leading to an inconsistency in the article where the infobox
says 1,622 metres but the main text says 1,611 metres. The inconsistency seems to have been ignored up until now, perhaps because neither cite a reference, making it difficult for people to decide which is correct without doing some digging. Or perhaps people just don’t care as much as me about the accuracy of random facts on the internet.
Determined to find the survey mentioned in the page edit history, I did a search and found an article “Queensland’s highest mountain not as tall as first thought” by The Courier Mail, 23 Oct 2016. The full article is behind a paywall, but here’s the key part:
“Mt Bartle Frere, 50km south of Cairns, has for years been thought to stand at a height of 1622m but a recent survey revealed it is actually 11m shorter … Cairns-based Department of Natural Resources cadastral surveyor Jeff Pickford, who made the discovery, said it could mean other mountains were not as tall as thought … ‘The original measurements were done back in the 1960s. This time we took state-of-the-art GPS equipment and kept it there overnight to find the true height.’ “
Using Queensland Globe, I was able to track down the survey report for the mark installed by J Pickford, which confirms the height to be (approximately) 1,611 metres:
Phew, so we finally have the answer! The elevation of Mount Bartle Frere is approximately 1,611 metres above the Australian Height Datum (which approximates mean sea level). The 1,622 metre measurement is outdated. But it took me about three days to trace everything back to the original sources to determine the correct measurement (this blog post is the simplified version, I originally tried using digital elevation data captured by the NASA Shuttle Radar Topography Mission before realising that it lacked the resolution or accuracy needed to reliably determine the highest point of a mountain, particularly considering all the trees, as well as looking through geotagged photos of the summit on Flickr in the hopes of finding a recent picture of the sign mentioned in the mobile edit to the Wikipedia article, the official sign at the top doesn’t seem to mention the height but there are/were signs along the way which could have caused confusion if they haven’t been updated). Ironically, it would have been faster to fly there and climb the mountain myself.
Kudos to the author of hikingtheworld blog who seems to have come across the same issue:
“Located 50km south of Cairns, Mount Bartle Frere is the highest mountain in Queensland with an elevation of 1,611 metres. To be precise, 1,611 metres above sea level is the official height of Mount Bartle Frere according to Geocience Australia; a survey in July 2016 revised the official height from 1,622m to 1611.2 metres. A large number of Web sites still have the mountain’s height at 1,622m – and just to add a bit more confusion, the topographic map shows the peak as 1,595m!”
Let’s fix it
Once I knew what the correct height was, making the edit to Wikidata was surprisingly easy. I first updated Wikidata with the new height, added some references, and marked the old measurment as deprecated. This was my first time using the Wikidata editor, but I found it a pleasant user experience, particularly compared to other knowledge graph editors I’ve used in the past.
Once I’d updated Wikidata, updating Wikipedia was even easier. The Wikipedia mountain infobox will automatically fetch any missing fields from Wikidata by default, so all I had to do was delete the outdated elevation field from Wikipedia.
As for height of other mountains that had inconsistencies, I leave this as an exercise for the reader (translation: I don’t know).
How (not) to design a knowledge base
Documenting data provenance (where data comes from) is hard. Providing the ability to reference where a data point came from is a good start, but this only takes us one hop back in a long (and often poorly documented) chain. When the provenance is not clear, outdated information survives longer than it has any reasonable right to.
At the time of publishing this blog post, the 1,622 metre figure, despite having been outdated for over 8 years, still appears on various official sites including queensland.com and wettropics.gov.au. It’s also what’s currently in the Google Knowledge Graph, which shows up at the top of a Google search:
Google Knowledge Graph gives an outdated answer without a reference. Google’s support page states “facts in the Knowledge Graph come from a variety of sources that compile factual information”