Three_Friends,_by_William_H._JohnsonThe first time I saw SusunW’s article, “Why I write about women on Wikipedia“, I have to admit I just glanced at it and dismissed it as yet another Wikimedia blog puff piece.  But someone has called my attention to it by email (see Friday Night Request Show), so I have looked at it again.

Like most of the Wikimedia blog guest posts, this one follows a testimonial form, which I understand is fairly effective as an advertising technique.

SusunW talks first about history, in particular, the women’s history programs at universities. Now I have heard a lot of criticisms of universities for doing this. If you have a university with a sizable black population, why are you steering the students to “black studies” programs instead of something more economically rewarding, like business.  But how many jobs do you think there are going to be for black executives, or female executives, as opposed to diversity officers for HR departments?  The criticism comes from white male professors, who do not understand bias, implicit bias, and all the rest.

Creating a special niche for black and female graduates, instead of displacing the white male graduates who end up in the more lucrative positions, is probably the best thing you could do for these students, instead of creating unrealistic expectations that will result in them having unsatisfying careers.  If they know their place, there will be no conflict or harassment, and they can always work for a better world where their children will be able to follow their dreams without regard for race or gender, but based on interest and aptitude.

And SusunW seems perfectly happy in a “soft” academic field, and in the Wikipedia women’s’ ghetto that is Women in Red.  She knows her place.  And to be honest, it is a useful project.  Take for instance the article that SusunW is the most proud of, Women in brewing.  I could only read the first 3 or 4 paragraphs before getting bogged down in the wall of text that is the “history” section.  And what about all those blue links — who can even read the thing with all those blue neon words clamoring for attention? The word “pub” is blue-linked, just in case you don’t know what that is, also honey, Japan, Japanese language, rye, saliva, Peru, BBC, and on and on. WP:Overlinking, anyone?

But you aren’t meant to read these things.  Wikipedia is meant to be the free, crowd-sourced front end for google.  Take a look at some of the discussion surrounding the defunct Knowledge Engine.  Google bots are all over Wikipedia all the time.  Where a few years ago it took several days to have an article reach the google search engine, today it is nearly instantaneous.  You can literally create an article then do a google search for the topic and see your article come up in first or second place. And soon you won’t even need the article, WikiData will do it all, bringing in information from all the language wikis.

Where Women in Red, and the other diversity groups come in is finding hidden information.  Let’s face it, the low hanging fruit is already done. The Pokemon characters are all online, and all the important stuff like Star Trek, Star Wars, and Dr. Who.  The only thing left is women, POC, and various language and cultural groups that can only be found in dead tree sources, or hasn’t been written into a RS yet because minority.

The Manchester people may have a stranglehold on copyediting, but copyediting is not important any more, because no one is expecting these things to be read.  They are just repositories of ever more carefully documented and cross referenced reliable sources.  Everyone thinks the “content creators” are indispensable, but why?  Google does not need “content” or someone who can find the most carefully chosen adjective to convey subtle shades of meaning.  It only needs data that is interlinked with other data in meaningful ways.

Every day they are looking for more cruft to jettison.  The simple Wikipedia and portals have been unsuccessfully targeted, probably not for the last time, and they have actually gotten rid of NPP through ACTRIAL

Wikipedia could probably get by without a new article on the front page every day or all those poorly written “do you know” hooks. It’s not like Wikipedia needs click bait for a search engine ranking. It only needs people who know how to fill in the data gaps with well-sourced stubs.  The links cannot be curated by machine, it needs humans to make the selections.

So that’s why SusunW and Wikipedia are perfect for each other.


  1. Your view of Wikipedia as more data than content is interesting, and it’s certainly not something I’m particularly in objection to — from my own sourced-stubs and data-wikifying (categories, navboxes, infoboxes, etc.) to the fact I certainly read a lot more ledes and infoboxes and google results than I actually read article content.

  2. If you talk to librarians the first thing they do is go to the bottom of the page and read the references.

    I think you will also find some infoboxes are being quietly replaced with Wikidata-compliant infoboxes — by bot of course, so it’s nothing personal that could touch off the infobox freaks.

    1. I’m actually in favor of Wikidatafying metadata like infoboxes, categories, etc. — around WPVG we started trialing it for review scores and release dates a bit. It’s deff a work in progress both technically and in its adoption. Enwiki is very very reticent to integrating off-enwp content… so much so that I’m almost surprised there isn’t regular pushback against being able to integrate Commons-based files

  3. I made the following remark at A Site You Don’t Like:

    Structured Data is an effort to get lots of volunteers to tag a huge library of images. Who benefits from having all that work done? Step forward any company that is interested in image recognition or computer vision. How nice for them to get all that done for them for free, when they could easily have afforded to pay for it. Still, you get what you pay for, so bad luck if it all turns out to be nonsense. No doubt your driverless car is going to be jolly good at recognising any genital organs or obscure sexual practices that it happens to encounter in the road.

  4. Oh I have complete confidence in Commons to recognize any and all sexual practices, no matter how obscure, it’s their specialty.

    Structured Data is not just tagging or adding cats, not with a $3m budget, it is more like a WikiData for Commons, using the same software. Might be very useful for bulk uploads by museums. Let’s just hope they don’t make the mistake of using Flow for their talk pages again, they had to create a separate non-Flow talk page just so ordinary users could communicate with their devs.

