Topic Maps, an ISO standard for semantic networks, relies on authorities to create and maintain Published Subject Indicators (PSIs), uniquely linking single topics to single subjects out there in the real world. TopicMaps.Org has eg. published indicators for languages and countries. But who gets to claim authority over a particular set of topics? Conflicts between people and organizations with opposing interests are unavoidable, especially for more controversial topics. I believe Wikipedia may provide a solution, by acting a democratic and deliberated set of PSI’s for every topic worth writing about. The authority to create and maintain PSI’s is then effectively granted to the global Internet community as a whole, since everybody is essentially allowed to edit the content of Wikipedia.
The problem with authority becomes evident with controversy. Who gets to decide on the truth when (groups of) people disagree? This is an inevitable problem for Wikipedia, as it is for any organization in charge of a communication channel. Wikipedia’s solution is as elegant as it is un-complicated. Whenever a conflict over an article occurs, that article is marked with a warning about it’s controversial content (like this article on the 2008 South Ossetia war). The article authors are then responsible for working towards a version of the article that everybody involved can agree upon. This deliberative and democratic mechanism seems to be built into the very core of Wikipedia.
I believe Topic Maps can draw benefits from Wikipedia’s democratic mechanisms, by making Wikipedia the universal authority of Topic Map PSI’s. If your topic map needs a new topic, check for an existing article on Wikipedia. If it doesn’t exist, author a new article, and use it’s URL as your new topic PSI. If nobody feels a need to modify your contribution, congratulations! Your new topic PSI has earned it’s right to exist. Should someone disagree with you, however, you need to engage in a little deliberative democracy in order to establish a common understanding. Once that agreement is reached, we’re all better off.
Wikipedia sports many of the features defined for the Topic Maps standard. Each article can serve as a topic definition, and each article URL can the thought of as that topic’s PSI. When topic maps that contain common topics are merged, Wikipedia’s disambiguation mechanism supports the requirement that merged topics must include the union of the individual topics. Even scoping can be handled using a more flexible approach to subdomains, not unlike how it is used for languages today (eg. http://no.wikipedia.org for Norwegian).
Is the Topic Maps community willing to adopt Wikipedia as it’s democratic and user-generated repository of topic PSI’s? What actions must Wikipedia take to fully accommodate for the needs of topic map author out there in the trenches? Let your voice be heard.









I don’t think this is the right way to go – but I’m willing to be convinced, especially if Wikipedia is willing to take certain actions. Let me just mention one problem:
I need an identifier for the Italian composer Giacomo Puccini. Which of the following Wikipedia URLs should I use as the PSI:
http://en.wikipedia.org/wiki/Puccini
http://en.wikipedia.org/wiki/Giacomo_Puccini
http://it.wikipedia.org/wiki/Giacomo_Puccini
http://no.wikipedia.org/wiki/Giacomo_Puccini
http://ko.wikipedia.org/wiki/자코모_푸치니
http://zh.wikipedia.org/wiki/贾科莫·普契尼
http://el.wikipedia.org/wiki/Τζιάκομο_Πουτσίνι
http://ka.wikipedia.org/wiki/ჯაკომო_პუჩინი
http://ja.wikipedia.org/wiki/ジャコモ・プッチーニ
[+ 50 or so more]
Steve, it’s a pleasure to hear from you. I enjoyed your tutorial last Wednesday on Topic Maps Norway 2009. Getting some of the fundamentals of topic maps in place was really helpful.
I guess you know better than me which actions Wikipedia could/should take to provide you with a proper PSI for Puccini. Here’s what I think, as I try to break down the problem.
The first URL is a nonstandard form of the second URL, which can be considered the canonical form. Would it be better it Wikipedia disallowed nonstandard article URLs, instead of just redirecting to the canonical?
The other URLs belong to non-English Wikipedia sites. This is really of the top of my head, but wouldn’t it be possible to make scopes out of language subdomains, like the language scopes of your Italian opera TM? That way, your Puccini PSI would be a language-independent canonical article URL, with subdomains representing scops, perhaps with English as the default scope.
How would these changes improve the situation, you think?
For Wikipedia URIs to be able to function as PSIs Wikipedia must explicitly designate one URI as the canonical URI for each subject, otherwise users won’t know which one to use. This already happens (albeit implicitly) within a single language version of Wikipedia through redirection, as when …/Puccini redirects to …/Giacomo_Puccini. However, this doesn’t work across different languages.
I for one would not be comfortable for English to become the default, because of the cultural bias this involves. I think there needs to be one URI which is language independent. My first thought was to suggest:
http://psi.wikipedia.org/wiki/Giacomo_Puccini
but this would be unfair to the 54,000 speakers of Southern Pashayi in Afghanistan (ISO 639 language code “psi”) who might one day want their own Wikipedia (see http://www.ethnologue.org/show_language.asp?code=psi). An alternative might be:
http://wikipedia.org/psi/Giacomo_Puccini
If Wikipedia were to implement this as the canonical identifier, explicitly stated on each page devoted to the subject in question, my first concern would go away. (One could envisage the possibility of such a PSI automatically resolving to the natural language version of the user’s choice, although this could also raise new concerns.)
It wouldn’t alleviate my second concern, though:
When we developed the Published Subjects paradigm, it was generally agreed that the PSD (published subject descriptor, aka published subject indicator) – the resource that the PSI resolves to – should contain the _bare minimum_ of assertions necessary to unambiguously identify the subject. The thinking was that the more assertions one makes about a subject, the greater the chance that people will disagree, and the more likely that some people will refuse to use the PSI (and that, of course, defeats the whole point of having a PSI). Wikipedia articles typically make many assertions about each subject and although they strive for objectivity, in practice this is never fully attainable.
My third concern is that editorial policy at Wikipedia would prevent you from creating articles about some subjects. (I doubt, for example, that they would countenance an article about my mother’s dog, Barney, http://psi.ontopedia.net/Barney.) This means that Wikipedia could at best only provide a starter set of PSIs, which would have to be supplemented in other ways.
At Ontopedia we have experimented with different approaches, but we have mostly followed the convention of using our own “namespace” (http://psi.ontopedia.net/) together with the “local part” (e.g. Giacomo_Puccini) that Wikipedia uses as its (implicit) canonical URI.
I can only speculate, but I believe that a richer PSD could work as an incentive for people to adopt the use the PSI. I understand the rationale behind minial and objective PSDs, but what if the lack of information and context makes me question the quality of the PSI, and it’s dependability as a lasting identifier for my topic?
I mean, which PSD/PSI feels more authorative and dependable?
http://psi.ontopedia.net/Collective_intelligence
or
http://en.wikipedia.org/wiki/Collective_intelligence
I choose Wikipedia. It just feels more inviting and cared for. And care breeds confidence. Sure, people can disagree on the definition of a topic, and that happens all the time on Wikipedia. But most edit wars are eventually resolved, and the end product is close to the bare minimum of assertions people are willing to agree upon. Objective enough, you could say.
Let’s flip the question around…
Can I use Wikipedia article URLs as PSIs for my own topic maps? Would you advice against that for any reason?
Hei Vegard,
interesting and necessary discussion. You might also want to have a look at this paper on “Cool URIs for the Semantic Web” as well:
http://www.w3.org/TR/2007/WD-cooluris-20071217/
The idea of PSI’s is very interesting and has strong points, albeit I like the idea of having several Unique Identifiers at different places at the web for a specific object/topic/idea better. In such a manner selecting the “best” description really becomes a community-driven initiative. People might want to add similarity between pointers to the same object through constructs like owl:sameAs, which let anybody state similarity between URIs if they feel there is one. An interesting topic should be how we can trace where such statements came from, in such a way that we can (by integrating a FOAF/trusted peer concept) filter out those statements about such URIs that we really trust.
Hi Robert!
I understand more of your comment now, after listening to your presentation today about Topic Maps, Semantic Web (and search). The concept of “compound identifiers” using the owl:sameAs is very interesting, together with the decentralized infrastructure for topic identification and description. I still don’t know enough about RDF, OWL, FOAF and the rest to really feel I can have a qualified opinion about what works or not, but what you’re saying resonates well with my general understanding.
Thanks for sharing!
Are Gulbrandsen comments on this post on his own blog Everythings a Subject:
He links to this blog post by Lars Marius Garshol: http://www.garshol.priv.no/blog/91.html
(Note: Backlinks have for some reason disappeared from my blog. I need to check out why.)
re: PSI for Puccini
How about using DBpedia[1] – it has already consolidated the different Wikipedia identifiers, with nice labels (“names”) comments in many languages plus media references and additional relations; and it is implemented with HTTP content negotiation. For instance:
http://dbpedia.org/resource/Giacomo_Puccini
[1] http://dbpedia.org/About
Today I came across an interesting example of why I wouldn’t want to use Wikipedia as a source of PSIs – or any other source, including DBpedia – unless they guarantee to maintain a policy of stability.
Browsing the article about “Paul Foot” [1] I clicked on the link to “International Socialists (UK)” [2] and was redirected to the page on the “Socialist Workers Party (Britain)” [3]. Although related, these are *not* the same subject. Even worse, the URL shown in the address bar of my browser for the SWP page was that of the IS page.
[1] http://en.wikipedia.org/wiki/Paul_Foot
[2] http://en.wikipedia.org/wiki/International_Socialists_(UK)
[3] http://en.wikipedia.org/wiki/Socialist_Workers_Party_(Britain)
It’s one thing to redirect from …/Puccini to …/Giacomo_Puccini (which one can understand are the same subject), but quite another to redirect from one subject to another.
Thank you for that example, Steve!
Re: Steve Pepper / International Socialists (UK)
Interesting point, which touches the question of authority. As I see it (not an expert on the British left), Wikipedia thinks, International Socialists (UK) are indeed a (historical) part of another subject. Such decisions have to be made, I think, and they are often made for practical reasons.
There are for me two distinct issues: First, do I respect Wikipedia’s authority (as best option for now/because of their peer reviewed, open approach) in principle? Second, what kind of options do I have if I don’t agree with this primary source of PSI’s or if I want to have a finer grained conversation? I’d think this would be the occasion for my own PSI’s.
Anyway, Wikipedia has a distinct PSI for the International Socialists (UK), just in case:
http://en.wikipedia.org/w/index.php?title=International_Socialists_(UK)&redirect=no
Sorry, please ignore the last paragraph of the previous post – it doesn’t matter, as the page still points to the other one for identification.
I think the best source for person entities as subject (person PSI’s) is the VIAF project (still in beta). “The Deutsche Nationalbibliothek, the Library of Congress, the Bibliothèque nationale de France, and OCLC are jointly conducting a project to match and link the authority records for personal names in the retrospective personal name authority files of the Deutsche Nationalbibliothek (dnb), the Library of Congress (LC), and the Bibliothèque nationale de France (BnF). from: http://www.oclc.org/research/projects/viaf/“.
The ISO 21127:2006, also known as CIDOC CRM ontology, considers that everything could be a *subject* of an “information object”. So, it’s necessary to create terminologies of speciliazed entities (like VIAF for person, TGN for location, etc.) that could be referenced with precision and without ambiguity. This global terminologies should be merge with national and local terminologies at one specific system implementation.
The Giacomo Puccini identifier at VIAF is http://viaf.org/137043 .
I proposed this some years back at http://en.wikipedia.org/w/index.php?title=Wikipedia:Village pump (policy)&oldid=17918242#Wikipedia_pages_as_Published_Subject_Indicators and got exactly nowhere, as you can see. Maybe someone with more clout in the Wikipedia community would have better luck.
@John: I think more is required than the insertion of published subjects boilerplate. At the very least you need a commitment to stability of URLs once published, including a commitment not to redirect from one previously independent page to another. I can’t see Wikipedia accepting that — and nor should they.