Thursday, March 31, 2005

Unique Name Assumption

I just read Andrew Newmans entry on the Unique Name Assumption (UNA). He thinks that not having an UNA is "weird, completely backwards and very non-intuitive". Furher he continues, that "It does seem perverse that the basis for this, the URI, is unique." He cites an OWL Flight paper that caused me quite some headache a few weeks ago (cause there was so little in it that I found to like).

Andrew, whose blog I really like to read, makes one very valid point: "
It doesn't really say, though, why you need non-unique names."

There was an OWL requirement that gives a short rationale for the UNA, but it seems it is not yet stated obvious enough.
Let's make a short jump to the close future: the Semantic Web is thriving, private homepages offer rich information sources about anything, and even the companies see the value of offering machine-processable information, thus, ontologies and knowledge bases everywhere!

People want to say how they liked the movie they just saw. They enrich their movie review with an RDF-statement that says http://semantic.nodix.net/movie#Ring_2 http://semantic.nodix.net/rating#rated http://semantic.nodix.net/rating#4_of_5. Or rather, their editor creates this statement automatically and publishes it along the review.

I'd be highly surprised if imdb would use the same URI for denoting the movie. They would probably use an imdb-URI. And so could I, using the imdb-specified URI for the movie. But I didn't, and I don't have to. If I want to state that this is the same movie, I can assert that explicitly. If I had UNA, I couldn't do that. The two knowledge bases could not work together.

With UNA, many knowledge bases relying on inverse functional properties would break as well. FOAF, for examples, uses this, identifiying persons with an IFP of their eMail-Hash. With UNA, this wouldn't work anymore.

Let's take another example. On my mothers webpage there could be a statement saying she has three kids, Jurica, Rozana and Zdenko. I would state on my page that I am my moms kid. My sister, being the social kind, tells the world about her mom and her two brothers, Jurica and Denny.
Now, if we have UNA, a reasoner would infer that one of us is lying. But all of us are very honest, trustworthy people. The problem here is, that my name is Zdenko, but most people refer to me as Denny. UNA says that Denny and Zdenko are the same person. If we have no UNA, we wouldn't believe that. But still we can state it explicitly: my mom could have said that she has three kids, Jurica, Rozana and Zdenko, and those are mutually distinct. Problem solved.

You could say, wait, if we had UNA we still could just claim that Zdenko owl:sameAs Denny, and the problem wouldn't arise. That is true. But then I would have to consider my moms statements. That maybe OK on a scale like this, but imagine this in the wilds of the web - you would have to consider every statement made about something, before you may state something as well. Impossible! And you would introduce non-monotonic inferences, and you probably wouldn't really want that.

What does this mean? Let's take the following row of statements, and consider the answer to the question "Is Kain on of Adams two sons?". So we know that Adam has two sons, and that there is an entity named Kain.

Adam fatherOf Abel.
UNA and non-UNA both answer: don't know.
Adam fatherOf Cain.
UNA says "No, Kain is no son of Adam". non-UNA says: "Sorry, I still don't know".
Cain sameAs Kain.
UNA says "Yes, Kain is a son of Adam (hope you didn't notice my little lie seconds before)". non-UNA says: "Yes, Kain is a son of Adam".

Assuming that, instead of the last statement, we claimed that
Adam fatherOf Kain.
UNA would say: "I'm messed up, I don't know anything, my database is inconsistent, sorry." , whereas non-UNA would answer: "Yes, Kain is a son of Adam (and by the way, maybe Kain and Abel are the same, or Kain and Cain, or Abel and Cain)."

The problem is, that in the setting of the Semantic Web you have a World Wide Web with thousands of facts, always changing, and you must assume that you didn't fetch all the information about a subject. You really can't know if you know everything there is about Adam. But you still want to be able to ask questions. And you want to get answers, and these answers to be monotonic. You don't want the Semantic Web to answer one day "No", the other "Yes" and sometimes "I don't know", but you could be fine with having it either provide the correct answer or non at all.

OWL-Flight and proponents of UNA actually forgot that it's a Semantic Web, not just a Semantic Knowledge Base. If you want UNA, take your Prolog-engine. The Semantic Web is more. And therefore it has to meet some requirements, and UNA is an astonishingly basic requirement of the Semantic Web. Don't forget, you can create local unique names if needed. But the other way would be much harder.

Still, Andrews arguments lead to a very important question: taking for granted that Andrew is an intelligent guy with quite some experience with this kind of stuff, how probable is it, that Joe Random User will have really big problems with grasping such concepts as non-UNA? How should the primers be written? How should the tools work in order to help users deal with this stuff - without requiring the user to study these ideas in advance?

Still a long way to go.

Tuesday, March 22, 2005

AIFB OWL Tools

Working with ontologies isn't yet as easy as it could be - especially because the number of little helpers is still far too small. After having written dlpconvert and owlrdf2owlxml (the tool with the maybe most clumsy name in the history of the Semantic Web) I noticed how easy it would be to write some more tools based on Boris' KAON2 OWL ontology infrastructure.

And so I went ahead. First I integrated dlpconvert and owlrdf2owlxml (or short, r2x) in it, then I added a simple ontology dumper and axiom and entity counter. Want to know how many individuals are in your ontology? Simply type owl count myontology.owl -individual, and there you go. Want a list of all Classes? Try owl print myontology.owl -owlclass. It's as easy as that.

I'm totally aware that this functionaly maybe isn't worth the effort of building a tool for. But this is just a beginning: I want to add more functionality to filter, merge, compare and much more to it. The point is, at the end having a handy little set of OWL tools you can work with. I miss that really with OWL, and now here it is. At least, a beginning.

Grab your copy now at the AIFB OWL Tools site.

Monday, March 21, 2005

Philosophische Grundlagen

I had a talk on Philosophical Foundations of Ontologies last week at the AIFB. I prepared it in German (and thus, all the slides were in German) and just before I started I got asked if I may give the talk in English.
Having never heard a single lesson in philosophy in English and having read English philosophy only on Wikipedia before, I said yes. Nevertheless, the talk was very well perceived, and so I decided to upload it. It's pure evil PowerPoint, no semantic slides format, and I didn't yet manage to translate it to English. If anyone can offer me some help with that - I simply don't know many of the technical terms, and I don't have ready access to the sources - I would be very happy! Just drop me a note, please.

Philosophische Grundlagen der Ontologie (PowerPoint, ca. 4,5 MB)

Friday, March 18, 2005

What's DLP?

OWL has some sublanguages which are all more or less connected to each other, and they make the mumbojumbo of ontology languages not any clearer. There is the almighty OWL Full, there's OWL DL, the easy* OWL Lite, and then there are numerous 'proprietary' expansions, which are more (OWL-E) or less (OWL Flight) compatible and useful.

We'd like to add another one, OWL DLP. Not because we think that there aren't enough already, but because we think this one makes a difference. Because it has some nice properties, like fully translatable to logic programs, and because it is easy to use and because it is fully compatible to standard OWL, and you don't have to use any extra tools.

If you want to read more, I and some colleagues at the AIFB wrote a short introduction to DLP (and the best thing is: if I say short, I mean short. Just two pages!). It's meant to be easy to understand as well - but if you have any comments on that, please provide them.

* whatever easy means here

Wednesday, March 16, 2005

New versions: owlrdf2owlxml, dlpconvert

New versions of owlrdf2owlxml and dlpconvert are out.

owlrdf2owlxml got renamed, as it was formerly known as rdf2owlxml. But as a colleague pointed out, this name can easily be misunderstood, meaning to transform arbitrarily RDF to OWL. It doesn't do that, it only transforms OWL to OWL, from RDF/XML-serialisation to XML Presentation Syntax. And it seems to work quite stable, it can even transform the famous wine ontology. Version 0.4 out now.

dlpconvert lost a lot of its bugs. And as most of you were feeding RDF/XML to it, well, now you can do it officially (listen to the users), too. It reads both syntaxes, and creates a Prolog program out of your ontology. Version 0.7 is out.

They are both based on KAON2, the Karlsruhe Ontology Infrastructure module, written by Boris Motik. My little tools are just wrapped around KAON2 and using its functionality. To be honest, I'm thinking of writing quite a number of little tools like this, who offer different functionality, thus providing you with a nice toolkit to handle ontologies efficiently. I don't lack ideas right now, it's just I' m not sure that there's interest in this.

Well, maybe I should just start and we'll see...

By the way, both tools are not only available as web services, but you may also download them as command line tools from their respective websites and play around it on your PC. That's a bit more comfortable than using a browser as your operating system.

Friday, March 11, 2005

Unexpected problems

As you know, I'm a strong believer in the vision of the Semantic Web, and I actively pursue this goal. I am not too sure what it means, but I have hundreds of ideas floating through my head, about what will be possible in this future...

But the road seems longer than expected. For some time I have the dlpconvert and rdf2owlxml web ervices running. It is very enlightening and interesting to see, what kind of ontologies were used for testing. And I most certainly don't mean the domain of the ontologies used, but rather the syntax.

Both services state very clearly what syntaxes you may use. dlpconvert allows only OWL XML presentation syntax, rather obscure, I admit. That's the main reason, rdf2owlxml was offered. But most people didn't care, they just keep on using RDF - and not just OWL in RDF/XML-serialisation, but much more simple, plain RDF.

Yeah, every RDF is in OWL Full. But dlpconvert only deals with OWL DL. That's stated explicitly. And much less does it work with Abstract Syntax or N3. All of this was tested.

I most definitively don't want to rant about users here. You never should rant about users (I mean, in public). Especially, since everyone who uses a service like dlpconvert is probably quite intelligent and has some expertise in the field of Semantic Web. It's not his fault. It isn't mine either, I wrote quite explicitly what is needed. Maybe it's the W3Cs fault, or maybe it's just to blame on politics.

The fine differences between RDF, RDFS, RDF(S), OWL, OWL Full, OWL DL, OWL Lite, DLP - yes, I said fine differences between RDF and OWL DL - it's just too much to cope with. If it is too much for us, what do we expect of the future user of the Semantic Web? The web as we know it grew to its todays size because it was easy. It wasn't because of standards. For the first few years no one really cared about the HTML standard, I mean, not to the extent we do today in the Semantic Web. Even with tons of errors, pages would load and show nice results. It was a very forgiving system. And now, find out why it was so widely adopted?

The problem is: maybe we really need to be as strict as we are. But I hope we don't. I strongly believe into the virtue of "View source" - but this means understandable views on the source. Not RDF/XML-Serialisation. And still easy to copy. Only this way the Semantic Web can lift off from the roots, from the users. The users were creating the Web in the first years, not the companies. I don't know why everybody is turning to the companies today.

Oh, I should stop, it sounds like ranting again.