[Person-ontology] Person-ontology Digest, Vol 1, Issue 11

Paul Trevithick paul at socialphysics.org
Wed Oct 10 21:02:50 PDT 2007


Inline

> -----Original Message-----
> From: person-ontology-bounces at idcommons.net [mailto:person-ontology-
> bounces at idcommons.net] On Behalf Of Parsa Mirhaji
> Sent: Wednesday, October 10, 2007 11:36 AM
> To: person-ontology at idcommons.net
> Subject: Re: [Person-ontology] Person-ontology Digest, Vol 1, Issue 11
> 
> > The duplication driven by the need in identity management systems to be
> able
> > to not only make statements, but also to be able to make statements
> about
> > these statements. In Higgins call the former "Attributes" and the latter
> > "Metadata" in Higgins. For example we want to make this Attribute
> statement
> > about Digital Subject Adam: "Adam hasHairColor blonde", but we also want
> to
> > say that this value "blonde" was provided by the Dept. of Motor Vehicles
> and
> > has an expiration data of Jan 1, 2010.
> >
> > If we just use a triple "Adam hasHairColor blonde" where blonde is a
> > literal, we've no ability to make statements about the value "blonde".
> So
> > what we do in Higgins is use two triples shown loosely here:
> >
> >    Adam hasHairColor value1
> >    value1 value "blonde"
> >    value1 expires "Jan 1, 2010"
> >    value1 source "Dept of Motor Vehicles"
> > ...
> >- Paul
> .
> 
> If we want to assert some statements about an statement (Reification) this
> is not how to do it.
> Your snippet doesn't say anything about the statement
> but the Value1. 

Yes, I knew all along I wasn't following the "official RDF way" in this
aspect. Higgins essentially defines its own convention. Which was fine when
Higgins was its own closed world and is now causing problems as we seek to
relate to other efforts. 

I also realized, as you point out, that the Higgins approach isn't really
reification of the statement itself, just a way to make statements about the
value. However, in the Higgins set of use cases, we found that the
distinction wasn't material. Talking about the values was good enough.

I found that the "official" approach to turning a simple [s p o] triple into
three triples [[statement subject value], [statement predicate value],
[statement object value]] just so you could add the statement [o source DMV]
and [o expires date] was verbose and awkward.

If these were my only points, then in the interest of bridging Higgins with
other worlds (such as the semantic folks on this list), we could abandon our
own "convention" and use the "official" approach. But there's one more
reason that is much more troubling to me than the above...

We wanted to allow Higgins developers to be able to define their own
ontologies based on higgins.owl wherein they could define their own OWL
Classes (both sub-classes of Higgins:DigitalSubject and sub-classes of
higgins:Value for new kinds of attribute values) and their own OWL
ObjectProperties (sub-properties of higgins:attribute). [They can do this
today to a lesser extent in LDAP schemas, etc. We wanted something more
expressive and (as OWL is).]

We wanted them to be able to express by defining their own OWL Classes and
ObjectProperties things like this:

- A Person is a class that is a sub-class of DigitalSubject.
- A Person has these attributes:
  - 1..1 hairColor
  - 1..2 eyeColor
  - 0..1 streetAddress
  - etc. 
- define the ranges of these newly defined attributes (e.g. hairColor simply
(e.g. higgins:StringSimpleValue)) while inheriting the ability to attach
"metadata" to instances of these values
- define a streetAddress attribute AND define a new complex class
(StreetAddress) that would define the value of the streetAddress attribute
as being comprised street, city, zip, sub-parts.

I hope I'm missing something, but if we use "official" reification, then we
loose the ability to define useful OWL Classes at all. Why? Because if every
attribute was represented as a Statement instance (as is required in
"official" reification) then every Class (whether for DigitalSubject
sub-classes or attribute "value" sub-classes (like streetAddress) would say
effectively: "Person is a subclass of DigitalSubject and it is comprised of
some Statements but I can't describe anything about them." Which means you
can't create an ontology at all. In summary, official reification
effectively means that all of OWL's expressiveness can't be used at all,
we're back to RDFS.

> Here is a classic reification example using blond as
> xsd:string:
> 
> This is the statement about Adams hair color:
> <rdf:Description rdf:about="#Adam">
> 
> 
> <hasHairColor
> rdf:datatype="http://www.w3.org/2001/XMLSchema#string">blond</hasHairColor
> >
> </rdf:Description>
> 
> And this is its reification stating expiration dates and source of data:
> <rdf:Description rdf:about="#Adam-hasHairColor">
> 
> <rdf:type
> rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Statement"/>
> 
> <rdf:subject rdf:resource="#Adam"/>
>     <rdf:predicate
> rdf:resource="#hasHairColor"/>
>     <rdf:object
> rdf:datatype="http://www.w3.org/2001/XMLSchema#string">blond</rdf:object>
> 
> <expiresAt
> rdf:datatype="http://www.w3.org/2001/XMLSchema#string">10-10-
> 2007</expiresAt
> >
>     <sourceOfData
> rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Houston
> DPS</sourceOfData>
>   </rdf:Description>
> 

Understood.

> 
> 
> 
> >
> > Naively, I  wonder what you mean by value1 expires "Jan 1, 2010"
> > does the blonde-ness of Adam's hair expire?
> >
> > If you mean that the hair color information is known to be good on Jan
> 1,
> > 2010, good for you,
> > but I find even that confusing.
> >
> > In general, this seems like some categorial error of talking about a
> value
> > having an expiration date.
> > The best I can assume from this snippet is that Adam's driver's license
> is
> > expiring in Jan 1, 2010
> > which is totally different...
> >
> > David
> 
> I don't think one would evaluate semantics of some assertions on its face
> value, that is, what we as a human being infer from some statement has
> nothing to do with what a reasoner or application deducts from them, as it
> depends on the framework and logic system used to interpret and infer and
> other rules that exists in the system. 
> There might be valid use cases for
> making such assertions. For example I think that US immigration expires
> your
> finger prints every 18 months or so (even though they are supposed to be
> your unique life long companions). For such a system, any value that has
> passed an expiration date should be recollected. This representation, at
> minimum can support that use case, although I agree in an awkward way....
> 

The awkwardness I could live with if it didn't prevent pretty much all the
goodies in OWL being thrown away too.

> With that said, I find this method of not expressing any value simply as a
> literal very useful (for reasons below), and we use this frequently. For
> example one may tie literal values of such String Nodes to an NLP tool and
> map its interpretation graph to the same node along with its literal value
> and another rule may interpret that in the context of other knowledge and
> add its inferences to the same node. Hence looking at a value node, you
> can
> see its literal value, and all its implications through the same node...
> One example of my use case for this:
> 
> Rather than directly saying [Adam hasAge "20 years"] this is what we say:
> 
> <Person rdf:ID="Adam">
>     <hasAge rdf:resource="#adamsAge"/>
>   </Person>
> 
> <literalValue rdf:datatype="http://www.w3.org/2001/XMLSchema#string"
>     >20
> years</literalValue>
> 
> 
> <rdf:Description rdf:about="#Adam">
> 
>     <rdf:type rdf:resource="#Person"/>
> 
>     <hasAge rdf:resource="#adamsAge"/>
> 
> </rdf:Description>
> 
> 
> <rdf:Description rdf:about="#adamsAge">
>     <literalValue
> rdf:datatype="http://www.w3.org/2001/XMLSchema#string">20
> years</literalValue>
> </rdf:Description>
> 
> And this is what happens after inference/rule engine plus NLP service work
> on the preliminary graph:
> 
> <rdf:Description rdf:about="#Adam">
> 
>     <rdf:type rdf:resource="#Person"/>
> 
>     <hasAge rdf:resource="#adamsAge"/>
> 
> </rdf:Description>
> 
> 
> <rdf:Description rdf:about="#adamsAge">
>     <rdf:type rdf:resource="#Age"/>
> 
>     <rdf:type rdf:resource="#AgeGroup_16-25"/>
> 
>     <unit>Years</unit>
>     <numericValue>20</numericValue>
> 
>     <literalValue
> rdf:datatype="http://www.w3.org/2001/XMLSchema#string">20
> years</literalValue>
> 
> </rdf:Description>
> 
> This way, if a second system reports another age, or if in 10 years later
> we
> measure a different age from the same system, discrepancies between
> reported
> values are contained and we know which inference came from where....
>

Thank you for this. Very helpful.

> One last note, I think redoing all XSD classes is an overkill and intended
> effects could have been achieved in simpler ways (this as one example)...

Oh I agree. All of this XSD "overkill" is a consequence of the more
fundamental approach that I've described above which is so far the only way
I know to "allow the ability to make statements about values AND not
effectively prevent the ability to use OWL constructs". If we can solve the
fundamental problem in a new better way, all this XSD stuff might also go
away.



More information about the Person-ontology mailing list