Extraction
Introduction
The simplest way to "teach" AWMT is by directly using the models composition API, in order to insert knowledge that is already structutured and reliable. However, the system can also learn from unstructured data, such as text or images.
The process of extraction consists of updating one or several existing abstraction(s) with new information provided in an unstructured format.
Human language is ambiguous and often incomplete, so the system is able to ask for clarification or confirmation when needed.
Extraction can also be seen as the ability to "learn new knowledge" about one specific object. The ability to generalize that knowledge to all sibling objects is induction.
Extracting from text
mutation {
create_model(label: "Football Club") {
man_u: instantiate(label: "Manchester United") {
model {
label
}
}
chelsea: instantiate(label: "Chelsea") {
model {
label
}
}
arsenal: instantiate(label: "Arsenal") {
model {
label
}
}
man_city: instantiate(label: "Manchester City") {
model {
label
}
}
}
}
query {
extract(text: "erling haaland plays for manchester") @stream {
status
kind
label
start {
path
label
}
end {
path
label
}
ambiguity {
path
options {
path
label
}
}
}
}
{
"data": {
"extract": [{
{
"status": "FOUND",
"kind": "HAS_PROPERTY",
"label": "plays for",
"start": {
"path": "football_player",
"label": "Football player"
},
"end": {
"path": "football_club",
"label": "Football club"
}
},
{
"status": "CREATED",
"kind": "INSTANCE_OF",
"start": {
"path": "erling_haaland",
"label": "Erling Haaland"
},
"end": {
"path": "football_player",
"label": "Football player"
}
},
{
"status": "AMBIGUOUS",
"kind": "REFERENCE",
"ambiguity": {
"path": "erling_haaland:plays_for",
"options": [
{
"path": "manchester_united",
"label": "Manchester United"
},
{
"path": "manchester_city",
"label": "Manchester City"
}
]
}
}
}]
}
}
In that example, the information extracted from the text has been confronted with the existing models.
The system knows what a football player is, what a football club is, that a football player may play for a club, and it knows a few clubs by their names.
It has been able to extract that there is a new football player called Erling Haaland and that he plays for a club, but it is not sure which one, because several ones applied to the context. It has asked for clarification between Manchester United and Manchester City, making the user "force" the disambiguation.