I've been playing with using KIO::get() to make queries on the DBPedia SPARQL endpoint, parse the XML result set and convert it to be used by a Plasma Data Engine. I'll explain how it works as I think it is pretty useful and makes it very easy to link up applets with Semantic Web/Desktop data.
This is the basic SPARQL query, it takes the name of an artist and retrieves details of all the albums they've made - the album name, the urn of the album's DBPedia resource, creation date and cover art picture:
PREFIX p: <http://dbpedia.org/property/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT * WHERE {
?album p:artist <http://dbpedia.org/resource/The_Velvet_Underground>.
?album rdf:type <http://dbpedia.org/class/yago/Album106591815>.
OPTIONAL {?album p:cover ?cover}.
OPTIONAL {?album p:name ?name}.
OPTIONAL {?album p:released ?dateofrelease}.
}
I borrowed the example query from this article about making a timeline of albums. You post the query string to a url for the DBPedia SPARQL endpoint which is http://dbpedia.org/sparql, and the query results areturned in a simple to parse XML format. They look like this:
<?xml version="1.0" ?> <sparql xmlns="http://www.w3.org/2005/sparql-results#" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2001/sw/DataAccess/rf1/result2.xsd"> <head> <variable name="album"/> <variable name="cover"/> <variable name="name"/> <variable name="dateofrelease"/> </head> <results distinct="false" ordered="true"> <result> <binding name="album"> <uri>http://dbpedia.org/resource/1969:_The_Velvet_Underground_Live</uri></binding> <binding name="cover"> <uri>http://upload.wikimedia.org/wikipedia/en/3/3c/1969Live.jpg</uri></binding> <binding name="name"><literal xml:lang="en">1969: The Velvet Underground Live</literal></binding> <binding name="dateofrelease"><literal datatype= "http://www.w3.org/2001/XMLSchema#gYearMonth">1974-09-01 00:00:00.000000</literal></binding> </result> ... </results> </sparql>
So if you're familiar with SQL queries, and SPARQL select query is very similar. In order to make it work well the the Plasma Data Engine model you need to decide which of the values is the most important, and in this case it's the album name.
The code to issue an HTTP request via KIO::get() is really short and simple. I wrote about using ActiveRDF to query in Get Semantic with DBPedia and ActiveRDF, and it was an interesting idea but didn't work very well. The open-uri get() call that the ActiveRDF SPARQL adapter uses would keep timing out even if you simplified the queries, and it was asynchronous which meant that a GUI app would just freeze while the query was being executed. KIO just chugs away in the background, calling the queryData() slot when ever some data arrived, until it calls the queryCompleted() and the data is ready to parse.
class SparqlDataEngine < Plasma::DataEngine
slots 'queryData(KIO::Job*, QByteArray)',
'queryCompleted(KJob*)'
def initialize(parent, args, endpoint, query, primary_value)
super(parent)
setMinimumPollingInterval120 * 1000)
@endpoint = endpoint
@query = query
@primary_value = primary_value
end
def sourceRequestEvent(source_name)
if @job
return false
end
@source_name = source_name
@sparql_results_xml = ""
query_url = KDE::Url.new("#{@endpoint}?query=#{CGI.escape(@query % @source_name.gsub(' ', '_'))}")
@job = KIO::get(query_url, KIO::Reload, KIO::HideProgressInfo)
@job.addMetaData("accept", "application/sparql-results+xml" )
connect(@job, SIGNAL('data(KIO::Job*, QByteArray)'), self,
SLOT('queryData(KIO::Job*, QByteArray)'))
connect(@job, SIGNAL('result(KJob*)'), self, SLOT('queryCompleted(KJob*)'))
setData(@source_name, {})
return true
end
def queryData(job, data)
@sparql_results_xml += data.to_s
end
def queryCompleted(job)
@job.doKill
@job = nil
parser = SparqlResultParser.new
REXML::Document.parse_stream(@sparql_results_xml, parser)
parser.result.each do |binding|
binding.each_pair do |key, value|
# puts "#{key} --> #{value.inspect}"
setData(binding[@primary_value].literal.variant.toString, key, Qt::Variant.fromValue(value))
end
end
end
def updateSourceEvent(source_name)
sourceRequestEvent(source_name)
return true
end
end
I tweaked the XML parsing code in the ActiveRDF adapter to create Nepomuk Soprano nodes, and return a Ruby Array of Hashes, each hash having keys for the SPARQL query variable and Soprano::Nodes for the values. The code in the 'queryCompleted()' method above then walks through the results making Plasma setData() calls, which is how an engine submits its data. The first string of the setData() call is the album name, eg 'White Light/White Heat' for the Velvets, and the second string is the particular attribute, such as data of release, and the third argument is the Soprano::Node with the value wrapped up in a Qt::Variant.
This is the code that parses the XML using the Ruby REXML library:
# Parser for SPARQL XML result set. Derived from the parser in the
# ActiveRDF SPARQL adapter code. Produces an Array of Hashes, each
# hash contains keys for each of the variables in the query, and
# values which are Soprano nodes.
#
class SparqlResultParser
attr_reader :result
def initialize
@result = []
@vars = []
@current_type = nil
end
def tag_start(name, attrs)
case name
when 'variable'
@vars << attrs['name']
when 'result'
@current_result = {}
when 'binding'
@current_binding = attrs['name']
when 'bnode', 'uri'
@current_type = name
when 'literal', 'typed-literal'
@current_type = name
@datatype = attrs['datatype']
@xmllang = attrs['xml:lang']
end
end
def tag_end(name)
if name == "result"
@result << @current_result
elsif name == 'bnode' || name == 'literal' || name == 'typed-literal' || name == 'uri'
@current_type = nil
elsif name == "sparql"
end
end
def text(text)
if !@current_type.nil?
@current_result[@current_binding] = create_node(@current_type, @datatype, @xmllang, text)
end
end
# create ruby objects for each RDF node
def create_node(type, datatype, xmllang, value)
case type
when 'uri'
Soprano::Node.new(Qt::Url.new(value))
when 'bnode'
Soprano::Node.new(value)
when 'literal', 'typed-literal'
if xmllang
Soprano::Node.new(Soprano::LiteralValue.new(value), xmllang)
elsif datatype
Soprano::Node.new(Soprano::LiteralValue.fromString(value, Qt::Url.new(datatype)))
else
Soprano::Node.new(Soprano::LiteralValue.new(value))
end
end
end
def method_missing (*args)
end
end
Those two class are pretty generic and could be used for any similar SPARQL query, and you just need to subclass SparqlDataEngine to give it a specific query string and endpoint like this:
SPARQL_QUERY = <<-EOS
PREFIX p: <http://dbpedia.org/property/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT * WHERE {
?album p:artist <http://dbpedia.org/resource/%s>.
?album rdf:type <http://dbpedia.org/class/yago/Album106591815>.
OPTIONAL {?album p:cover ?cover}.
OPTIONAL {?album p:name ?name}.
OPTIONAL {?album p:released ?dateofrelease}.
}
EOS
#
# Customize the use of the SparqlDataEngine by giving it the url of an endpoint,
# a query to execute, and the name of the most important (or primary) value.
# The '%s' in the query text above is replaced with the source name, with any
# spaces replaced by underscores.
#
class DbpediaAlbumsEngine < SparqlDataEngine
def initialize(parent, args)
super(parent, args, 'http://dbpedia.org/sparql', SPARQL_QUERY, 'name')
end
end
It's very little work indeed compared with the way you normally have to issue standard html requests and then parse the totally non-standard results. I had a look at some of the Weather applet's Ion code to get BBC forcasts and it was really very complicated, and it would be vastly simpler if you could get weather data via SPARQL instead. The last step is to make a .desktop file for your new engine:
[Desktop Entry] Name=DBPedia Albums Data Engine Comment=DBPedia album data for Plasmoids X-KDE-ServiceTypes=Plasma/DataEngine Type=Service Icon= X-KDE-Library=krubypluginfactory X-KDE-PluginKeyword=plasma-engine-dbpedia-albums/dbpedia_albums_engine.rb X-Plasma-EngineName=dbpedia-albums
And a simple CMakeLists.txt file to install it:
install(FILES plasma-dataengine-dbpedia-albums.desktop DESTINATION ${SERVICES_INSTALL_DIR} )
install(FILES dbpedia_albums_engine.rb DESTINATION ${DATA_INSTALL_DIR}/plasma-engine-dbpedia-albums)
You can use the Plasma engine explorer to test engines, and I enhanced the Ruby version slightly so it can show the contents of Soprano::Nodes within Qt::Variants. Here is what the browser looks like testing a new engine:
[image:3401 size=preview]
I'll try and add some stuff to the TechBase wiki about writing Ruby Plasma data engines and applets once the api has settled down a bit again, but I hope I've explained enough to get people playing with SPARQL queries as I think there could be a lot of application for the idea..

the only problem is...
the only problem is that these ruby plugins are not using the ScriptEngine mechanism. ScriptEngine is not just there to make it possible to add scripting for languages that need a direct shim, they also play a *very* important role in management of the plugins, API maintenance, etc.
while it's cool one can write plugins using ruby like this, i really hope this doesn't become the preferred mechanism for ruby addons to plasma since it is broken from the perspective of the libplasma api.
your screenshot reminds me of a small bug i need to fix in the engine explorer though =)
Re: the only problem is...
Well the Ruby Plamsa bindings are generated directly from the Plasma headers, and so maintaining them with respect to API changes is pretty easy. The Ruby plugins use the same mechanism as C++ ones for packaging, and if a there is a better way for scripting languages it sounds a good idea to change. I would prefer not to have to treat every KDE plugin api as a special case, to be maintained in its own (possibly idiosyncratic) way.
My understanding was that there were to be two sorts of apis for scripting languages in Plasma, a very simple sort aimed at non-professional 'consumer programmers', and another sort for people who want to do everything they can in C++ and more, but faster and easier.
That's why I am interested in asking questions about what sort of programmers and languages are we targeting, and what kind of development environments they might want. In my opinion, at present we don't have any central point where that kind of issue is addressed and discussed in the KDE project. It is spread across app specific lists like the Plasma one, sometimes on k-c-d, there was a large discussion recently on the release list, and then there is the kde-bindings list which is used by only some of the Qt and KDE bindings projects.