Skip navigation.
KDE Developer's Journals

oever's blog

oever's picture

Spring cleaning: Strigi becomes a meta-project

A couple of large commits changed the organization of the Strigi project. As you probably know, Strigi provides the code to extract data from files and also allows for fast searching for files. We have reorganized the project to be a meta project. It is now split into five projects that can be compiled independently: libstreams, libstreamanalyzer, strigidaemon, strigiclient and strigiutils. This move has been done to make it easier for other projects to use the library parts of Strigi. KDE, especially Nepomuk, depends on libstreamanalyzer, which in turn depends on libstreams.

This reorganization has brought along a big cleanup of build files in the project. The resulting libraries and executables are essentially the same as the are in last release: this reorganization just moves the files and changes the build system the libraries more pronounced. Especially the Tracker developers should benefit from this move. They have requested a way to use libstreams and libstreamanalyzer without needing to use the rest of Strigi.

The versioning and release schedule of the five Strigi components will stay the same. The next release will come as a big tarball and as five small tarballs. To get all five parts, run

  svn co svn://svn.kde.org/home/kde/trunk/kdesupport/strigi

To get just the libraries run
svn co svn://svn.kde.org/home/kde/trunk/kdesupport/strigi/libstreams
svn co svn://svn.kde.org/home/kde/trunk/kdesupport/strigi/libstreamanalyzer

We are now considering how to best move the project to git.

I leave you with a picture of one of my chickens and a link to a nice Summer of Code idea.
Wyandotte chicken

oever's picture

SlideCompare: improving rendering of slides in KOffice

Rendering slides is a complicated business. Slides can contain tons of different features just like webpages can. People expect that presentations look the same in different programs. Perhaps not pixel-perfect but very similar nevertheless.

OpenOffice and KOffice (and the Maemo/Meego Office Viewer) both have ODF as their main file format. ODF is an open standard and this means exchanging data between these programs should be simple and lossless. To help the developers of these programs find differences in rendering of slides, I have written a program that loads a presentation and shows it as rendered by KOffice and OpenOffice.

As an added bonus, it also shows how these programs render PowerPoint files. PowerPoint files are converted to ODP first and then loaded into each of the two rendering engines. That gives four types of output:

  • Converted by OpenOffice to ODP and rendered by OpenOffice
  • Converted by KOffice to ODP and rendered by KOffice
  • Converted by KOffice to ODP and rendered by OpenOffice
  • Converted by OpenOffice to ODP and rendered by KOffice

You can see an example view in the screenshot and screencast below.

The code has been announced on the koffice mailing list.

Ogg Theora screencast of SlideCompare
Flash screencast of SlideCompare

oever's picture

Silent Metronome in QML

Tonight I could not attend band rehearsal so I used the time to play with the new QML language. There is a nice tutorial online and a good screencast.

QML allows one to write flashy applications with little code. My first QML program is a metronome. The N900 has a metronome program but it is rather boring. It does not look and feel like a real metronome. So I set out to write one in QML and managed to do so in 56 lines of QML. The interaction is simple: tap it to toggle between on and off and slide up and down to move the cross-bar on the metronome which will adjust the tempo in the range 40 to 208 beats per minute.

Without further ado here is the code. You can run it in qmlviewer. Two things are lacking at the moment: a nice SVG image of a metronome and of course the ticking sound. I am keen to find out how to make the metronome produce sound to make it useful.

import Qt 4.6

Rectangle {
    width: 640
    height: 480

    Rectangle { // metronome bar
        id: bar
        x: 320; y: 100; width: 30; height: 300
        color: "#aaaaaa"
        property double tempo: 120

        Rectangle { // weight on metronome bar that determines the tempo
            x: -15; y: parent.tempo; width: 60; height: 30
            color: "#aaaaaa"
        }

        transformOrigin: Item.Bottom
        rotation: 0
        rotation: SequentialAnimation {
            id: anim
            repeat: true
            NumberAnimation { to: 35; easing: "easeInOutQuad"; duration: 60000/bar.tempo }
            NumberAnimation { to: -35; easing: "easeInOutQuad"; duration: 60000/bar.tempo }
        }
    }

    Text { // tempo indicator
        x: 0; y: 0;
        font.pointSize: 24; font.bold: true
        text: bar.tempo
    }

    MouseRegion { // logic for tempo tuning and turning metronome on and off
        anchors.fill: parent
        property int start: -1
        property bool moved: false
        property bool wasrunning: true

        onReleased: { // start or stop the metronome
            anim.running = (moved) ?wasrunning :!wasrunning
            bar.rotation = 0
            start = -1
        }
        onPositionChanged: { // adjust the tempo
            moved = start != -1
            wasrunning = (moved) ?wasrunning :anim.running
            bar.tempo += (moved) ?(mouse.y - start) :0
            bar.tempo = (bar.tempo > 208) ?208 :bar.tempo
            bar.tempo = (bar.tempo < 40) ?40 :bar.tempo
            anim.running = false;
            bar.rotation = 0
            start = mouse.y
        }
    }
}

silent metronome in qmlreal metronome

oever's picture

Alpha version of Office Viewer for Nokia N900 available

Today, Nokia released the first public version of the office document viewer for the Nokia N900 phone. It was uploaded to the Maemo repositories. This version supports text files, spreadsheets and presentations in OpenDocument format (ODF) and Microsoft Office formats. The viewer requires the latest update (PR1.1) to the N900 software. You can install 'Office Viewer' by adding the maemo-devel repository to your N900 catalogues:

Catalog name:
Maemo Extras-devel
Web address:
http://repository.maemo.org/extras-devel
Distribution:
fremantle
Components:
free

Then the application 'freoffice' will be available in the category 'Office'. The install is 9 megabytes.

With the viewer, you can open multiple files at once, open office documents from your e-mail, search in office files and copy and paste from your documents. A very nice feature is the ability to give presentations with the phone. Here are some screen shots of the viewer running on the N900.

Presentation Spreadsheet Text Document Overview

The code for this viewer is available in the KOffice repository. New releases of the viewer will be uploaded to the repository as KOffice progresses towards version 2.2.

The viewer has a simple user interface and responds quickly to user input such as page changing and scrolling.

oever's picture

Strigi 0.7.1

This is just a quick note to tell the world about the newest Strigi release. It has version number 0.7.1 and is the recommended Strigi version for use with KDE 4.4 and Nepomuk.

Go get it.

0.7.1
- Support more fields from ODF documents
- Improved skipping behavior on streams for large files.
- Added album art support.
- Added support for ID3v1 tags.
- Added MP3 stream metadata extraction, UTF-16 support in tags.
- Extended the range of metadata extracted by ID3 analyzer.
- Added a FLAC audio file analyzer.
- Significantly unbreak the PDF analyzer.
- Fix scanning trees where permissions are insufficient to read some parts
- Check for multithreaded version of libxml2
- Require newer CLucene version (0.9.21)

Join us in #strigi for comments and questions.

oever's picture

testing document conversion

Being able to properly read many different file formats is important for KOffice success. By 'read', I mean 'convert to ODF' because the conversion and reading is strictly separated in KOffice. KWord will convert a .doc file to a .odt file before loading it into the internal rendering and editing structure. There is even a nice separate program called 'koconverter' that can convert files on the command-line.

So far, there were no decent tests to avoid regressions in our filters. I have
written a small framework (well, a shell script, but framework sounds better) that makes it simple to write tests. There are a number of tests there now for converting ppt files, but it would be great to have them for other input formats too. And here is where I hope you will help. All you need is a small input file that highlights a feature or problem and a small XSL file. The XSL file contains the test.

Look at this small example. Suppose you have a file, it can be a .doc, .docx or another office format. The file contains only one image and you want to have an automated test to verify that the ODF that is created also has one image. The following XSL file tests this:

<?xml version="1.0" encoding="UTF-8"?>
<x:stylesheet
   xmlns:d="urn:oasis:names:tc:opendocument:xmlns:drawing:1.0"
   xmlns:x="http://www.w3.org/1999/XSL/Transform" version="1.0"
>
<x:template match="/">
  <x:if test="count(//d:image) != 1">
   <x:message terminate="yes">
    Error: there should be exactly one image.
   </x:message>
  </x:if>
</x:template>
</x:stylesheet>

If the number of image elements is not exactly one, the XSL transformation will abort with an error message.

So you see that the framework is written in such a way that writing tests is easy and fast.
When reporting a bug in KOffice or koconverter you can help a lot by writing an XSL for our automated
tests. You will see that this will speed up fixing the bug and it will help
avoid regressions.

This way of testing is a bit unconventional: these are not unit tests but overall
tests. Files are converted to ODF and the output file is checked. Not a small
part is tested but the complete conversion is tested. A benefit is that the tests are independent of the programs doing the conversion. We just check the result. So the same method could be used on any programs that write out ODF files.

Here is how our tests in KOffice work. First we convert the input file to ODF with
koconverter. An ODF is a zip file with many files and we usually want to check
the content of the XML files. So after conversion with koconverter, the ODF file
is uncompressed. Then an XSL transformation is run on the file content.xml.

In XSL on can report errors like this:

  <x:if test="string($style/s:graphic-properties/@d:fill-color) != '#bbe0e3'">
    <x:message terminate="yes">
      Error: draw:fill-color of the second frame should be '#bbe0e3'.
    </x:message>
  </x:if>

(You see that XML does not have to be too verbose.) The prefixes x: and s: in
this snippet stand for http://www.w3.org/1999/XSL/Transform and
urn:oasis:names:tc:opendocument:xmlns:style:1.0 respectively.
The test checks if the fill-color for a particular part of the output document
has the correct value. If not an error message is printed and the
transformation stopped.

You can replay this example by checking out the tests:

  svn checkout svn://anonsvn.kde.org/home/kde/trunk/tests/kofficetests/
  cd import/powerpoint
  make test

That was the overview of how the tests work. Now let us look into one more complicated test. It has two files: background.ppt and background.xsl. background.ppt is the input file and background.xsl
is the transformation that verifies the output of the transformation.

The file background.ppt has two frames, one of which must have a light blue
(#bbe0e3) background. At the moment the frame gets a background color, but it
is wrong. So when fixing this bug we first formulate what we want the result to be by
writing an XSL file.

One XSL file can contain multiple tests. This test is called
testSolidBackground:

  <x:template name="testSolidBackground">

We assign the second frame in content.xml to a variable:
  <x:variable name="frame"
    select="o:body/o:presentation/d:page/d:frame[position()=2]"/>

Now we find the name of the style for this frame:
  <x:variable name="stylename" select="$frame/@p:style-name"/>

And find the style with that name:
  <x:variable name="style"
    select="o:automatic-styles/s:style[@s:name=$stylename]"/>

Now we do a sanity check: do we even have a second frame?
  <x:if test="count($frame) != 1">
    <x:message terminate="yes">
      Error: there is no second frame on the first slide.
    </x:message>
  </x:if>

And do we even have a style?
  <x:if test="count($style) != 1">
    <x:message terminate="yes">
      Error: there is no style for the second frame.
    </x:message>
  </x:if>

Now we test if the background is 'solid':
  <x:if test="string($style/s:graphic-properties/@d:fill) != 'solid'">
    <x:message terminate="yes">
      Error: draw:style of the second frame should be solid.
    </x:message>
  </x:if>

And we check the color:
  <x:if test="string($style/s:graphic-properties/@d:fill-color) != '#bbe0e3'">
    <x:message terminate="yes">
      Error: draw:fill-color of the second frame should be '#bbe0e3'.
    </x:message>
  </x:if>

That is all there is to it! Learning XSL if you do not know it yet is some
effort but one that will pay off. Once you have the XSL you can run 'make test'
while fixing the bug. This will call the test for you which has as side-effect
that the conversion is run and the odf file unpacked.

I hope you all will start using this method for reporting and fixing filter bugs. I stop by starting you off with some links to XSL and XPath.

oever's picture

Getting an energy efficient small server

For mirroring my backup drive, central data store for devices, music playing and a webserver for experiments, I'd like to run a small server at home. I want this server to be energy efficient, easy to modify, robust, silent and run customizable free software. It should have at least 500 GB of storage, but 1 or 1.5 TB is better. You can buy very low-energy computers such as the Fit-PC 2 (6 watt) or the Linutop 2 (8 watt). Energy costs for machines that run constantly can be roughly estimated by doubling the power draw in watt, so running a device that uses 8 watt constantly costs about 16 euro a year.

Until recently the computer I used most was a Dell X1 Latitude laptop. That machine is now 4.5 years old. At the time, I chose it because it is a laptop with no fan and hence very silent. It is still better than any atom based netbook. So I would like to use this laptop as a server. UPS and screen are integrated which is a nice plus. The machine has a 1.8" disk built in. It is not possible to replace it with a disk of at least 500 GB. I wanted to know the energy cost of adding more storage to the X1. So I did some power measurements with an 2.5" external disk (Toshiba, 160 GB) and a 3.5" external disk (TrekStore 500GB). I measured on my current main laptop, a Lenovo X200s too.

Lenovo X220s (console, idle, low brightness unless otherwise specified)
Adapter only: 6 W
Console, low brightness: 19 W
Console, high brightness: 21 W
100% cpu and high brightness: 40 W
mounted 2.5" disk: 24 W
active (dd) 2.5" disk: 28 W
mounted 3.5" disk: 37 W
active (dd) 3.5" disk: 41W

Dell Latitude X1 (console, idle, low brightness unless otherwise specified)
Adapter only: 0 W
Console, low brightness: 15 W
Console, high brightness: 19 W
100% cpu and high brightness: 23 W
mounted 2.5" disk: 17 W
active (dd) 2.5" disk: 21 W
mounted 3.5" disk: 32 W
active (dd) 3.5" disk: 37 W

The 2.5" disk uses USB for power. The 3.5" disk has a separate adapter which is included in the power measurements. The device used for the power measurements is a DEM1379.
The idle 3.5" drive uses 13-15 more watt and the active drive uses 13-16 more watt. The difference is as large as power usage of the entire server. So I am now wondering if there are more energy efficient external 3.5" drives.

oever's picture

Good karma

This weekend I visited my parents in law, because my wifes paternal grandmother celebrated her 90th birthday. I noticed that the laptop they use was still running Kubuntu Feisty with OpenOffice 2.2. On this machine, reading emails, managing photos, surfing the internet and working on office documents are most important. Digikam is used for photos. Kmail and konqueror from KDE 3.5 are installed and a mix of OpenOffice and Microsoft Office 97 on wine is in use for editing office documents. in short, a horribly outdated setup of more than two years old. IT is still moving fast. Feisty was not a long term release and no updates for it anymore.

So in a slightly reckless move I decided to update the machine to the next Kubuntu: karmic koala. This meant going to KDE 4.3. To my relief the install went very well. All important settings for digikam and kmail were migrated automatically. Dolphin is really nice and more intuitive for non-professional users. The kwin effects add a nice touch of class (translucent wobbly windows). Plasmoids on the desktop (photo frames and weather forcast) were very well received.

In short: good karma! Thank you very much, Kubuntu team.

(my last two blogs were written on the Nokia N900 which has a good keyboard)

oever's picture

Printing photo albums

One important feature for photo management is missing in the FOSS world:an application for creating photo albums that can be sent away for printing at a printing service. There is however a pretty slick closed source application that works on linux. It can be fiound at for example Pixum (also in.nl and .de). It is based on Qt 4.4 and installs using a perl script which downloads the artwork and the required libraries. The application is customized for different printing companies that have these customized downloads available from their website. Not all of them
offer the linux or even the macintosh version. This is a shame and probably done to limit the number of different questions users might have. A standard for these photo album ordering services would be great, but I'm not holding my breath and will recommend Pixum for now.

oever's picture

Strigi partial port to javascript

You may remember two of my recent blogs. One was about a project to parse powerpoint files and another one was about porting hexdump to the browser.

So how about a combination of those two topics: parsing powerpoint files in the browser. It is quite a feasible task. The powerpoint file format is largely described in an xml schema now. From this scheme one would need to generate a parser like there is for c++ and java already. The parsers for java and c++ are both less then 700 lines of code.

We have not reached that stage yet and I do not have time to implement a powerpoint parser in javascript soon. I have written some requirements for it. To parse the individual data streams in a ppt file, one must parse the OLE2 file format. Currently we use pole for this in c++ and poifs in java. Now I could port either of these libraries to javascript, but there is another nice OLE parser: strigi.

In Strigi, the OLE file format is treated like other container formats such as zip, tar and mime. Porting parts of Strigi to javascript seemed like an interesting challenge. In Strigi, we use low level c++ to ensure speed. Most of the techniques used in the c++ are not available in javascript. So the javascript version is bound to be much slower. Still, I was curious what Strigi would look like in javascript.

And now it is ready. The parts required for reading OLE files have been ported. The result is one html page of 600 lines. It can read ppt files and list the streams in there. When clicking the streams, you see the stream in 'hexdump' style display. The speed is not even that bad. It takes about a second to parse a megabyte of file.

enjoy the demo!
(firefox 3.5 or recent webkit browser required)

Syndicate content