![]()
As you might know, okular supports a number of file formats. One of the formats it supports is [w:DjVu|DjVu], as you can see in the screenshot.
Its implementaton works quite nicely, although the page pixmaps generation is still synchronous, and we can not extract text from DjVu documents yet, but these are problems we are working on, hoping to fix them soon.
In the implementation I wrote, I was able to extract almost all the kind of metadata in a documents: for example the table of contents, or hyperlinks, or also the text or line annotations (you did not know a DjVu document could have annotation, did you?
)
What I'm missing to implement is the extraction from the metadata the information about author, year, title, etc., not because it's particularly difficult, but because I still miss a simple document test case with such kind of information.
So, basically, what I'm asking is if anyone of you have any documents with this information 
Knowing if a DjVu document has this information is really simple: use a simple DjVuLibre utility called djvused (usually packaged with DjVuLibre or in a separate djvulibre-bin, like Debian/Ubuntu) this way:
djvused -e 'output-all' mydocument.djvu | grep '(metadata'If you get any output, then that document might be a nice candidate! If the document is not private, you could sent it to me. There's no real prize, just a big "Thanks!" and your name in the commit log of the feature

NIPS has such documents
NIPS has an interface to search through their volumes by Author/Title. When I take some random document from there and check it with your command given above I at least get the title.
If I do not grep but "|less" the content prodiced by djvused I see for example:
(metadata (title "Programmable Reinforcement Learning Agents")(author "David Andre, Stuart J. Russell")
(booktitle "Advances in Neural Information Processing Systems, 2000 (NIPS'2000)"
)
(editor "Todd Leen, Tom Dietterich, Volker Tresp")
(publisher "MIT Press")
(year "2001")
(volume "13"))
That should help.
liquidat
Thanks!
Sure it helped - I was able to read successfully those information
Great!
Thanks Pino, DjVu is very important for me! I have tons of scanned books in DjVu format and some scientific archives like www.numdam.org propose the DjVu file format.
I tried the command and unfortunately, none of my DjVu files seems to have metadata.
-- Benoit (bjacob)