I love creating software, this is my dream. My goal is to create as much as possible open source software. Help me by suggesting one.

Wednesday, May 31, 2006

#8 Desktop Optimization

Summary

XML Optimization is a set of method that reformat the XML metadata for use with XML stream. The process is used in websites to minimize network bandwidth consumption and increase the memory space for the applications who store them locally. XML metadata is used by modern applications like OpenOffice.org, GNOME, Evolution, Rhythmbox, GDM, etc., and is used in SVG graphic rendering. By Optimizing the XML metadata that those applications use, the application who parse them will require less memory usage, and less time parsing, thus improving speed and responsiveness.

Rationale

OpenOffice.org alone contains 847 XML Metadata, and a fresh install Ubuntu system contains thousands of XML files. XML metadata plays a big role in today's applications. and optimization for those XML files for improved user experience is not yet being used.

Scope

By XML Optimization, the metadata will be preformatted, by removing the whitespace between the tags, and compacting the whole XML content into a single line, without making any changes in the data inside the tags. The process will make the file much smaller, giving increased memory space, and increased ease for the xml parser to read it.

Design

The first phase is to remove the whitespace, tabs, and new lines. The second phase is to compact the whole XML data into a single line. The third phase is to repeat the process again for the optimized XML metadata.

Implementation

The optimized XML metadata should be preshipped with the application who uses it.

Benchmarks

No benchmark data had yet been produce, I would like to ask for community members for help.

Code

Code

Application

Affected

rhythmbox-quickstart

Rhythmbox

XML files in rhythmbox_database_dir="$HOME/.gnome2/rhythmbox"

evolution-optimize

Evolution Groupware

XML files in evolution_libdir="/usr/lib/evolution" evolution_sharedir="/usr/share/evolution"

openoffice-optimizer

OpenOffice.org 2

XML files in openoffice_dir="/usr/lib/openoffice".

gnome-optimize

GDM, SVG themes, Mime

XML files in themes_dir="/usr/share/themes" icons_dir="/usr/share/icons" gdm_dir="/usr/share/gdm" pixmaps_dir="/usr/share/pixmaps" mime_dir="/usr/share/mime"

gconf-optimize

Gconf

XML files in gconf_share_dir="/usr/share/gconf" gconf_etc_dir="/etc/gconf" gconf_home_gnome_dir="$HOME/.gnome2" gconf_home_gconf_dir="$HOME/.gconf"

doc-optimize

Shared documents, Yelp

XML files in doc_dir="/usr/share/doc" yelp_dir="/usr/share/yelp"

Data preservation and migration

Outstanding issues

Since the XML files are compacted, human readability of those metadata will be affected. We should check if the gain in performance makes up the loss in human readability.

BoF agenda and discussion

Benchmark the results.

  • One alternative to correct the draw back of human readability would be to create a Universal Metadata Reader Applet, something that could be used individually to read Document Metadata but also could be integrated into the desktop search bar in the future for searching document Metadata through a call out within the individual applet. Since the XML files are compacted, human readability of those metadata will be affected. We should check if the gain in performance makes up the loss in human readability

6 comments:

dravine said...

I don't see the need for that, when Conglomerate and MLView are both useful and practical for editing XML files.

The average user is not going to be concerned with the xml meta data, the developers who are concerned, will most likely know how to edit xml, or reformat the files to be easier to work with (tidy for example)

joelbryan said...

You can still edit those optimized files with a text editor, but it'll be very garbled, and un user-friendly. I believe that's why HTML/XML/SVG editors have in common, to be productive as possible compared to a normal text editor.

Ralf said...

Again, is this going to be part of gutsy? Perhaps as a cron job?

joelbryan said...

@ralf

This only works on Ubuntu Dapper, but you can test it out on Feisty and Gutsy by changing the "XML Document Text" to "XML 1.0 Document Text" within the scripts.

There had been so many Ubuntu changes since Dapper, and I haven't tried it yet on newer releases, because I don't feel confident that it'll not break things up unlike in Dapper. The best thing is to try it out and report the problems.

But, if the Ubuntu guys would include it, that would be really great! I really do believe that this simple optimization rule in the networks is also applicable in the bus networks (computers). I would like my packages to be really really included in Ubuntu. and Yes, I need guidance..

nzroller said...

Rhythmbox adds whitespace/newlines when you open it. This sort of optimization probably needs to be talked about in the Rhythmbox project.

My library is 12mb before and 9mb after...
Though I used:
`xmllint --noblanks rhythmdb.xml -o rhythmdb.xml'

ghkj said...

EVEN by wow gold the standards gold in wow of the worst financial buy wow gold crisis for at least wow gold cheap a generation, the events of Sunday September 14th and the day before were extraordinary. The weekend began with hopes that a deal could be struck,maplestory mesos with or without government backing, to save Lehman Brothers, America''s fourth-largest investment bank.sell wow gold Early Monday buy maplestory mesos morning Lehman maplestory money filed for Chapter 11 bankruptcy protection. It has more than maplestory power leveling $613 billion of debt.Other vulnerable financial giants scrambled maple money to sell themselves or raise enough capital to stave off a similar fate. billig wow gold Merrill Lynch, the third-biggest investment bank, sold itself to Bank of America (BofA), an erstwhile Lehman suitor,wow power leveling in a $50 billion all-stock deal.wow power leveling American International Group (AIG) brought forward a potentially life-saving overhaul and went maple story powerleveling cap-in-hand to the Federal Reserve. But its shares also slumped on Monday.

About Me

joelbryan
Paranaque, Metro Manila, Philippines
I like creating software that makes people's life easier.
View my complete profile