Tuesday, December 20, 2005

Web Archiving

I can't believe I forgot to mention a significant development in regards to our digitalpermanence initiative.

A few weeks ago, our dp team viewed a piece of software, which was developed by a Ph.D student here at McGill, that captures dynamically-generated webpages such as the McGill University website.

The McGill University website includes both static webpages (HTML webpages that have text and graphics) and dynamic webpages (HTML webpages whose contents reside in a database and are populated upon a page request).

It's easy enough to capture static webpages with freely-available software (e.g. HTtrack); however, it's another story entirely to capture websites that are both static and dynamic.

The software that was developed can capture static webpages in XHTML and capture database content and package it into XML. It is hoped that we can tie both the XHTML and XML together and create a seamless web archive of our McGill website. That's the plan.

I'll post more as further details and new developments emerge.

No comments:

about the author

I am an information professional, researcher, and writer with over eight years experience in the information services field with experience in information and communication technology.

I have a B.A. in History and a Master's in Library and Information Studies and working on a Web and Multimedia Design certificate.

I believe that empowering people with information can enrich lives and transform the world.