Memento Guide - Determining Resource Type

Last updated: January 19, 2015

Is this resource a Memento, a TimeGate, or an Original Resource?

The datetime negotiation component of the Memento framework introduces three resource types: Original Resource (URI-R), Memento (URI-M), and TimeGate (URI-G). However, when stumbling upon a resource (URI-Q) out there on the Web, how does one determine which type it is? Answering that question is important, for example, when developing Memento client applications such as browser plug-ins.

This document details how the resource type can be determined by means of HTTP headers returned in response to a HTTP HEAD issued against URI-Q. It shows how determing the type is straightforward when resources comply with the Memento specification, but may involve some heuristics when they don't.

The following conventions are used in the remainder of this document:
  • A header is shown in bold when its presence is necessary to determine a resource type;
  • A header is shown in bold strike-through when its abscence is necessary to determine a resource type;
  • A header is shown in regular font to convey additional context.

[1]

The resource URI-Q is a Memento for the Original Resource URI-R.

resource is a memento
The resource (URI-Q) can be unambiguously characterized as a Memento because both a Memento-Datetime header and a HTTP Link header with a Relation Type of "original" are included in its HTTP response headers.

The value of the Memento-Datetime header indicates the archival datetime of the resource, i.e. the datetime since when the resource has been stable and will remain to be stable. The "original" Link points at the Original Resource (URI-R) for which the current resource (URI-Q) is a Memento.

Further HTTP Link relationships may be included in the response header from URI-Q, specifically "timegate", "timemap" and "memento" Links.
For example, the resource http://mementoarchive.lanl.gov/ta/20100320000003/http://lanlsource.lanl.gov/hello is a Memento for the Original Resource http://lanlsource.lanl.gov/hello. This is the HTTP response header that allows recongizing the resource as a Memento:


[2]

The resource URI-Q is a TimeGate for Original Resource URI-R.

resource is a timegate
The resource (URI-Q) can be unambiguously characterized as a TimeGate because of the presence of a Vary header with as value "accept-datetime", which indicates that the resource supports HTTP content negotiation in the datetime dimension.

The response header will typcially also include HTTP Link headers, specifically "original", "timemap", and "memento" Links. Also, additional values may be provided in the Vary header, for example, "accept" to indicate that HTTP content negotiation is available in the media type dimension too.
For example, the resource http://mementoarchive.lanl.gov/twa/timegate/http://lanlsource.lanl.gov/hello is a TimeGate for the Original Resource http://lanlsource.lanl.gov/hello. Below is the HTTP response header that allows recongizing the resource as a TimeGate. Note that the Location header points at a Memento for the Original Resource, and that the Link header contains links to the Original Resource, a TimeMap, and to several Mementos.


[3]

The resource URI-Q is an Original Resource or a Memento.

resource is an original resource or a memento
In a perfectly Memento-ized Web, the abscence of both a Memento-Datetime header and an "original" HTTP Link header would be sufficient to decide that URI-Q is an Original Resource.

Unfortunately, until further notice, the Web is not perfect that way. Many Web Archives and Content Management Systems do not yet include these crucial headers for their archived/stable resources. As a result, trying to determine whether a resource, which at first glance looks like an Original Resource, might actually be a Memento necessarily entails imperfect heuristics.

One way to approach the problem is by introducing a reference list for URI syntaxes used for Mementos by major Web Archives and Content Management Systems. This is rather imperfect and definitely not very scalable. But all of this sorrow goes away if Web Archives and Content Management Systems implement the Memento-Datetime and the "original" HTTP Link header. What are you waiting for? Ping that archive or CMS near to you and get them going!
For example, Web Archives that use the Wayback software use fine granular datetime indicators in the URIs of their Mementos (e.g. /20010911203610/), and MediaWiki installations consistently use the same query parameters in URIs of old resource versions (e.g. oldid and title). Such information, in combination with the base URI of these systems, can be included in a reference list. Then, it becomes possible to test whether URI-Q matches one of the URI syntaxes in the reference list. If it does, URI-Q can be categorized as a Memento, and the value of its Last-Modified header (if available) can be interpreted as the Memento's datetime, which, in a perfect world, would be expressed in the Memento-Datetime header.


[4]

The resource URI-Q is both a Memento and an Original Resource.

resource is both a memento and an original resource
As per [1] above, the resource (URI-Q) can unambiguously be characterized as a Memento because it has the necessary Memento-Datetime HTTP response header.

In this particular case, the "original" link points at URI-Q itself, indicating that URI-Q is both a Memento and an Original Resource. This happens in cases where a resource is archived/stabilized at its original Web location and may not necessarily also be available at another Web location for archival purposes.

The presence of the Memento-Datetime header entails a promise by the custodian of the resource that the resource will not undergo any changes beyond the datetime expressed as the header's value.
For example, the entry page for the 2008 JCDL conference at http://jcdl2008.org/ has not changed since June 24th 2008, and will not change anymore beyond that date. Hence, the resource is both an Original Resource and a Memento, and could return the below HTTP response header. Note that the response indicates that this resource is also aware of another archived version of itself that exists at http://jcdl.org/archived-conf-sites/jcdl2008/, and provides a "memento" link to it.


[5]

The resource URI-Q is an intermediate resource.

intermediate resource
An intermediate resource issues a redirect to a TimeGate, to a Memento, or to another intermediate resource, and thus plays an active role in the Memento infrastructure. This happens, for example, in Wayback archives that have a set of URIs of Mementos (URI-Mi, i=1..n) for a given Original Resource, all of which have the same representation. In that case, accessing a URI-Mi (i > 1) from that set will result in a redirect to the URI-M1 of the temporally first Memento with this shared representation, as it is the only one the archive stores.

In cases like this, the redirecting response from URI-Mi (i > 1) to URI-M1 includes a HTTP Link header that contains a link with an "original" relation type that points at the Original Resource (URI-R), and it does not contain a Memento-Datetime header (as no Memento is served yet), and no Vary header (as URI-Mi is not a TimeGate).

In this scenario, a client can determine it is still on a successful path towards a Memento, and can follow the redirect to the URI that is provided as the value of the Location header.

[6]

The resource URI-Q is excluded from datetime negotiation.

do not negotiate
When delivering a Memento to a user agent, a web archive commonly enhances that Memento's archived content, for example, by including a banner that provides branding and highlights the archival status of the Memento. The resources that are involved in providing such system-specific functionality, many times Javascript or images, must be used in their current state.

In cases like this, the response from URI-Q will contain a HTTP Link header that contains a link with type as a relation type and http://mementoweb.org/terms/donotnegotiate as Target IRI.

In this scenario, a client should not engage in datetime negotiation with URI-Q.
For example, the resource http://web.archive.org/static/images/toolbar/icon_alert.png is used by the Wayback Machine to appropriately render a Memento page. It returns the below HTTP response header, which includes a HTTP Link header that indicates that Memento clients should not datetime negotiate with this resource.