A curator who anticipates the need to collect a significant amount of web publications from a certain portal over time,
navigates towards a publication that is representative for a class of publications in that portal. For example, if there
is a need to web archive slideshare presentations, the curator might navigate towards the landing page for the
Creating Pockets of Persistence
presentation. On that page, the curator activates the Memento Tracer browser extension to start recording a Trace for
the page by interacting with it. The extension does not record the actual resources or URLs that are traversed by the curator.
Rather, the extension's browser event listener captures mouse actions and records those abstractly in terms that
uniquely identify the page's elements that are being interacted with, e.g. by means of their
class ID or
XPath. Since all pages of
the same class are based on the same template, the resulting Traces apply across all pages of the class rather than to this specific page only.
Currently, in addition to recording simple mouse-clicks, the extension is able to record - with a single interaction by the curator -
the notion of repeated clicks (e.g., navigate
through all slides of the presentation) and clicks on all links in a certain user interface component.
For example, below is a Trace that results from the curator indicating that the "next slide" button should be clicked repeatedly.
Note that the Trace also indicates the URL pattern to which the Trace applies,
and provenance information including the resource on which the Trace was created and the user agent used to create it.
When the lay-out and/or affordances for a particular class of web publications changes, a new Trace has to be recorded to
ensure that captures maintain their high quality.
{
"portal_url_match": "(slideshare.net)\/([^\/]+)\/([^\/]+)",
"actions": [{
"action_order": "1",
"value": "div.j-next-btn.arrow-right",
"type": "CSSSelector",
"action": "repeated_click",
"repeat_until": {
"condition": "changes",
"type": "resource_url"
}
},
{
"action_order": "2",
"value": "div.notranslate.transcript.add-padding-right.j-transcript a",
"type": "CSSSelector",
"action": "click"
}
],
"resource_url": "https://www.slideshare.net/hvdsomp/creating-pockets-of-persistence",
"user_agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_4) AppleWebKit/537.36
(KHTML, like Gecko) Chrome/68.0.3417.0 Safari/537.36"
}