Open Web Analytics

Basic information

Open Web Analytics (OWA) is an open source web analytics platform built using PHP programming language and MySQL database system. Is it currently used on over 40,000 websites. There is no hosted option; practitioners must download, install, and configure the software using their own infrastructure.

Though far less commonly used in GLAMR institutions than Google Analytics, Open Web Analytics does have some support among library professionals (Azim & Hasan, 2018).

How to use this tool for use/reuse assessment

Consult the Web Analytics data collection method guide for more general information about each of the following strategies.

Referrals

Practitioners can use a list of referrers to help determine the context of use or reuse.  Strategies can range from using URL patterns as the basis for segmenting different kinds of incoming traffic to actually visiting the referring pages in order to analyze the links in context.

In Open Web Analytics, referral data is reported under Traffic > Referring Websites. The hostname can be displayed via a secondary dimension, “Referral Web Site.” OWA also tracks “Referral Link Text,” that is, if the full URL of the external page where the link resides is known, OWA will crawl the referring page and capture the text of the link. This could potentially be helpful for more easily determining the context of use. However, as browsers have evolved in recent years, the full referring URL will now rarely be reported in an HTTP header. Thus, only the referring hostname will be listed in web analytics reports and the referring page will not be able to be crawled because it is unknown.

Social Media

Traffic to digital objects originating via social media might signal a distinct type of sharing that institutions would consider to be reuse. 

In OWA reports, there is no distinction made between social media traffic and other website traffic. One could potentially identify hostnames that match known social media sites, and use filters on the “Referring Websites” report to isolate those URLs.

Event Tracking

Web analytics packages support granular, targeted tracking of specific interactions within a site. Practitioners may identify elements of their website user interface that signal re-use when clicked by a user (e.g., share, download, or export buttons), and track that data for reporting purposes.

OWA’s mechanism for this is called “Action Tracking”; it is similar in syntax and purpose to the “Event Tracking” feature in Google Analytics (UA version) and Matomo. One can provide values for three text fields (Name, Group, Label) and one numeric field (Value) to be logged with each event.

Embedding

Some digital asset management software supports an “embed code” feature to empower users to reuse digital objects by putting interactive versions of them in external sites (often in an <iframe>). The service providing the source of the <iframe> should have a separate web analytics property. External sites embedding the objects are logged as referrers within that property.

OWA can support an unlimited number of tracked site profiles.

Internet Service Provider and Geolocation

Some GLAMR institutions have used IP-derived service provider data to distinguish digital object use from within academic or government institutions from other contexts.

Open Web Analytics does enable tracking some user network data that is derived from IP addresses, including geolocation (country and city) and Internet Service Provider. This functionality requires some additional setup. One must first activate OWA’s MaxMind GeoIP module. Supporting geolocation lookups requires downloading MaxMind’s free GeoLite2 City Database file and adding it to the server running OWA. MaxMind also licenses a more accurate GeoIP2 City data web service for a fee.

There is no free option for obtaining Internet Service Provider (ISP) information. However, practitioners may purchase a monthly license for MaxMind’s GeoIP2 ISP database, and configure OWA with their license key to lookup and track service providers.

Ethical guidelines

Practitioners should follow the practices laid out in the “Ethical considerations and guidelines for the assessment of use and reuse of digital content.” The Guidelines are meant both to inform practitioners in their decision-making, and to model for users what they can expect from those who steward digital collections.

Additional guidelines for responsible practice

Open Web Analytics collects and stores visitors’ full Internet Protocol (IP) address by default. Its web interface includes a configurable option to “Anonymize IP Addresses,” which, like Google Analytics, will mask the final octet of the address (e.g., 12.214.31.XXX).

See A National Forum on Web Privacy and Web Analytics: Action Handbook (2019, p. 5) for a Five-Point Plan for Privacy-Aware Analytics.

Strengths

  • The data collected belongs only to the practitioner’s organization and cannot be accessed by anyone else. Open Web Analytics never uses sampled data for reporting. Any report run represents 100% actual data.

  • As open source software, OWA’s source code is transparent. Implementers can see exactly how it works, make modifications to the code, or contribute development resources toward improving it. 

  • Because Open Web Analytics is installed on a local server, there are no limits on how much data can be collected or stored.The tool natively supports tracking either on the client-side (via Javascript) or server-side (via PHP scripting). Google Analytics and Matomo both require additional software components in order to track server-side.

  • OWA supports several optional advanced interaction tracking features, particularly through its “DOMstream” module. This includes click heatmaps, mouse movement recording and playback, and full automated click tracking on every element within an interface.

Weaknesses

  • Open Web Analytics requires particular technical knowledge and expertise to install and maintain. There is no hosted option.

  • Though there are Open Web Analytics plugins or modules available for easier integration with web content management systems (e.g., Drupal, WordPress, MediaWiki, etc.), there is no equivalent available for digital asset management platforms.

  • The user interface lacks export features (e.g., to save a report as CSV or PDF file).

  • There is no configurable data retention schedule. If practitioners wish to purge data after a set period, this would likely require writing local scripts to query and remove records from the database.

  • Although an implementer retains full control over the data, running OWA with default settings and its built-in modules activated will track a vast amount of user data. This may be at odds with ethical guidelines and considerations in the profession, so careful attention is advised.

Real world examples

  • Application of Open Web Analytics for usage data capture and analysis in recommender systems
    The paper reports the use of Open Web Analytics as a platform for collecting and analyzing usage data for a Software as a Service platform.

    Garcia, J. E., & Paiva, A. C. (2018, March). Manage software requirements specification using web analytics data. In World Conference on Information Systems and Technologies (pp. 257-266). Springer, Cham.

Additional resources

Azim, M., & Hasan, N. (2018, February). Web Analytics Tools Usage among Indian Library Professionals. In 2018 5Th International Symposium On Emerging Trends And Technologies In Libraries And Information Services (ETTLIS) (pp. 31-35). 

Used for these methods

Alternative tools

Skip to content