Google Analytics

Basic information

Google Analytics is the most commonly used web analytics platform on the web, used by over 28 million websites. It is free to use and easy to set up, though being a Google-owned entity, GLAMR institutions choosing to implement it should be especially diligent in considering user privacy concerns (Young et. al, 2019).

Google introduced a new version of the software in 2020, Google Analytics 4 (GA4) that differs significantly from the previous version, Universal Analytics (UA). New accounts created after 2020 use only GA4 by default, however it is still possible to add new UA accounts or run both UA and GA4 concurrently within the same site. There are code syntax, reporting, and user interface differences between the versions, though in general, the practices related to tracking use or re-use are similar whether using GA4 or UA.

There is also a more full-featured enterprise version of the tool, Google Analytics 360, that is aimed at large businesses and marketers. It would be prohibitively expensive for most GLAMR institutions to use. The rest of the information on this page will focus on the free versions.

More Resources

How to use this tool for use/reuse assessment

Consult the Web Analytics data collection method guide for more general information about each of the following strategies.

Referrals

Practitioners can use a list of referrers to help determine the context of use or reuse.  Strategies can range from using URL patterns as the basis for segmenting different kinds of incoming traffic to actually visiting the referring pages in order to analyze the links in context.

Referrer reports are available via:

  • GA4. Life Cycle > Acquisition > Traffic Acquisition. Search ‘referral’ and filter to “Session source/medium”.
  • UA. Acquisition > All Traffic > Referrals

     

Note that due to recent changes in how browsers report referrers in HTTP headers, the full referring path will often be unavailable.

Social Media

Traffic to digital objects originating via social media might signal a distinct type of sharing that institutions would consider to be re-use. Google Analytics tracks social media separately from other referrals.

  • GA4. Life Cycle > Acquisition > Traffic Acquisition. Search “social” and filter to “Session source/medium”
  • UA. Acquisition > Social > Overview

     

Event Tracking

Web analytics packages support granular, targeted tracking of specific interactions within a site. Practitioners may identify elements of their web user interface that signal re-use when clicked by a user (e.g., share, download, or export buttons), and track that data for reporting purposes.

Data collection in GA4 is almost entirely performed via event tracking, whereas in UA, “Events” are more of an optional, ancillary feature for capturing page interactions. Regardless, configuring event tracking on interface elements such as links to share, download, or export objects can help to isolate user activity that may qualify as re-use.

How to set up events:

  • GA4. Some interaction events such as file downloads and clicks on outbound links get collected automatically via the default “Enhanced Measurement” feature. A practitioner may configure some additional events through the admin user interface. Highly customized events may also be defined using Javascript, including any number of custom properties.
  • UA. No events are automatically tracked. Use Javascript to write event listeners (e.g., clicks on links of interest). The data transmitted with each event is limited to three string fields (category, action, label), and an integer (value).

How to see event reports:

  • GA4. Life Cycle > Engagement > Events
  • UA. Behavior > Events > Overview

     

Embedding

Some digital asset management software supports an “embed code” feature to empower users to re-use digital objects by putting interactive versions of them in external sites (often in an <iframe>). The service providing the source of the <iframe> should have a separate web analytics property. External sites embedding the objects are logged as referrers within that property.

Both GA4 and UA make it possible to manage separate analytics properties within the same account. Having an Account > Property hierarchy helps keep reports organized and enables sharing some common configuration between properties.

Internet Service Provider

Some GLAMR institutions have used IP-derived service provider data in Google Analytics to measure digital collections traffic coming from academic or government institutions. In 2020, Google discontinued reporting Service Provider and Network Domain; data in these fields will now only appear as “not set.” This practice is no longer supported by any version of Google Analytics.

Ethical guidelines

Practitioners should follow the practices laid out in the “Ethical considerations and guidelines for the assessment of use and reuse of digital content.” The Guidelines are meant both to inform practitioners in their decision-making, and to model for users what they can expect from those who steward digital collections.

Additional guidelines for responsible practice

Privacy concerns around Google Analytics: 

  • All data that is collected by an institution legally belongs to Google. It resides on Google infrastructure, and will be used by the company and its partners in advertising products and other services.

  • Google does not consider IP addresses to be personally identifiable information / PII.  In GA4, Google anonymizes IP addresses by default, masking the final part of the address, e.g. 12.214.31.XXX. However in UA, a visitor’s full IP is collected by default, and this can only be changed by modifying the Javascript tracking code to activate an anonymizeIP feature; there is no way to change this setting in the UI.

See A National Forum on Web Privacy and Web Analytics: Action Handbook (2019, p. 3) for a complete list of configuration and implementation changes GLAMR practitioners should specifically consider if choosing to use Google Analytics.

Strengths

  • Easy setup: because Google Analytics is hosted software, no special skills, local storage space, or additional technologies are required. Google provides a  property identifier, or alternatively a small tracking code HTML snippet to add to a page template for custom sites. Cost: being a free, externally hosted service makes Google Analytics an attractive option, particularly for smaller institutions with limited budgets or without dedicated IT staff to set up and maintain web analytics software.

  • Ubiquity: Google Analytics is supported out-of-the-box for most digital asset management software, including DSpace, CONTENTdm, Hyrax/Hyku, Omeka, BePress, and more. An implementer typically only has to provide their property identifier in a configuration setting to activate tracking. Some platforms also have deeper integrations with Google Analytics using its API. For example, Hyrax/Hyku pulls statistics for views and downloads from Google to render an in-page interactive chart.

  • Google Analytics has a sleek user interface  that will feel familiar to anyone with experience using other Google products and services.

  • Google Analytics integrates well with various Google products in ways that other tools cannot.

Weaknesses

  • The data collected belongs to Google. Google may use the data as it wishes, including sharing it with corporate partners or government agencies.

  • Google may modify data collection or retention policies at any time. New major versions of the software have been introduced every few years, each with its own unique syntax, policies, and other implementation considerations.

  • The Google Analytics reporting interfaces emphasize several e-commerce aspects (monetization, conversions, etc.) that are likely inapplicable to most GLAMR institutions. 

  • Google Analytics samples data to generate large reports, using data from a maximum of 500,000 sessions for its calculations. This can lead to inaccurate reports, particularly as the degree to which a report is based on sampled data rises. An upgrade to the prohibitively expensive 360 version is required to avoid sampling.

  • Data exports are limited. The web interface has a 5,000 row limit for a single data export, while the reporting API has a 10,000 row limit per query.

Additional resources

Young, S. W. H., Mannheimer, S., Clark, J. A., & Hinchliffe, L. J. (2019). A Roadmap for Achieving Privacy in the Age of Analytics: A White Paper from A National Forum on Web Privacy and Web Analytics. Montana State University. 

Used for these methods

Alternative tools

Skip to content