Sample Queries

Intro

Here are some sample queries to get you started.

All of these queries can be run against the Akamai mPulse RUM dataset.

NOTE: It is recommended you always use a DATE filter to minimize query size and cost.

Unique Dimension Values

To understand the unique values available for each dimension, you can run a DISTINCT query:

SELECT  DISTINCT DEVICEMODEL
FROM    `akamai-mpulse-rumarchive.rumarchive.rumarchive_page_loads`
WHERE   DATE = '2022-09-01'
ORDER BY DEVICEMODEL ASC

Dimension Popularity

You can calculate the frequency of Dimension values by doing a GROUP BY and calculating the SUM(BEACONS) count for that date.

Ensure you're not just calculating row COUNT(*) for each group, as each row is weighted differently (by BEACONS count).

For example, this shows the counts of each Device Type in the mPulse dataset for 2022-09-01.

(NOTE: Data is sampled so these counts should only be used for relative percentages).

SELECT  DEVICETYPE,
        SUM(BEACONS) AS BEACONCOUNT
FROM    `akamai-mpulse-rumarchive.rumarchive.rumarchive_page_loads`
WHERE   DATE = '2022-09-01'
GROUP BY DEVICETYPE
ORDER BY BEACONCOUNT DESC

Dimension Popularity (as a percentage of total)

You can calculate the relative popularity of Dimension values by doing a GROUP BY and calculating the SUM(BEACONS) count for that date, compared to the overall SUM(BEACONS) for all rows.

For example, this shows the relative popularity of Device Types in the mPulse dataset for 2022-09-01:

SELECT  DEVICETYPE,
        SUM(BEACONS) AS BEACONCOUNT,
        (
          SUM(BEACONS) /
            (SELECT SUM(BEACONS)
             FROM `akamai-mpulse-rumarchive.rumarchive.rumarchive_page_loads`
             WHERE DATE = '2022-09-01')
        ) AS BEACONPCT
FROM    `akamai-mpulse-rumarchive.rumarchive.rumarchive_page_loads`
WHERE   DATE = '2022-09-01'
GROUP BY DEVICETYPE
ORDER BY BEACONCOUNT DESC

Page Load Time by Country

Using the PERCENTILE_APPROX function, you can calculate the approximate percentile (e.g. median) for a set of data.

NOTE: When using the PERCENTILE_APPROX function, it may run out of memory when run in BigQuery. It will often do this if you're executing a query over all of the data for one day (e.g. no filters have been applied).

The work around is shown below, to limit the dataset via WHERE filters, or the following sample which uses a subquery to reduce memory pressure.

SELECT  COUNTRY,
        SUM(BEACONS) AS BEACONCOUNT,
        `akamai-mpulse-rumarchive.rumarchive.PERCENTILE_APPROX`(
          ARRAY_AGG(PLTHISTOGRAM),
          [0.50],
          100,
          false) as MEDIAN
FROM    `akamai-mpulse-rumarchive.rumarchive.rumarchive_page_loads`
WHERE   DATE = "2022-09-01"
    AND BEACONTYPE = 'page view'
    AND USERAGENTFAMILY = 'Chrome'
GROUP BY COUNTRY
ORDER BY SUM(BEACONS) DESC

Page Load Time by Country (using a Subquery)