Calculated fields are one of the most powerful features of Google Data Studio. Based on the existing fields in your data source, calculated fields allow you to create new dimensions and metrics to augment your reporting and analysis.
Data Studio offers a variety of formulas that can be used to calculate ratios and rates, extract and reformat text, and apply logical comparisons.
Here are 5 useful examples to get you started with calculated fields in Data Studio:
Tip: Copy & paste any of the formulas below into the calculated field editor in Data Studio. Even though the editor is single-line, you can paste in multi-line expressions as well.
1. Combine hostname and page path
If you have multiple domains or subdomains within your website, it can be helpful to see the full URL of the page in your reports. A simple concatenation function will let you combine the Hostname and Page dimensions:
CONCAT(Hostname, Page)
2. Extract query parameter from URL
Query parameters are often used to trigger dynamic content, label traffic sources, or identify unique users. If you haven’t already moved these parameters into custom dimensions via your analytics setup, you can create a new dimension in Data Studio with a regular expression extract:
REGEXP_EXTRACT(Page, 'mode=([^&]+)')
Replace mode
with the name of the desired query parameter.
3. Clean up inconsistent campaign naming
Ideally, you should establish consistent campaign tagging across your organization. Using common naming conventions between marketing teams and agency partners will make the reporting and analysis process much more efficient. However, if you need to clean up historical data or a rogue campaign, Data Studio will be your friend:
The most common inconsistency is usually the use of different case, e.g. “email”, “Email”, and “EMAIL”. This is easily fixed by forcing all campaign dimensions to lowercase.
LOWER(Medium)
Similarly, you can use LOWER(Campaign)
and LOWER (Source).
When you use your calculated dimension, Data Studio will then aggregate the inconsistent values for you.
4. Create custom geographic regions
Google Analytics provides you with the standard geo dimensions of City, Region, and Country. But what if your organization uses your own regional grouping? Perhaps your sales team splits the country into West, Central, and East, for example.
Here’s a conditional function that you can use to define your own custom regions, based on the standard dimensions:
CASE WHEN REGEXP_MATCH(Region, "British Columbia|Alberta|Saskatchewan|Manitoba") THEN "West" WHEN REGEXP_MATCH(Region, "Ontario|Quebec") THEN "Central" WHEN REGEXP_MATCH(Region, "Nova Scotia|New Brunswick|Prince Edward Island|Newfoundland and Labrador") THEN "East" ELSE "Other" END
Note that Data Studio doesn’t support IF statements, so you must use the CASE function instead (which ends up being cleaner than nested IFs).
The general format is: CASE WHEN X THEN Y1 ELSE Y2 END
, where X
is a boolean expression and Y1
and Y2
are text, number, or boolean values. Refer to this CASE statement article in the Data Studio help centre for more details.
5. Create a custom channel grouping
Any custom channel groupings that you create in Google Analytics won’t be available in Data Studio. However, you can recreate your channel definitions in Data Studio using a CASE construct with a series of WHEN clauses.
Here is the Google Analytics default channel grouping represented as a CASE statement. Use this as a base for creating your own custom channel definitions:
CASE WHEN ((Source="direct" AND Medium="(not set)") OR Medium="(none)") THEN "Direct" WHEN Medium="organic" THEN "Organic Search" WHEN (Social Source Referral="Yes" OR REGEXP_MATCH(Medium,"^(social|social-network|social-media|sm|social network|social media)$")) THEN "Social" WHEN Medium="email" THEN "Email" WHEN Medium="affiliate" THEN "Affiliates" WHEN Medium="referral" THEN "Referral" WHEN (REGEXP_MATCH(Medium,"^(cpc|ppc|paidsearch)$") AND Ad Distribution Network!="Content") THEN "Paid Search" WHEN REGEXP_MATCH(Medium," ^(cpv|cpa|cpp|content-text)$") THEN "Other Advertising" WHEN (REGEXP_MATCH(Medium,"^(display|cpm|banner)$") OR Ad Distribution Network="Content") THEN "Display" ELSE "(Other)" END
Using calculated fields is a skill that every Data Studio user should master. It will make your data cleaner, make your reporting more relevant, and ultimately make your life easier.
Interested in learning more about Data Studio? Check out 7 Reasons to Use Data Studio with Google Analytics.
Let us know if there are any other specific aspects of Data Studio you would be interested in learning more about!
——
Looking for hands-on training in Data Studio? Come to our 1-day Data Studio course!
Related post: 7 Reasons to Use Data Studio with Google Analytics
I am tracking scroll depth (25%, 50%, 75%, 90%) and want to create an aggregate view of total events per URL. I also want to create a % of the total events per URL at each scroll depth interval. For example, I want to know the % of the total events for a URL at each interval. Is this possible in GDS?
Hi Skyler,
You can definitely accomplish this, but not with calculated fields. Data Studio will aggregate data based on the dimensions and metrics that you include in your chart. So, if you want to see an aggregate of total scroll depth events per URL, you could create a table with Page and Total Events and filter for scroll events only.
To get a “percent of total” metric, you can edit a metric in your report and apply the Percent of Total analytical function. Take a look at the help centre page here: https://support.google.com/datastudio/answer/7512288.
Hope that helps!
Hi, thanks for the article. You have wrong function name in 2. Extract query parameter from URL part it is REGEXP_EXTRACT. Have a nice day
Hello Ondra,
Thanks for letting us know! We have updated the post.
Jonathan
Very useful, Marc,
But I’m trying to filtering/categorize with a custom field with an example like this:
…
WHEN REGEXP_MATCH(Campaign, “BLU”) THEN “Colour Blue”
WHEN REGEXP_MATCH(Campaign, “GRE”) THEN “Colour Green”
WHEN REGEXP_MATCH(Campaign, “CAR”) THEN “Product Car”
WHEN REGEXP_MATCH(Campaign, “MOT”) THEN “Product Motorbike”
…
And works fine for campaigns with only one “tag”, like MOT or GRE, for example.
I have also campaigns with more than one tag, for example CAR+GRE, and the REGEXP only returns the first true condition, in this case, CAR. But I’m not able to categorize both, CAR and GRE.
Does anybody know how to achieve this?
Many thanks!
Hi Joan,
The CASE statement in your calculated field can only return one value. It will return the result of the first WHEN condition that evaluates to true.
When you are creating custom groupings in this manner, each dimension value (in this case, each campaign) can only be placed in one group. It won’t be possible to have a “CAR+GRE” campaign appear in both the CAR group and the GRE group. This is the same way that channel groupings and content groupings work in Google Analytics, i.e. it’s a one-to-one mapping.
If you have different types of tags (e.g. Product and Colour), and each campaign can have at most one value of each type (i.e. you could have CAR+GRE and MOT+BLU, but never BLU+GRE or CAR+MOT), then I suggest creating two separate groupings. You can create one calculated field that groups the campaigns based on colour and a second calculated field that groups based on product.
Hope that helps!
Marc
This is very helpful, however I am wondering how to split comma separated data into metrics?
For example having columns with following data: cat, dog, horse, bird.
Each matched text should be aggregated for summary how many times textbook occurred in the dataset.
Any ideas how to achieve this?
Hi Laz,
You can create a conditional count with a CASE statement. For example, if you wanted to count instances of “cat”, you can use the following:
CASE WHEN REGEXP_MATCH(Dimension, ".*cat.*") THEN 1 ELSE 0 END
The same could be done to count dog, horse, bird, or any other value in a comma-separated dimension value.
Hope that helps!
Hi – do NOT statements work? Example: I want a rule created that reflects my segment. Source = “paid_search” and Campaign [does not contain] “gmail”. How is a statement like this properly done?
Hi Arron,
You can use a NOT operator as follows:
WHEN REGEXP_MATCH(Source, "Paid Search") AND NOT REGEXP_MATCH(Campaign,"gmail") THEN "Paid Search"
Hope that works out for you!
Hi – I’m wondering if you can help me display a metric in Data Studio for percentage of overall traffic?
I want to show what my organic sessions from social networks are as a percentage of my total traffic. And then to drill down further and show that for each social network.
e.g If I had 1000 sessions total in a month, and 60 of those came from organic Facebook, then I want to show that figure (6%). and I can’t work out how to.
Sounds like it should be quite straightforward, I’m fairly new to Data Studio!
Cheers!
Hi Joe. A percent of total metric cannot be created as a calculated field since it is not a row-level calculation. You can however create a Percent of Total calculation in a table. See our other blog post on Calculating Percent of Total. You won’t be able to get a single scorecard metric, but you can get a table with the total and a breakdown.
Hi Soares, can you please throw some light to accomplish the below requirement.
We wanted to create a metric like [Last 25 days page views]/[Current Month]? I try to get the current month numeric number but failed and I was not able to create the Last 25 days page views. Please guide me the steps to accomplish this request.
Sincerely,
Vamsikrishna.
Hi Vamsikrishna,
To get the number of Pageviews for a specific date range, I suggest using a Scorecard chart and filtering by date.
If I understand correctly, you wish to create a calculated field for [Pageviews in the last 25 days] / [Pageviews for the current month]. Unfortunately, this type of aggregation is not supported in Data Studio right now. You would need to pre-process your data so that you have separate metrics for [PVs Last 25 Days] and [PVs for Current Month]. Then you will be able to calculate the ratio in Data Studio.
Marc
Hello Marc!
I am trying to have a table like this:
Channel | Page = ‘/’ (users) | Page =’/Plans’ (users) | Page=’/Cart’ (users) | Page=’Purchased’ (users)
I was wondering if that is possible with custom metrics. Please help!
Thank you,
Bruna
Hi Bruna,
You can accomplish this using Data Blending:
1) Create a separate table for each metric that you want to include. In each table, add Users as the metric and apply a filter for the desired page.
2) Select all 5 charts, right click and select “Blend data”. In the blended chart, you can then have a column for each metric.
This is similar to the method described in this post, except you’ll need to use filters instead of segments.
Marc
I used the code for custom channel grouping to include only Organic Search, Paid Search, Referral and Direct but when I enter the code and apply it another set of data values appear on my time series as an unnamed dimension. Do you know why this would happen?
Hi Catherine,
Did you include an ELSE condition at the end of your CASE statement? If not, any traffic that doesn’t match your WHEN conditions will be undefined. I suggest including the
ELSE "(Other)"
condition as shown above. If desired, you can then exclude (Other) from your time series.Marc
I’m trying to get a count of how many “share” events happen on a page.
I’ve worked out:
Facebook share = CASE WHEN (Event Category=”addthis” AND Event Action= “facebook”) THEN 1.0 ELSE 0 END
But when I try to create another field:
Count Facebook Shares = Sum (Facebook event)
Adding “Count of facebook shares” to my table with Page URLS- It doesn’t calculate correctly. In fact my whole “Page” Table gets totally out of whack and removes rows somehow. It’s not a filter, it’s a new column; any insights?
Hi Kate,
You shouldn’t need to create a new calculated field to sum your Facebook Share metric. Data Studio will automatically total the metric if you set the aggregation type to SUM. Try creating a table with Page and the Facebook Share metric.
Marc
Hi,
I’m trying to add a new custom field with just REGEXP_EXTRACT(Page, ‘mode=([^&]+)’) to get something working, and it tells me it doesn’t know what Page is. Do I need to enable something? What am I missing
Hi Chris,
The formula above assumes that Page is the name of a field in your data source. If you don’t have a “Page” field, then you will need to replace it in the formula with an existing field that you wish to extract from.
Marc
Hi!
Very good blog! I love your posts.
Guys, I have a doubt. I currently have two events, with action values of “start form” and “send form”. I would like to create a metric that allows calculating the rate of shipments between users who initiated the form and others who finally sent. How could I do it with a calculated metric and what kind of conditions would I have to use?
Thank you,
Fred
Hi Fred,
You can create a ratio metric by blending scorecards. Take a look at our blog post on Calculating Ratios with Filtered Metrics in Data Studio.
Hope that helps!
Marc
Hi Marc,
I’m trying to replicate the dimension “User-ID Status” within GDS which is unfortunately otherwise only available in GA when creating a segment. I’d hoped to do so with a calculated field. I created a new field with the following CASE statement because all our UserIDs are prefixed with “uid-“, however I only ended up with “Assigned” values. What happens with all the sessions where no UserID is assigned? Why are they not available in GDS as “(not set)”? Do you know of any GDS workaround without having to create more GA segments?
CASE WHEN CONTAINS_TEXT(UserID, “uid-“) THEN “Assigned” ELSE “Unassigned” END
Hi Steve,
Thanks for the question. Whenever you query a custom dimension from Google Analytics, the resulting report only contains rows that have a defined value for that custom dimension. This happens within the GA interface, as well as in external queries through the Reporting API or Data Studio.
So in this example, your CASE statement is not a problem. The reason you don’t see “Unassigned” is that rows without a UserID value are not in the dataset to begin with.
One way to deal with this behaviour of Google Analytics is to explicitly send “not set” or another default value in custom dimensions when they don’t have a set value. If you don’t need a row-level report, then a filter or segment that excludes sessions with a UserID value is probably the best approach.
Hope that helps!
Marc
Thank you Marc for your reply. It’s most helpful to know that this is a consistent issue that applies to all custom dimensions. I have considered the idea of sending a “not set” value, however we have a few questions/concerns:
(a) Currently, we only send a UserID value when a user completes a successful login. This means we’d need to figure out when it would be appropriate to reliably send a “not set” value.
(b) I’m not entirely sure what happens if a user has visited over many sessions over a long period of time without an account (i.e. sending “not set” to GA) and then the user creates an account on our website and logs in (i.e. we now send a UserID value to GA). Since we have session unification enabled, will that mean that all those sessions with a value of “not set” will now be “overwritten” with the new UserID?
(c) Similarly, for users who clear their cookies, if we initially send “not set” to GA, will this value be replaced by the accurate UserID value the next time they login.
(d) In cases where a previously known user returns to our website (i.e. we’ve previously sent a UserID value to GA), but they choose to not login, I don’t want to overwrite that user’s valid UserID stored at GA with a “not set” value.
About the only way I can see to overcome the above would be to set a more persistent cookie with the UserID value, independent of the user’s login status. That way, regardless of whether a user was actually logged in or not, we’d have a UserID value we could send to GA. However, for this approach, we’d need to run this by our Privacy colleagues.
In general, I’m able to use segments to either include or exclude traffic from known users. But each segment operates independently, so while I can hack it so that it looks like I have a table with both, I can’t calculate a percentage of total. So I can’t create a simple table in GDS that replicates the “User-ID Status” values otherwise only available in GA segments. A simple table that shows:
User-ID Status | Sessions | % Total Sessions
—-
Assigned | 2500 | 16.7%
Unassigned | 12500 | 83.3%
It also means that I can’t build a pie chart of all traffic, broken down by this User-ID Status.
And I forgot to mention the most important factor:
(e) The value sent to GA as the UserID is used to group all sessions as the same “user”. If we send simply “not set”, then all activity for every user who doesn’t login will be grouped together, somewhat compromising the concept of a “user”.
The only idea I can think of to overcome the above point would be to craft unique variations of “not set”, e.g. by appending the GA Client ID. This would still allow a GDS custom field to roll these sessions into an “Unassigned” status, and it still allows us to create segments in GA for all sessions that otherwise would have been equivalent to an User-ID Status of Unassigned.
Hi Steve,
Yes, these are common challenges, and as you’ve described very well, it’s difficult to resolve them easily with GA alone.
It would be feasible to persist a user ID in a cookie, even when the user isn’t logged in. Of course, this has privacy implications as you mention. Session unification only stitches together prior hits in the same session; It will not attach prior sessions to the user ID.
It would also be feasible to use the Client ID to create a proxy user ID when the user isn’t authenticated.
Without making any tracking changes, I think the best way to create the chart you’re looking for is to pre-process the data before connecting to Data Studio. You could use Google Sheets (with the GA Spreadsheet Add-on) or an ETL tool to extract and summarize your data into the Assigned/Unassigned status table.
There are rarely perfect solutions in our field. Hopefully this provides you with some direction.
Marc
For some of my data studio dashboards my regional CASE works when i use it as a filter control, but some times it does not. For example, when i want to narrow down the most common document downloads (event labels) for a region – the customer field filter does not work. The rub is, if i create individual filters for Country and Region, the downloads will show… So I can’t understand why my regional custom field case will not work.
Any help would be appreciated.
Hi Chad,
I’m not sure exactly what the issue may be based on your description. When you say that the custom field filter does not work, what are you seeing when the filter is applied? Does the data not change at all, does it get filtered in an unexpected way, or do you see an error?
Marc
How to calculate % in data studio .
for eq:
we have column
a1 a1%
1 1/10 * 100 = 10
2 2/10 * 100 = 20
3 3/10 * 100 = 30
4 4/10 * 100 = 40
after this i have to calculate % difference.
a1% – b1% .
Hi Pooja,
I’m not clear on exactly what you are looking to calculate. If you already have the percentage in a field in your data source, then you can calculate the difference between two percentage fields as a simple subtraction. If however you are applying a “percent of total” or “percent change” calculation in your chart, you will not be able to use these metrics in a calculated field. The in-chart comparison calculations are not available to be used in calculated fields.
Another way to create a calculation between two percentages is to use scorecards and data blending. We have a blog post about this here: Calculating Ratios with Filtered Metrics in Data Studio.
Marc