Google Data Studio: Average Number of Sessions based on selected country values - calculated-field

Let's say that for the dimension country I have 4 values and for each of the 4 I have the respective number of Sessions. E.g.
+---------+----------+
| country | Sessions |
+---------+----------+
| Italy | 10 |
| France | 12 |
| Germany | 14 |
| Spain | 16 |
+---------+----------+
I want to compute and output in a scorecard the average number of Sessions, only for those specific countries. So, in the example, the output should be 13.
I tried with the following calculated field but it doesn't work:
Sessions * AVG(CASE
WHEN REGEXP_MATCH(country, '^Italy|France|Germany|Spain.*') THEN 1
ELSE 0 END)

Create a filter based on the country dimension using the Matching RegEx operator.
Then apply this to a scorecard with the metric sessions. In the Data tab on the right hand side, you should be able to click on a little pencil icon for the metric, and choose the aggregation method as average instead of sum.
You may not have this option if you're using the GA connector. In this case, there should be an Average Session metric in the data source.

One way it can be achieved is by using a Filter Control and a Calculated Field:
1) Filter Control
Add the component with the Dimension set to Country and then add a Default Selection (a comma separated list of the required countries):
Italy, France, Germany, Spain
2) Calculated Field (Scorecard)
Sessions / COUNT_DISTINCT(Country)
Google Data Studio Report and a GIF to elaborate:

Related

How to group rows vertically in PowerBuilder?

I have this sample rows of plate nos with bay nos:
Plate no | Bay no
------------------
AAA111 | 1
AAA222 | 1
AAA333 | 2
BBB111 | 3
BBB222 | 3
CCC111 | 1
Is there a way to make it look like this in a datawindow in powerbuilder?
1 | 2 | 3
------------------------
AAA111 | AAA333 | BBB111
AAA222 BBB222
CCC111
There isn't an simple answer, especially if you need cells to be update-able.
Variable Column Count Strategy
If the number of columns across the top is unknown at development time than you might get by with a "Crosstab" style datawindow but it would be a display only. If you need updates you'll need to do manual data manipulations & updates as each cell would probably represent one row.
Fixed Column Count Strategy
If the number of columns is known (fixed) you could flatten the data at the database and use a standard tabular (or grid) datawindow control but you'll still need to get creative if updates are needed.
If you use Oracle to obtain the data you can use the Pivot and Unpivot function to perform what you are looking for. Here is an example of how to do it:
http://www.oracle.com/technetwork/es/articles/sql/caracteristicas-database11g-2108415-esa.html

Oracle SQL - Give each row in a result set a unique identifier depending on a value in a column

I have a result set, being returned from a view, that returns a list of items and the country they originated from, an example would be:
ID | Description | Country_Name
------------------------------------
1 | Item 1 | United Kingdom
2 | Item 2 | France
3 | Item 3 | United Kingdom
4 | Item 4 | France
5 | Item 5 | France
6 | Item 6 | Germany
I wanted to query this data, returning all columns (There are more columns than ID, Description and Country_Name, I've omitted them for brevity's sake) with an extra one added on giving a unique value depending on the value that is inside the field Country_name
ID | Description | Country_Name | Country_Relation
---------------------------------------------------------
1 | Item 1 | United Kingdom | 1
2 | Item 2 | France | 2
3 | Item 3 | United Kingdom | 1
4 | Item 4 | France | 2
5 | Item 5 | France | 2
6 | Item 6 | Germany | 3
The reason behind this, is we're using a Jasper report and need to show these items with an asterisk next to it (Or in this case a number) explaining some details about the country. So the report would look like this:
Desc. Country
Item 1 United Kingdom(1)
Item 2 France(2)
Item 3 United Kingdom(1)
Item 4 France(2)
Item 5 France(2)
Item 6 Germany(3)
And then further down the report would be a field stating:
1: Here are some details about the UK
2: Here are some details about France
3: Here are some details about Germany
I'm having difficulty trying to generate a unique number to go along side each country, starting at one each time the report is ran, incrementing it when a new country is found and keeping track of where to assign it. I would hazard a guess at using temporary tables to do such a thing, but I feel that's overkill.
Question
Is this kind of thing possible in Oracle SQL or am I attempting to do something that is rather large and cumbersome?
Are there better ways of doing this inside of a Jasper report?
At the moment, I'm looking at just having the subtext underneath each individual item and repeating the same information several times, just to avoid this situation, rather than having them aggregated and having the subtext once. It's not clean, but it saves this rather odd hassle.
You are looking for dense_rank():
select t.*, dense_rank() over (order by country_name) as country_relation
from t;
I don't know if this can be done inside Jasper reports. However, it is easy enough to set up a view to handle this in Oracle.

SQLAlchemy getting label names out from columns

I want to use the same labels from a SQLAlchemy table, to re-aggregate some data (e.g. I want to iterate through mytable.c to get the column names exactly).
I have some spending data that looks like the following:
| name | region | date | spending |
| John | A | .... | 123 |
| Jack | A | .... | 20 |
| Jill | B | .... | 240 |
I'm then passing it to an existing function we have, that aggregates spending over 2 periods (using a case statement) and groups by region:
grouped table:
| Region | Total (this period) | Total (last period) |
| A | 3048 | 1034 |
| B | 2058 | 900 |
The function returns a SQLAlchemy query object that I can then use subquery() on to re-query e.g.:
subquery = get_aggregated_data(original_table)
region_A_results = session.query(subquery).filter(subquery.c.region = 'A')
I want to then re-aggregate this subquery (summing every column that can be summed, replacing the region column with a string 'other'.
The problem is, if I iterate through subquery.c, I get labels that look like:
anon_1.region
anon_1.sum_this_period
anon_1.sum_last_period
Is there a way to get the textual label from a set of column objects, without the anon_1. prefix? Especially since I feel that the prefix may change depending on how SQLAlchemy decides to generate the query.
Split the name string and take the second part, and if you want to prepare for the chance that the name is not prefixed by the table name, put the code in a try - except block:
for col in subquery.c:
try:
print(col.name.split('.')[1])
except IndexError:
print(col.name)
Also, the result proxy (region_A_results) has a method keys which returns an a list of column names. Again, if you don't need the table names, you can easily get rid of them.

Date Join Query with Calculated Fields

I'm creating an Access 2010 database to replace an old Paradox one. Just now getting to queries, and there is no hiding that I am a new to SQL.
What I am trying to do is set up a query to be used by a graph. The graph's Y axis is to be a simple percentage passed, and the X axis is a certain day. The graph will be created on form load and subsequent new records entered with a date range of "Between Date() And Date()-30" (30 days, rolling).
The database I'm working with can have multiple inspections per day with multiple passes and multiple fails. Each inspection is a separate record.
For instance, on 11/26/2012 there were 7 inspections done; 5 passed and 2 failed, a 71% ((5/7)*100%) acceptance. The "11/26/2012" and "71%" represent a data point on the graph. On 11/27/2012 there were 8 inspections done; 4 passed and 4 failed, a 50% acceptance. Etc.
Here is an example of a query with fields "Date" and "Disposition" of date range "11/26/2012 - 11/27/2012:"
SELECT Inspection.Date, Inspection.Disposition
FROM Inspection
WHERE (((Inspection.Date) Between #11/26/2012# And #11/27/2012#) AND ((Inspection.Disposition)="PASS" Or (Inspection.Disposition)="FAIL"));
Date | Disposition
11/26/2012 | PASS
11/26/2012 | FAIL
11/26/2012 | FAIL
11/26/2012 | PASS
11/26/2012 | PASS
11/26/2012 | PASS
11/26/2012 | PASS
11/27/2012 | PASS
11/27/2012 | PASS
11/27/2012 | FAIL
11/27/2012 | PASS
11/27/2012 | FAIL
11/27/2012 | PASS
11/27/2012 | FAIL
11/27/2012 | FAIL
*NOTE - The date field is of type "Date," and the Disposition field is of type "Text." There are days where no inspections are done, and these days are not to show up on the graph. The inspection disposition can also be listed as "NA," which refers to another type of inspection not to be graphed.
Here is the layout I want to create in another query (again, for brevity, only 2 days in range):
Date | # Insp | # Passed | # Failed | % Acceptance
11/26/2012 | 7 | 5 | 2 | 71
11/27/2012 | 8 | 4 | 4 | 50
What I think needs to be done is some type of join on the record dates themselves and "calculated fields" in the rest of the query results. The problem is
that I haven't found out how to "flatten" the records by date AND maintain a count of the number of inspections and the number passed/failed all in one query. Do I need multiple layered queries for this? I prefer not to store any of the queries as tables as the only use of these numbers is in graphical form.
I was thinking of making new columns in the database to get around the "Disposition" field being Textual by assigning a PASS "1" and a FAIL "0," but this seems like a cop-out. There has to be a way to make this work in SQL, just I haven't found applicable examples.
Thanks for your help! Any input or suggestions are appreciated! Example databases with forms, queries, and graphs are also helpful!
You could group by Date, and then use aggregates like sum and count to calculate statistics for that group:
select Date
, count(*) as [# Insp]
, sum(iif(Disposition = 'PASS',1,0)) as [# Passed]
, sum(iif(Disposition = 'FAIL',1,0)) as [# Failed]
, 100.0 * sum(iif(Disposition = 'PASS',1,0)) / count(*) as [% Acceptance]
from YourTable
where Disposition in ('PASS', 'FAIL')
group by
Date

Excel: filtering a time series graph

I have data that looks like the following:
ID | Location | Attendees | StartDate | EndDate
---------------------------------------------
Event1 | Bldg 1 | 10 | June 1 | June 5
Event2 | Bldg 2 | 15 | June 3 | June 6
Event3 | Bldg 1 | 5 | June 3 | June 10
I'd like to create a time series graph showing, for every given date, how many events were active on that date (i.e. started but haven't ended yet). For example, on June 1, there was 1 active event, and on June 4, there were 4 active events.
This should be simple enough to do by creating a new range where my first column consists of consecutive dates, and the second column consists of formulas like the following (I hardcoded June 8 in this example):
=COUNTIFS(Events[StartDate],"<=6/8/2009", Events[EndDate],">6/8/2009")
However, the challenge is that I'd like to be able to dynamically filter the time series graph based on various criteria. For example, I'd like to be able to quickly switch between seeing the above time series only for events in Bldg 1; or for Events with more than 10 attendees. I have at least 10 different criteria I'd like to be able to filter on.
What is the best way to do this? Does Excel have a built-in way to do this, or should I write the filtering code in VBA?
Apart from that my answer is not programming related: That's prime example for using a pivot table. Use this to show data consolidated for e.g. each day. Then you can play around with filtering as you like.
Your question is exactly what pivot tables are made for.