How to create a funnel visual/bar chart in Tableau by creating a calculated field using an existing column in the data source? - sql

In my data source, there's a column called 'Pool'
Within that column, there are about 3 values:
| Pool |
| C |
| B |
| C |
| A |
So as you can see, there are 3 distinct values, A, B, C. I want to create a funnel, or essentially a bar chart that will calculate each and count them in the whole column for each of those three values. However, I know I can't just place the column itself in the sheet since I also want to have a fourth bar that counts all the values as a "All" category.
So eventually having a visual that states (but this is in tabular form to help illustrate what I mean)
All | 20
A | 10
B | 5
C | 5

Please find an indicative answer in fiddle
You could use UNION between two results one to bring the COUNT for each of your values and one COUNT for all your samples.
(SELECT Pool, COUNT(Pool) AS your_count
FROM your_table
GROUP BY Pool)
UNION
(SELECT 'ALL', COUNT(*) AS your_count
FROM your_table)
ORDER BY your_count DESC

Related

How to unnest two lists from two columns in BigQuery without cross product, as individual rows

I have a table in BigQuery, it has two columns, each column contains an array. for a given row, both columns will contain arrays of the same length, but that length can vary from row to row:
WITH tbl AS (
select ['a','b','c'] AS one, [1,2,3] as two
union all
select ['a','x'] AS two, [10,20] as two
)
select * from tbl
So the table will look like:
row | one | two
-----------------------
1 | [a,b,c] | [1,2,3]
2 | [a,x] | [10,20]
I would like to unnest in such a way that each row, in the new table, will have an element of an array from column1 and an corresponding element from column2. So from the table above, I am looking to get:
row | one | two
---------
1 | a | 1
2 | b | 2
3 | c | 3
4 | a | 10
5 | x | 20
Any help would be much appreciated! Thanks!
below is for BigQuery Standard SQL
#standardSQL
SELECT z.*
FROM `project.dataset.table` t,
UNNEST(ARRAY(
SELECT AS STRUCT one, two
FROM UNNEST(one) one WITH OFFSET
JOIN UNNEST(two) two WITH OFFSET
USING(OFFSET)
)
) z
You can test, play with above using sample data from your question - result will be
Row one two
1 a 1
2 b 2
3 c 3
4 a 10
5 x 20
I dont fully understand the syntax, could you please explain it?
Explanation:
Step 1
for each row in table below array is calculated
ARRAY(
SELECT AS STRUCT one, two
FROM UNNEST(one) one WITH OFFSET
JOIN UNNEST(two) two WITH OFFSET
USING(OFFSET)
)
Elements of this array are structs with respective value from two column - they are being matched with each other by JOIN'ing on their positions in initial arrays (OFFSET)
Step 2
Then this array gets UNNEST'ed and cross JOIN'ed with respective row in the table - and whole row is actually ignored and only that struct (z) is being brought into to the output
Step 3
And finally to output not a a struct but rather as a separate columns - z.* is used
Hope this helped :o)

SQL: reverse groupby : EDIT

Is there a build in function in sql, to reverse the order in which the groupby works? I try to groupby a certain key but i would like to have the last inserted record returned and not the first inserted record.
Changing the order with orderby does not affect this behaviour.
Thanx in advance!
EDIT:
this is the sample data:
id|value
-----
1 | A
2 | B
3 | B
4 | C
as return i want
1 | A
3 | B
4 | C
not
1 | A
2 | B
4 | C
when using group by id don't get the result i want.
Question here is how are you identifying last inserted row. Based on your example, it looks like based on id. If id is auto generated, or a sequence then you can definitely do this.
select max(id),value
from your_table
group by value
Ideally in a table design, people uses a date column which holds the time a particular record was inserted, so it is easy to order by that.
Use Max() as your aggregate function for your id:
SELECT max(id), value FROM <table> GROUP BY value;
This will return:
1 | A
3 | B
4 | C
As for eloquent, I've not used it but I think it would look like:
$myData = DB::table('yourtable')
->select('value', DB::raw('max(id) as maxid'))
->groupBy('value')
->get();

Google Big Query : Window Function Row Wise Cumulative Sum Across Columns

I am looking to calculate cumulative sum across columns in Google Big Query.
Assume there are five columns (NAME,A,B,C,D) with two rows of integers, for example:
NAME | A | B | C | D
----------------------
Bob | 1 | 2 | 3 | 4
Carl | 5 | 6 | 7 | 8
I am looking for a windowing function or UDF to calculate the cumulative sum across rows to generate this output:
NAME | A | B | C | D
-------------------------
Bob | 1 | 3 | 6 | 10
Carl | 5 | 11 | 18 | 27
Any thoughts or suggestions greatly appreciated!
I think, there are number of reasonable workarounds for your requirements mostly in the area of designing better your table. All really depends on how you input your data and most importantly how than you consume it
Still, if to stay with presented requirements - Below is not exactly what you expect in your question as an output, but might be usefull as an example:
SELECT name, GROUP_CONCAT(STRING(cum)) AS all FROM (
SELECT name,
SUM(INTEGER(num))
OVER(PARTITION BY name
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS cum
FROM (
SELECT name, SPLIT(all) AS num FROM (
SELECT name,
CONCAT(STRING(a),',',STRING(b),',',STRING(c),',',STRING(d)) AS all
FROM yourtable
)
)
)
GROUP BY name
Output is:
name all
Bob 1,3,6,10
Carl 5,11,18,26
Depends on how you than consume this data - it still can work for you
Note, not you avoiding now writing something like col1 + col2 + .. + col89 + col90 - but still need to explicitelly mention each column just ones.
in case if you have "luxury" of implementing your requirements outside of GBQ UI, but rather in some Client- you can use BigQuery API to programatically aquire table schema and build on fly your logic/query and than execute it
Take a look at below APIs to start with:
To get table schema - https://cloud.google.com/bigquery/docs/reference/v2/tables/get
To issue query job - https://cloud.google.com/bigquery/docs/reference/v2/jobs/insert
There's no need for a UDF:
SELECT name, a, a+b, a+b+c, a+b+c+d
FROM tab

Problems with a Sum in Crystal Reports 2008

my legacy stored procedure is bringing in the data like this:
Person ID | Location ID | Awesome Count (by Location ID)
1 | A | 2
2 | A | 2
3 | A | 2
4 | B | 3
5 | B | 3
6 | B | 3
So, since Awesome count is by Location ID, its repeated for each person in that location (A's actual awesome count is 2 (of the 3 people) and B's is 3 of 3. The problem occurs when I try to Sum the Awesome count for all locations. In this example, Sum(Awesome Count, Location ID) yields 15 instead of 5 because it sums for all person IDs. Is there something like a Distinct Sum?
I also tried a 2 step formula, where first formula was Maximum(Awesome Count, Location ID) and second formula was Sum(1st formula), but the second formula says "This field cannot be summarized when I hit save.
Any thoughts would be appreciated!
First option would be check Select Distinct Records option in Database.
Try below. This solution works assuming Awesome Count is always same for Location ID
a. Create a group with Location ID
b. Place the Awesome Count in detail section
c. Now create a formula #Result in group footer of the Location ID
Sum(Awesome Count, Location ID)/count(Awesome Count, Location ID)
I ended up creating 2 running totals for Awesome Count, one that reset every change in the Location ID group (total per location), and one that did not reset (grand total).

Transforming a 2 column SQL table into 3 columns, column 3 lagged on 2

Here's my problem: I want to write a query (that goes into a larger query) that takes a table like this;
ID | DATE
A | 1
A | 2
A | 3
B | 1
B | 2
and so on, and transforms it into;
ID | DATE1 | DATE2
A | 1 | 2
A | 2 | 3
A | 3 | NOW
B | 1 | 2
B | 2 | NOW
Where the numbers are dates, and NOW() is always appended to the most recent date. Given free rein I would do this in Python, but unfortunately this goes into a larger query. We're using SyBase's SQL Anywhere 12, I think? I interact with the database using SQuirreL SQL.
I'm very stumped. I thought (SQL query to transform a list of numbers into 2 columns) would help, but I'm afraid I don't know enough to make it work. I was thinking of JOINing the table to itself, but I don't know how to SELECT for only the A-1-2 rows instead of the A-1-3 rows as well, for instance, or how to insert the NOW() value into it. Does anyone have any ideas?
I made a an sqlfiddle.com to outline a solution for your example. You were mentioning dates, but using integers so I chose to do an integer example, but it can be modified. I wrote it in postgresql so the coalesce() function can be substituted with nvl() or similar. Also, the parameter '0' can be substituted with any value, including now(), but you must change the data type of the "i" column in the table to be a date as well. Please let me know if you need further help on this.
select a.id, a.i, coalesce(min(b.i),'0') from
test a
left join test b on b.id=a.id and a.i<b.i
group by a.id,a.i
order by a.id, a.i
http://sqlfiddle.com/#!15/f1fba/6