Get a histogram of a query Bigquery - sql

The question is:
Show customer's transaction distribution for completed RIDE orders between 1st - 10th of April 2018 (Distribution of customers that have done 1 transaction, 2, 3,4,etc)
And the preview of table that's querying is:
My query is:
SELECT customer_no, COUNT(*) AS total_transaction FROM [bi-dwhdev-01:source.daily_order]
WHERE DATE(order_time) >= '2018-04-01'AND DATE(order_time) <= '2018-04-10'
GROUP BY customer_no
ORDER BY total_transaction DESC;
I'm wondering how to get a distribution in Bigquery(either Legacy or Standard)?
Thanks in advance!

I think you want two levels of aggregation:
SELECT total_transaction, COUNT(*)
FROM (SELECT customer_no, COUNT(*) AS total_transaction
FROM [bi-dwhdev-01:source.daily_order]
WHERE DATE(order_time) >= '2018-04-01' AND DATE(order_time) <= '2018-04-10'
GROUP BY customer_no
) c
GROUP BY total_transaction
ORDER BY total_transaction DESC;

Related

How to get number of billable customers per month SQL

This is what my table looks like:
NOTE: Don't worry about the BMI field being empty in some rows. We assume that each row is a reading. I have omitted some columns for privacy reasons.
I want to get a count of the number of active customers per month. A customer is active if they have at least 18 readings in total (1 reading per day for 18 days in a given month). How do I write this SQL query? Assume the table name is 'cust'. I'm using SQL Server. Any help is appreciated.
Presumably a patient is a customer in your world. If so, you can use two levels of aggregation:
select yyyy, mm, count(*)
from (select year(createdat) as yyyy, month(createdat) as mm,
patient_id,
count(distinct convert(date, createdat)) as num_days
from t
group by year(createdat), month(createdat), patient_id
) ymp
where num_days >= 18
group by yyyy, mm;
You need to group by patient and the month, then group again by just the month
SELECT
mth,
COUNT(*) NumPatients
FROM (
SELECT
EOMONTH(c.createdat) mth
FROM cust c
GROUP BY EOMONTH(c.createdat), c.patient_id
HAVING COUNT(*) >= 18
-- for distinct days you could change it to:
-- HAVING COUNT(DISTINCT CAST(c.createdat AS date)) >= 18
) c
GROUP BY mth;

GROUP BY with a condition on WHERE clause

I have the following query:
SELECT
Group as [Grupo],
COUNT(*) as [Total]
FROM
Table
WHERE
Status NOT IN ('Closed', 'Cancelled', 'Resolved') AND
DATEDIFF(day,Submit_Date,GETDATE()) > 30
GROUP BY
Group,
DATEDIFF(day,Submit_Date,GETDATE())
The objective is to get tickets with aging above 30 days. The output is:
Group Total
Group A 4
Group A 1
Group A 2
Group A 2
Group B 1
Group B 1
What I'm hoping to see:
Group Total
Group A 9
Group B 2
I might be missing something dumb here... Can someone help me with this one? Thanks
seems like you just need to group by "Group" only:
SELECT
Group as [Grupo],
COUNT(*) as [Total]
FROM
Table
WHERE
Status NOT IN ('Closed', 'Cancelled', 'Resolved') AND
DATEDIFF(day,Submit_Date,GETDATE()) > 30
GROUP BY
Group
You need to fix the GROUP BY. These keys define each row and apparently you want one row per group.
I would also suggest fixing the date logic:
SELECT [Group] as [Grupo], COUNT(*) as [Total]
FROM Table
WHERE Status NOT IN ('Closed', 'Cancelled', 'Resolved') AND
Submit_Date < DATEADD(DAY, -30 CONVERT(DATE, GETDATE()))
GROUP BY [Group];
Avoiding the function call on Submit_Date should help the optimizer produce the best execution plan.

Days Since Last Help Ticket was Filed

I am trying to create a report to show me the last date a customer filed a ticket.
Customers can file dozens of tickets. I want to know when the last ticket was filed and show how many days it's been since they have done so.
The fields I have are:
Customer,
Ticket_id,
Date_Closed
All from the Same table "Tickets"
I'm thinking I want to do a ranking of tickets by min date? I tried this query to grab something but it's giving me all the tickets from the customer. (I'm using SQL in a product called Domo)
select * from (select *, rank() over (partition by "Ticket_id"
order by "Date_Closed" desc) as date_order
from tickets ) zd
where date_order = 1
This should be simple enough,
SELECT customer,
MAX (date_closed) last_date,
ROUND((SYSDATE - MAX (date_closed)),0) days_since_last_ticket_logged
FROM emp
GROUP BY customer
select Customer, datediff(day, date_closed, current_date) as days_since_last_tkt
from
(select *, rank() over (partition by Customer order by "Date_Closed" desc) as date_order
from tickets) zd
join tickets t on zd.date_closed = t.date_closed
where zd.date_order = 1
Or you can simply do
select customer, datediff(day, max(Date_closed), current_date) as days_since_last_tkt
from tickets
group by customer
To select other fields
select t.*
from tickets t
join (select customer, max(Date_closed) as mxdate,
datediff(day, max(Date_closed), current_date) as days_since_last_tkt
from tickets
group by customer) tt
on t.customer = tt.customer and tt.mxdate = t.date_closed
I would do this with a simple sub-query to select the last closed date for the customer. Then compare this to today with datediff() to get the number of days since last closed.
Select
LastTicket.Customer,
LastTicket.LastClosedDate,
DateDiff(day,LastTicket.LastClosedDate,getdate()) as DaysSinceLastClosed
From
(select
tickets.customer
max(tickets.dateClosed) as LastClosedDate
from tickets
Group By tickets.Customer) as LastTicket
Based on the responses this is what I did:
select "Customer",
Max("date_closed") "last_date,
round(datediff(DAY, CURRENT_DATE, max("date_closed")), 0) as "Closed_date"
from tickets
group by "Customer"
ORDER BY "Customer"

Sum Column Results in SQL

How do you sum the results of a calculated column into one number in SQL?
SELECT
id, SUM(cost + r_cost) AS Revenue
FROM
revenue_table
WHERE
signup_date >= '2015-01-01'
GROUP BY
id
ORDER BY
Revenue DESC
LIMIT 20;
This query displays the revenue to date of the top 20 customers. How can I quickly do a total sum of the Revenue to get the total Revenue of the top 20 guys?
Assuming you're using MySQL:
-- Option 1: Simply put your query in the FROM clause and sum the result
select sum(Revenue)
from (select id, sum(cost + r_cost) as Revenue
from revenue_table
where signup_date >= '2015-01-01'
group by id
order by Revenue desc
limit 20) as a
-- Option 2: Use, as suggested by Siyual in his comment, ROLLUP.
-- You'll have to use a subquery too, because
-- LIMIT is applied after the ROLLUP
select id, sum(a.Revenue) as Revenue
from (select id, sum(cost + r_cost) as Revenue
from revenue_table
where signup_date >= '2015-01-01'
group by id
order by Revenue desc
limit 20) as a
GROUP BY id WITH ROLLUP

To find the last updated record of each month for each policy(another field)

I have a table named a, and other fields as eff_date,policy no.
Now for each policy, consider all the records, and take out the last updated one (eff_date) from each month.
So I need the last updated record for each month for each policy. How would I write a query for this?
I'm not 100 percent on Teradata syntax, but I believe you're after this:
SELECT policy_no,eff_date
FROM (SELECT policy_no,eff_date, ROW_NUMBER() OVER (PARTITION BY policy no, EXTRACT(YEAR FROM eff_date),EXTRACT(MONTH FROM eff_date) ORDER BY eff_date DESC) as RowRank
FROM a) as sub
WHERE RowRank = 1
I'm assuming when you say by month you also want to differentiate by year, but if not, just remove the EXTRACT(YEAR FROM eff_date) from the PARTITION BY section.
Edit: Update for Teradata syntax.
SELECT * from a
qualify ROW_NUMBER() OVER (PARTITION BY policy no, EXTRACT(YEAR FROM eff_date),
EXTRACT(MONTH FROM eff_date) ORDER BY eff_date DESC) = 1
The main difficulty, is that the group by needs to be made both the conbination of policy_no, but also the month (extracted from the date). For example:
In Mysql
SELECT policy_no,
month(eff_date),
year(eff_date),
max(eff_date)
FROM myTable
GROUP BY policy_no,
month(eff_date),
year(eff_date);
Update
I saw derived tables are allowed in teradata. Using a join to a derived table, here is how to access the full rows:
select * from a,
(SELECT policy_no,
month(eff_date),
year(eff_date),
max(eff_date) as MaxMonthDate
FROM a
GROUP BY policy_no,
month(eff_date),
year(eff_date)
) as b
where a.policy_no = b.policy_no and
a.eff_date = b.MaxMonthDate;
http://www.sqlfiddle.com/#!2/1f728/5
Update (Using Extract)
select * from a,
(SELECT a2.policy_no,
EXTRACT(MONTH FROM a2.eff_date),
EXTRACT(YEAR FROM a2.eff_date),
max(a2.eff_date) as MaxMonthDate
FROM a as a2
GROUP BY a2.policy_no,
EXTRACT(MONTH FROM a2.eff_date),
EXTRACT(YEAR FROM a2.eff_date)
) as b
where a.policy_no = b.policy_no and
a.eff_date = b.MaxMonthDate;
I'm going to suggest looking into Windows Aggregate functions and the QUALIFY statement. I believe the following SQL will work.
SELECT Policy_No
, EXTRACT(MONTH FROM Eff_Date) AS Eff_Month_
, Eff_Date
FROM TableA
QUALIFY ROW_NUMBER() OVER (PARTITION BY Policy_No, EXTRACT(MONTH FROM Eff_Date)
ORDER BY Eff_Date DESC) = 1;