Two selects based on one select in on query - sql

Can I do one select and then do different selects on the result in one query?
Now I want to do something like that (which is not working)
select
(select count(*), sum(amount) from view where amount > 5),
(select count(*), sum(amount) from view where amount < 5)
from
(select id, amount from warehouse where createDate = '2019-01-01') as view;
I don't want to select view and then select some data with additional filtering based on the view.

You can use conditional aggregation:
select count(*),
sum(amount) filter (where waga > 5),
sum(amount) filter (where amount < 5)
from warehouse
where createdate = date '2019-01-01'

About the general syntax question you could use WITH clause:
with v as
(
select id, amount from warehouse where createDate = '2019-01-01'
)
select * from
(
(select count(*), sum(amount) from v where waga > 5) as count1,
(select count(*), sum(amount) from v where amount < 5) as count2
);
(I don't mean it will be faster; it's just a way to use an "inline" view).

Related

Averaging and Grouping In google Big Query

I have the table as shown in google big Query:
I just want to do the following:
Calculate Category wise total units sold
Calculate Category wise average selling price
consider below approach
select 'category' type, category name, count(1) units_sold, sum(sale_price) total_sale, round(avg(sale_price), 2) average_selling_price
from your_table group by category
union all
select * from (
select 'product' type, product name, count(1) units_sold, sum(sale_price) total_sale, round(avg(sale_price), 2) average_selling_price
from your_table group by product
order by total_sale desc limit 10
)
union all
select * from (
select 'order_date' type, '' || order_date name, count(1) units_sold, sum(sale_price) total_sale, round(avg(sale_price), 2) average_selling_price
from your_table group by order_date
order by total_sale desc limit 5
)
order by type
if applied to sample/dummy data - output would be like below

conditional running sum

I'm trying to return the number of unique users that converted over time.
So I have the following query:
WITH CTE
As
(
SELECT '2020-04-01' as date,'userA' as user,1 as goals Union all
SELECT '2020-04-01','userB',0 Union all
SELECT '2020-04-01','userC',0 Union all
SELECT '2020-04-03','userA',1 Union all
SELECT '2020-04-05','userC',1 Union all
SELECT '2020-04-06','userC',0 Union all
SELECT '2020-04-06','userB',0
)
select
date,
COUNT(DISTINCT
IF
(goals >= 1,
user,
NULL)) AS cad_converters
from CTE
group by date
I'm trying to count distinct user but I need to find a way to apply the distinct count to the whole date. I probably need to do something like a cumulative some...
expected result would be something like this
date, goals, total_unique_converted_users
'2020-04-01',1,1
'2020-04-01',0,1
'2020-04-01',0,1
'2020-04-03',1,2
'2020-04-05',1,2
'2020-04-06',0,2
'2020-04-06',0,2
Below is for BigQuery Standard SQL
#standardSQL
SELECT t.date, t.goals, total_unique_converted_users
FROM `project.dataset.table` t
LEFT JOIN (
SELECT a.date,
COUNT(DISTINCT IF(b.goals >= 1, b.user, NULL)) AS total_unique_converted_users
FROM `project.dataset.table` a
CROSS JOIN `project.dataset.table` b
WHERE a.date >= b.date
GROUP BY a.date
)
USING(date)
I would approach this by tagging when the first goal is scored for each name. Then simply do a cumulative sum:
select cte.* except (seqnum), countif(seqnum = 1) over (order by date)
from (select cte.*,
(case when goals = 1 then row_number() over (partition by user, goals order by date) end) as seqnum
from cte
) cte;
I realize this can be expressed without the case in the subquery:
select cte.* except (seqnum), countif(seqnum = 1 and goals = 1) over (order by date)
from (select cte.*,
row_number() over (partition by user, goals order by date) as seqnum
from cte
) cte;

SQLite Getting multiple results with LIMIT 1

I have the following problem.
Part of a task is to determine the visitor(s) with the most money spent between 2000 and 2020.
It just looks like this.
SELECT UserEMail FROM Visitor
JOIN Ticket ON Visitor.UserEMail = Ticket.VisitorUserEMail
where Ticket.Date> date('2000-01-01') AND Ticket.Date < date ('2020-12-31')
Group by Ticket.VisitorUserEMail
order by SUM(Price) DESC;
Is it possible to output more than one person if both have spent the same amount?
Use rank():
SELECT VisitorUserEMail
FROM (SELECT VisitorUserEMail, SUM(PRICE) as sum_price,
RANK() OVER (ORDER BY SUM(Price) DESC) as seqnum
FROM Ticket t
WHERE t.Date >= date('2000-01-01') AND Ticket.Date <= date('2021-01-01')
GROUP BY t.VisitorUserEMail
) t
WHERE seqnum = 1;
Note: You don't need the JOIN, assuming that ticket buyers are actually visitors. If that assumption is not true, then use the JOIN.
Use a CTE that returns all the total prices for each email and with NOT EXISTS select the rows with the top total price:
WITH cte AS (
SELECT VisitorUserEMail, SUM(Price) SumPrice
FROM Ticket
WHERE Date >= '2000-01-01' AND Date <= '2020-12-31'
GROUP BY VisitorUserEMail
)
SELECT c.VisitorUserEMail
FROM cte c
WHERE NOT EXISTS (
SELECT 1 FROM cte
WHERE SumPrice > c.SumPrice
)
or:
WITH cte AS (
SELECT VisitorUserEMail, SUM(Price) SumPrice
FROM Ticket
WHERE Date >= '2000-01-01' AND Date <= '2020-12-31'
GROUP BY VisitorUserEMail
)
SELECT VisitorUserEMail
FROM cte
WHERE SumPrice = (SELECT MAX(SumPrice) FROM cte)
Note that you don't need the function date() because the result of date('2000-01-01') is '2000-01-01'.
Also I think that the conditions in the WHERE clause should include the =, right?

Group by in columns and rows, counts and percentages per day

I have a table that has data like following.
attr |time
----------------|--------------------------
abc |2018-08-06 10:17:25.282546
def |2018-08-06 10:17:25.325676
pqr |2018-08-05 10:17:25.366823
abc |2018-08-06 10:17:25.407941
def |2018-08-05 10:17:25.449249
I want to group them and count by attr column row wise and also create additional columns in to show their counts per day and percentages as shown below.
attr |day1_count| day1_%| day2_count| day2_%
----------------|----------|-------|-----------|-------
abc |2 |66.6% | 0 | 0.0%
def |1 |33.3% | 1 | 50.0%
pqr |0 |0.0% | 1 | 50.0%
I'm able to display one count by using group by but unable to find out how to even seperate them to multiple columns. I tried to generate day1 percentage with
SELECT attr, count(attr), count(attr) / sum(sub.day1_count) * 100 as percentage from (
SELECT attr, count(*) as day1_count FROM my_table WHERE DATEPART(week, time) = DATEPART(day, GETDate()) GROUP BY attr) as sub
GROUP BY attr;
But this also is not giving me correct answer, I'm getting all zeroes for percentage and count as 1. Any help is appreciated. I'm trying to do this in Redshift which follows postgresql syntax.
Let's nail the logic before presenting:
with CTE1 as
(
select attr, DATEPART(day, time) as theday, count(*) as thecount
from MyTable
)
, CTE2 as
(
select theday, sum(thecount) as daytotal
from CTE1
group by theday
)
select t1.attr, t1.theday, t1.thecount, t1.thecount/t2.daytotal as percentofday
from CTE1 t1
inner join CTE2 t2
on t1.theday = t2.theday
From here you can pivot to create a day by day if you feel the need
I am trying to enhance the query #johnHC btw if you needs for 7days then you have to those days in case when
with CTE1 as
(
select attr, time::date as theday, count(*) as thecount
from t group by attr,time::date
)
, CTE2 as
(
select theday, sum(thecount) as daytotal
from CTE1
group by theday
)
,
CTE3 as
(
select t1.attr, EXTRACT(DOW FROM t1.theday) as day_nmbr,t1.theday, t1.thecount, t1.thecount/t2.daytotal as percentofday
from CTE1 t1
inner join CTE2 t2
on t1.theday = t2.theday
)
select CTE3.attr,
max(case when day_nmbr=0 then CTE3.thecount end) as day1Cnt,
max(case when day_nmbr=0 then percentofday end) as day1,
max(case when day_nmbr=1 then CTE3.thecount end) as day2Cnt,
max( case when day_nmbr=1 then percentofday end) day2
from CTE3 group by CTE3.attr
http://sqlfiddle.com/#!17/54ace/20
In case that you have only 2 days:
http://sqlfiddle.com/#!17/3bdad/3 (days descending as in your example from left to right)
http://sqlfiddle.com/#!17/3bdad/5 (days ascending)
The main idea is already mentioned in the other answers. Instead of joining the CTEs for calculating the values I am using window functions which is a bit shorter and more readable I think. The pivot is done the same way.
SELECT
attr,
COALESCE(max(count) FILTER (WHERE day_number = 0), 0) as day1_count, -- D
COALESCE(max(percent) FILTER (WHERE day_number = 0), 0) as day1_percent,
COALESCE(max(count) FILTER (WHERE day_number = 1), 0) as day2_count,
COALESCE(max(percent) FILTER (WHERE day_number = 1), 0) as day2_percent
/*
Add more days here
*/
FROM(
SELECT *, (count::float/count_per_day)::decimal(5, 2) as percent -- C
FROM (
SELECT DISTINCT
attr,
MAX(time::date) OVER () - time::date as day_number, -- B
count(*) OVER (partition by time::date, attr) as count, -- A
count(*) OVER (partition by time::date) as count_per_day
FROM test_table
)s
)s
GROUP BY attr
ORDER BY attr
A counting the rows per day and counting the rows per day AND attr
B for more readability I convert the date into numbers. Here I take the difference between current date of the row and the maximum date available in the table. So I get a counter from 0 (first day) up to n - 1 (last day)
C calculating the percentage and rounding
D pivot by filter the day numbers. The COALESCE avoids the NULL values and switched them into 0. To add more days you can multiply these columns.
Edit: Made the day counter more flexible for more days; new SQL Fiddle
Basically, I see this as conditional aggregation. But you need to get an enumerator for the date for the pivoting. So:
SELECT attr,
COUNT(*) FILTER (WHERE day_number = 1) as day1_count,
COUNT(*) FILTER (WHERE day_number = 1) / cnt as day1_percent,
COUNT(*) FILTER (WHERE day_number = 2) as day2_count,
COUNT(*) FILTER (WHERE day_number = 2) / cnt as day2_percent
FROM (SELECT attr,
DENSE_RANK() OVER (ORDER BY time::date DESC) as day_number,
1.0 * COUNT(*) OVER (PARTITION BY attr) as cnt
FROM test_table
) s
GROUP BY attr, cnt
ORDER BY attr;
Here is a SQL Fiddle.

Selecting only if at least one row matches condition

I have a select statement and want to return all values only if at least one of them has a date with 60 days of difference from today.
The problem is that i have an outer apply which returns the column i want to compare to, and they come from different tables (one belongs to cash items, and the other to card items).
Considering I have the following:
OUTER APPLY (
SELECT COUNT(*) AS quantity, MIN(date) AS item_date
FROM dbo.get_cash_items(loans.id_cash) AS cash_item
HAVING loans.id_product_type = 1 --Cash
UNION
SELECT COUNT(*) AS quantity, MIN(date) AS item_date
FROM dbo.get_card_items(loans.id_card) AS card_item
HAVING loans.id_product_type = 2 --Card
) AS items
I want to return all the rows only when DATEDIFF(DAY, MIN(items.item_date), GETDATE()) >= 60, but I want them all even if only one matches this condition.
What would be the best approach to do this?
EDIT
To make it clearer, I'll explain the use case:
I need to show the items of every loan, only if the client is late for more than 60 days of the due date on any of it
I am also not sure, what do you expect, but how about that:
WITH items
AS (SELECT Count(*) AS quantity,
Min(date) AS item_date
FROM dbo.Get_cash_items(loans.id_cash) AS cash_item
HAVING loans.id_product_type = 1
UNION
SELECT Count(*) AS quantity,
Min(date) AS item_date
FROM dbo.Get_card_items(loans.id_card) AS card_item
HAVING loans.id_product_type = 2)
SELECT a.*
FROM items AS a,
(SELECT TOP 1 *
FROM items AS b
WHERE Datediff(day, b.item_date, Getdate()) >= 60) AS c
It's a sort of CROSS JOIN, where table C will have one or zero rows depending on that if the condition is met - it will than join to every row in other table.
Have you tried something like this?
SELECT a.quantity, a.item_date
FROM
(SELECT COUNT(*) AS quantity, MIN(date) AS item_date
FROM dbo.get_cash_items(loans.id_cash) AS cash_item
HAVING loans.id_product_type = 1
UNION
SELECT COUNT(*) AS quantity, MIN(date) AS item_date
FROM dbo.get_card_items(loans.id_card) AS card_item
HAVING loans.id_product_type = 2) a
WHERE DATEDIFF(day, a.item_date, GETDATE()) >= 60
Typically I do this using a CTE to select the key for the records I want to select and then join on that. Below is an attempt at an example:
with LateClients as
(
SELECT LoadId FROM Payment Where /*payment date later than 60 days*/
)
SELECT p.LoanId,
p.UserId
FROM Payment as p
INNER JOIN LateClients as LC
ON p.LoanId = lc.LoanId
OrderBy p.LoanId, p.UserId
I know it's a bit different from the code you posted, but this is a simplified example that should explain the concept. Good luck!