SQL: calculate different columns with events and dates - sql

I have this columns:
user_id (xxxx)
order_id (xxxx)
order_date (2020-07-01)
I would like to have per user_id the following calculated columns:
ordered at least 1 time or more, between 2020-07-01 to 2020-12-31 (6m)
ordered at least 3 times or more, between 2020-07-01 to 2020-12-31 (6m)
ordered at least 1 time or more, between 2020-07-01 to 2020-09-30 (3m)
ordered at least 1 time or more, between 2020-07-01 to 2020-08-31 (1m)
The result value could be e.g. "ordered" vs "not ordered" to populate the columns.
I'm using redshift

You can use the group by and conditional aggregation as follows:
select user_id,
case when
count(case when order_date between xxxx1 and yyyy1 then 1 end) > 1
and count(case when order_date between xxxx2 and yyyy2 then 1 end) > 3
and count(case when order_date between xxxx3 and yyyy3 then 1 end) > 1
and count(case when order_date between xxxx4 and yyyy4 then 1 end) > 1
then 'Yes' else 'No' end as res_
from your_table -- where ... -- use where condition to restrict the result if required
group by user_id
Replace dates with xxxxn and yyyyn

Related

Create a SQL query that merges rows

I have a table that stores the dates when an order was opened and closed. It's similar to this:
id
orderID
status
date
1
1
opened
2020-01-01
2
1
closed
2020-01-05
3
2
opened
2020-01-02
I need an SQL query that returns the following result:
orderId
openedDate
closedDate
1
2020-01-01
2020-01-05
2
2020-01-02
NULL
This is what I've tried:
SELECT
orderId,
CASE WHEN status = 'opened' THEN date END AS openedDate,
CASE WHEN status = 'closed' THEN date END AS closedDate
FROM
orders
GROUP BY
orderId;
But I'm not getting the desired result.
You should get a syntax error, because the select columns are inconsistent with the group by. Use aggregation:
SELECT orderId,
MAX(CASE WHEN status = 'opened' THEN date END) AS openedDate,
MAX(CASE WHEN status = 'closed' THEN date END) AS closedDate
FROM orders
GROUP BY orderId;

Calculating Percentages in Postgres

I'm completely new to PostgreSQL. I have the following table called my_table:
a b c date
1 0 good 2019-05-02
0 1 good 2019-05-02
1 1 bad 2019-05-02
1 1 good 2019-05-02
1 0 bad 2019-05-01
0 1 good 2019-05-01
1 1 bad 2019-05-01
0 0 bad 2019-05-01
I want to calculate the percentage of 'good' from column c for each date. I know how to get the number of 'good':
SELECT COUNT(c), date FROM my_table WHERE c != 'bad' GROUP BY date;
That returns:
count date
3 2019-05-02
1 2019-05-01
My goal is to get this:
date perc_good
2019-05-02 25
2019-05-01 75
So I tried the following:
SELECT date,
(SELECT COUNT(c)
FROM my_table
WHERE c != 'bad'
GROUP BY date) / COUNT(c) * 100 as perc_good
FROM my_table
GROUP BY date;
And I get an error saying
more than one row returned by a subquery used as an expression.
I found this answer but not sure how to or if it applies to my case:
Calculating percentage in PostgreSql
How do I go about calculating the percentage for multiple rows?
avg() is convenient for this purpose:
select date,
avg( (c = 'good')::int ) * 100 as percent_good
from t
group by date
order by date;
How does this work? c = 'good' is a boolean expression. The ::int converts it to a number, with 1 for true and 0 for false. The average is then the average of a bunch of 1s and 0s -- and is the ratio of the true values.
For this case you need to use conditional AVG():
SELECT
date,
100 * avg(case when c = 'good' then 1 else 0 end) perc_good
FROM my_table
GROUP BY date;
See the demo.
You could use a conditional sum for get the good value and count for total
below an exaustive code sample
select date
, count(c) total
, sum(case when c='good' then 1 else 0 end) total_good
, sum(case when c='bad' then 1 else 0 end) total_bad
, (sum(case when c='good' then 1 else 0 end) / count(c))* 100 perc_good
, (sum(case when c='bad' then 1 else 0 end) / count(c))* 100 perc_bad
from my_table
group by date
and for your result
select date
, (sum(case when c='good' then 1 else 0 end) / count(c))* 100 perc_good
from my_table
group by date
or as suggested by a_horse_with_no_name using count(*) filter()
select date
, ((count(*) filter(where c='good'))/count(*))* 100 perc_good
from my_table
group by date

How to calculate the average per day for different years

I am trying to calculate the average number of times apple with an increment of 3 are shown per day in the years of both 2018 and 2017. To do this I am trying to use setNum and exNum that has a difference of 3.
ID Year Text setNum ExNum
-------------------------------------------------
1 2018-01-21 apple 1 3
2 2017-08-03 apple 2 5
3 2018-03-02 banana 1 3
4 2018-05-22 apple 1 3
5 2018-12-12 apple 3 6
6 2017-04-13 apple 3 6
My current query to obtain this is:
SELECT
2017 = avg(case when Year BETWEEN '2017-01-01' AND '2017-12-31' then 1 else 0 end),
2018 = avg(case when Year BETWEEN '2018-01-01' AND '2018-12-31' then 1 else 0 end)
FROM
exampleTable
WHERE
Text LIKE '%apple%'
This currently outputs:
2017 2018
0 0
Note: The original table had a single text column Increment, which had values like 1-3. That is, the 1-3 represented a setNum of 1 and an ExNum of 3.
Your decision to store a numerical increment range as text is not a good one, and ideally you should be storing the two points of the increment in separate columns. That being said, we can do some string olympics to work around this:
SELECT
YEAR(Year) AS Year,
COUNT(CASE WHEN 3 BETWEEN CAST(LEFT(Increment, CHARINDEX('-', Increment)-1) AS int) AND
CAST(RIGHT(Increment, LEN(Increment) - CHARINDEX('-', Increment)) AS int)
THEN 1 END) AS apple_3_cnt
FROM exampleTable
WHERE
TEXT LIKE '%apple%'
GROUP BY
YEAR(year);
Demo
Here I am aggregating by year, and then taking a conditional count of record, for each year, where the apple increment range contains 3. To do this, I separate out the two ends of the increment range, and then convert them to integers.
Edit:
Based on your updated table, we can try a simpler query:
SELECT
YEAR(Year) AS Year,
COUNT(CASE WHEN 3 BETWEEN setNum AND ExNum THEN 1 END) AS apple_3_cnt
FROM exampleTable
WHERE
TEXT LIKE '%apple%'
GROUP BY
YEAR(year);
Try below
SELECT
avg(case when Year BETWEEN '2017-01-01' AND '2017-12-31' then setNum+ExNum end) as 2017
avg(case when Year BETWEEN '2018-01-01' AND '2018-12-31' then setNum+ExNum end) as 2018
FROM
exampleTable
WHERE
Text LIKE '%apple%'
Your query is fine. The only problem is how and to where you assign the results.
Use this syntax instead
SELECT
avg(case when Year BETWEEN '2017-01-01' AND '2017-12-31' then 1 else 0 end) as A2017,
avg(case when Year BETWEEN '2018-01-01' AND '2018-12-31' then 1 else 0 end) as A2018
FROM
exampleTable
WHERE
Text LIKE '%apple%'
Note that you can't use numbers as variable names.

How to count rows matching a filter in aggregate operations

Sorry for the not punctual title, this is the best I succeeded to obtain.
I have a table like this:
date | type | qty
2018-03-21 03:30:00 | A | 3
2018-03-22 03:30:00 | A | 3
2018-03-22 04:57:00 | A | 1
2018-03-22 05:18:00 | B | 3
I do some aggregations on this table, e.g. sum of qty over day or over month.
In the same query I need to count how many rows are of type B, while retrieving the total qty on that day.
So,
select sum(qty), date_trunc('day', date) ... group by date_trunc('day', date);
Now, what I need to do next is to count how many rows are of type B. So the expected result is
day | Bcount | totqty
2018-03-21 | 0 | 3
2018-03-22 | 1 | 7
I thought to use partitions but I'm not sure how to use them in this specific case.
Edit: thank you all, guys, for your answers. This was soooooooo easy 🙄
Since 9.4 release we can replace the CASE WHEN clauses in these aggregate functions by the new FILTER clause, use below query:
select date_trunc('day', date) AS Day,
Count(TYPE) filter (where Type = 'B') AS BCount,
Sum(qty) AS TotalQty
FROM Table1 group by date_trunc('day', date);
For Demo Follow the link:
http://sqlfiddle.com/#!17/a5203/14
Until Postgres 9.4 release, if you wanted to count a few sets of records when executing an aggregate function, you had to use a CASE WHEN.
Like This:
SELECT date_trunc('day', date) AS Day,
SUM(CASE WHEN TYPE = 'B' THEN 1 ELSE 0 END) AS BCount,
Sum(qty) AS TotalQty
FROM Table1 group by date_trunc('day', date);
Use a case expression to do conditional aggregation:
select ...
sum(case when type = 'B' then 1 else 0 end) as Bcount
...
select date_trunc('day', date) ,sum(qty),
SUM (CASE WHEN type = 'B' THEN 1 ELSE 0 END) AS Bcount
FROM Table1
group by date_trunc('day', date);
Demo
http://sqlfiddle.com/#!17/a5203/13

How deal with this issue in SQL with Groupby

I have this data called pdays:
id|time|date_time| type_id
1 2 2016-03-05 1
2 5 2016-03-05 1
3 3 2016-03-06 2
4 7 2016-03-07 3
5 2 2016-03-10 1
6 1 2016-03-12 3
I would like to calculate the average number of time SUM(time) for weekdays and weekends grouped by type_id
The output expect like this:
type_id| weekday_time|weekends_time
1 7 2
2 3 0
3 7 1
This is my thoughts:
First I need to extract date number from date_time; Second, identify the date number whether falls into (5,6,12,13,19,20,26,27) which are weekend numbers (note: This data presents a one month case, so I do not need to worry about the changes of weekend date numbers in next month); Finally, do the aggregation and grouping on type_id
CASE WHEN pday.date IN(5,6,12,13,19,20,26,27) THEN 'weekend' ELSE 'weekday' END
This is the case part I think I should use.
First, your output appears to be wrong. Type_id 3 has both a weekend and a weekday entry, but you show one of the output values as 0.
This should get you what you want in SQL Server and it is very close to other RDBMS's. If you update your RBDMS, I'll change:
;with cte AS (
select type_id,
CASE WHEN pday.date IN(5,6,12,13,19,20,26,27) THEN 'weekday' ELSE 'weekend' END AS day_type,
SUM(time) AS time_sum
FROM pdays
GROUP BY
type_id,
CASE WHEN pday.date IN(5,6,12,13,19,20,26,27) THEN 'weekday' ELSE 'weekend' END
)
SELECT type_id,
SUM(CASE WHEN day_type = 'weekday' THEN time_sum ELSE 0 END) AS 'weekday_time',
SUM(CASE WHEN day_type = 'weekend' THEN time_sum ELSE 0 END) AS 'weekend_time'
FROM cte
GROUP BY [type_id]