Create a SQL query that merges rows - sql

I have a table that stores the dates when an order was opened and closed. It's similar to this:
id
orderID
status
date
1
1
opened
2020-01-01
2
1
closed
2020-01-05
3
2
opened
2020-01-02
I need an SQL query that returns the following result:
orderId
openedDate
closedDate
1
2020-01-01
2020-01-05
2
2020-01-02
NULL
This is what I've tried:
SELECT
orderId,
CASE WHEN status = 'opened' THEN date END AS openedDate,
CASE WHEN status = 'closed' THEN date END AS closedDate
FROM
orders
GROUP BY
orderId;
But I'm not getting the desired result.

You should get a syntax error, because the select columns are inconsistent with the group by. Use aggregation:
SELECT orderId,
MAX(CASE WHEN status = 'opened' THEN date END) AS openedDate,
MAX(CASE WHEN status = 'closed' THEN date END) AS closedDate
FROM orders
GROUP BY orderId;

Related

Finding Cumulative Sum of column with string data type

I need to calculate the cumulative sum for the tickets which are only open. I have a table with id, open_date, ticket_status, and ticket_closed.
I'm not sure how to calculate the cumulative sum only for the open tickets with the data type being string.
I have a table tb with the following structure:
id
open_date
ticket_status
ticket_closed
1
01-01-2022
open
2
01-01-2022
closed
01-02-2022
3
01-01-2022
open
4
01-02-2022
open
5
01-03-2022
open
I want output to be the following
id
open_date
ticket_status
ticket_closed
cumulative_sum
1
01-01-2022
open
1
2
01-01-2022
closed
01-02-2022
3
01-01-2022
open
2(1+1)
4
01-02-2022
open
3(2+1)
5
01-03-2022
open
4(2+1)
I have tried the following code and it's not giving me the output I'm expecting
SELECT id, open_date,
SUM(CASE WHEN 'ticket_status' = 'open' THEN 1 ELSE NULL END) OVER (ORDER BY open_date ASC ROWS UNBOUNDED PRECEDING)
FROM tb
any help would be appreciated!
Try
SUM(CASE WHEN 'ticket_status' = 'open' THEN 1 ELSE 0 END) OVER (ORDER BY open_date, id)
It looks like your "id" field identifies the order of insertion for your records. If that's the case, you can use it inside the ORDER BY clause of your COUNT window function. Then update your field value only when your ticket_status='open'.
SELECT id, open_date,
CASE WHEN ticket_status = 'open'
THEN COUNT(CASE WHEN ticket_status = 'open' THEN 1 END) OVER (ORDER BY id)
END
FROM tb
Here's a demo in MySQL, although this query is likely to work on all the most common DBMS'.

SQL: calculate different columns with events and dates

I have this columns:
user_id (xxxx)
order_id (xxxx)
order_date (2020-07-01)
I would like to have per user_id the following calculated columns:
ordered at least 1 time or more, between 2020-07-01 to 2020-12-31 (6m)
ordered at least 3 times or more, between 2020-07-01 to 2020-12-31 (6m)
ordered at least 1 time or more, between 2020-07-01 to 2020-09-30 (3m)
ordered at least 1 time or more, between 2020-07-01 to 2020-08-31 (1m)
The result value could be e.g. "ordered" vs "not ordered" to populate the columns.
I'm using redshift
You can use the group by and conditional aggregation as follows:
select user_id,
case when
count(case when order_date between xxxx1 and yyyy1 then 1 end) > 1
and count(case when order_date between xxxx2 and yyyy2 then 1 end) > 3
and count(case when order_date between xxxx3 and yyyy3 then 1 end) > 1
and count(case when order_date between xxxx4 and yyyy4 then 1 end) > 1
then 'Yes' else 'No' end as res_
from your_table -- where ... -- use where condition to restrict the result if required
group by user_id
Replace dates with xxxxn and yyyyn

Select start and end dates for changing values in SQL

I have a database with accounts and historical status changes
select Date, Account, OldStatus, NewStatus from HistoricalCodes
order by Account, Date
Date
Account
OldStatus
NewStatus
2020-01-01
12345
1
2
2020-10-01
12345
2
3
2020-11-01
12345
3
2
2020-12-01
12345
2
1
2020-01-01
54321
2
3
2020-09-01
54321
3
2
2020-12-01
54321
2
3
For every account I need to determine Start Date and End Date when Status = 2. An additional challenge is that the status can change back and forth multiple times. Is there a way in SQL to create something like this for at least first two timeframes when account was in 2? Any ideas?
Account
StartDt_1
EndDt_1
StartDt_2
EndDt_2
12345
2020-01-01
2020-10-01
2020-11-01
2020-12-01
54321
2020-09-01
2020-12-01
I would suggest putting this information in separate rows:
select t.*
from (select account, date as startdate,
lead(date) over (partition by account order by date) as enddate
from t
) t
where newstatus = 2;
This produces a separate row for each period when an account has a status of 2. This is better than putting the dates in separate pairs of columns, because you do not need to know the maximum number of periods of status = 2 when you write the query.
For a fixed maximum of status changes per account, you can use window functions and conditional aggregation:
select account,
max(case when rn = 1 then date end) as start_dt1,
max(case when rn = 1 then lead_date end) as end_dt1,
max(case when rn = 2 then date end) as start_dt2,
max(case when rn = 2 then lead_date end) as end_dt2
from (
select t.*,
row_number() over(partition by account, newstatus order by date) as rn,
lead(date) over(partition by account order by date) as lead_date
from mytable t
) t
where newstatus = 2
group by account
You can extend the select clause with more conditional expressions to handle more possible ranges per account.

Flaggin active customers - Atleast one transaction every month

Once the customer is registered, between date_registered and current date - if the customer has made atleast one transaction every month, then flag it as active or else flag it has inactive
Note: Every customer has different date_registered
I tried this but doesn't work since few of the customers were onboarded in the middle of the year
Eg -
-------------------------------------
txn_id | txn_date | name | amount
-------------------------------------
101 2018-05-01 ABC 100
102 2018-05-02 ABC 200
-------------------------------------
(case when count(distinct case when txn_date >= '2018-05-01' and txn_date < '2019-06-01' then last_day(txn_date) end) = 13
then 'active' else 'inactive'
end) as flag
from t;
Final output
----------------
name | flag
----------------
ABC active
BCF inactive
You can use filtering on an aggregation query:
select customer,
count(distinct last_day(txn_date)) as num_months
from (select t.*, min(date_registered) over (partition by customer) as min_dr
from t
) t
group by customer, min_dr
having count(distinct last_day(txn_date)) = months_between(last_day(current_date), last_day(min_dr)) + 1;
Note: This may give unexpected results toward the beginning of a month, if customers do not all have transactions on the first day of the month.
EDIT:
If you want a flag, just move the HAVING logic to the SELECT:
select customer,
(case when count(distinct last_day(txn_date)) = months_between(last_day(current_date), last_day(min_dr)) + 1
then 'Active' else 'Inactive'
end) as active_flag
from (select t.*, min(date_registered) over (partition by customer) as min_dr
from t
) t
group by customer, min_dr;

SQL - How to get columns from row values in the same column (SQL Server 2016)

I need to derive columns from the row values of one column.
Here's the row data.
CustomerID Activity Date
10001 Active 2018-06-21
10001 Inactive 2018-06-25
10001 Active 2018-08-22
10001 Inactive 2018-10-06
And here's the output that I am trying to get to:
CustomerID ActiveDate InactiveDate
10001 2018-06-21 2018-06-25
10001 2018-08-22 2018-10-06
Please help! Thanks!
You can try to make row number in subquery group by CustomerID,Activity, then do condition aggregate function.
SELECT CustomerID,
MAX(CASE WHEN Activity = 'Active' THEN Date END) ActiveDate,
MAX(CASE WHEN Activity = 'Inactive' THEN Date END) InactiveDate
FROM (
SELECT *,ROW_NUMBER() OVER(PARTITION BY CustomerID,Activity ORDER BY Date ) rn
FROM T
)t1
group by CustomerID,rn
sqlfiddle
You logic is a little unclear. If you want the next "inactive" date:
select CustomerID, date as active_date, inactive_date
from (select t.*,
min(case when activity = 'Inactive' then date end) over (partition by CustomerID order by date desc) as inactive_date
from t
) t
where activity = 'Active';