SQL merging lines into one line - sql

I have a question about an SQL query.
I have a table with these column names:
date time route_id name
(and other columns not included here):
Date
route_id
Time
Name
2022-02-04
320
11:40:00
Taxi
2022-02-04
320
14:35:00
Taxi
I have made the following query:
Select
date,
LEFT(route_id_intern,4) as route_id,
CASE
WHEN time < '12:00:00' THEN 'Morning'
ELSE 'Free' END as 'Morning',
CASE
WHEN time > '12:00:00' THEN 'Afternoon'
ELSE 'Free' END as 'Afternoon',
Name,
FROM [DW2].[dbo].[FCT_RITTEN]
where year(date) = 2022 and month(date)=02 and day(date) = 04
and LEFT(route_id_intern,4) = 3209
My query gives the following result:
Date
route_id
Moring
Afternoon
Name
2022-02-04
320
Morning
Free
Taxi
2022-02-04
320
Free
Afternoon
Taxi
The data stays separated in two lines but I would like to have following result:
Date
route_id
Moring
Afternoon
Name
2022-02-04
320
Morning
Afternoon
Taxi
I have tried several methods but I keep getting these separated lines.
Please note the used data is anonymized for the data in the above samples but the problem stays the same.
Update:
After the reply of #HoneyBadger, I have amended my query:
Select
date,
LEFT(route_id_intern,3) as route_id,
MAX(CASE
WHEN time <= '12:00:00' THEN '1'
ELSE '0' END) as 'Morning',
MAX(CASE
WHEN time > '12:00:00' THEN '1'
ELSE '0' END) as 'Afternoon'
FROM [DW2].[dbo].[FCT_RITTEN]
where date = '2022-02-04'
and LEFT(route_id_intern,4) = 320
group by date, route_id_intern
Unfortunately, the result is still not as needed:
Date
route_id
Morning
Afternoon
2022-02-04
320
1
0
2022-02-04
320
0
1

You are almost there I guess. Removing the group by and putting your query inside a cte and then playing with the time column a bit as below will give you the desired result.
;with cte as(
Select
date,
LEFT(route_id_intern,4) as route_id,
CASE
WHEN max(time)over(partition by date,name order by date) > '12:00:00'
and count(1)over(partition by date,name order by date) > 1
THEN 'Morning-Afternoon'
WHEN max(time)over(partition by date,name order by date) = '12:00:00'
and count(1)over(partition by date,name order by date) > 1
THEN 'Morning-Lunch'
when max(time)over(partition by date,name order by date) > '12:00:00'
and count(1)over(partition by date,name order by date) = 1
then 'Free-Afternoon'
when max(time)over(partition by date,name order by date) < '12:00:00'
and count(1)over(partition by date,name order by date) = 1
then 'Morning-Free'
END as 'day',
Name,
FROM [DW2].[dbo].[FCT_RITTEN]
where year(date) = 2022 and month(date)=02 and day(date) = 04
and LEFT(route_id_intern,4) = 3209
)
select distinct Date,route_id
,SUBSTRING([day],1,charindex('-',[day])-1) as [Morning]
,SUBSTRING([day],charindex('-',[day])+1,len([day])) as [Afternoon]
,name
from cte

One the challenges here is that your sample data and query don't line up. There are columns referenced in your query not in your sample data. I think this is likely because you obfuscated the real information so much that some details got lost. I made some best guesses and I think that what you need is conditional aggregation. At the very least this will return the desired output you wanted based (mostly) on the sample provided.
create table FCT_RITTEN
(
MyDate date
, route_id int
, MyTime time
, Name varchar(10)
)
insert FCT_RITTEN
select '20220204', 320, '11:40:00', 'Taxi' union all
select '20220204', 320, '14:35:00', 'Taxi'
Select
MyDate
, route_id
, Morning = max(CASE WHEN MyTime < '12:00:00' THEN 'Morning' END)
, Afternoon = max(CASE WHEN MyTime > '12:00:00' THEN 'Afternoon' END)
, Name
FROM [dbo].[FCT_RITTEN]
where MyDate = '20220204'
and route_id like '320%' --Why are you doing a string comparison on an integer column?
group by MyDate
, route_id
, Name

Related

Query with group by with CustID and amounts daily and MTD

I have the following data:
ID
username
Cost
time
1
test1
1
2021-05-22 11:48:36.000
2
test2
2
2021-05-20 12:55:22.000
3
test3
5
2021-05-21 00:00:0-0.000
I would like to count the costs for the username with a daily figure and a month to date figure in once table
I have got the following
SELECT
username,
COUNT(*) cost
FROM
dbo.sent
WHERE
time BETWEEN {ts '2021-05-01 00:00:00'} AND {ts '2021-05-22 23:59:59'}
GROUP BY
username
ORDER BY
cost DESC;
This will return the monthly figures and changing the time to '2021-05-22 00:00:00'} AND {ts '2021-05-22 23:59:59'} will give the the daily however I would like a table to show daily and MTD together
username
Daily
MTD
test1
1
1012
test2
2
500
test3
5
22
Any help or pointers would be fantastic, I am guessing that I need a temp table and then run again using the MTD range updating the temp table where the username is the same then to export and delete - but i have no idea where to start.
First, you need one row per user and date, not just one row per user.
Second, you should fix your date arithmetic so you are not missing a second.
Then, you can use aggregation and window functions:
SELECT username, CONVERT(DATE, time) as dte,
COUNT(*) as cost_per_day,
SUM(COUNT(*)) OVER (PARTITION BY username ORDER BY CONVERT(DATE, time)) as mtd
FROM dbo.sent s
WHERE time >= '2021-05-01' AND
time < '2021-05-23'
GROUP BY username, CONVERT(DATE, time)
ORDER BY username, dte;
You can learn more about window functions in a SQL tutorial (or, ahem, a book on SQL) or in the documentation.
EDIT:
If you only want the most recent date and MTD, then you can either filter the above query for the most recent date or use conditional aggregation:
SELECT username,
SUM(CASE WHEN CONVERT(DATE, time) = '2021-05-22' THEN 1 ELSE 0 END) as cost_per_most_recent_day,
COUNT(*) as MTD
FROM dbo.sent s
WHERE time >= '2021-05-01' AND
time < '2021-05-23'
GROUP BY username
ORDER BY username;
And, you can actually express the query using the current date so it doesn't have to be hardcoded. For this version:
SELECT username,
SUM(CASE WHEN CONVERT(DATE, time) = CONVERT(DATE, GETDATE()) THEN 1 ELSE 0 END) as cost_per_most_recent_day,
COUNT(*) as MTD
FROM dbo.sent s
WHERE time >= DATEFROMPARTS(YEAR(GETDATE()), MONTH(GETDATE()), 1) AND
time < DATEADD(DAY, 1, CONVERT(DATE, GETDATE()))
GROUP BY username
ORDER BY username;

Date filtering in SQL

Table below consists of 2 columns: a unique identifier and date. I am trying to build a new column of episodes, where a new episode would be triggered when >= 3 months between dates. This process should occur for each unique EMID. In the table attached, EMID ending in 98 would only have 1 episode, there are no intervals >2 months between each row in the date column. However, EMID ending in 03 would have 2 episodes, as there is almost a 3 year gap between rows 12 and 13. I have tried the following code, which doesn't work.
Table:
SELECT TOP (1000) [EMID],[Date]
CASE
WHEN DATEDIFF(month, Date, LEAD Date) <3
THEN "1"
ELSE IF DATEDIFF(month, Date, LEAD Date) BETWEEN 3 AND 5
THEN "2"
ELSE "3"
END episode
FROM [res_treatment_escalation].[dbo].[cspine42920a]
EDIT: Using Microsoft SQL Server Management Studio.
EDIT 2: I have made some progress but the output is not exactly what I am looking for. Here is the query I used:
SELECT TOP (1000) [EMID],[visit_date_01],
CASE
WHEN DATEDIFF(DAY, visit_date_01, LAG(visit_date_01,1,getdate()) OVER (partition by EMID order by EMID)) <= 90 THEN '1'
WHEN DATEDIFF(DAY, visit_date_01, LAG(visit_date_01,1,getdate()) OVER (PARTITION BY EMID ORDER BY EMID)) BETWEEN 90 AND 179 THEN '2'
WHEN DATEDIFF(DAY, visit_date_01, LAG(visit_date_01,1,getdate()) OVER (PARTITION BY EMID order by EMID)) > 180 THEN '3'
END AS EPISODE
FROM [res_treatment_escalation].[dbo].['c-spine_full_dataset_4#29#20_wi$']
table2Here is the actual vs expected output
The partition by EMID does not seem to be working correctly. Every time there is a new EMID a new episode is triggered. I am using day instead of month as the filter in DATEDIFF- this does not seem to recognize new episodes within the same EMID
Hmmm: Use LAG() to get the previous date. Use a date comparison to assign a flag and then a cumulative sum:
select c.*,
sum(case when prev_date > dateadd(month, -3, date) then 0 else 1 end) over
(partition by emid order by date) as episode_number
from (select c.*, lag(date) over (partition by emid order by date) as prev_date
from res_treatment_escalation.dbo.cspine42920a c
) c;

How to calculate a count of users who did a thing X times during a rolling 7-day period in SQL Server?

I want to calculate a count of unique users who have posted 5 or more times over the course of a 7-day rolling period. How do I do this?
I know how to calculate a count of users who have posted 1 or more times over the course of a 7-day rolling period. The query looks like this:
with PostsPerDay as (
select cast(CreationDate as Date) [Day]
, OwnerUserId [User]
, count(*) Post
from Posts
where CreationDate > '2017-07-01'
group by
cast(CreationDate as Date)
, OwnerUserId
)
select [Day], count(distinct [User]) DailyPosters, Rolling7DayCount
from PostsPerDay
outer apply (
select count(distinct [User]) Rolling7DayCount
from PostsPerDay ppd
where ppd.[Day] >= dateadd(dd, -7, PostsPerDay.[Day])
and ppd.[Day] < PostsPerDay.[Day]
) Rolling7DayCount
group by [Day], Rolling7DayCount
order by 1
Here it is at the Stack Exchange Data Explorer.
Desired Results
Ideally I'm looking for a four-column result: Day, DailyPosters, Rolling7DayCount, Rolling7DayCount5xPosters. (The sample query returns the first 3 columns.)
To be extra clear: I'm hoping to count users who have posted 5x over the course of any 7 day period ending on a specific date. So simply adding a Having to the CTE won't give me what I need.
Any performance tips would be appreciated, too!
In your "PostsPerDay" CTE change it to this:
SELECT
CAST(CreationDate AS DATE) [Day]
,OwnerUserId
,COUNT(*) Post
FROM
Posts
WHERE
CreationDate > '2017-07-01'
GROUP BY
CAST(CreationDate AS DATE)
,OwnerUserId
HAVING COUNT(*) >= 5
I only added the "HAVING" filter.
After some troubleshooting and a bit of help from Stack Overflow developer #BenjaminHodgson, I figured it out.
Here's the code:
DECLARE #Date1 DATE, #Date2 DATE
SET #Date1 = '20170601' -- start date
SET #Date2 = '20170726';-- end date
with Days as (
SELECT DATEADD(DAY,number+1,#Date1) [Date]
FROM master..spt_values
WHERE type = 'P'
AND DATEADD(DAY,number+1,#Date1) < #Date2
),
-- create calendar of days
cal as (
select *
from Days
),
data as (
select
cal.[Date]
, x.OwnerUserId [User]
, x.PostsLast7Days
from cal
cross apply (
select
OwnerUserId
, count(*) PostsLast7days
from Posts
where CreationDate between dateadd(dd, -7, cal.[Date]) and cal.[Date]
group by OwnerUserId
) x
)
select
distinct Date,
sum(case when PostsLast7days > 0 then 1 else 0 End) [Sent >= 1 Posts in Preceding 7 Days],
sum(case when PostsLast7days >= 5 then 1 else 0 End) [Sent >= 5 Posts in Preceding 7 Days],
sum(case when PostsLast7days >= 10 then 1 else 0 End) [Sent >= 10 Posts in Preceding 7 Days],
sum(case when PostsLast7days >= 20 then 1 else 0 End) [Sent >= 20 Posts in Preceding 7 Days]
from data
group by Date
order by 1 asc
You can see the code in action at the Stack Exchange Data Explorer.

How To Select Records in a Status Between Timestamps? T-SQL

I have a T-SQL Quotes table and need to be able to count how many quotes were in an open status during past months.
The dates I have to work with are an 'Add_Date' timestamp and an 'Update_Date' timestamp. Once a quote is put into a 'Closed_Status' of '1' it can no longer be updated. Therefore, the 'Update_Date' effectively becomes the Closed_Status timestamp.
I'm stuck because I can't figure out how to select all open quotes that were open in a particular month.
Here's a few example records:
Quote_No Add_Date Update_Date Open_Status Closed_Status
001 01-01-2016 NULL 1 0
002 01-01-2016 3-1-2016 0 1
003 01-01-2016 4-1-2016 0 1
The desired result would be:
Year Month Open_Quote_Count
2016 01 3
2016 02 3
2016 03 2
2016 04 1
I've hit a mental wall on this one, I've tried to do some case when filtering but I just can't seem to figure this puzzle out. Ideally I wouldn't be hard-coding in dates because this spans years and I don't want to maintain this once written.
Thank you in advance for your help.
You are doing this by month. So, three options come to mind:
A list of all months using left join.
A recursive CTE.
A number table.
Let me show the last:
with n as (
select row_number() over (order by (select null)) - 1 as n
from master..spt_values
)
select format(dateadd(month, n.n, q.add_date), 'yyyy-MM') as yyyymm,
count(*) as Open_Quote_Count
from quotes q join
n
on (closed_status = 1 and dateadd(month, n.n, q.add_date) <= q.update_date) or
(closed_status = 0 and dateadd(month, n.n, q.add_date) <= getdate())
group by format(dateadd(month, n.n, q.add_date), 'yyyy-MM')
order by yyyymm;
This does assume that each month has at least one open record. That seems reasonable for this purpose.
You can use datepart to extract parts of a date, so something like:
select datepart(year, add_date) as 'year',
datepart(month, date_date) as 'month',
count(1)
from theTable
where open_status = 1
group by datepart(year, add_date), datepart(month, date_date)
Note: this counts for the starting month and primarily to show the use of datepart.
Updated as misunderstood the initial request.
Consider following test data:
DECLARE #test TABLE
(
Quote_No VARCHAR(3),
Add_Date DATE,
Update_Date DATE,
Open_Status INT,
Closed_Status INT
)
INSERT INTO #test (Quote_No, Add_Date, Update_Date, Open_Status, Closed_Status)
VALUES ('001', '20160101', NULL, 1, 0)
, ('002', '20160101', '20160301', 0, 1)
, ('003', '20160101', '20160401', 0, 1)
Here is a recursive solution, that doesn't rely on system tables BUT also performs poorer. As we are talking about months and year combinations, the number of recursions will not get overhand.
;WITH YearMonths AS
(
SELECT YEAR(MIN(Add_Date)) AS [Year]
, MONTH(MIN(Add_Date)) AS [Month]
, MIN(Add_Date) AS YMDate
FROM #test
UNION ALL
SELECT YEAR(DATEADD(MONTH,1,YMDate))
, MONTH(DATEADD(MONTH,1,YMDate))
, DATEADD(MONTH,1,YMDate)
FROM YearMonths
WHERE YMDate <= SYSDATETIME()
)
SELECT [Year]
, [Month]
, COUNT(*) AS Open_Quote_Count
FROM YearMonths ym
INNER JOIN #test t
ON (
[Year] * 100 + [Month] <= CAST(FORMAT(t.Update_Date, 'yyyyMM') AS INT)
AND t.Closed_Status = 1
)
OR (
[Year] * 100 + [Month] <= CAST(FORMAT(SYSDATETIME(), 'yyyyMM') AS INT)
AND t.Closed_Status = 0
)
GROUP BY [Year], [Month]
ORDER BY [Year], [Month]
Statement is longer, also more readable and lists all year/month combinations to date.
Take a look at Date and Time Data Types and Functions for SQL-Server 2008+
and Recursive Queries Using Common Table Expressions

SQL Query to combine results and show in a PIVOT

Below mentioned are the two of my queries:
SELECT WINDOWS_NT_LOGIN, COUNT(DPS_NUMBER) as TotalDPS
FROM DispatcherProductivity
WHERE DPS_Processed_Time_Stamp>='12/04/2014 10:30 AM'
AND DPS_Processed_Time_Stamp<='12/05/2014 10:30 AM'
GROUP BY WINDOWS_NT_LOGIN
ORDER BY TotalDPS
SELECT STATUS, COUNT(DPS_NUMBER) AS TotalDPS
FROM DispatcherProductivity
WHERE DPS_Processed_Time_Stamp>='12/04/2014 10:30 AM'
AND DPS_Processed_Time_Stamp<='12/05/2014 10:30 AM'
GROUP BY STATUS
ORDER BY TotalDPS
Their respective Results are:
WINDOWS_NT_LOGIN TotalDPS
A_S 72
T_I_S 133
STATUS TotalDPS
ID 1
Can 2
NHD 3
SED 14
Ord 185
I would like to get the results in this format:
WINDOWS_NT_LOGIN ID Can NHD SED Ord
A_S 2 70
T_I_S 1 2 3 12 115
Thanks
You can use the PIVOT function for this:
SELECT pvt.WINDOWS_NT_LOGIN,
pvt.[ID],
pvt.[Can],
pvt.[NHD],
pvt.[SED],
pvt.[Ord]
FROM ( SELECT WINDOWS_NT_LOGIN, STATUS, DPS_NUMBER
FROM DispatcherProductivity
WHERE DPS_Processed_Time_Stamp>='20141204 10:30:00'
AND DPS_Processed_Time_Stamp<='20141205 10:30:00'
) AS t
PIVOT
( COUNT(DPS_NUMBER)
FOR STATUS IN ([ID], [Can], [NHD], [SED], [Ord])
) AS pvt;
N.B. I changed your dates to the culture invariant format yyyyMMdd hh:mm:ss, however, I was not sure if 12/04/2014 was supposed to tbe 12th April, or 4th December (the exact problem with that format), so it is possible I have put the wrong dates in. I assumed you meant 4th December as that is today.
For further reading read Bad habits to kick : mis-handling date / range queries
I tend to use conditional aggregations for pivots. In this case:
SELECT WINDOWS_NT_LOGIN, COUNT(DPS_NUMBER) as TotalDPS,
SUM(CASE WHEN status = 'ID' THEN DPS_Number END) as ID,
SUM(CASE WHEN status = 'Can' THEN DPS_Number END) as Can,
SUM(CASE WHEN status = 'NHD' THEN DPS_Number END) as NHD,
SUM(CASE WHEN status = 'SED' THEN DPS_Number END) as SED,
SUM(CASE WHEN status = 'Ord' THEN DPS_Number END) as Ord
FROM DispatcherProductivity
WHERE DPS_Processed_Time_Stamp >= '12/04/2014 10:30 AM' AND
DPS_Processed_Time_Stamp <= '12/05/2014 10:30 AM'
GROUP BY WINDOWS_NT_LOGIN;
I would also recommend that you use YYYY-MM-DD format for your dates. I, for one, don't know if your dates are for December or April and May.