SQL Server 2008 multiple column filter - sql

I have stuck on sql query to bring the wanted data. I have table as following
I have tried cte table but did not work . I need the get source 'O' if available else 'T' with max sequence as above result table.
select district
, id
, building
, year
, date
, period
, sequence
, source from GetAttData gt with (nolock) where sequence in (select max(sequence) from GetAttData with (nolock)
where district = gt.district
and building = gt.building
and year = gt.year
and id= gt.id
group by district, id, building, year, date, period)
and source = 'O'

select
district, id, building, year, date, period, sequence, source
from (
select district, id, building, year, date, period, sequence, source,
row_number() over(partition by district, id, building, year, date, period
order by case when source = 'O' then 0 else 1 end, sequence desc
) as takeme
) foo
where takeme = 1

Related

Calculate time span between two specific statuses on the database for each ID

I have a table on the database that contains statuses updated on each vehicle I have, I want to calculate how many days each vehicle spends time between two specific statuses 'Maintenance' and 'Read'.
My table looks something like this
and I want to result to be like this, only show the number of days a vehicle spends in maintenance before becoming ready on a specific day
The code I written looks like this
drop table if exists #temps1
select
VehicleId,
json_value(VehiclesHistoryStatusID.text,'$.en') as VehiclesHistoryStatus,
VehiclesHistory.CreationTime,
datediff(day, VehiclesHistory.CreationTime ,
lead(VehiclesHistory.CreationTime ) over (order by VehiclesHistory.CreationTime ) ) as days,
lag(json_value(VehiclesHistoryStatusID.text,'$.en')) over (order by VehiclesHistory.CreationTime) as PrevStatus,
case
when (lag(json_value(VehiclesHistoryStatusID.text,'$.en')) over (order by VehiclesHistory.CreationTime) <> json_value(VehiclesHistoryStatusID.text,'$.en')) THEN datediff(day, VehiclesHistory.CreationTime , (lag(VehiclesHistory.CreationTime ) over (order by VehiclesHistory.CreationTime ))) else 0 end as testing
into #temps1
from fleet.VehicleHistory VehiclesHistory
left join Fleet.Lookups as VehiclesHistoryStatusID on VehiclesHistoryStatusID.Id = VehiclesHistory.StatusId
where (year(VehiclesHistory.CreationTime) > 2021 and (VehiclesHistory.StatusId = 140 Or VehiclesHistory.StatusId = 144) )
group by VehiclesHistory.VehicleId ,VehiclesHistory.CreationTime , VehiclesHistoryStatusID.text
order by VehicleId desc
drop table if exists #temps2
select * into #temps2 from #temps1 where testing <> 0
select * from #temps2
Try this
SELECT innerQ.VehichleID,innerQ.CreationDate,innerQ.Status
,SUM(DATEDIFF(DAY,innerQ.PrevMaintenance,innerQ.CreationDate)) AS DayDuration
FROM
(
SELECT t1.VehichleID,t1.CreationDate,t1.Status,
(SELECT top(1) t2.CreationDate FROM dbo.Test t2
WHERE t1.VehichleID=t2.VehichleID
AND t2.CreationDate<t1.CreationDate
AND t2.Status='Maintenance'
ORDER BY t2.CreationDate Desc) AS PrevMaintenance
FROM
dbo.Test t1 WHERE t1.Status='Ready'
) innerQ
WHERE innerQ.PrevMaintenance IS NOT NULL
GROUP BY innerQ.VehichleID,innerQ.CreationDate,innerQ.Status
In this query first we are finding the most recent 'maintenance' date before each 'ready' date in the inner most query (if exists). Then calculate the time span with DATEDIFF and sum all this spans for each vehicle.

SQL - find row with closest date but different column value

i'm new to SQL and i would need an help.
I have a TAB and I need to find for any item B in the TAB the item A with the closest date. In this case the A with 02.09.2021 04:25:30
Date.
Item
07.09.2021 05:02:05
A
06.09.2021 05:01:02
A
05.09.2021 05:00:02
A
04.09.2021 04:59:01
A
03.09.2021 04:58:03
A
02.09.2021 04:56:55
A
02.09.2021 04:33:56
B
02.09.2021 04:25:30
A
WITH CTE(DATE,ITEM)AS
(
SELECT '20210907 05:02:05' , 'A'UNION ALL
SELECT '20210906 05:01:02' , 'A'UNION ALL
SELECT '20210905 05:00:02' , 'A'UNION ALL
SELECT'20210904 04:59:01' , 'A'UNION ALL
SELECT'20210903 04:58:03' , 'A'UNION ALL
SELECT'20210902 04:56:55' , 'A'UNION ALL
SELECT'20210902 04:33:56' , 'B'UNION ALL
SELECT'20210902 04:25:30' , 'A'
)
SELECT
CAST(C.DATE AS DATETIME)X_DATE,C.ITEM,Q.CLOSEST
FROM CTE AS C
OUTER APPLY
(
SELECT TOP 1 CAST(X.DATE AS DATETIME)CLOSEST
FROM CTE AS X
WHERE X.ITEM='A'AND CAST(X.DATE AS DATETIME)<CAST(C.DATE AS DATETIME)
ORDER BY CAST(X.DATE AS DATETIME) ASC
)Q
WHERE C.ITEM='B'
You can use OUTER APPLY-approach as in the above query.
Please also take a look that datetime-column (DATE)is written in the ISO-compliant form
Your data has only two columns. If you want the only the closest A timestamp, then the fastest way is probably window functions:
select t.*,
(case when prev_a_date is null then next_a_date
when next_a_date is null then prev_a_date
when datediff(second, prev_a_date, date) <= datediff(second, date, next_a_date) then prev_a_date
else next_a_date
end) as a_date
from (select t.*,
max(case when item = 'A' then date end) over (order by date) as prev_a_date,
min(case when item = 'A' then date end) over (order by date desc) as next_a_date
from t
) t
where item = 'B';
This uses seconds to measure the time difference, but you can use a smaller unit if appropriate.
You can also do this using apply if you have more columns from the "A" rows that you want:
select tb.*, ta.*
from t b outer apply
(select top (1) ta.*
from t ta
where item = 'A'
order by abs(datediff(second, a.date, b.date))
) t
where item = 'B';

SQL query get first punch and lastpunch for every employee

I have the following data in SQL Server:
What I need is that for every day by employee (employeeId) I get in the follwing data:
AccessCode column means I = PunchIn and O = PunchOut and we have to filter by lunchtype = 'N'
So basically the result should return only one row per day and all the punch ins and punch outs in the middle of the first entrance and last exist shouldn't be considered.
Any clue?
You can do conditional aggregation :
select employeeid, In, Out,
dateadd(second, datediff(second, in, out), 0) as Hours
from(select employeeid,
min(case when AccessCode = 'I' then timestamp end) as In,
max(case when AccessCode = 'O' then timestamp end) as Out
from table t
where lunchtype = 'N'
group by employeeid, convert(date, times)
) t;
You can try this
with cte as
(select
*,
cast(times as date) as myda
from myTable
)
select
employeeid,
mn as punch_in,
mx as punch_out,
datediff(minute, mn, mx)/60.0 as hours
from
(select
employeeid,
min(times) over (partition by myda) as mn,
max(times) over (partition by myda) as mx
from cte
) t
group by
employeeid, mn, mx
Try this:
select employeeId,
min(case when accessCode = 'I' then timestamp end) punchIn,
max(case when accessCode = 'O' then timestamp end) punchOut
from myTable
where lunchtype = 'N'
group by employeeId

Calculate % of total - redshift / sql

I'm trying to calculate the percentage of one column over a secondary total column.
I wrote:
create temporary table screenings_count_2018 as
select guid,
datepart(y, screening_screen_date) as year,
sum(case when screening_package = 4 then 1 end) as count_package_4,
sum(case when screening_package = 3 then 1 end) as count_package_3,
sum(case when screening_package = 2 then 1 end) as count_package_2,
sum(case when screening_package = 1 then 1 end) as count_package_1,
sum(case when screening_package in (1, 2, 3, 4) then 1 end) as count_total_packages
from prod.leasing_fact
where year = 2018
group by guid, year;
That table establishes the initial count and total count columns. All columns look correct.
Then, I'm using ratio_to_report to calculate the percentage (referencing this tutorial):
create temporary table screenings_percentage as
select
guid,
year,
ratio_to_report(count_package_1) over (partition by count_total_packages) as percentage_package_1
from screenings_count_2018
group by guid, year,count_package_1,count_total_packages
order by percentage_package_1 desc;
I also tried:
select
guid,
year,
sum(count_package_1/count_total_packages) as percentage_package_1
-- ratio_to_report(count_package_1) over (partition by count_total_packages) as percentage_package_1
from screenings_count_2018
group by guid, year,count_package_1,count_total_packages
order by percentage_package_1 desc;
Unfortunately, percentage_package_1 just returns all null values (this is not correct - I'm expecting percentages). Neither are working.
What am I doing wrong?
Thanks!
Since you are already laid out the columns with components and a total, in creating screenings_count_2018, do you actually need to use ratio_to_report?
select
, guid
, year
, count_package_1/count_total_packages as percentage_package_1
, count_package_2/count_total_packages as percentage_package_2
, count_package_3/count_total_packages as percentage_package_3
, count_package_4/count_total_packages as percentage_package_4
from screenings_count_2018
That should work. NB are you guaranteed to never have count_total_packages be zero? If it can be zero you'll need to handle it. One way is with a case statement.
If you wish for the per-package percentages to appear in a single column, then you can use ratio_to_report -- it is a "window" analytic function and it will be something like this against the original table.
with count_table as (
select guid
, datepart(y, screening_screen_date) as year
, screening_package
, count(1) as count
from prod.leasing_fact
where year = 2018
group by guid
, datepart(y, screening_screen_date)
, screening_package
)
select guid
, year
, screening_package
, ratio_to_report(count) over(partition by guid, year, screening_package) as perc_of_total
from count_table
you will need round(100.0*count_package_1/count_total_packages,1) and so on as you already calculated the subtotal and total

Group by in columns and rows, counts and percentages per day

I have a table that has data like following.
attr |time
----------------|--------------------------
abc |2018-08-06 10:17:25.282546
def |2018-08-06 10:17:25.325676
pqr |2018-08-05 10:17:25.366823
abc |2018-08-06 10:17:25.407941
def |2018-08-05 10:17:25.449249
I want to group them and count by attr column row wise and also create additional columns in to show their counts per day and percentages as shown below.
attr |day1_count| day1_%| day2_count| day2_%
----------------|----------|-------|-----------|-------
abc |2 |66.6% | 0 | 0.0%
def |1 |33.3% | 1 | 50.0%
pqr |0 |0.0% | 1 | 50.0%
I'm able to display one count by using group by but unable to find out how to even seperate them to multiple columns. I tried to generate day1 percentage with
SELECT attr, count(attr), count(attr) / sum(sub.day1_count) * 100 as percentage from (
SELECT attr, count(*) as day1_count FROM my_table WHERE DATEPART(week, time) = DATEPART(day, GETDate()) GROUP BY attr) as sub
GROUP BY attr;
But this also is not giving me correct answer, I'm getting all zeroes for percentage and count as 1. Any help is appreciated. I'm trying to do this in Redshift which follows postgresql syntax.
Let's nail the logic before presenting:
with CTE1 as
(
select attr, DATEPART(day, time) as theday, count(*) as thecount
from MyTable
)
, CTE2 as
(
select theday, sum(thecount) as daytotal
from CTE1
group by theday
)
select t1.attr, t1.theday, t1.thecount, t1.thecount/t2.daytotal as percentofday
from CTE1 t1
inner join CTE2 t2
on t1.theday = t2.theday
From here you can pivot to create a day by day if you feel the need
I am trying to enhance the query #johnHC btw if you needs for 7days then you have to those days in case when
with CTE1 as
(
select attr, time::date as theday, count(*) as thecount
from t group by attr,time::date
)
, CTE2 as
(
select theday, sum(thecount) as daytotal
from CTE1
group by theday
)
,
CTE3 as
(
select t1.attr, EXTRACT(DOW FROM t1.theday) as day_nmbr,t1.theday, t1.thecount, t1.thecount/t2.daytotal as percentofday
from CTE1 t1
inner join CTE2 t2
on t1.theday = t2.theday
)
select CTE3.attr,
max(case when day_nmbr=0 then CTE3.thecount end) as day1Cnt,
max(case when day_nmbr=0 then percentofday end) as day1,
max(case when day_nmbr=1 then CTE3.thecount end) as day2Cnt,
max( case when day_nmbr=1 then percentofday end) day2
from CTE3 group by CTE3.attr
http://sqlfiddle.com/#!17/54ace/20
In case that you have only 2 days:
http://sqlfiddle.com/#!17/3bdad/3 (days descending as in your example from left to right)
http://sqlfiddle.com/#!17/3bdad/5 (days ascending)
The main idea is already mentioned in the other answers. Instead of joining the CTEs for calculating the values I am using window functions which is a bit shorter and more readable I think. The pivot is done the same way.
SELECT
attr,
COALESCE(max(count) FILTER (WHERE day_number = 0), 0) as day1_count, -- D
COALESCE(max(percent) FILTER (WHERE day_number = 0), 0) as day1_percent,
COALESCE(max(count) FILTER (WHERE day_number = 1), 0) as day2_count,
COALESCE(max(percent) FILTER (WHERE day_number = 1), 0) as day2_percent
/*
Add more days here
*/
FROM(
SELECT *, (count::float/count_per_day)::decimal(5, 2) as percent -- C
FROM (
SELECT DISTINCT
attr,
MAX(time::date) OVER () - time::date as day_number, -- B
count(*) OVER (partition by time::date, attr) as count, -- A
count(*) OVER (partition by time::date) as count_per_day
FROM test_table
)s
)s
GROUP BY attr
ORDER BY attr
A counting the rows per day and counting the rows per day AND attr
B for more readability I convert the date into numbers. Here I take the difference between current date of the row and the maximum date available in the table. So I get a counter from 0 (first day) up to n - 1 (last day)
C calculating the percentage and rounding
D pivot by filter the day numbers. The COALESCE avoids the NULL values and switched them into 0. To add more days you can multiply these columns.
Edit: Made the day counter more flexible for more days; new SQL Fiddle
Basically, I see this as conditional aggregation. But you need to get an enumerator for the date for the pivoting. So:
SELECT attr,
COUNT(*) FILTER (WHERE day_number = 1) as day1_count,
COUNT(*) FILTER (WHERE day_number = 1) / cnt as day1_percent,
COUNT(*) FILTER (WHERE day_number = 2) as day2_count,
COUNT(*) FILTER (WHERE day_number = 2) / cnt as day2_percent
FROM (SELECT attr,
DENSE_RANK() OVER (ORDER BY time::date DESC) as day_number,
1.0 * COUNT(*) OVER (PARTITION BY attr) as cnt
FROM test_table
) s
GROUP BY attr, cnt
ORDER BY attr;
Here is a SQL Fiddle.