SQL Top 1 query on multiple columns - sql

I have a script which returns the following table. If I put the script in a subquery and give it a pseudonym, what script would generate the top row by EVENT_DATE for each CARE_ID? This has to be compatible with SQL2000. Thank you.
CARE_ID EVENT_ID EVENT_TYPE EVENT_DATE
3 18 B 13/07/2010 00:00
78 11 C 27/07/2009 00:00
78 9 T 28/07/2009 00:00
151 49 T 21/03/2010 00:00
217 102 C 30/03/2010 00:00
355 111 C 16/07/2010 00:00
355 56 T 17/07/2010 00:00
364 774 C 23/08/2012 00:00
369 117 C 28/07/2010 00:00
631 74 T 15/01/2010 00:00
631 148 C 02/02/2010 00:00
1066 91 T 15/11/2010 00:00
2123 280 T 10/07/2011 00:00
2265 448 C 31/05/2011 00:00
2512 183 B 04/02/2014 00:00
2691 906 C 12/01/2014 00:00
2694 307 T 15/06/2011 00:00
2694 544 C 02/07/2011 00:00
2892 85 B 19/12/2011 00:00
2892 641 C 13/02/2012 00:00
3038 660 C 09/08/2011 00:00
3162 407 T 15/04/2012 00:00
3178 780 C 01/09/2012 00:00
3311 175 B 27/01/2014 00:00
3344 869 C 01/10/2013 00:00
3426 474 T 13/07/2013 00:00
3606 479 T 03/01/2014 00:00
3770 917 C 11/01/2014 00:00

This is somewhat inefficient, but I see no better way to do it in SQL Server 2000:
select
t1.care_id,
t1.event_id,
t1.event_type,
t1.event_date
from TheTable t1
join TheTable t2
on t1.care_id = t2.care_id
and t1.event_date >= t2.event_date
group by
t1.care_id,
t1.event_id,
t1.event_type,
t1.event_date
having count(*) = 1
The query currently returns the most recent record per care_id. If you need the oldest, just change the >= to <=.
SQLFiddle: http://www.sqlfiddle.com/#!3/98536/6
A potential issue with the query above is that if you have two records with the same (latest) event_date, it will return none. Let me know if such cases are possible in your data set.

Try this, assume the earliest date is top row
select x.care_id,min(x.event_date) as FirstDate
from <table> x
group by x.care_id
To get all information, you need a bit more
select x.care_id,a.event_id,a.event_type,x.firstDate as Event_date
from <table> a
join (select b.care_id,min(b.event_date) as FirstDate
from <table> b
group by b.care_id ) x
on a.care_id=x.care_id and a.event_date=x.firstDate
Just type in on the fly, but should get you what you need.
Caveat, if care_id have identical event dates, you might get some duplicate rows.

Related

How to create a new SQL table with Mean, Median, and Mode?

Ok, so I am a new to SQL that's why I am asking this question.
I have got a table called: kpi_notification_metrics_per_month
This table has 2 columns:
Date
NotificationCount
I want to create a brand new table that will show
Mean
Median
Mode
For the NotificationCount column.
Example table:
Date NotificationCount
01/04/2018 00:00 0
31/03/2018 00:00 0
25/03/2018 00:00 0
24/03/2018 00:00 0
22/03/2018 00:00 0
18/03/2018 00:00 0
17/03/2018 00:00 0
14/03/2018 00:00 0
11/03/2018 00:00 0
07/04/2018 00:00 1
26/03/2018 00:00 1
21/03/2018 00:00 1
15/03/2018 00:00 1
13/03/2018 00:00 1
12/03/2018 00:00 1
10/03/2018 00:00 1
08/04/2018 00:00 2
30/03/2018 00:00 2
09/03/2018 00:00 2
08/03/2018 00:00 2
20/03/2018 00:00 3
19/03/2018 00:00 4
02/04/2018 00:00 9
23/03/2018 00:00 11
27/03/2018 00:00 22
03/04/2018 00:00 28
28/03/2018 00:00 34
04/04/2018 00:00 39
05/04/2018 00:00 43
29/03/2018 00:00 47
06/04/2018 00:00 50
16/03/2018 00:00 140
Expected results:
Mean Median Mode
13.90625 1 0
Here is how to do this in Oracle:
select
avg(notificationcount) as statistic_mean,
median(notificationcount) as statistic_median,
stats_mode(notificationcount) as statistic_mode
from mytable;
No need for another table. You can (and should) always query the data ad hoc. For convenience you can create a view as jarlh has suggested in the request comments.
Mean: Use Avg()
Select Avg(NotificationCount)
From kpi_notification_metrics_per_month
Median: Order by ASC and DESC for TOP 50 Percent of data, find the middle one.
Select ((
Select Top 1 NotificationCount
From (
Select Top 50 Percent NotificationCount
From kpi_notification_metrics_per_month
Where NotificationCount Is NOT NULL
Order By NotificationCount
) As A
Order By NotificationCountDESC) +
(
Select Top 1 NotificationCount
From (
Select Top 50 Percent NotificationCount
From kpi_notification_metrics_per_month
Where NotificationCount Is NOT NULL
Order By NotificationCount DESC
) As A
Order By NotificationCount Asc)) / 2
Mode: Get counts of each value set and get the top 1 row in DESC order.
SELECT TOP 1 with ties NotificationCount
FROM kpi_notification_metrics_per_month
WHERE NotificationCount IS Not NULL
GROUP BY NotificationCount
ORDER BY COUNT(*) DESC
All worked in Sql Server 2014.
Reference: http://blogs.lessthandot.com/index.php/datamgmt/datadesign/calculating-mean-median-and-mode-with-sq/

MAX value of column with corresponding columns

I am using an old SQL Server 2000.
Here is some sample data:
ROOMDATE rate bus_id quantity
2018-09-21 00:00:00.000 129 346686 2
2018-09-21 00:00:00.000 162 354247 36
2018-09-21 00:00:00.000 159 382897 150
2018-09-21 00:00:00.000 120 556111 25
2018-09-22 00:00:00.000 129 346686 8
2018-09-22 00:00:00.000 162 354247 86
2018-09-22 00:00:00.000 159 382897 150
2018-09-22 00:00:00.000 120 556111 25
2018-09-23 00:00:00.000 129 346686 23
2018-09-23 00:00:00.000 162 354247 146
2018-09-23 00:00:00.000 159 382897 9
2018-09-23 00:00:00.000 94 570135 23
Essentially what I am wanting is the MAX quantity of each day with it's corresponding rate and bus_id.
For example, I would want the following rows from my sample data above:
ROOMDATE rate bus_id quantity
2018-09-21 00:00:00.000 159 382897 150
2018-09-22 00:00:00.000 159 382897 150
2018-09-23 00:00:00.000 162 354247 146
From what I have read, SQL Server 2000 did not support ROW_NUMBER. But we can phrase your query using a subquery which finds the max quantity for each day:
SELECT t1.*
FROM yourTable t1
INNER JOIN
(
SELECT
CONVERT(char(10), ROOMDATE, 120) AS ROOMDATE,
MAX(quantity) AS max_quantity
FROM yourTable
GROUP BY CONVERT(char(10), ROOMDATE, 120)
) t2
ON CONVERT(char(10), t1.ROOMDATE, 120) = t2.ROOMDATE AND
t1.quantity = t2.max_quantity
ORDER BY
t1.ROOMDATE;
Demo

PostgreSQL: How do I join two tables based on same start and end time (timestamp without time zone)?

Okay, I came across this relevant question but it is slightly different than my case.
Problem
I have two similar type of tables in my PostgreSQL 9.5 database tbl1 and tbl2 both containing 1,274 rows. The structure and layout of table 1 is as follows:
Table 1:
id (integer) start_time end_time my_val1 (numeric)
51 1994-09-26 16:50:00 1994-10-29 13:30:00 3.7
52 1994-10-29 13:30:00 1994-11-27 12:30:00 2.4
53 1994-11-27 12:30:00 1994-12-29 09:25:00 7.6
54 1994-12-29 09:25:00 1994-12-31 23:59:59 2.9
54 1995-01-01 00:00:00 1995-02-05 13:50:00 2.9
55 1995-02-05 13:50:00 1995-03-12 11:10:00 1.6
56 1995-03-12 11:10:00 1995-04-11 09:05:00 2.2
171 1994-10-29 16:15:00 1994-11-27 19:10:00 6.9
172 1994-11-27 19:10:00 1994-12-29 11:40:00 4.2
173 1994-12-29 11:40:00 1994-12-31 23:59:59 6.7
173 1995-01-01 00:00:00 1995-02-05 15:30:00 6.7
174 1995-02-05 15:30:00 1995-03-12 09:45:00 3.2
175 1995-03-12 09:45:00 1995-04-11 11:30:00 1.2
176 1995-04-11 11:30:00 1995-05-11 15:30:00 2.7
321 1994-09-26 14:40:00 1994-10-30 14:30:00 0.2
322 1994-10-30 14:30:00 1994-11-27 14:45:00 7.8
323 1994-11-27 14:45:00 1994-12-29 14:20:00 4.6
324 1994-12-29 14:20:00 1994-12-31 23:59:59 4.1
324 1995-01-01 00:00:00 1995-02-05 14:35:00 4.1
325 1995-02-05 14:35:00 1995-03-12 11:30:00 8.2
326 1995-03-12 11:30:00 1995-04-11 09:45:00 1.2
.....
In some rows, start_time and end_time may look similar but whole time window may not be equal. For example,
id (integer) start_time end_time my_val1 (numeric)
54 1994-12-29 09:25:00 1994-12-31 23:59:59 2.9
173 1994-12-29 11:40:00 1994-12-31 23:59:59 6.7
Start_time and end_time are timestamp without time zone. The start_time and end_time have to be in one year window thus whenever there was a change of year from 1994 to 1995 then that row was divided into two rows therefore, there are repeating IDs in the column id. Table 2 tbl2 contains the similar start_time and end_time (timestamp without time zone) and column my_val2 (numeric). For each row in table 1 I need to join corresponding row of table 2 where start_time and end_time are similar.
What I have tried,
Select
a.id,
a.start_time, a.end_time,
a.my_val1,
b.my_val2
from tbl1 a
left join tbl2 b on
b.start_time = a.start_time
order by a.id;
The query returned 3,802 rows which is not desired. The desired result is 1,274 rows of table 1 joined with my_val2. I am aware of Postgres Distinct on clause but I need to keep all repeating ids of tbl1 and only need to join my_val2 of tbl2. Do I need to use Postgres Window function here. Can someone suggest that how to join these two tables?
why you don't add to the ON part the condition
ON b.start_time = a.start_time AND a.id = b.id
For each row in table 1 I need to join corresponding row of table 2
where start_time and end_time are similar.
SQL query should include end_time
SELECT a.id,
a.start_time,
a.end_time,
a.my_val1,
b.my_val2
FROM tbl1 a
LEFT JOIN tbl2 b
ON b.start_time = a.start_time
AND b.end_time = a.end_time
ORDER BY a.id;

Want to get sum till date for past 30 days

Following is the code I wrote to get the records
SELECT run_time, SUM(rec_cnt) reg_cnt FROM(
select run_time,rec_cnt from
(select TO_DATE(TO_CHAR(LST_UPDT_TIME,'DD-MON-YYYY'),'DD-MON-YYYY') run_time,max(Running_Total) rec_cnt from (
SELECT
LST_UPDT_TIME,
(
SELECT COUNT(*)
FROM DM_REG_SMRY T2
WHERE T2.LST_UPDT_TIME <= T1.LST_UPDT_TIME AND REG_STS_ID = 14
) AS Running_Total
FROM
DM_REG_SMRY T1
order by T1.LST_UPDT_TIME
)
group by TO_DATE(TO_CHAR(LST_UPDT_TIME,'DD-MON-YYYY'),'DD-MON-YYYY')
order by TO_DATE(TO_CHAR(LST_UPDT_TIME,'DD-MON-YYYY'),'DD-MON-YYYY')
)
UNION
(SELECT TRUNC(SYSDATE+1 - ROWNUM) run_time , 0 as rec_cnt FROM DUAL CONNECT BY ROWNUM <= 30)
)GROUP BY run_time
ORDER BY run_time;
I got following output
18-06-2015 00:00 6
19-06-2015 00:00 7
20-06-2015 00:00 0
21-06-2015 00:00 0
22-06-2015 00:00 0
23-06-2015 00:00 0
24-06-2015 00:00 12
25-06-2015 00:00 0
26-06-2015 00:00 0
27-06-2015 00:00 0
28-06-2015 00:00 0
29-06-2015 00:00 0
30-06-2015 00:00 0
01-07-2015 00:00 0
02-07-2015 00:00 0
03-07-2015 00:00 49
04-07-2015 00:00 0
05-07-2015 00:00 0
06-07-2015 00:00 0
07-07-2015 00:00 0
08-07-2015 00:00 0
09-07-2015 00:00 0
10-07-2015 00:00 49
11-07-2015 00:00 0
12-07-2015 00:00 0
13-07-2015 00:00 65
14-07-2015 00:00 77
15-07-2015 00:00 101
16-07-2015 00:00 0
17-07-2015 00:00 0
But I want the last non zero value to be repeated for the zero place
Please help
I'm not 100% sure, but If I understand correctly, you want to count the number of rows in DM_REG_SMRY accumulated in the past 30 days (starting with SYSDATE-(30-1) and ending with today SYSDATE-(1-1)) and with REG_STS_ID=14. And you want to get the accumulated count by date.
This means that if you have on DM_REG_SMRY (with REG_STS_ID=14):
6 rows on 18-06-2015
1 row on 19-06-2015
5 rows on 24-06-2015
37 rows on 03-07-2016
16 rows on 13-07-2015
12 rows on 14-07-2015
24 rows on 15-07-2015
you really want this result:
18-06-2015 00:00 6
19-06-2015 00:00 7
20-06-2015 00:00 7
21-06-2015 00:00 7
22-06-2015 00:00 7
23-06-2015 00:00 7
24-06-2015 00:00 12
25-06-2015 00:00 12
26-06-2015 00:00 12
27-06-2015 00:00 12
28-06-2015 00:00 12
29-06-2015 00:00 12
30-06-2015 00:00 12
01-07-2015 00:00 12
02-07-2015 00:00 12
03-07-2015 00:00 49
04-07-2015 00:00 49
05-07-2015 00:00 49
06-07-2015 00:00 49
07-07-2015 00:00 49
08-07-2015 00:00 49
09-07-2015 00:00 49
10-07-2015 00:00 49
11-07-2015 00:00 49
12-07-2015 00:00 49
13-07-2015 00:00 65
14-07-2015 00:00 77
15-07-2015 00:00 101
16-07-2015 00:00 101
17-07-2015 00:00 101
If this is what you really want then a possible solution is:
SELECT runtime,
(SELECT COUNT(*)
FROM DM_REG_SMRY
WHERE LST_UPDT_TIME >= TRUNC(SYSDATE - (30-1))
AND LST_UPDT_TIME < (runtime+1)
AND REG_STS_ID = 14
) reg_cnt
FROM
(
SELECT TRUNC(SYSDATE - (LEVEL - 1)) runtime
FROM DUAL
CONNECT BY LEVEL <= 30
) dates
ORDER BY runtime;

SQL Server 2000 - Row of data based on closest date

I have two tables as below and I want to return the rows for CARE_ID and WHO_STATUS where the MDT_DATE is the closest date that is <= the earliest SURGERY_DATE for each CARE_ID.
For instance for CARE_ID 5 the closest MDT_DATE which is <= the earliest SURGERY_DATE of 18/07/2009 is 17/07/2009 so the WHO_STATUS would be 2, and so on.
The script below works fine in SQL Server 2005 but it isn't backwards compatible with SQL Server 2000.
How could I rework this script so it will run in SQL Server 2000?
CARE_ID SURGERY_DATE
5 18/07/2009 00:00
5 23/07/2009 00:00
5 23/07/2009 00:00
5 23/07/2009 00:00
5 01/09/2009 00:00
5 03/09/2009 00:00
70 20/07/2009 00:00
70 21/07/2009 00:00
76 03/03/2010 00:00
78 08/07/2009 00:00
81 27/07/2009 00:00
82 27/07/2009 00:00
83 30/07/2009 00:00
86 29/07/2009 00:00
91 30/07/2009 00:00
103 03/08/2009 00:00
106 05/08/2009 00:00
125 07/08/2009 00:00
172 19/05/2010 00:00
CARE_ID MDT_DATE WHO_STATUS
5 17/07/2009 00:00 2
5 03/11/2009 00:00 1
70 23/03/2010 00:00 0
81 03/11/2009 00:00 1
81 18/11/2009 00:00 1
81 27/11/2009 00:00 3
81 27/03/2010 00:00 1
103 03/12/2008 00:00 0
103 04/01/2009 00:00 2
103 06/01/2010 00:00 1
103 08/02/2010 00:00 1
103 14/01/2013 00:00 1
172 20/07/2009 00:00 4
172 08/01/2010 00:00 3
172 25/09/2010 00:00 1
The query (working in SQL Server 2005):
SELECT t1.*,t2.WHO_STATUS
FROM (SELECT ROW_NUMBER() OVER (PARTITION BY CARE_ID ORDER BY SURGERY_DATE) AS Seq,*
FROM Table1)t1
CROSS APPLY(SELECT TOP 1 WHO_STATUS FROM Table2
WHERE CARE_ID = t1.CARE_ID
AND MDT_DATE < = t1.SURGERY_DATE
ORDER BY MDT_DATE DESC)t2
WHERE t1.Seq=1
You can use a correlated subquery for this:
select t1.*,
(select top 1 who_status
from table2 t2
where t2.care_id = t1.care_id and
t2.mdt_date <= t1.surgery_date
order by t2.mdt_date desc
) as who_status
from Table1 t1;
This will also work in SQL Server 2005.