tsql proc logic help - sql

I am weak in SQL and need some help working through some logic with my proc.
Three pieces: store procedure, table1, table2
Table 1 stores most recent data for specific IDs
Customer_id status_dte status_cde app_dte
001 2010-04-19 Y 2010-04-19
Table 2 stores history of data for specific customer IDs:
For example:
Log_id customer_Id status_dte status_cde
01 001 2010-04-20 N
02 001 2010-04-19 Y
03 001 2010-04-19 N
04 001 2010-04-19 Y
The stored proecure currently throws an error if the status date from
table1 is < than app_date in table1.
If #status_dte < app_date
Error
Note: #status_dte is a variable stored as the status_dte from table1
However, I want it to throw an error when the EARLIEST status_dte from
table 2 with a status_cde of 'Y' is less than the app_dte column in
table 1.
Keep in mind that this earliest date is not stored anywhere, the history
of data changes per customer. Another customer might have the following
history.
Log_id customer_Id status_dte status_cde
01 002 2010-04-20 N
02 002 2010-04-18 N
03 002 2010-04-19 Y
04 002 2010-04-19 Y
Any ideas on how I can approach this?

You can test in one go per customer to find where the earliest date is less than the appdate using this construct
IF EXISTS (SELECT *
FROM
mytable M
JOIN
HistoryTable H ON M.customer_Id = H.customer_Id
WHERE
H.status_cde = 'Y'
GROUP BY
H.customer_Id, M.app_dte
HAVING
MIN(H.status_dte) < M.app_dte)
...error...

If instead of a single customer, you wanted a list of customers with their earliest status date prior to the app_date, you could do something like:
;With
CustomerStatusDates As
(
Select T2.customer_id, Min(T2.status_dte) As MinDate
From Table2 As T2
Where status_cte = 'Y'
Group By T2.customer_id
)
Select ....
From Table1 As T1
Join CustomerStatusDates As T2
On T2.Customer_Id = T1.Customer_Id
Where T2.MinDate < T1.app_dte

Related

How to get the MAX value of unique column in sql and aggregate other?

I want get the row with max 'date', groupy just by unique 'id' but without considering another columns.
I tried this query:
(But don't work cause modify anothers columns)
SELECT id,
MAX(num),
MAX(date),-- I just want the max of this column
MAX(product_name),
MAX(other_columns)
FROM TB
GROUP BY id
Table:
id num date product_name other_columns
123 0001 2021-12-01 exit 12315413
123 0002 2021-12-02 entry 65481328
333 0001 2021-12-03 entry 13848136
333 ASDV 2021-12-04 exit 1325165
Expected Result:
id num date product_name
123 0002 2021-12-02 entry
333 ASDV 2021-12-04 exit
How to do that?
Sub-query with an inner join can take care of this pretty DBMS agnostically.
SELECT
t.ID
,t.date
,t.product_name
,t.other_columns
FROM tb as t
INNER JOIN (
SELECT
id
,MAX(date) as date
FROM tb
GROUP BY id
) as s on t.id = s.id and t.date = s.date

Join records only on first match

im trying to join two tables. I only want the first matching row to be joined the others have to be null.
One of the tables contains daily records per User and the second table contains the goal for each user and day.
The joined result table should only join the firs ocurrence of User and Day and set the others to null. The Goal in the joined table can be interpreted as DailyGoal.
Example:
Table1 Table2
Id Day User Value Id Day User Goal
================================ ============================
01 01/01/2020 Bob 100 01 01/01/2020 Bob 300
02 01/01/2020 Bob 150 02 02/01/2020 Carl 170
03 01/01/2020 Bob 50
04 02/01/2020 Carl 200
05 02/01/2020 Carl 30
ResultTable
Day User Value Goal
============================================
01/01/2020 Bob 100 300
01/01/2020 Bob 150 (null)
01/01/2020 Bob 50 (null)
02/01/2020 Carl 200 170
02/01/2020 Carl 30 (null)
I tryed doing top1, distinct, subqueries but I cant find way to do it. Is this possible?
One option uses window functions:
select t1.*, t2.goal
from (
select t1.*,
row_number() over(partition by day, user order by id) as rn
from table1 t1
) t1
left join table2 t2 on t2.day = t1.day and t2.user = t1.user and t1.rn = 1
A case expression is even simpler:
select t1.*,
case when row_number() over(partition by day, user order by id) = 1
then t2.goal
end as goal
from table1 t1

Select Most Recent Entry in SQL

I'm trying to select the most recent non zero entry from my data set in SQL. Most examples of this are satisfied with returning only the date and the group by variables, but I would also like to return the relevant Value. For example:
ID Date Value
----------------------------
001 2014-10-01 32
001 2014-10-05 10
001 2014-10-17 0
002 2014-10-03 17
002 2014-10-20 60
003 2014-09-30 90
003 2014-10-10 7
004 2014-10-06 150
005 2014-10-17 0
005 2014-10-18 9
Using
SELECT ID, MAX(Date) AS MDate FROM Table WHERE Value > 0 GROUP BY ID
Returns:
ID Date
-------------------
001 2014-10-05
002 2014-10-20
003 2014-10-10
004 2014-10-06
005 2014-10-18
But whenever I try to include Value as one of the selected variables, SQLServer results in an error:
"Column 'Value' is invalid in the select list because it is not
contained in either an aggregate function or the GROUP BY clause."
My desired result would be:
ID Date Value
----------------------------
001 2014-10-05 10
002 2014-10-20 60
003 2014-10-10 7
004 2014-10-06 150
005 2014-10-18 9
One solution I have thought of would be to look up the results back in the original Table and return the Value that corresponds to the relevant ID & Date (I have already trimmed down and so I know these are unique), but this seems to me like a messy solution. Any help on this would be appreciated.
NOTE: I do not want to group by Value as this is the result I am trying to pull out in the end (i.e. for each ID, I want the most recent Value). Further Example:
ID Date Value
----------------------------
001 2014-10-05 10
001 2014-10-06 10
001 2014-10-10 10
001 2014-10-12 8
001 2014-10-18 0
Here, I only want the last non zero entry. (001, 2014-10-12, 8)
SELECT ID, MAX(Date) AS MDate, Value FROM Table WHERE Value > 0 GROUP BY ID, Value
Would return:
ID Date Value
----------------------------
001 2014-10-10 10
001 2014-10-12 8
This can also be done using a window function which is very ofter faster than a join on a grouped query:
select id, date, value
from (
select id,
date,
value,
row_number() over (partition by id order by date desc) as rn
from the_table
) t
where rn = 1
order by id;
Assuming you don't have repeated dates for the same ID in the table, this should work:
SELECT A.ID, A.Date, A.Value
FROM
T1 AS A
INNER JOIN (SELECT ID,MAX(Date) AS Date FROM T1 WHERE Value > 0 GROUP BY ID) AS B
ON A.ID = B.ID AND A.Date = B.Date
select a.id, a.date, a.value from Table1 a inner join (
select id, max(date) mydate from table1
where Value>0 group by ID) b on a.ID=b.ID and a.Date=b.mydate
Using Subqry,
SELECT ID, Date AS MDate, VALUE
FROM table t1
where date = (Select max(date)
from table t2
where Value >0
and t1.id = t2.id
)
Answers provided are perfectly adequate, but Using CTE:
;WITH cteTable
AS
(
SELECT
Table.ID [ID], MAX(Date) [MaxDate]
FROM
Table
WHERE
Table.Value > 0
GROUP BY
Table.ID
)
SELECT
cteTable.ID, cteTable.Date, Table.Value
FROM
Table INNER JOIN cteTable ON (Table.ID = cteTable.ID)

How Do I Select All Parents and the Top Previous Child Record Based on Dates in SQL Server 2008

I'm using a vendor provided database running on SQL Server 2008. There are two tables that track tests. For every record in Table A there may be zero, one or multiple records in Table B. There can also be multiple tests in Table A for the same user. The relationship is TableA.UserID = TableB.UserID. Tests taken in Table B can occur before or after Table A.
I need to select all of the records in Table A and, if test(s) from Table B have been taken by the same user before the test in Table A, data from Table B but only from the last previous child record. Both tables are structured similarly:
**TABLE A**
TestID INTEGER PRIMARY KEY,
UserID INTEGER,
TestDate DATE,
Score INTEGER
TABLE B
TestID INTEGER PRIMARY KEY,
UserID INTEGER,
TestDate Date,
Score INTEGER
Sample Data
TABLE A
TestID UserID TestDate Score
1 100 2014-02-15 80
2 101 2014-02-20 100
3 102 2014-02-22 90
4 102 2014-03-10 70
TABLE B
TestID UserID TestDate Score
1000 100 2014-02-01 55
1007 100 2014-02-05 85
1012 100 2014-02-20 95
1034 102 2014-02-12 65
1205 102 2014-03-05 75
1986 101 2014-03-10 45
What I'd like returned would be:
UserID TestA_ID TestADate TestAScore TestB_ID TestBDate TestBScore
100 1 2014-02-15 80 1007 2014-02-05 85
101 2 2014-02-20 100 NULL NULL NULL
102 3 2014-02-22 90 1034 2014-02-12 65
102 4 2014-03-10 70 1205 2014-03-05 75
I've know how to get all of the previous Table B rows joined to the Table A rows by using a LEFT OUTER JOIN and filtering by date in the WHERE clause, and I know how to get the Top row from Table B, but I haven't been able to work out how to get the top child record that occurs before the date of the record in Table A. Any help would be appreciated. Thanks.
You can do this using OUTER APPLY in T-SQL.
For each record in TableA, we're looking for a record in TableB for the same user but with a test date prior to the test date in TableA and we're also ordering the test in TableB to ensure we're getting the most recent test from TableB (but still prior to the test date from TableA).
SELECT
A.[UserID],
A.[TestID] [TestA_ID],
A.[TestDate] [TestADate],
A.[Score] [TestAScore],
B.[TestB_ID],
B.[TestBDate],
B.[TestBScore]
FROM [TableA] A
OUTER APPLY
(
SELECT TOP 1
B1.[TestID] [TestB_ID],
B1.[TestDate] [TestBDate],
B1.[Score] [TestBScore]
FROM [TableB] B1
WHERE A.[UserID] = B1.[UserID]
AND A.[TestDate] > B1.[TestDate]
ORDER BY
B1.[TestDate] DESC
) B
Or another option might be to use the ROW_NUMBER() window function to find the record from TableB. I have a hunch this one wouldn't perform as well because it needs to hit TableA twice, but can't be sure without running tests.
SELECT
A.[UserID],
A.[TestID] [TestA_ID],
A.[TestDate] [TestADate],
A.[Score] [TestAScore],
B.[TestB_ID],
B.[TestBDate],
B.[TestBScore]
FROM [TableA] A
LEFT JOIN
(
SELECT
ROW_NUMBER() OVER (PARTITION BY A.[UserID], A.[TestID] ORDER BY B.[TestDate] DESC) [rn],
A.[UserID],
A.[TestID] [TestA_ID],
B.[TestID] [TestB_ID],
B.[TestDate] [TestBDate],
B.[Score] [TestBScore]
FROM [TableA] A
INNER JOIN [TableB] B
ON A.[UserID] = B.[UserID]
AND A.[TestDate] > B.[TestDate]
) B
ON A.[UserID] = B.[UserID]
AND A.[TestID] = B.[TestA_ID]
AND B.[rn] = 1

How to find out if a field's values for a given item are only increasing?

For this problem, I'm using Access as a front end for SQL Server, and calling Access through Excel VBA, although I can use a direct ADO connection if there are some T-SQL specific functions that would be more useful here.
I have a table that logs state changes for a set of items, e.g.:
+-------+-------+------------+
| docID | state | date |
+-------+-------+------------+
| 103 | 5 | 10/15/2013 |
| 103 | 6 | 10/18/2013 |
| 102 | 3 | 10/22/2013 |
| 103 | 2 | 11/1/2013 |
| 102 | 7 | 11/8/2013 |
+-------+-------+------------+
For each unique docID, I want to figure out whether its state is only increasing from first date to last date, or if it ever decreases. In the above data set, 103 decreases and 102 only increases. We can assume that the entries will be in date order.
One way to find this would be to create an object for each docID and add these objects to a dictionary, loading each state change into a list and checking to see whether the state has decreased, something like this:
function isDecreasing(cl as changeList) as boolean
for c=2 to cl.count
if cl.item(c).state < cl.item(c-1).state then
isDecreasing=true
exit function
end if
next
isDecreasing=false
end function
But this will slow my query down a lot because I'll have to convert all the table data into objects. It also means I'll have to write a lot of additional code to create and manage the objects for calculating and generating reports.
Is there any way to write a query in SQL Server or Access that can perform the same type of analysis on the whole data set?
In his otherwise excellent answer, Gordon Linoff said:
You have a problem using Access-only functionality
Really?
For the given data, which I've put in a table called [StateChanges]:
docID state date
----- ----- ----------
103 5 2013-10-15
103 6 2013-10-18
102 3 2013-10-22
103 2 2013-11-01
102 7 2013-11-08
I can create the following saved query in Access named [PreviousDates]
SELECT t1.docID, t1.date, MAX(t2.date) AS PreviousDate
FROM
StateChanges t1
INNER JOIN
StateChanges t2
ON t2.docID = t1.docID
AND t2.date < t1.date
GROUP BY t1.docID, t1.date
It returns
docID date PreviousDate
----- ---------- ------------
102 2013-11-08 2013-10-22
103 2013-10-18 2013-10-15
103 2013-11-01 2013-10-18
Then I can use the following query to identify the [docID]'s where the [state] went down
SELECT curr.docID
FROM
(
PreviousDates pd
INNER JOIN
StateChanges curr
ON curr.date = pd.date
)
INNER JOIN
StateChanges prev
ON prev.date = pd.PreviousDate
WHERE curr.state < prev.state
It returns
docID
-----
103
In fact, both queries are so simple that we can combine them into a single query that does the whole thing in one shot:
SELECT curr.docID
FROM
(
(
SELECT t1.docID, t1.date, MAX(t2.date) AS PreviousDate
FROM
StateChanges t1
INNER JOIN
StateChanges t2
ON t2.docID = t1.docID
AND t2.date < t1.date
GROUP BY t1.docID, t1.date
) PreviousDates
INNER JOIN
StateChanges curr
ON curr.date = PreviousDates.date
)
INNER JOIN
StateChanges prev
ON prev.date = PreviousDates.PreviousDate
WHERE curr.state < prev.state
So where's the problem?
You have a problem using Access-only functionality. But, if you have SQL Server 2012, you can use lead()/lag() functionality. There is another way, just using row_number(), which is available since SQL Server 2005.
Here is the idea. Enumerate the rows within each docId first by state and also by date. If the enumerations are the same, then the sequence is non-decreasing (essentially increasing). If different, then there is a bump in the road. Here is the code:
select docid,
(case when sum(case when rn_ds <> rn_sd then 1 else 0 end) = 0 then 'Increasing'
else 'Decreasing'
end) as SequenceType
from (select d.*,
row_number() over (partition by docId order by date, state) as rn_ds,
row_number() over (partition by docId order by state, date) as rn_sd
from d
) d
group by docid;
Note that I've made the sort a little more complicated by using both fields. This handles the case when two dates in a row have the same state (probably not allowed, but might as well make the technique more stable).
Question:
For each unique docID, I want to figure out whether its state is only increasing from first date to last date, or if it ever decreases.
So what you want to know is, for a given record a, does there exist a record b where the date of a is earlier but the state of b is lower.
So just ask that.
select docID
from T a
where
exists (
select 1 from T b where b.date > a.date and b.state < a.state
)