get the latest records - sql

I am currently still on my SQL educational journey and need some help!
The query I have is as below;
SELECT
Audit_Non_Conformance_Records.kf_ID_Client_Reference_Number,
Audit_Non_Conformance_Records.TimeStamp_Creation,
Audit_Non_Conformance_Records.Clause,
Audit_Non_Conformance_Records.NC_type,
Audit_Non_Conformance_Records.NC_Rect_Received,
Audit_Non_Conformance_Records.Audit_Num
FROM Audit_Non_Conformance_Records
I am trying to tweak this to show only the most recent results based on Audit_Non_Conformance_Records.TimeStamp_Creation
I have tried using MAX() but all this does is shows the latest date for all records.
basically the results of the above give me this;
But I only need the result with the date 02/10/2019 as this is the latest result. There may be multiple results however. So for example if 02/10/2019 had never happened I would need all of the idividual recirds from the 14/10/2019 ones.
Does that make any sense at all?

You can filter with a subquery:
SELECT
kf_ID_Client_Reference_Number,
TimeStamp_Creation,
Clause,
NC_type,
NC_Rect_Received,
Audit_Num
FROM Audit_Non_Conformance_Records a
where TimeStamp_Creation = (
select max(TimeStamp_Creation)
from Audit_Non_Conformance_Records
)
This will give you all whose TimeStamp_Creation is equal to the greater value available in the table.
If you want all records that have the greatest day (exluding time), then you can do:
SELECT
kf_ID_Client_Reference_Number,
TimeStamp_Creation,
Clause,
NC_type,
NC_Rect_Received,
Audit_Num
FROM Audit_Non_Conformance_Records a
where cast(TimeStamp_Creation as date) = (
select cast(max(TimeStamp_Creation) as date)
from Audit_Non_Conformance_Records
)
Edit
If you want the latest record per refNumber, then you can correlate the subquery, like so:
SELECT
kf_ID_Client_Reference_Number,
TimeStamp_Creation,
Clause,
NC_type,
NC_Rect_Received,
Audit_Num
FROM Audit_Non_Conformance_Records a
where TimeStamp_Creation = (
select max(TimeStamp_Creation)
from Audit_Non_Conformance_Records a1
where a1.refNumber = a.refNumber
)
For performance, you want an index on (refNumber, TimeStamp_Creation).

If you want the latest date in SQL Server, you can express this as:
SELECT TOP (1) WITH TIES ancr.kf_ID_Client_Reference_Number,
ancr.TimeStamp_Creation,
ancr.Clause,
ancr.NC_type,
ancr.NC_Rect_Received,
ancr.Audit_Num
FROM Audit_Non_Conformance_Records ancr
ORDER BY CONVERT(date, ancr.TimeStamp_Creation) DESC;
SQL Server is pretty good about handling dates with conversions, so I would not be surprised if this used an index on TimeStamp_Creation.

Related

SUM of database function returns much higher value than actual SUM

I have a database function that I am able to get the correct results from, but when I try to use SUM to total the results it returns a much higher value.
SELECT
SUM([DATABASE].[dbo].[fn_GetCharges]([TABLE1_DATE],[TABLE1_CUST],[TABLE1_SITE],[TABLE1_SERV])) AS [GROSS_REVENUE]
FROM [DATABASE].[dbo].[TABLE1]
WHERE
[TABLE1_ROUT] = '1234'
AND [TABLE1_DATE] = '2018-05-08'
This returns a SUM value of about 15,740.
If I remove the SUM and GROUP BY the function then it shows each of the returned values, which I manually totalled to about 750.
I'm using Microsoft SQL Server 2014.
We don't know what is in your function, but you could probably output your raw data first and then sum it next.
;WITH cte AS (
SELECT
[DATABASE].[dbo].[fn_GetCharges]([TABLE1_DATE],[TABLE1_CUST],[TABLE1_SITE],[TABLE1_SERV]) AS [GROSS_REVENUE]
FROM [DATABASE].[dbo].[TABLE1]
WHERE
[TABLE1_ROUT] = '1234'
AND [TABLE1_DATE] = '2018-05-08'
GROUP BY whatever_you_grouped_by_that_gave_correct_results
)
SELECT SUM(gross_revenue) AS gross_revenue
FROM cte
Wondering if you used group by clause in the query since I don't see one posted.
Group By (columns) Having (condition). What you specified in the 'where' goes in the 'Having' clause

SQL: Getting the latest date using Max() while using group by

I'm struggling to get the correct result with this query:
select max(kts.my_date), kts.name
join ktt on ktt.someId = kts.someOtherId
where ktt.someId = 'example'
group by kts.name;
I have two (possibly stupid) questions:
Will this max() take time into account? I know that order by does if the dates are the same. Does max do the same?
This is connected to my previous question, but when I run the query above, if the dates are same, it orders it by the name. I want the latest date at the top. Do I need to put an order by clause for the date in? If so, using Max is pointless, right?
Thanks for the help.
Yes,
--2
select max(kts.my_date) over (partition by kts.name) as maxdate, kts.name
from -- chose your table
join ktt on ktt.someId = kts.someOtherId
where ktt.someId = 'example'
order by --chose here your column
give this a try

Understanding a Correlated Subquery

I want to create a query that returns the most recent date for a date field and the highest value of a integer field for each "assessment" record. What I think is required is a correlated subquery and using the MAX function.
example data would be as follows
the date field could have duplicate dates for each assessment but each duplicate date group would have a different the integer in the integer field.
eg
1256 2/6/14 0
1256 2/6/14 1
1256 1/6/14 0
4534 3/6/14 0
4534 3/6/14 1
4534 3/6/14 2
select assessment, Max(correctnum) maxofcorrectnum, dateeffect
from lraassm outerassm
where dateeffect =
(select MAX(dateeffect) maxofdateeffect
from pthdbo.lraassm innerassm
innerassm.assessment = outerassm.assessment
group by innerassm.assessment)
group by assessment, dateeffect
so my theory is that the inner query executes and gives the outer query the criteria for the dateeffect field in the outer query and then the outer query would return the maximum of the correctnum field for this dateeffect and also return its corresponding assessment and the dateeffect.
Could someone please confirm this is correct. How does the subquery handle the rows? what other ways are there to solve this problem? thanks
Your query is doing the right thing, but granted, the correlated subquery is a little difficult to understand. What the subquery does is, it filters the records based on assessment from the outer query and then returns the maximum dateeffect for that assessment. In fact, you don't need the group by clause on the correlated query.
These types of queries are where common when working with data in ERP systems, when you're only interested in "latest" records, etc. This is also known as a "top segment" type of query (which the query optimizer is sometimes able to figure out by itself). I've found, that on SQL Server 2005 or newer, it is a lot easier to use the ROW_NUMBER() function. The following query should return the same as yours, namely one record from lraassm for each assessment, that has the highest value of dateeffect and correctnum.
select * from (
select
assessment, dateeffect, correctnum,
ROW_NUMBER() OVER (
PARTITION BY assessment,
ORDER BY dateeffect DESC, correctnum DESC
) AS segment
from lraassm) AS innerQuery
where segment = 1
This is the query I worked out using my tables. But it will get you on the right track and you should be able to substitute your fields/tables in.
Select * from Decode
where updated_time = (Select MAX(updated_time)from DECODE)
That Query gives you every record that has the most recent updated_time. The next query will return the greatest entry_id value as well as the most recent updated_time from those Records
Select MAX(entry_id), updated_time from Decode
where updated_time = (Select MAX(updated_time)from DECODE)
group by updated_time
The result is 2 columns 1 record, 1st column is the Maximum value of entry id, the second is the most recent updated_time. Is that what you wanted to return?

Get a Row if within certain time period of other row

I have a SQL statement that I am currently using to return a number of rows from a database:
SELECT
as1.AssetTagID, as1.TagID, as1.CategoryID,
as1.Description, as1.HomeLocationID, as1.ParentAssetTagID
FROM Assets AS as1
INNER JOIN AssetsReads AS ar ON as1.AssetTagID = ar.AssetTagID
WHERE
(ar.ReadPointLocationID='Readpoint1' OR ar.ReadPointLocationID='Readpoint2')
AND (ar.DateScanned between 'LastScan' AND 'Now')
AND as1.TagID!='000000000000000000000000'
I am wanting to do a query that will get the row with the oldest DateScanned from this query and also get another row from the database if there was one that was within a certain period of time from this row (say 5 seconds for an example). The oldest record would be relatively simple by selecting the first record in a descending sort, but how would I also get the second record if it was within a certain time period of the first?
I know I could do this process with multiple queries, but is there any way to combine this process into one query?
The database that I am using is SQL Server 2008 R2.
Also please note that the DateScanned times are just placeholders and I am taking care of that in the application that will be using this query.
Here is a fairly general way to approach it. Get the oldest scan date using min() as a window function, then use date arithmetic to get any rows you want:
select t.* -- or whatever fields you want
from (SELECT as1.AssetTagID, as1.TagID, as1.CategoryID,
as1.Description, as1.HomeLocationID, as1.ParentAssetTagID,
min(DateScanned) over () as minDateScanned, DateScanned
FROM Assets AS as1
INNER JOIN AssetsReads AS ar ON as1.AssetTagID = ar.AssetTagID
WHERE (ar.ReadPointLocationID='Readpoint1' OR ar.ReadPointLocationID='Readpoint2')
AND (ar.DateScanned between 'LastScan' AND 'Now')
AND as1.TagID!='000000000000000000000000'
) t
where datediff(second, minDateScanned, DateScanned) <= 5;
I am not really sure of sql server syntax, but you can do something like this
SELECT * FROM (
SELECT
TOP 2
as1.AssetTagID,
as1.TagID,
as1.CategoryID,
as1.Description,
as1.HomeLocationID,
as1.ParentAssetTagID ,
ar.DateScanned,
LAG(ar.DateScanned) OVER (order by ar.DateScanned desc) AS lagging
FROM
Assets AS as1
INNER JOIN AssetsReads AS ar
ON as1.AssetTagID = ar.AssetTagID
WHERE (ar.ReadPointLocationID='Readpoint1' OR ar.ReadPointLocationID='Readpoint2')
AND (ar.DateScanned between 'LastScan' AND 'Now')
AND as1.TagID!='000000000000000000000000'
ORDER BY
ar.DateScanned DESC
)
WHERE
lagging IS NULL or DateScanned - lagging < '5 SECONDS'
I have tried to sort the results by DateScanned desc and then just the top most 2 rows. I have then used the lag() function on DateScanned field, to get the DateScanned value for the previous row. For the topmost row the DateScanned shall be null as its the first record, but for the second one it shall be value of the first row. You can then compare both of these values to determine whether you wish to display the second row or not
more info on the lagging function: http://blog.sqlauthority.com/2011/11/15/sql-server-introduction-to-lead-and-lag-analytic-functions-introduced-in-sql-server-2012/

SQL merging result sets on a unique column value

I have 2 similar queries which both work on the same table, and I essentially want to combine their results such that the second query supplies default values for what the first query doesn't return. I've simplified the problem as much as possible here. I'm using Oracle btw.
The table has account information in it for a number of accounts, and there are multiple entries for each account with a commit_date to tell when the account information was inserted. I need get the account info which was current for a certain date.
The queries take a list of account ids and a date.
Here is the query:
-- Select the row which was current for the accounts for the given date. (won't return anything for an account which didn't exist for the given date)
SELECT actr.*
FROM Account_Information actr
WHERE actr.account_id in (30000316, 30000350, 30000351)
AND actr.commit_date <= to_date( '2010-DEC-30','YYYY-MON-DD ')
AND actr.commit_date =
(
SELECT MAX(actrInner.commit_date)
FROM Account_Information actrInner
WHERE actrInner.account_id = actr.account_id
AND actrInner.commit_date <= to_date( '2010-DEC-30','YYYY-MON-DD ')
)
This looks a little ugly, but it returns a single row for each account which was current for the given date. The problem is that it doesn't return anything if the account didn't exist until after the given date.
Selecting the earliest account info for each account is trival - I don't need to supply a date for this one:
-- Select the earliest row for the accounts.
SELECT actr.*
FROM Account_Information actr
WHERE actr.account_id in (30000316, 30000350, 30000351)
AND actr.commit_date =
(
SELECT MAX(actrInner .commit_date)
FROM Account_Information actrInner
WHERE actrInner .account_id = actr.account_id
)
But I want to merge the result sets in such a way that:
For each account, if there is account info for it in the first result set - use that.
Otherwise, use the account info from the second result set.
I've researched all of the joins I can use without success. Unions almost do it but they will only merge for unique rows. I want to merge based on the account id in each row.
Sql Merging two result sets - my case is obviously more complicated than that
SQL to return a merged set of results - I might be able to adapt that technique? I'm a programmer being forced to write SQL and I can't quite follow that example well enough to see how I could modify it for what I need.
The standard way to do this is with a left outer join and coalesce. That is, your overall query will look like this:
SELECT ...
FROM defaultQuery
LEFT OUTER JOIN currentQuery ON ...
If you did a SELECT *, each row would correspond to the current account data plus your defaults. With me so far?
Now, instead of SELECT *, for each column you want to return, you do a COALESCE() on matched pairs of columns:
SELECT COALESCE(currentQuery.columnA, defaultQuery.columnA) ...
This will choose the current account data if present, otherwise it will choose the default data.
You can do this more directly using analytic functions:
select *
from (SELECT actr.*, max(commit_date) over (partition by account_id) as maxCommitDate,
max(case when commit_date <= to_date( '2010-DEC-30','YYYY-MON-DD ') then commit_date end) over
(partition by account_id) as MaxCommitDate2
FROM Account_Information actr
WHERE actr.account_id in (30000316, 30000350, 30000351)
) t
where (MaxCommitDate2 is not null and Commit_date = MaxCommitDate2) or
(MaxCommitDate2 is null and Commit_Date = MaxCommitDate)
The subquery calculates two values, the two possibilities of commit dates. The where clause then chooses the appropriate row, using the logic that you want.
I've combined the other answers. Tried it out at apex.oracle.com. Here's some explanation.
MAX(CASE WHEN commit_date <= to_date('2010-DEC-30', 'YYYY-MON-DD')) will give us the latest date not before Dec 30th, or NULL if there isn't one. Combining that with a COALESCE, we get
COALESCE(MAX(CASE WHEN commit_date <= to_date('2010-DEC-30', 'YYYY-MON-DD') THEN commit_date END), MAX(commit_date)).
Now we take the account id and commit date we have and join them with the original table to get all the other fields. Here's the whole query that I came up with:
SELECT *
FROM Account_Information
JOIN (SELECT account_id,
COALESCE(MAX(CASE WHEN commit_date <=
to_date('2010-DEC-30', 'YYYY-MON-DD')
THEN commit_date END),
MAX(commit_date)) AS commit_date
FROM Account_Information
WHERE account_id in (30000316, 30000350, 30000351)
GROUP BY account_id)
USING (account_id, commit_date);
Note that if you do use USING, you have to use * instead of acrt.*.