I have a record with same ID but different data in both rows
While updating the final result should be the last record of that ID present in data.
Example
ID | Name | PermanentAddrss | CurrentLocation
1 | R1 | INDIA | USA
1 | R1 | INDIA | UK
Now for ID 1 the record which will be loaded in database
1|R1|INDIA|UK
How this can be done in SQL server for multiple records?
Please understand that SQL server does not store or fetch data in order of data insertion, so to find the latest/last record you should have some way to order the records.
This is typically a timestamp column like last_modified_date. Your current table is prime candidate for a slow changing dimension type 2; and you should consider implementing it.
See explanation on Kimball's group site.
If you are really not affected by any order and just need a row for each id you can try below query.
select
ID,
Name,
PermanentAddress,
CurrentLocation
from
(select
*,
row_number() over(partition by id order by (select null)) r
from yourtable)t
where r=1
You can identify the latest ID value by:
SELECT B.ID, A.NAME, A.PERMANENTADDRS, A.CURRENTLOCATION
FROM
(SELECT ID, NAME, PERMANENTADDRS, CURRENTLOCATION, MAX(RNUM) AS LATEST_ID FROM
(SELECT ID, NAME, PERMANENTADDRS, CURRENTLOCATION, ROW_NUMBER() OVER (PARTITION BY ID) AS RNUM FROM YOUR_TABLE)
GROUP BY ID, NAME, PERMANENTADDRS, CURRENTLOCATION) A
INNER JOIN
YOUR_TABLE B
ON A.LATEST_ID = B.ID;
This will take the last populated record for a given ID value. If the logic for latest record is different, it can be appropriately incorporated in the query.
Related
I have the following data:
ID Date Num ClientID Dest
--------------------------------------------------------
123 04/29/2021 -2222 H1234 -1
123 04/29/2021 1 H1234 3
345 04/29/2021 -2222 H3456 -1
345 04/29/2021 1 H8888 .1
BTW: this does not include all the fields, just what I'm currently using for my query.
For every ID in the above table I'll always have 2 records. There are 2 scenarios that can take place:
As for ID = 123, the ClientID is the same
As for ID = 345, the ClientID is different
I'm trying to return the following data:
ID Date Num ClientID Dest
123 04/29/2021 1 H1234 3
The reason I'm returning only this row is because:
I only want 1 row per ID, where the ClientID is the same for both rows
Only need the record that does not have -2222, where the CLientID is the same for both rows
If the ClientID is different for the same ID (ex: 345), then completely skip these records.
Now the numbers for DEST can vary, so we can't always rely that one will be -1 and the other will be positive, however the NUM field will always have -2222 and 1 for the 2nd row (which is the row that I'd want to be returned)
I'm not sure how best to do this, I guess I thought about the alternative of just creating a CTE, and then counting the ClientID and if Count = 2 then select the data. The problem I find is with DEST field, I know that I can do Max(NUM) but since DEST field can very I wouldn't know how to select it.
Here is what I tried:
WITH Ranking AS (
SELECT Rank() OVER (PARTITION BY c.ID,c.date1,c.ClientID ORDER BY num
asc)x, c.*
FROM cte c
)
SELECT * FROM Ranking WHERE x = 2
I'm not if this is a good apprach, I guess it does the job but any thoughts?
One solution is to use the ever-useful window functions with row_number to select which pair to use and lead to check if both ClientIds are the same. This would work regardless if the specific values you have should change, plus is more performant than hitting the table twice:
select id, date, num, clientid, Dest from (
select *,
Row_Number() over(partition by id order by num) rn,
case when Lead(clientid) over(partition by id order by num)=clientid then 1 else 0 end same
from t
)t
where rn=1 and same=1
Just select all rows where Num isn't -2222 and the ID is in a subquery grouped by id and having only one distinct client id.
SELECT *
FROM tbl
WHERE ID IN (SELECT ID FROM tbl GROUP BY ID HAVING COUNT(DISTINCT ClientId) = 1)
AND Num != -2222
I am trying to figure out how to do the following in SQL (I'm specifically working in Teradata). Given the following table of user IDs, transaction date, and item bought, I'm trying to figure out how for each user ID to get the last item bought, for example:
User ID Date Product
123 12/01/1996 A
123 12/02/1996 B
123 12/03/1996 C
124 12/01/1996 B
124 12/04/1996 A
123 12/05/1996 D
So the query would return in this case:
User ID Last Product Bought
123 D
124 A
And so forth. I tried using a Partition By or Window function in Teradata, but could not figure out how to implement it.
Thanks for your help.
Apply Teradata's proprietary syntax for filtering Windowed Aggregates:
select *
from tab
qualify
row_number()
over (partition by User_ID -- each user
order by Date_col desc) = 1 -- lastest row
In Teradata, you can use row_number() and qualify to solve this top-1-per-group problem:
select t.*
from mytable t
qualify row_number() over(partition by yser_id order by date desc) = 1
Does anyone happen to know a way of basically taking the 'Distinct' command but only using it on a single column. For lack of example, something similar to this:
Select (Distinct ID), Name, Term from Table
So it would get rid of row with duplicate ID's but still use the other column information. I would use distinct on the full query but the rows are all different due to certain columns data set. And I would need to output only the top most term between the two duplicates:
ID Name Term
1 Suzy A
1 Suzy B
2 John A
2 John B
3 Pete A
4 Carl A
5 Sally B
Any suggestions would be helpful.
select t.Id, t.Name, t.Term
from (select distinct ID from Table order by id, term) t
You can use row number for this
Select ID, Name, Term from(
Select ID, Name, Term, ROW_NUMBER ( )
OVER ( PARTITION BY ID order by Name) as rn from Table
Where rn = 1)
as tbl
Order by determines the order from which the first row will be picked.
I have a small database with the data from weather parameters being recorded daily. This is a Microsoft SQL Server 2008 express Database
My tables are as follows:
station (id, name, position)
reading (station_id, timestamp, value)
--station_id is the foreign key to id in station table
I want to Join them and get the result as below:
id | name | value | time
--------+---------+-------------------------
0 | lake | 73 |2013/08/16 02:00
1 | pier | 72 |2013/08/16 02:00
2 | gate | 81 |2013/08/16 02:00
Looking at question like Join to only the "latest" record with t-sql, I've been only able to get one row from the first table, and using Join two tables, only use latest value of right table, I've been able to get only the max time from second table.
How can I get the output that I want?
It can be done with a subquery
SELECT s.id,
s.name,
r.value,
r.timestamp
FROM station as s
INNER JOIN reading as r
on s.id = r.station_id
WHERE r.timestamp = (
SELECT max(timestamp)
FROM reading
where reading.station_id = s.station_id
)
SELECT STATION.ID,STATION.Name,T2.timestamp,T2.Value
FROM STATION
LEFT JOIN
(
SELECT station_id,timestamp, value
FROM
(
SELECT station_id,timestamp, value,
ROW_NUMBER() OVER (PARTITION BY station_id ORDER BY timestamp DESC) as rn
FROM reading
) as T1
WHERE RN=1
) as T2 on STATION.ID=T2.station_id
The database schema is organized as follows:
ID | GroupID | VALUE
--------------------
1 | 1 | A
2 | 1 | A
3 | 2 | B
4 | 3 | B
In this example, I want to GET all Rows with duplicate VALUE, but are not part of the same group. So the desired result set should be IDs (3, 4), because they are not in the same group (2, 3) but still have the same VALUE (B).
I'm having trouble writing a SQL Query and would appreciate any guidance. Thanks.
So far, I'm using SQL Count, but can't figure out what to do with the GroupId.
SELECT *
FROM TABLE T
HAVING COUNT(T.VALUE) > 1
GROUP BY ID, GroupId, VALUE
The simplest method for this is using EXISTS:
SELECT
ID
FROM
MyTable T1
WHERE
EXISTS (SELECT 1
FROM MyTable
WHERE Value = t1.Value
AND GroupID <> t1.GroupID)
Here is one method. First you have to identify the values that appear in more than one group and then use that information to find the right rows in the original table:
select *
from t
where value in (SELECT value
FROM TABLE T
GROUP BY VALUE
HAVING COUNT(distinct groupid) > 1
)
order by value
Actually, I prefer a slight variant in this case, by changing the HAVING clause:
HAVING min(groupid) <> max(groupid)
This works when you are looking for more than one group and should be faster than the COUNT DISTINCT version.
SELECT ALL_.*
FROM (SELECT *
FROM TABLE_
GROUP BY ID, GROUPID, VALUE
ORDER BY ID) GROUPED,
TABLE_ ALL_
WHERE GROUPED.VALUE = ALL_.VALUE
AND GROUPED.GROUPID <> ALL_.GROUPID