SQL select query to find the latest "destination_id" to track moves - sql

Hello – I am trying to construct an Oracle 11g query that will find the latest version of an entity by going through a table that has a history of moves. An example of this is that the table could contain a list of addresses that a person has lived at and different addresses that they have moved to.
For example, you might live at ADDRESS_ID 123 but then moved to ADDRESS_ID 456 and moved again to ADDRESS_ID 789.
It is also possible that you lived at ADDRESS_ID 123 the whole time and never moved therefore you would never appear on the MOVE_LIST table.
The goal of the query would be so if I select ADDRESS_ID 123 in the first example above then it would tell me the MOST RECENT ADDRESS_ID that the person is at (789).
The table is called MOVE_LIST and has the following columns:
MOVE_LIST_ID
ORIGINAL_ADDRESS_ID
DESTINATION_ADDRESS_ID
The query I have so far doesn’t complete this task since it doesn’t go through the list of moves:
Select DESTINATION_ADDRESS_ID
from MOVE_LIST
where ORIGINAL_ADDRESS_ID = '123'
Any tips on this query would be GREATLY appreciated.
Here is some sample data:
MOVED_LIST_ID ORIGINAL_ADDRESS_ID DESTINATION_ADDRESS_ID
1 123 456
2 456 789
Thank you

In you case data in the move_table form a hierarchy. So, in order to find out the last address a person moved to, a simple hierarchical query can be used:
with move_list(moved_list_id, original_address_id, destination_address_id) as(
select 1, 123, 456 from dual union all
select 2, 456, 789 from dual
)
select destination_address_id
from move_list
where connect_by_isleaf = 1
start with original_address_id = 123
connect by original_address_id = prior destination_address_id
Result:
DESTINATION_ADDRESS_ID
----------------------
789

Related

List of products with certain status at any timepoint

I have a table in Redshift containing id, status and the timestamp the status was set. The ids are not unique, for example a book was available at one point, then it was sold, then we received it as a return from the customer and the status is on cleaning, and after cleaning it will receive again the status available.
book_id
status
status_valid_from
101
available
2022-08-02 13:04:43.000
103
cleaning
2022-08-03 10:20:21.000
104
cleaning
2022-08-04 13:55:04.000
101
sold
2022-08-05 19:29:41.000
104
available
2022-08-06 06:14:33.000
105
cleaning
2022-08-07 15:43:12.000
108
available
2022-08-08 11:03:24.000
101
cleaning
2022-08-11 07:28:21.000
124
sold
2022-08-11 09:41:53.000
101
available
2022-08-11 16:49:34.000
Every time the book gets a new status, a new row is created in the table to record it.
The question: How do I select the list of ids having status "available" at a specific time point?
For example:
at 2022-08-03 10:20:21.000 the query should return as available only id 101.
at 2022-08-11 07:40:27.000 the query should return as available the ids 104, 108.
at 2022-08-11 16:51:25.000 the query should return as available the ids 104, 108, 101.
What I tried:
select *
from table
where status = 'available'
and status_valid_from >= specific_timestamp
However, it works for books that had the status set to available only once, and only if the status wasn't changed afterwards. What I'd like to find out is how to select the correct list using a code that's valid for all cases.
You should read up on window functions; they are very helpful in situations like this. https://docs.aws.amazon.com/redshift/latest/dg/c_Window_functions.html
As I understand your question you want to find, for each id, the last status for that id and then find all the id that have status "available". Correct?
Window functions allow you to partition the data by id, and find the last status by date. Doing this for a dataset that only has data up to the date of interest will give you the latest status for each id as of that point in time. Them just select those that are available.
select book_id, status
from (
select book_id, status,
row_number() over (partition by book_id order by status_valid_from desc) as row_num
from <table>
where status_valid_from <= <date-of-interest>
)
where row_num = 1

How to get the set size, first and last record in a db2 ordered set with one call

I have a very big transaction table on DB2 v11, and I need to query a subset of it as efficiently as possible. All I need is the total count of the set (not known in advance, it's based on criteria, lets say 1 day) and the ID of the first record, and the ID of the last record.
The old code was fetching the entire table, then just using the 1st record ID, and the last record ID, and size, and not making use of the rest. Now this code is timing out. It's a complex query of several joins.
IS there a way to just fetch the size of the set, 1st record, last record all in one select query ?
I've read that reordering the list in order to fetch the 1st record(so fetch with Desc, then change to Asc) is not efficient.
sample table 1 TRANSACTION_RECORDS:
tdID TIMESTAMP name
-------------------------------
123 2020-03-31 john
234 2020-03-31 dan
456 2020-03-01 Eve
675 2020-04-01 joy
sample table 2 TRANSACTION_TYPE:
invoiceId tdID account
------------------------------
897 123 abc
898 123 def
877 234 mnc
899 456 opp
Sample query
select Min(tr.transaction_id), Max(tr.transaction_id)
from TRANSACTION_RECORDS TR
join TRANSACTION_TYPE TT
on TR.tdID=tt.tdID
WHERE Date(TR.TIMESTAMP) = '2020-03-31'
group by tr.tdID
order by TR.tdID ASC
This results in multiple columns, (but it requires the group by)
123,123
234,234
456,456
What I want is:
123,456
As I mentioned in the comments, for this query you don't need Group BY and neither Order by, just do:
select Min(tr.transaction_id), Max(tr.transaction_id)
from TRANSACTION_RECORDS TR
join TRANSACTION_TYPE TT
on TR.tdID=tt.tdID
WHERE Date(TR.TIMESTAMP) = '2020-03-31'
It should work as expected

SQL Masking A Mapping Field In The Query

I am creating a view to extract data from a table and load that data into a fixed file which will be loaded into a system. The view will map the table column to a particular format.
There is one column, Account_Number, which needs to be masked as the column has sensitive information.
My logic to mask the value is to shift the number to the next place in numberline.
so, if the number is 0 then 1, 4 then 5, etc. I am not able to come with the logic in the view itself.
Any help would be appreciated.
CREATE OR REPLACE FORCE EDITIONABLE VIEW "Schema1"."VW_ActiveTraders" ("FUND", "NAME", "CITY", "ACN") AS
Select
TD_Fund as FUND,
Name as NAME,
City as CITY,
Account_Number as ACN
FROM Trader1 -- Table Name
Account Number
023457456
123456789
012345678
Masked Account Number
134568567
012345678
123456789
Please note that Account Number column has more than 1000 entries.
You may use TRANSLATE to shift the numbers
with dt as (
select '023457456' ACN from dual union all
select '123456789' ACN from dual union all
select '012345678' ACN from dual)
select ACN,
TRANSLATE(ACN,'0123456789','1234567890') as ACN_WEAK_MASK
from dt;
ACN ACN_WEAK_
--------- ---------
023457456 134568567
123456789 234567890
012345678 123456789
But note, that this is not a real masking of sensitive information. It is very easy to unmask the information and get the original acount ID.
An often used masking is e.g. 012345678 gets ******678.
#MarmiteBomber #Stilgar - Thanks so much for clarification and help on the answer.
I just tweaked the query and it ran successfully.
Changed Query
------------------------------------------------------------------------------------------
CREATE OR REPLACE FORCE EDITIONABLE VIEW "Schema1"."VW_ActiveTraders" ("FUND", "NAME", "CITY", "ACN") AS
Select
TD_Fund as FUND,
Name as NAME,
City as CITY,
--Account_Number as ACN
TRANSLATE(Account_Number,'0123456789','1234567890') as ACN,
FROM Trader1 -- Table Name
------------------------------------------------------------------------------------------

finding duplicate rows with different IDs based on multiple columns

please forgive me if my jargon is off. I'm still learning!
I just started using Teradata, and to be honest has been a lot of fun. however, I have hit a road block that has stumped me for a while.
I successfully selected a table from a database that looks like:
ID service date name
1 service1 1/5/15 john
2 service2 1/7/15 steve
3 service3 1/8/15 lola
4 service4 1/3/15 joan
5 service5 1/5/15 fred
6 service3 1/3/15 joan
7 service5 1/8/15 oscar
Now I want to search the data base again to find any duplicate IDs (example: to see if service service1 with date 1/5/15 with name john exists on another row with a different ID.)
At first, I did something like this:
SELECT ID, service, date, name
FROM table
WHERE table.service = ANY(service1, service2, service3, service4, service5, service3, service5)
AND table.date = ANY('1/5/15', '1/7/15, '1/8/15', '1/3/15', '1/5/15', '1/3/15', '1/8/15')
AND table.name = ANY('john', 'steve', 'lola', 'joan', 'fred', 'joan', 'oscar');
But this is giving me more rows than I wanted.
example:
ID service date name
92 service3 1/8/15 steve
is of no use to me since I am looking for IDs that have the same combination of service, date, and name as of any of the other IDs in the above table.
something like this would be favorable:
ID service date name
609 service3 1/8/15 lola
since it matches than of ID 3.
I was curious to see if it were possible to treat the three columns (service, date, name) as a vector and maybe select the rows that match it that way?
ex
......
WHERE (table.service, table.date, table.name) = ANY((service3,1/8/15,lola), (service1, 1/5/15, john), ...etc)
My Teradata is down right now, So I have yet to try the above example. Nevertheless, any thoughts/feedback is greatly appreciated!
The following query may be what you are trying to achieve. This selects IDs for which the combination of service, date, and name appears more than once.
SELECT t1.ID
FROM yourTable t1
INNER JOIN
(
SELECT service, date, name
FROM yourTable
GROUP BY service, date, name
HAVING COUNT(*) > 1
) t2
ON t1.service = t2.service AND
t1.date = t2.date AND
t1.name = t2.name
This is a simple task for a Windowed Aggregate:
SELECT *
FROM tab
QUALIFY
COUNT(*) OVER (PARTITION BY service, date, name) > 1
This counts the number of rows with the same combination of values (like Tim Biegeleisen's Derived Table) but unlike a Standard Aggregate it keeps all rows. The QUALIFY is a nice Teradata syntax extension to avoid a Derived Table.
Don't hardcode values in your query unless you absolutely have to. Instead, take the query you already wrote and join to that.
SELECT dupes.*
FROM (your query) yourquery
JOIN table dupes
ON yourquery.service = dupes.service
AND yourquery.date = dupes.date
AND yourquery.name = dupes.name

SQL Server 2000: need to return record ID from a previous record in current query

I work on a help-desk and am doing some analysis of PC repair tickets.
I am needing to dump data from our call log system that returns history of tickets for issues on computers where they were recently repaired by another team. We are simply trying to improve QA on deployed machines and this data will help.
I have the query for the analysis of tickets, but I am wanting to return the ticket number of the last PC repair case.
My current query is as follows:
SELECT
CallLog.CallID,
CallLog.CustID,
Subset.Rep_num,
Subset.FirstName,
Subset.LastName,
CallLog.OpndetailCat,
CallLog.Tracker_Full,
CallLog.RecvdDate,
FROM
heatPrd.dbo.CallLog CallLog,
heatPrd.dbo.Subset Subset
WHERE
CallLog.CallID = Subset.CallID AND
CallLog.RecvdDate>='2015-10-01' AND
CallLog.OpnAreaCat='back from repair'
ORDER BY
CallLog.CallID DESC
This returns
CallID CustID Rep_num FirstName LastName OpndetailCat Tracker_Full
2182375 1234 Sarah Doe Missing Email Folde
2181831 1235 JENNIFER Doe ZOTHER
2180815 1236 123 Jason Smith ZOTHER
2180790 1237 124 DARCY Doe Wrong Proxy Config
2180787 1239 125 Jason Smith ZOTHER
I want to add a column to the query that would return something to the effect of
select max(callid)
from calllog
where calltype = 'in_for_service_pc' and custid = '1234'
where calltype = 'in_for_service_pc' resides on the CallLog table and custID would pull from the query result.
This is a lot of info so i hope my request is clear.
Disclaimer: Data resides in SQL Server 2000 so some of the newer commands may not work.
Something like this should be pretty close.
SELECT
cl.CallID,
cl.CustID,
s.Rep_num,
s.FirstName,
s.LastName,
cl.OpndetailCat,
cl.Tracker_Full,
cl.RecvdDate,
x.MaxCallID
FROM heatPrd.dbo.CallLog cl
JOIN heatPrd.dbo.Subset s ON cl.CallID = s.CallID
left join
(
select max(cl2.callid) as MaxCallID
, cl2.custid
from calllog cl2
where cl2.calltype = 'in_for_service_pc'
group by cl2.custid
) x on x.custid = cl.custid
WHERE cl.RecvdDate >= '2015-10-01' AND
cl.OpnAreaCat = 'back from repair'
ORDER BY cl.CallID DESC