Update the date fields from the values returned from a subquery Oracle SQL - sql

I have an application that scans a table for IDs that do not have a processed date. When an ID is processed, a process date is added to Table 1 and the entire record is copied to another table (table 2) if it completes the application.
I need to compare IDs in the tables looking for IDs from table 1 that are not in table 2 but has a date for when it was processed. This is a sign that the id was processed but never made it to the end of the application and failed along the way.
I then need to update these IDs (records) from table 1 by setting the processing date to null so the application picks it up in the next run.
Here is a query that gets the IDs I need.
SELECT Subject_Number
From Table1
WHERE NOT EXISTS(SELECT NULL
FROM Table2
WHERE Table2.Subject_Number = Table1.Subject_Number)
AND Table1.Processed_Date IS NOT NULL.
Now I just need to update the processed date to null for the IDs this returns.
Any help will be greatly appreciated.

This query something like this
UPDATE <table> SET <fields> WHERE <table.id> IN (
SELECT Subject_Number
From Table1
WHERE NOT EXISTS(
SELECT NULL
FROM Table2
WHERE Table2.Subject_Number = Table1.Subject_Number)
AND Table1.Processed_Date IS NOT NULL.)

Related

Improving insert query for SCD2 solution

I have two insert statements. The first query is to inserta new row if the id doesn't exist in the target table. The second query inserts to the target table only if the joined id hash value is different (indicates that the row has been updated in the source table) and the id in the source table is not null. These solutions are meant to be used for my SCD2 solution, which will be used for inserts of hundreds thousands of rows. I'm trying not to use the MERGE statement for practices.
The columns "Current" value 1 indicates that the row is new and 0 indicates that the row has expired. I use this information later to expire my rows in the target table with my update queries.
Besides indexing is there a more competent and effective way to improve my insert queries in a way that resembles the like of the SCD2 merge statement for inserting new/updated rows?
Query:
Query 1:
INSERT INTO TARGET
SELECT Name,Middlename,Age, 1 as current,Row_HashValue,id
from Source s
Where s.id not in (select id from TARGET) and s.id is not null
Query 2:
INSERT INTO TARGET
SELECT Name,Middlename,Age,1 as current ,Row_HashValue,id
FROM SOURCE s
LEFT JOIN TARGET t ON s.id = t.id
AND s.Row_HashValue = t.Row_HashValue
WHERE t.Row_HashValue IS NULL and s.ID IS NOT NULL
You can use WHERE NOT EXISTS, and have just one INSERT statement:
INSERT INTO TARGET
SELECT Name,Middlename,Age,1 as current ,Row_HashValue,id
FROM SOURCE s
WHERE NOT EXISTS (
SELECT 1
FROM TARGET t
WHERE s.id = t.id
AND s.Row_HashValue = t.Row_HashValue)
AND s.ID IS NOT NULL;

Update a single row in a table in SQL

So, I am creating a new table that gets populated from another table. NewTableA.ColA is getting populated from an existing OldTableB.ColB
Source query that populates NewTableA.ColA:
SELECT TOP (1) EXEC_END_TIME
FROM CR_STAT_EXECUTION AS cse
WHERE (EXEC_NAME = 'ETL')
ORDER BY EXEC_END_TIME DESC
Destination Table (NewTableA.ColA) When scripted out:
SELECT TOP 1 [EXEC_END_TIME]
FROM [SSISHelper].[dbo].[ETLTimeCheck]
ORDER BY EXEC_END_TIME DESC
The problem I am facing is, I only want to have 1 row in the NewTableA.ColA that updates the current value in the ColA from the other table. I already setup an SSIS job to populate the table every day from OldTableB.ColB... I just couldn't figure out how I can only update 1 row from OldTableB.ColB?
Thanks.
Use IF condition in SQL:
Example:
IF EXISTS (SELECT * FROM EXEC_END_TIME WHERE COLUMNX='xValue')
BEGIN
(...update...i guess)
END
ELSE
BEGIN
(...insert...i guess)
END

Optimized way to check if record is present in table 1. If not then check table 2, else return default value

Asked in an interview:
I have 2 tables, one table has records like ID, Name, address. id(pk) is from 1 to 10000000.
Another table has records from 10000001 to 20000000.
I have to check if a particular ID is present in table 1 or table 2 and return corresponding result.
Because table size is big, have to think an optimized way to do this.
declare #ID BIGINT
SET #ID=10000000
IF EXIST(SELECT ID FROM TABLE1 WHERE ID=#ID)
SELECT ID,NAME,ADDRESS FROM TABLE1 WHERE ID=#ID
ELSE IF EXIST(SELECT ID FROM TABLE2 WHERE ID=#ID)
SELECT ID,NAME,ADDRESS FROM TABLE2 WHERE ID=#ID
ELSE
SELECT #ID
Few ideas on top of my mind.
In the hive, you can use map-side join which is much faster than usual join when 1 table is large and another is small. (here 2nd table being the id you are searching for)
You can optimize in the way you store the data. Keeping the data sorted by id column, if such queries are frequent. A columnar format such as orc keeps track of the range of id in each file, resulting in such queries being faster.

SQL Combining two different tables

(P.S. I am still learning SQL and you can consider me a newbie)
I have 2 sample tables as follows:
Table 1
|Profile_ID| |Img_Path|
Table 2
|Profile_ID| |UName| |Default_Title|
My scenario is, from the 2nd table, i need to fetch all the records that contain a certain word, for which i have the following query :
Select Profile_Id,UName from
Table2 Where
Contains(Default_Title, 'Test')
ORDER BY Profile_Id
OFFSET 5 ROWS
FETCH NEXT 20 ROWS ONLY
(Note that i am setting the OFFSET due to requirements.)
Now, the scenario is, as soon as i retrieve 1 record from the 2nd table, i need to fetch the record from the 1st table based on the Profile_Id.
So, i need to return the following 2 results in one single statement :
|Profile_Id| |Img_Path|
|Profile_Id| |UName|
And i need to return the results in side-by-side columns, like :
|Profile_Id| |Img_Path| |UName|
(Note i had to merge 2 Profile_Id columns into one as they both contain same data)
I am still learning SQL and i am learning about Union, Join etc. but i am a bit confused as to which way to go.
You can use join:
select t1.*, t2.UName
from table1 t1 join
(select Profile_Id, UName
from Table2
where Contains(Default_Title, 'Test')
order by Profile_Id
offset 5 rows fetch next 20 rows only
) t2
on t2.profile_id = t1.profile_id
SELECT a.Profile_Id, a.Img_Path, b.UName
FROM table1 a INNER JOIN table2 b ON a.Profile_Id=b.Profile_Id
WHERE b.Default_Title = 'Test'

Sql server Update Query in table with duplicate records

I would like to know what happen when I do an update with duplicated rows, for example:
Table 1:
Email StatusID Status
phil#gmail.com NULL
dome#yahoo.es 1 Busy
phil#gmail.com 2 Online
dome#yahoo.es NULL
Table 2:
Email Name RejectionStatusID RejectionStatus
dome#yahoo.es Dome 1
phil#gmail.com Phil 2
Result?
Update Table2
SET RejectionStatusID = StatusID,
RejectionStatus = Status
FROM Table2 Inner Join Table1
ON Table2.Email = Table1.Email
I wouldlike to know which of the duplicated and why??? Not really sure!!! Of course I did the query and know what happen but why? Just want an explanation...
Thanks.
EDITED:
This is the example, this is what happen with null values!!!
http://sqlfiddle.com/#!6/6ee69/1/0
From BoL https://msdn.microsoft.com/en-us/library/ms177523(v=sql.110).aspx
The results of an UPDATE statement are undefined if the statement includes a FROM clause that is not specified in such a way that only one value is available for each column occurrence that is updated, that is if the UPDATE statement is not deterministic.