SQL Query based on specific conditions - sql

Figure 1 denotes the current state of the TABLE A and TABLE B.
The current implementation is to fetch the new MIds to Table B and copy the SqlQuery from base process, in case of a new market or an existing market. Below query is used for this:
SELECT A.MId, B1.Loop, B1.Segment, B1.SqlQuery, B1.UseDefault
FROM TableB B1 WITH (NOLOCK)
INNER JOIN TableA A WITH (NOLOCK) ON B1.MId IN (100, 200)
AND B1.MId = A.BaseMarket
AND ISNULL(A.POCId, 0) > 0
LEFT JOIN TableB B2 WITH (NOLOCK) ON A.MId = B2.MId
WHERE B2.MId IS NULL
Figure 2 shows the updated data in Table A and the desired state of Table B. The required implementation would be:
To fetch the new MIds to Table B and copy the SqlQuery from Base Process, if it's a new market (XYZ Market - 2001, 2002)
If the market configuration already exists in Table B (Market ABC - 1001 and 1002), then copy the existing configuration's SqlQuery.
Here's the complete flow for Table A and B. The base configurations (100 and 200) in both tables were inserted manually initially including the loop and segments.
A new market is introduced and a new MId is created in Table A. Let's assume that to be 1001 and 1002 for Market ABC.
Corresponding records are inserted in Table B for each MId and it copies data from Base Configuration in Table B. Inserted Records (SqlId - 3 and 4)
SqlQuery column in Table B is updated manually due to a specific business request. (SqlId - 3 and 4). Hence, the different query.
Market ABC is updated in front end, which creates two new entries in Table A. (MId - 1003 and 1004). Also, new market XYZ (MId - 2001 and 2002) is created.
Corresponding entries created in Table B should refer Base Configuration for Market XYZ (SqlId - 7 and 8), since it's a new market but should copy the existing configuration for Market ABC (MId - 1001 and 1002) since it's configuration already existed.
I am looking for a suggestions if a single query can implement this requirement using Case statement. I'll appreciate your help!

I guess by market configuration already exists you actually mean the combination of MarketName and Type. So here's the query
SELECT
A.NewId, B.Loop, B.Segment, B.SqlQuery, B.UseDefault
FROM (
SELECT
A1.MId AS NewId, A2.MId AS RefId
FROM
TableA A1
INNER JOIN
TableA A2
ON
(A1.MarketName = A2.MarketName AND A1.Type = A2.Type) -- use your market configuration logic here
OR
A1.BaseMarket = A2.BaseMarket
WHERE
A1.Mid NOT IN (SELECT MId FROM TableB)
) As A
INNER JOIN
TableB B
ON (A.RefId = B.MID)
At first we are self-joining TableA to get the reference MId as RefId here. Then we are joining the new derived table with TableB.
Hope this helps. Thank you!

Related

Select rows from table where a certain value in a joined table does not exist

I have two tables, playgrounds and maintenance, which are linked with a foreign key. Whenever there is a maintenance on a playground, it will be saved in the table and connected to the respective playground.
Table A (playgrounds):
playground_number
Table B (maintenance):
playground_number (foreign key),
maintenance_type (3 different types),
date
What I now want is to retrieve all the playgrounds on which a certain type of maintenance has NOT been performed yet IN a certain year. For instance all playgrounds that do not have a maintenance_type = 1 in the year 2022 connected yet, although there could be multiple other maintenance_types because they are more frequent.
This is what I have tried (pseudo):
SELECT DISTINCT A.playground_number
FROM table A
JOIN table B ON A.playground_number = B.playground_number (FK)
WHERE NOT EXISTS (SELECT B.maintenance_type FROM table B
WHERE B.maintenance_type = 1 AND year(B.date) = 2022
However this will return nothing as soon as there is only one entry with maintenance_type 1 within the table.
I am struggling with this query for a while, so would appreciate some thoughts :) Many thanks.
You need to correlate the exists subquery to the outer B table. Also, you don't even need the join.
SELECT DISTINCT a.playground_number
FROM table_a a
WHERE NOT EXISTS (
SELECT 1
FROM table_b b
WHERE b.playground_number = a.playground_number AND
b.maintenance_type = 1 AND
YEAR(b.date) = 2022
);
Please consider this. I don't think you need JOIN.
SELECT DISTINCT A.playground_number
FROM table A
WHERE A.playground_number NOT IN (SELECT B.playground_number FROM table B
WHERE B.maintenance_type = 1 AND year(B.date) = 2022)
Please let me know if I understand it incorrectly.

populating null rows in table column based on matching IDs via join or otherwise

Just to level set: i'm working within a Vertica database using SQL.
Let's say i have two tables: Table A and Table B. Let's also say that Table A is my final/master table used for data vis within Tableau (or something akin), and that Table B feeds certain columns into Table A based on matches within a tertiary table, Table C (which is not relevant to this conversation).
As is, Table A has columns:
ProgramName [varchar(50)]
CustomerName [varchar(50)]
Total_Cost [numeric(18,4)]
As is, Table B has columns:
CustomerCode [varchar(10)]
Total_Cost [numeric(18,4)]
What I would like to do is update Table A's CustomerName column to equal CustomerCode in Table B where the columns of total_cost_dollars equal each other across tables.
I've run this left join query to ensure that, when I do update Table A's CustomerName to equal CustomerCode, the total cost columns are exact/true matches for my entire data set.
SELECT
A.ProgramName,
A.CustomerName,
A.total_cost_dollars,
B.CustomerCode,
B.total_cost_dollars
FROM
TableA A
LEFT JOIN
TableB B
ON
B.total_cost_dollars = A.total_cost_dollars
WHERE
A.CustomerName IS NULL;
Any idea on how to solve this problem?
Since Vertica supports merge query, you can use merge statement:
merge into TableA A
using TableB B
ON (B.total_cost_dollars = A.total_cost_dollars)
when matched then
update
set
A.CustomerName = B.CustomerCode
where
A.CustomerName IS NULL;

Change Data Capture Using Spark SQL

I have few tables which are related as A -> Left Join -> B -> Left join -> C. Let's call A as the driving table and B & C as "supporting" tables. Each of these tables have a last_update_date column. My requirement is to identify the records that changed since the last processing date (available as a parameter) not only in the driving table but also if a change to any column occurs in the supporting table(s).
Table A
------
empid|salary|last_updt_dt
123|20000|05/14/2019
Table B
-------
empid|fname|lname|last_updt_date
123|John|Taylor|05/16/2019
Table C
-------
empid|address|last_updt_dt
123|Maryland|05/17/2019
Assume, = 05/10/2019
So, assuming executing job on Day 1 (05/20/2019) output should be:
empid|fname|lname|salary|address|last_exec_date
-----------------------------------------------
123|John|Taylor|20000|Maryland|05/20/2019
Now, let's assume that on Day 2 (05/21/2019), the address got changed from Maryland to California. So, on Day 2, the output table should look like:
empid|fname|lname|salary|address|last_exec_date
-----------------------------------------------
123|John|Taylor|20000|Maryland|05/20/2019
123|John|Taylor|20000|California|05/21/2019
561|Peter|Anderson|50000|Missouri|05/21/2019
The point to note is that on Day 2 a change in any "supporting table" (Table-C 'address' column in this case) triggered insertion of another record which was already processed earlier yesterday, but now with Updated value in address column. Also note, on Day 2 other inserts will happen as-is as regular inserts for any other qualifying record (if any) e.g. empid=561.
SELECT
A.empid, B.fname, B.lname, A.salary, C.address, current_date() as last_exec_date
from A
left outer join B
on A.empid = B.empid
left outer join B.empid = C.empid
where to_date(A.last_updt_dt, 'yyyyMMdd') > {last_exec_date}
OR to_date(A.last_updt_dt, 'yyyyMMdd') > {last_exec_date}
to_date(A.last_updt_dt, 'yyyyMMdd') > {last_exec_date}
My challenge is how to trigger and propagate any changes from any of the participating supporting tables, even when that change pertains to a record which had been processed and inserted to the target table earlier, so that a new record with the updated value shows in the target table.
In other word how can I trigger a record with a change from any of the other supporting (non-driver) tables

Read Rows But search first

I want to make an import system that will look into one Datasource and copy new records into another DataSource.
Monthly I want to copy some tables data from one datasource to another datasource
SourceTableName : srcTable
DestinationTableName : destTable
Suppose first month in source table I have:
Id Name 1 john
3 Rahul 5 Andrew
All three rows Will be copy into desTable
Suppose Second Month in Source Table I have
Id Name 1 John
3 Rahul 5 Andrew
6 Vikas 7 Sonam
8 Divya
Firstly Sql Should get the last Row of desTable
and match that row into srcTable
and extract all new records from scrTable and copied into desTable
.....
Please let me know how I can write query for fulfill above purpose. If there is shorter approach, that would be helpful too.
Since you only care about adding new records, and don't need to handle updates or deletes... You can simply add the record from the source table if it doesn't exist in the destination table:
INSERT INTO destTable (ID, Name)
SELECT s.ID, s.Name
FROM
srcTable s
LEFT OUTER JOIN destTable d ON d.ID = s.ID
WHERE
d.ID IS NULL
You can write a stored procedure for do this action and execute that every time you want.
for this action you can from bellow query:
(Part 1 for insert new data, Part 2 for update change data)
Insert Into DestinationTable(ID, Name)
Select ID, Name
From SoiurceTable
Where Not Exists
(Select *
From TDestinationTablest
Where DestinationTable.ID = SoiurceTable.ID)
Go
Update DestinationTable
Set DestinationTable.Name = SoiurceTable.Name
From DestinationTable, SoiurceTable
Where DestinationTable.ID = SoiurceTable.ID
I hope it's helpful.

SQL Query - Ensure a row exists for each value in ()

Currently struggling with finding a way to validate 2 tables (efficiently lots of rows for Table A)
I have two tables
Table A
ID
A
B
C
Table matched
ID Number
A 1
A 2
A 9
B 1
B 9
C 2
I am trying to write a SQL Server query that basically checks to make sure for every value in Table A there exists a row for a variable set of values ( 1, 2,9)
The example above is incorrect because t should have for every record in A a corresponding record in Table matched for each value (1,2,9). The end goal is:
Table matched
ID Number
A 1
A 2
A 9
B 1
B 2
B 9
C 1
C 2
C 9
I know its confusing, but in general for every X in ( some set ) there should be a corresponding record in Table matched. I have obviously simplified things.
Please let me know if you all need clarification.
Use:
SELECT a.id
FROM TABLE_A a
JOIN TABLE_B b ON b.id = a.id
WHERE b.number IN (1, 2, 9)
GROUP BY a.id
HAVING COUNT(DISTINCT b.number) = 3
The DISTINCT in the COUNT ensures that duplicates (IE: A having two records in TABLE_B with the value "2") from being falsely considered a correct record. It can be omitted if the number column either has a unique or primary key constraint on it.
The HAVING COUNT(...) must equal the number of values provided in the IN clause.
Create a temp table of values you want. You can do this dynamically if the values 1, 2 and 9 are in some table you can query from.
Then, SELECT FROM tempTable WHERE NOT IN (SELECT * FROM TableMatched)
I had this situation one time. My solution was as follows.
In addition to TableA and TableMatched, there was a table that defined the rows that should exist in TableMatched for each row in TableA. Let’s call it TableMatchedDomain.
The application then accessed TableMatched through a view that controlled the returned rows, like this:
create view TableMatchedView
select a.ID,
d.Number,
m.OtherValues
from TableA a
join TableMatchedDomain d
left join TableMatched m on m.ID = a.ID and m.Number = d.Number
This way, the rows returned were always correct. If there were missing rows from TableMatched, then the Numbers were still returned but with OtherValues as null. If there were extra values in TableMatched, then they were not returned at all, as though they didn't exist. By changing the rows in TableMatchedDomain, this behavior could be controlled very easily. If a value were removed TableMatchedDomain, then it would disappear from the view. If it were added back again in the future, then the corresponding OtherValues would appear again as they were before.
The reason I designed it this way was that I felt that establishing an invarient on the row configuration in TableMatched was too brittle and, even worse, introduced redundancy. So I removed the restriction from groups of rows (in TableMatched) and instead made the entire contents of another table (TableMatchedDomain) define the correct form of the data.