MS SQL - Two tables, simple flag calculation to check if a value is present in another table

MS SQL - Two tables, simple flag calculation to check if a value is present in another table - sql

Long time viewer and my first question. Please be gentle.
I am having issues writing a query that incorporates tables with 1-1 / 1-M relationships.
To keep it simple - I have two tables
Tables
Query - Provide the entire list of cases from Table 1 and add a new column that has a flag (Y/N) if case has a car from table 2 whilst keeping the 1-1 relationship
Outputs

Try using exists logic to check, for each table 1 record, if it has a matching car record in the second table:
SELECT
t1.caseno,
CASE WHEN EXISTS (SELECT 1 FROM Table2 t2
WHERE t1.caseno = t2.caseno AND t2.Product = 'Car')
THEN 'Y' ELSE 'N' END AS car_flag
FROM Table1 t1
ORDER BY
t1.caseno;
Demo

Related

Get the "most" optimal row in a JOIN

Problem
I have a situation in which I have two tables in which I would like the entries from table 2 (lets call it table_2) to be matched up with the entries in table 1 (table_1) such that there are no duplicates rows of table_2 used in the match up.
Discussion
Specifically, in this case there are datetime stamps in each table (field is utcdatetime). For each row in table_1, I want to find the row in table_2 in which has the closed utcdatetime to the table 1 utcdatetime such that the table2.utcdatetime is older than the table_1 utcdatetime and within 30 minutes of the table 1 utcdatetime. Here is the catch, I do not want any repeats. If a row in table 2 gets gobbled up in a match on an earlier row in table 1, then I do not want it considered for a match later.
This has currently been implemented in a Python routine, but it is slow to iterate over all of the rows in table 1 as it is large. I thought I was there with a single SQL statement, but I found that my current SQL results in duplicate table 2 rows in the output data.

I would recommend using a nested select to get whatever results you're looking for.
For instance:
select *
from person p
where p.name_first = 'SCCJS'
and not exists (select 'x' from person p2 where p2.person_id != p.person_id
and p.name_first = 'SCCJS' and p.name_last = 'SC')

SQL (VBA/ADO) command to populate NULL fields with corresponding non-NULL values from duplicate records

I have a database where several hundred records have been duplicated. However the duplicated information is not the same across all fields. For any two lines, the first line will contain information in some fields while the duplicate line's fields are blank; but then for other fields, the duplicate (second) line will contain information while the first line's fields are blank. For example, it looks like this:
ID Deleted Reference Name Case_Date Outcome Outcome_Date
100 False A123 Chris 2000-01-01 Yes
101 False A123 Chris 2000-03-31
The ID column is a unique primary key for the record.
The Reference column is the one by which I can identify the duplicates.
However as you can see, the first record (100) contains information in Case_Date and Outcome, but the second record (101) contains an Outcome_Date.
What I want to do is to copy the most amount of information into just one of each pair of records, and then mark the duplicate as deleted (I use a soft-delete, not actually removing records from the table but just flagging the Duplicate column as True). With the above example, I want it to look like this:
ID Deleted Reference Name Case_Date Outcome Outcome_Date
100 False A123 Chris 2000-01-01 Yes 2000-03-31
101 True A123 Chris (2000-01-01)* (Yes)* 2000-03-31
Technically it will not be necessary to also copy information into the blank fields of the record which will be marked as deleted, but I figure it's easier to just copy everything and then mark the "second" record as the duplicate, rather than trying to work out which one contains more information and which one contains less.
I am also aware that it will be easier to run a separate SQL command for each column than to try to do them all at once. The columns shown above are a simplified example, and the information which may or may not be present across each column differs.
My select query for the record set of duplicates is:
SELECT *
FROM [Cohorts]
WHERE [Deleted] = False
AND ([CaseType] = "Female" OR [CaseType] = "Family")
AND [Reference] Is Not Null
And [Reference] In (SELECT [Reference] FROM [Cohorts] As Tmp
WHERE [Deleted] = False
AND ([CaseType] = "Female" OR [CaseType]="Family")
GROUP BY [Reference]
HAVING Count(*) > 1)
ORDER BY [Reference];
This will return all (Female/Family) records in the table [Cohorts] where there exists more than one record with the same Reference (and where the records have not been marked as deleted).
I'm running my queries from VBA via ADO, so can execute UPDATE statements. My database is an Access-compatible .mdb using the JET engine.
Grateful if anyone could suggest a suitable SQL command which I can run per column in order to populate the NULL fields with the values of the non-NULL fields from the relevant duplicate records. It's a bit beyond my SQL understanding at present! Thanks.

My first UPDATE JOIN ever, hope it works (untested):
update t1
set t1.name = coalesce(t1.name, t2.name),
t1.Case_Date = coalesce(t1.Case_Date, t2.Case_Date),
t1.Outcome = coalesce(t1.Outcome, t2.Outcome),
t1.Outcome_Date = coalesce(t1.Outcome_Date, t2.Outcome_Date),
t1.deleted = case when t1.id < t2.id then FALSE else TRUE end
from Cohorts t1
join Cohorts t2 on t1.Reference = t2.Reference and t1.id <> t2.id
Edit: Alternative solution:
Create a copy table, do insert select:
insert into CohortsCopy (Deleted, Reference, Name, Case_Date, Outcome, Outcome_Date)
select case when t1.id < t2.id or t2.id is null then FALSE else TRUE end,
coalesce(t1.Reference, t2.Reference),
coalesce(t1.name, t2.name),
coalesce(t1.Case_Date, t2.Case_Date),
coalesce(t1.Outcome, t2.Outcome),
coalesce(t1.Outcome_Date, t2.Outcome_Date)
from Cohorts t1
left join Cohorts t2 on t1.Reference = t2.Reference and t1.id <> t2.id
Then either rename, or copy back to original table.

condition that applies to multiple lines before update is made in another table

I have two tables, one table is linked by a foreign key to the other.
I need to create an update statement that would update table one based on a condition in table two. However, the conditions have to relate to one or more lines in table two.
Example - Table Two
Order A
line 1 = open (status column)
line 2 = closed (staus column)
When line 1 is also closed (same order number - in this case A), the condition is met so the order will then be closed (updated to closed) in the other table. Table on only has header information (no multiple lines).
I am having trouble with righting a condtion that looks at multiple lines in table two (all lines have to be closed) before the update to table one is made.
Any helpful suggestions would be appreciated.

Assuming that (1) the key name is ordernum, and (2) EVERY table1 entry has at least one entry in table2, this is a simple query that should work for you. Basically, the not exists clause is testing that there are no open lines in the second table.
update table1
set table1.status = 'closed'
where table1.status = 'open'
and not exists
(select 1
from table2
where table2.ordernum = table1.ordernum
and table2.status = 'open')
This may need tweaked further based on your business requirements.
Update based on user request: You could try this, but performance may take a hit, I've not tested it:
update table1
set table1.status = 'closed',
table1.count_lines =
(select count(1)
from table2 y
where y.ordernum = table1.ordernum
)
where table1.status = 'open'
and not exists
(select 1
from table2 z
where z.ordernum = table1.ordernum
and z.status = 'open')
Update 2: Try this from your last comment. You are wanting to update the ZORDER table with the sum of all the line item prices. So you must UPDATE ZORDER, not ZORDERLINE. The total is found by summing EXTENDED_PRICE on all ZORDERLINE rows that match the order id. There may be some additional business logic needed in the first sub-query (eg., if you need to exclude certain statuses like canceled items), but this should get you very close.
UPDATE zorder
SET zorder.status = 3,
zorder.total =
(SELECT SUM(Y.EXTENDED_PRICE)
FROM ZORDERLINE Y
WHERE Y.order_id = zorder.order_id)
WHERE zorder.status = 1
and not exists
(select 1
from zorderline z
where z.order_id = zorder.order_id
and z.status = 1)

Update a table based on a results of a group by

Update a table based on a results of a group by
I've got a tricky update problem I'm trying to solve. There are two tables that contain the same three columns plus additional varied columns, looking like this:
Table1 {pers_id, loc_id, pos, ... }
Table2 {pers_id, loc_id, pos, ... }
None of the fields are unique. The first two fields collectively identify the records in a table (or tables) as belonging to the same entity. Table1 could have 15 records belonging to an entity, and table2 could have 4 records belonging to the same entity. The third column 'pos' is an index from 0 to whatever, and this is the column that I'm trying to update.
In Table1 and in Table2, the pos column begins at 0, and increments based on user selection, so that in the example (15 records in table1 and 4 records in table2), table1 contains 'pos' values of 0 - 14, and Table2 contains 'pos' values of 0-3.
I want to increment the pos field in Table1 with the results of the count of similar entities in Table2. This is the sql statement that correctly gives me the results from table2:
select table2.pers_id, table2.loc_id, count(*) as pos_increment from table2 group by table2.pers_id, table2.loc_id;
The end result of the update, in the example (15 records in table1 and 4 records in table2), would be all records in Table1 of the same entity being incremented by 4 (the result of the specific entity group by). 0 would be changed to 4, 15 to 19, etc.
Is this achievable in a single statement?

Since you only need to increment the pos field the solution is really simple:
update table1 t1
set t1.pos = t1.pos +
(select count(1)
from table2 t2
where t2.pers_id = t1.pers_id
and t2.loc_id = t1.loc_id)

Yes, this is possible, you can use MERGE for some of these upadtes and there are ways to relate values between the update and the subselect. I have done this in the past, but it's tricky and I don't have an existing example.
You can find several examples on this site, some for Oracle and some for other database that will awork with slight modifications.

SQL Server 2005 - How to take a record from one table and loop through another table looking for match

I have 2 tables. for this example I will use only one users records.
The first table has the user name and an evaluation date as such:
USER EVALDATE
--------------
bobr 6/7/2010
bobr 9/20/2010
bobr 9/21/2010
The above table needs to be joined against this user history table, which has the history of the ID's and the dates they were valid, to look for a match (the NULL date means current):
USER STARTDATE ENDDATE
----------------------------
bobr 2/20/2006 4/18/2010
bobr2 4/19/2010 9/7/2010
bobr 9/8/2010 null
What I'm trying to do in SQL Server 2005 is take the first record from the first table, loop it through the second table and when(if) the EVALDATE is within one of these date ranges and the IDs match, then flag that record from the first table as valid.
The current code takes the record from the first table and runs against all rows of the second table and kicks out a record for each invalid evaldate, so it kicks out a record when joined against the second table because the evaldate is not between the dates of the first record on the history table, even though the record is fine because the evaldate is between the start and end dates of the third record in the history table.
I hope this makes sense! In something like SAS I can create an array and loop through checking against each record in the history table. How do I do this in SQL? What I was trying to do was just update the first table with a flag if the records dates are invalid. Any ideas? Thanks!!!

Try this:
SELECT [USER]
,[EVALDATE]
,CASE WHEN ( SELECT COUNT(*)
FROM [UserStartEndDates] b
WHERE [a].[USER] = [b].[User]
AND [EVALDATE] BETWEEN [STARTDATE]
AND COALESE([ENDDATE],[EVALDATE])
) > 0 THEN 1
ELSE 0
END AS [IsValid]
FROM [Evaluations] a

You can try something like
Select *
FROM Users u INNER JOIN
UserHistory uh ON u.User = uh.User
AND u.EvalDate BETWEEN uh.StartDate
AND ISNULL(uh.EndDate, u.EvalDate)
EDIT
Try this for all values from User
Select u.*,
CASE
WHEN uh.User IS NULL
THEN 'Invalid'
ELSE 'Valid'
END Validity
FROM Users u LEFT JOIN
UserHistory uh ON u.User = uh.User
AND u.EvalDate BETWEEN uh.StartDate
AND ISNULL(uh.EndDate, u.EvalDate)

try something like:
select * from [table2] t2
join [table1] t1 on t1.user = t2.user
--or better yet the foreign key
where t1.user = t2.user and t1.evaldate
between t2.startdate and t2.enddate

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

MS SQL - Two tables, simple flag calculation to check if a value is present in another table - sql

Try using exists logic to check, for each table 1 record, if it has a matching car record in the second table: SELECT t1.caseno, CASE WHEN EXISTS (SELECT 1 FROM Table2 t2 WHERE t1.caseno = t2.caseno AND t2.Product = 'Car') THEN 'Y' ELSE 'N' END AS car_flag FROM Table1 t1 ORDER BY t1.caseno; Demo

Related

Get the "most" optimal row in a JOIN

SQL (VBA/ADO) command to populate NULL fields with corresponding non-NULL values from duplicate records

condition that applies to multiple lines before update is made in another table

Update a table based on a results of a group by

SQL Server 2005 - How to take a record from one table and loop through another table looking for match

Categories

Resources