Finding difference between two tables in SQL - sql

I have two tables with exactly same schema. One is a week old and other is current. Let new record be new_data and the old be old_data. Both have a column called opportunity_id which is the primary key, sales stages(1,2,3,4..etc) and sales_values. Now in a week a sales stage may change for an opportunity or sales values of an opportunity may change. Even a new Opportunity_id may be added. I need all the data that has changed.
I am trying INNER_JOIN but that only works when opportunity_ids match.I need newly added opportunities also.
I am using MS Access, so please provide only sql solution.
Thanks

What you're looking for is the EXISTS clause. It returns rows that match on criteria (or doesn't match). Your query would look something like this:
SELECT *
FROM new_data AS n
WHERE NOT EXISTS (SELECT *
FROM old_data AS o
WHERE n.opportunity_id = o.opportunity_id
AND n.sales_stages = o.sales_stages
AND n.sales_values = o.sales_values)

You should use LEFT OUTER JOIN instead (providing the table on the left is New_Data)
SELECT n.opportunity_id, n.sales_stage, n.sales_values, [o].sales_stage, [o].sales_values
FROM new_data AS n LEFT JOIN old_data AS o
ON n.opportunity_id = [o].opportunity_id
WHERE (((n.sales_stage)<>[o].[sales_stage]))
OR (((n.sales_values)<>[o].[sales_values]))
OR ((([o].opportunity_id) Is Null));

If you truly need ALL changes, there's one more case to consider.
What about deleted keys? Meaning keys that exist in old_data and not in new_data. To get these, use the same general approach (in my limited experience OUTER JOIN seems to perform better in Access):
SELECT [o].opportunity_id
FROM new_data AS n RIGHT JOIN old_data AS o
ON n.opportunity_id = [o].opportunity_id
WHERE ((([o].opportunity_id) Is Not Null))
AND ((([n].opportunity_id) Is Null));

Related

Abap subquery Where Cond [duplicate]

I have a requirement to pull records, that do not have history in an archive table. 2 Fields of 1 record need to be checked for in the archive.
In technical sense my requirement is a left join where right side is 'null' (a.k.a. an excluding join), which in abap openSQL is commonly implemented like this (for my scenario anyways):
Select * from xxxx //xxxx is a result for a multiple table join
where xxxx~key not in (select key from archive_table where [conditions] )
and xxxx~foreign_key not in (select key from archive_table where [conditions] )
Those 2 fields are also checked against 2 more tables, so that would mean a total of 6 subqueries.
Database engines that I have worked with previously usually had some methods to deal with such problems (such as excluding join or outer apply).
For this particular case I will be trying to use ABAP logic with 'for all entries', but I would still like to know if it is possible to use results of a sub-query to check more than than 1 field or use another form of excluding join logic on multiple fields using SQL (without involving application server).
I have tested quite a few variations of sub-queries in the life-cycle of the program I was making. NOT EXISTS with multiple field check (shortened example below) to exclude based on 2 keys works in certain cases.
Performance acceptable (processing time is about 5 seconds), although, it's noticeably slower than the same query when excluding based on 1 field.
Select * from xxxx //xxxx is a result for a multiple table inner joins and 1 left join ( 1-* relation )
where NOT EXISTS (
select key from archive_table
where key = xxxx~key OR key = XXXX-foreign_key
)
EDIT:
With changing requirements (for more filtering) a lot has changed, so I figured I would update this. The construct I marked as XXXX in my example contained a single left join ( where main to secondary table relation is 1-* ) and it appeared relatively fast.
This is where context becomes helpful for understanding the problem:
Initial requirement: pull all vendors, without financial records in 3
tables.
Additional requirements: also exclude based on alternative
payers (1-* relationship). This is what example above is based on.
More requirements: also exclude based on alternative payee (*-* relationship between payer and payee).
Many-to-many join exponentially increased the record count within the construct I labeled XXXX, which in turn produces a lot of unnecessary work. For instance: a single customer with 3 payers, and 3 payees produced 9 rows, with a total of 27 fields to check (3 per row), when in reality there are only 7 unique values.
At this point, moving left-joined tables from main query into sub-queries and splitting them gave significantly better performance.
than any smarter looking alternatives.
select * from lfa1 inner join lfb1
where
( lfa1~lifnr not in ( select lifnr from bsik where bsik~lifnr = lfa1~lifnr )
and lfa1~lifnr not in ( select wyt3~lifnr from wyt3 inner join t024e on wyt3~ekorg = t024e~ekorg and wyt3~lifnr <> wyt3~lifn2
inner join bsik on bsik~lifnr = wyt3~lifn2 where wyt3~lifnr = lfa1~lifnr and t024e~bukrs = lfb1~bukrs )
and lfa1~lifnr not in ( select lfza~lifnr from lfza inner join bsik on bsik~lifnr = lfza~empfk where lfza~lifnr = lfa1~lifnr )
)
and [3 more sets of sub queries like the 3 above, just checking different tables].
My Conclusion:
When exclusion is based on a single field, both not in/not exits work. One might be better than the other, depending on filters you use.
When exclusion is based on 2 or more fields and you don't have many-to-many join in main query, not exists ( select .. from table where id = a.id or id = b.id or... ) appears to be the best.
The moment your exclusion criteria implements a many-to-many relationship within your main query, I would recommend looking for an optimal way to implement multiple sub-queries instead (even having a sub-query for each key-table combination will perform better than a many-to-many join with 1 good sub-query, that looks good).
Anyways, any additional insight into this is welcome.
EDIT2: Although it's slightly off topic, given how my question was about sub-queries, I figured I would post an update. After over a year I had to revisit the solution I worked on to expand it. I learned that proper excluding join works. I just failed horribly at implementing it the first time.
select header~key
from headers left join items on headers~key = items~key
where items~key is null
if it is possible to use results of a sub-query to check more than
than 1 field or use another form of excluding join logic on multiple
fields
No, it is not possible to check two columns in subquery, as SAP Help clearly says:
The clauses in the subquery subquery_clauses must constitute a scalar
subquery.
Scalar is keyword here, i.e. it should return exactly one column.
Your subquery can have multi-column key, and such syntax is completely legit:
SELECT planetype, seatsmax
FROM saplane AS plane
WHERE seatsmax < #wa-seatsmax AND
seatsmax >= ALL ( SELECT seatsocc
FROM sflight
WHERE carrid = #wa-carrid AND
connid = #wa-connid )
however you say that these two fields should be checked against different tables
Those 2 fields are also checked against two more tables
so it's not the case for you. Your only choice seems to be multi-join.
P.S. FOR ALL ENTRIES does not support negation logic, you cannot just use some sort of NOT IN FOR ALL ENTRIES, it won't be that easy.

Complex SQL View with Joins & Where clause

My SQL skill level is pretty basic. I have certainly written some general queries and done some very generic views. But once we get into joins, I am choking to get the results that I want, in the view I am creating.
I feel like I am almost there. Just can't get the final piece
SELECT dbo.ics_supplies.supplies_id,
dbo.ics_supplies.old_itemid,
dbo.ics_supplies.itemdescription,
dbo.ics_supplies.onhand,
dbo.ics_supplies.reorderlevel,
dbo.ics_supplies.reorderamt,
dbo.ics_supplies.unitmeasure,
dbo.ics_supplies.supplylocation,
dbo.ics_supplies.invtype,
dbo.ics_supplies.discontinued,
dbo.ics_supplies.supply,
dbo.ics_transactions.requsitionnumber,
dbo.ics_transactions.openclosed,
dbo.ics_transactions.transtype,
dbo.ics_transactions.originaldate
FROM dbo.ics_supplies
LEFT OUTER JOIN dbo.ics_orders
ON dbo.ics_supplies.supplies_id = dbo.ics_orders.suppliesid
LEFT OUTER JOIN dbo.ics_transactions
ON dbo.ics_orders.requisitionnumber =
dbo.ics_transactions.requsitionnumber
WHERE ( dbo.ics_transactions.transtype = 'PO' )
When I don't include the WHERE clause, I get 17,000+ records in my view. That is not correct. It's doing this because we are matching on a 1 to many table. Supplies table is 12,000 records. There should always be 12,000 records. Never more. Never less.
The pieces that I am missing are:
I only need ONE matching record from the ICS_Transactions Table. Ideally, the one that I want is the most current 'ICS_Transactions.OriginalDate'.
I only want the ICS_Transactions Table fields to populate IF ICS_Transacions.Type = 'PO'. Otherwise, these fields should remain null.
Sample code or anything would help a lot. I have done a lot of research on joins and it's still very confusing to get what I need for results.
EDIT/Update
I feel as if I asked my question in the wrong way, or didn't give a good overall view of what I am asking. For that, I apologize. I am still very new to SQL, but trying hard.
ICS_Supplies Table has 12,810 records
ICS_Orders Table has 3,666 records
ICS_Transaction Table has 4,701 records
In short, I expect to see a result of 12,810 records. No more and no less. I am trying to create a View of ALL records from the ICS_Supplies table.
Not all records in Supply Table are in Orders and or Transaction Table. But still, I want to see all 12,810 records, regardless.
My users have requested that IF any of these supplies have an open PO (ICS_Transactions.OpenClosed = 'Open' and ICS_Transactions.InvType = 'PO') Then, I also want to see additional fields from ICS_Transactions (ICS_Transactions.OpenClosed, ICS_Transactions.InvType, ICS_Transactions.OriginalDate, ICS_Transactions.RequsitionNumber).
If there are no open PO's for supply record, then these additional fields should be blank/null (regardless to what data is in these added fields, they should display null if they don't meet the criteria).
The ICS_Orders Table is nly needed to hop from the ICS_Supplies to the ICS_Transactions (I first, need to obtain the Requisition Number from the Orders field, if there is one).
I am sorry if I am not doing a good job to explain this. Please ask if you need clarification.
Here's a simplified version of Ross Bush's answer (It removes a join from the CTE to keep things more focussed, speed things up, and cut down the code).
;WITH
ordered_ics_transactions AS
(
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY requisitionnumber
ORDER BY originaldate DESC
)
AS seq_id
FROM
dbo.ics_transactions
)
SELECT
s.supplies_id, s.old_itemid,
s.itemdescription, s.onhand,
s.reorderlevel, s.reorderamt,
s.unitmeasure, s.supplylocation,
s.invtype, s.discontinued,
s.supply,
t.requsitionnumber, t.openclosed,
t.transtype, t.originaldate
FROM
dbo.ics_supplies AS s
LEFT OUTER JOIN
dbo.ics_orders AS o
ON o.supplies_id = s.suppliesid
LEFT OUTER JOIN
ordered_ics_transactions AS t
ON t.requisitionnumber = o.requisitionnumber
AND t.transtype = 'PO'
AND t.seq_id = 1
This will only join the most recent transaction record for each requisitionnumber, and only if it has transtype = 'PO'
IF you want to reverse that (joining only transaction records that have transtype = 'PO', and of those only the most recent one), then move the transtype = 'PO' filter to be a WHERE clause inside the ordered_ics_transactions CTE.
You can possibly work with the query below to get what you need.
1. I only need ONE matching record from the ICS_Transactions Table. Ideally, the one that I want is the most current 'ICS_Transactions.OriginalDate'.
I would solve this by creating a CTE with all the ICS_Transaction fields needed in the query, rank-ordered by OPriginalDate, partitioned by suppliesid.
2. I only want the ICS_Transactions Table fields to populate IF ICS_Transacions.Type = 'PO'. Otherwise, these fields should remain null.
If you move the condition from the WHERE clause to the LEFT JOIN then ICS_Transactions not matching the criteria will be peeled and replaced with null values with the rest of the query records.
;
WITH ReqNumberRanked AS
(
SELECT
dbo.ICS_Orders.SuppliesID,
dbo.ICS_Transactions.RequisitionNumber,
dbo.ICS_Transactions.TransType,
dbo.ICS_Transactions.OriginalDate,
dbo.ICS_Transactions.OpenClosed,
RequisitionNumberRankReversed = RANK() OVER(PARTITION BY dbo.ICS_Orders.SuppliesID, dbo.ICS_Transactions.RequisitionNumber ORDER BY dbo.ICS_Transactions.OriginalDate DESC)
FROM
dbo.ICS_Orders
LEFT OUTER JOIN dbo.ICS_Transactions ON dbo.ICS_Orders.RequisitionNumber = dbo.ICS_Transactions.RequsitionNumber
)
SELECT
dbo.ICS_Supplies.Supplies_ID, dbo.ICS_Supplies.Old_ItemID,
dbo.ICS_Supplies.ItemDescription, dbo.ICS_Supplies.OnHand,
dbo.ICS_Supplies.ReorderLevel, dbo.ICS_Supplies.ReorderAmt,
dbo.ICS_Supplies.UnitMeasure,
dbo.ICS_Supplies.SupplyLocation, dbo.ICS_Supplies.InvType,
dbo.ICS_Supplies.Discontinued, dbo.ICS_Supplies.Supply,
ReqNumberRanked.RequsitionNumber,
ReqNumberRanked.OpenClosed,
ReqNumberRanked.TransType,
ReqNumberRanked.OriginalDate
FROM
dbo.ICS_Supplies
LEFT OUTER JOIN dbo.ICS_Orders ON dbo.ICS_Supplies.Supplies_ID = dbo.ICS_Orders.SuppliesID
LEFT OUTER JOIN ReqNumberRanked ON ReqNumberRanked.RequisitionNumber = dbo.ICS_Transactions.RequsitionNumber
AND (ReqNumberRanked.TransType = 'PO')
AND ReqNumberRanked.RequisitionNumberRankReversed = 1

Issue with joins in a SQL query

SELECT
c.ConfigurationID AS RealflowID, c.companyname,
c.companyphone, c.ContactEmail, COUNT(k.caseid)
FROM
dbo.Configuration c
INNER JOIN
dbo.cases k ON k.SiteID = c.ConfigurationId
WHERE
EXISTS (SELECT * FROM dbo.RepairEstimates
WHERE caseid = k.caseid)
AND c.AccountStatus = 'Active'
AND c.domainid = 46
GROUP BY
c.configurationid,c.companyname, c.companyphone, c.ContactEmail
I have this query - I am using the configuration table to get the siteid of the cases in the cases table. And if the case exists in the repair estimates table pull the company details listed and get a count of how many cases are in the repair estimator table for that siteid.
I hope that is clear enough of a description.
But the issue here is the count is not correct with the data that is being pulled. Is there something I could do differently? Different join? Remove the exists add another join? I am not sure I have tried many different things.
Realized I was using the wrong table. The query was correct.

Why does my left join in Access have fewer rows than the left table?

I have two tables in an MS Access 2010 database: TBLIndividuals and TblIndividualsUpdates. They have a lot of the same data, but the primary key may not be the same for a given person's record in both tables. So I'm doing a join between the two tables on names and birthdates to see which records correspond. I'm using a left join so that I also get rows for the people who are in TblIndividualsUpdates but not in TBLIndividuals. That way I know which records need to be added to TBLIndividuals to get it up to date.
SELECT TblIndividuals.PersonID AS OldID,
TblIndividualsUpdates.PersonID AS UpdateID
FROM TblIndividualsUpdates LEFT JOIN TblIndividuals
ON ( (TblIndividuals.FirstName = TblIndividualsUpdates.FirstName)
and (TblIndividuals.LastName = TblIndividualsUpdates.LastName)
AND (TblIndividuals.DateBorn = TblIndividualsUpdates.DateBorn
or (TblIndividuals.DateBorn is null
and (TblIndividuals.MidName is null and TblIndividualsUpdates.MidName is null
or TblIndividuals.MidName = TblIndividualsUpdates.MidName))));
TblIndividualsUpdates has 4149 rows, but the query returns only 4103 rows. There are about 50 new records in TblIndividualsUpdates, but only 4 rows in the query result where OldID is null.
If I export the data from Access to PostgreSQL and run the same query there, I get all 4149 rows.
Is this a bug in Access? Is there a difference between Access's left join semantics and PostgreSQL's? Is my database corrupted (Compact and Repair doesn't help)?
ON (
TblIndividuals.FirstName = TblIndividualsUpdates.FirstName
and
TblIndividuals.LastName = TblIndividualsUpdates.LastName
AND (
TblIndividuals.DateBorn = TblIndividualsUpdates.DateBorn
or
(
TblIndividuals.DateBorn is null
and
(
TblIndividuals.MidName is null
and TblIndividualsUpdates.MidName is null
or TblIndividuals.MidName = TblIndividualsUpdates.MidName
)
)
)
);
What I would do is systematically remove all the join conditions except the first two until you find the records drop off. Then you will know where your problem is.
This should never happen. Unless rows are being inserted/deleted in the meantime,
the query:
SELECT *
FROM a LEFT JOIN b
ON whatever ;
should never return less rows than:
SELECT *
FROM a ;
If it happens, it's a bug. Are you sure the queries are exactly like this (and you have't omitted some detail, like a WHERE clause)? Are you sure that the first returns 4149 rows and the second one 4103 rows? You could make another check by changing the * above to COUNT(*).
Drop any indexes from both tables which include those JOIN fields (FirstName, LastName, and DateBorn). Then see whether you get the expected
4,149 rows with this simplified query.
SELECT
i.PersonID AS OldID,
u.PersonID AS UpdateID
FROM
TblIndividualsUpdates AS u
LEFT JOIN TblIndividuals AS i
ON
(
(i.FirstName = u.FirstName)
AND (i.LastName = u.LastName)
AND (i.DateBorn = u.DateBorn)
);
For whatever it is worth, since this seems to be a deceitful bug and any additional information could help resolving it, I have had the same problem.
The query is too big to post here and I don't have the time to reduce it now to something suitable, but I can report what I found. In the below, all joins are left joins.
I was gradually refining and changing my query. It had a derived table in it (D). And the whole thing was made into a derived table (T) and then joined to a last table (L). In any case, at one point in its development, no field in T that originated in D participated in the join to L. It was then the problem occurred, the total number of rows mysteriously became less than the main table, which should be impossible. As soon as I again let a field from D participate (via T) in the join to L, the number increased to normal again.
It was as if the join condition to D was moved to a WHERE clause when no field in it was participating (via T) in the join to L. But I don't really know what the explanation is.

SQL JOIN limit results to rows where specific value does not exist

I am joining two tables using SQL. I'm joining a table which contains charter flight information and a table which contains the crew assigned. In my results, I only want to display the rows that only have a value of "Pilot" in the the crew table and not "Copilot" or both.
SELECT * FROM TABLE_A JOIN TABLE_B ON (TABLE_A.Value = TABLE_B.Value) WHERE TABLE_A.OtherValue = 'Pilot'
This is off the top of my head, so some syntax may be off. The main point is the WHERE clause. You can specify the value that you are looking for in the column (in your case you are looking for Pilot).
EDIT: To prevent a value you can do something like WHERE TABLE.VALUE != 'Copilot' != may need to be written as <> depending on the what SQL it is.
EDIT2: My SQL-Server is throwing a hissy and not connecting, so this is also entirely off the top of my head and I think it's a bit of a hack-job, but I think it'll do the job. :)
SELECT [CHARTER].*, COUNT(*) as Tally FROM [CHARTER] JOIN [CREW] ON ([CHARTER].[CHAR_TRIP] = [CREW].[CHAR_TRIP]) WHERE [CREW].[CREW_JOB] = 'PILOT' OR [CREW].[CREW_JOB] = 'COPILOT' GROUP BY [CHARTER].* HAVING Tally = 1
This assume that all flights have a pilot, but not all flights have a co-pilot. To get the exact display you want, you might have to use it as a sub-query (to remove the Tally column).
SELECT *
FROM charter ch
JOIN crew cr ON ch.char_trip = cr.char_trip
WHERE NOT EXISTS(SELECT *
FROM crew cr2
WHERE cr2.char_trip = ch.char_trip
AND cr2.crew_job != 'PILOT')
I think that should do the trick. The join to the crew table in line 3 is optional, and only if you need results from that table. The NOT EXISTS anti-join is what evaluates all crew for a given trip and checks for any that are not pilots.
You should really help us out here with the schema for us to provide you with a decent query. I think the most important thing here is how do you determine who is a pilot and/or copilot and how do you relate each person to the flight.
I think something like this might help:
SELECT * FROM Charter C
INNER JOIN Crew ON (Charter.CHAR_TRIP = Crew.CHAR_TRIP)
WHERE Crew.Crew_Job = 'PILOT' AND (SELECT COUNT(*) FROM Charter
INNER JOIN Crew ON (Charter.CHAR_TRIP = Crew.CHAR_TRIP)
WHERE Crew.Crew_Job = 'CoPilot'
AND Charter.Chart_Trip = C.ChartTrip) = 0
Although this might not be the cleanest solution.. it should do the work.