Find entryno set from multiple set of records - sql

I have two SQL temp tables #Temp1 and #Temp2.
I want to get entryno which contain set of temp table two.
For example: #Temp2 has 8 records. I want to search in #Temp1 which contains a set of records from #Temp1.
CREATE TABLE #Temp1 (entryNo INT, setid INT, measurid INT,measurvalueid int)
CREATE TABLE #Temp2(setid INT, measurid INT,measurvalueid int)
INSERT INTO #Temp1 (entryNo,setid,measurid,measurvalueid )
VALUES (1,400001,1,1),
(1,400001,2,110),
(1,400001,3,1001),
(1,400001,4,1100),
(2,400002,5,100),
(2,400002,6,102),
(2,400002,7,1003),
(2,400002,8,10004),
(3,400001,1,1),
(3,400001,2,110),
(3,400001,3,1001),
(3,400001,4,1200)
INSERT INTO #Temp2 (setid,measurid,measurvalueid )
VALUES (400001,1,1),
(400001,2,110),
(400001,3,1001),
(400001,4,1100),
(400002,5,100),
(400002,6,102),
(400002,7,1003),
(400002,8,10004)
I want output
EntryNo
1
2
It contains two sets.
One is:
(400001,1,1),
(400001,2,110),
(400001,3,1001),
(400001,4,1100)
The second is:
(400002,5,100),
(400002,6,102),
(400002,7,1003),
(400002,8,10004)

Try this:
WITH DataSourceInialData AS
(
SELECT *
,COUNT(*) OVER (PARTITION BY [entryNo], [setid]) AS [GroupCount]
FROM #Temp1
), DataSourceFilteringData AS
(
SELECT *
,COUNT(*) OVER (PARTITION BY [setid]) AS [GroupCount]
FROM #Temp2
)
SELECT A.[entryNo]
FROM DataSourceInialData A
INNER JOIN DataSourceFilteringData B
ON A.[setid] = B.[setid]
AND A.[measurid] = B.[measurid]
AND A.[measurvalueid] = B.[measurvalueid]
-- we are interested in groups which are passed completely by the filtering groups
AND A.[GroupCount] = B.[GroupCount]
GROUP BY A.[entryNo]
-- aftering joining the rows, the filtered rows must match the filtering rows
HAVING COUNT(A.[setid]) = MAX(B.[GroupCount]);
The algorithm is simple:
we count how many rows exists per data group
we count how many rows exists per filtering group
we join the initial data and the filtering data
after the join we count how many rows are left in the initial data and if there count is equal to the filtering count for the given group
and the result is:
Note, that I am checking for each match. For example, if in your sample data, there is one more row for entryNo = 1 it won't be included in the result. In order to change this behavior, comment this row:
-- we are interested in groups which are passed completely by the filtering groups
AND A.[GroupCount] = B.[GroupCount]

Related

Find the most recently updated rows according to a multi-column grouping

I'm using SQL Server and T-SQL.
Sample Data:
I have data similar to the following readily consumable test data.
--===== Set the proper date format for the test data.
SET DATEFORMAT dmy
;
--===== Create and populate the Test Table
DROP TABLE IF EXISTS #TestTable
;
CREATE TABLE #TestTable
(
Item VARCHAR(10) NOT NULL
,GroupA TINYINT NOT NULL
,GroupB SMALLINT NOT NULL
,Updated DATE NOT NULL
,Idx INT NOT NULL
)
;
INSERT INTO #TestTable WITH (TABLOCK)
(Item,GroupA,GroupB,Updated,Idx)
VALUES ('ABC',7,2020,'14/11/2019',8) --Return this row
,('ABC',7,2020,'10/11/2019',7)
,('ABC',6,2019,'14/11/2019',6) --Return this row
,('ABC',5,2018,'13/11/2019',5) --Return this row
,('ABC',5,2018,'12/11/2019',4)
,('ABC',7,2018,'14/11/2019',3) --Return this row
,('ABC',7,2019,'25/11/2019',2) --Return this row
,('ABC',7,2019,'18/11/2019',1)
;
--===== Display the test data
SELECT * FROM #TestTable
;
Problem Description:
I need help in writing a query that will return the rows marked as "--Return this row". I know how to write a basic SELECT but have no idea how to pull this off.
The basis of the problem is to return the latest updated row for each "group" of rows. A "group" of rows is determined by the combination of the Item, GroupA, and GroupB columns and I need to return the full rows found.
Use row_number() :
select t.*
from (select t.*, row_number() over (partition by item, groupa, groupb order by updated desc) as seq
from table t
) t
where seq = 1;
select table.Item,table.GroupA,table.GroupB,table.Updated,Idx
FROM (select Item,GroupA,GroupB,max(Updated) Updated
from table
group by Item,GroupA,GroupB) a
inner join table
on(a.Item = table.Item and a.GroupA = table.GroupA and a.GroupB = table.GroupB and
a.Updated = table.Updated)

Generating Lines based on a value from a column in another table

I have the following table:
EventID=00002,DocumentID=0005,EventDesc=ItemsReceived
I have the quantity in another table
DocumentID=0005,Qty=20
I want to generate a result of 20 lines (depending on the quantity) with an auto generated column which will have a sequence of:
ITEM_TAG_001,
ITEM_TAG_002,
ITEM_TAG_003,
ITEM_TAG_004,
..
ITEM_TAG_020
Here's your sql query.
with cte as (
select 1 as ctr, t2.Qty, t1.EventID, t1.DocumentId, t1.EventDesc from tableA t1
inner join tableB t2 on t2.DocumentId = t1.DocumentId
union all
select ctr + 1, Qty, EventID, DocumentId, EventDesc from cte
where ctr <= Qty
)select *, concat('ITEM_TAG_', right('000'+ cast(ctr AS varchar(3)),3)) from cte
option (maxrecursion 0);
Output:
Best is to introduce a numbers table, very handsome in many places...
Something along:
Create some test data:
DECLARE #MockNumbers TABLE(Number BIGINT);
DECLARE #YourTable1 TABLE(DocumentID INT,ItemTag VARCHAR(100),SomeText VARCHAR(100));
DECLARE #YourTable2 TABLE(DocumentID INT, Qty INT);
INSERT INTO #MockNumbers SELECT TOP 100 ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) FROM master..spt_values;
INSERT INTO #YourTable1 VALUES(1,'FirstItem','qty 5'),(2,'SecondItem','qty 7');
INSERT INTO #YourTable2 VALUES(1,5), (2,7);
--The query
SELECT CONCAT(t1.ItemTag,'_',REPLACE(STR(A.Number,3),' ','0'))
FROM #YourTable1 t1
INNER JOIN #YourTable2 t2 ON t1.DocumentID=t2.DocumentID
CROSS APPLY(SELECT Number FROM #MockNumbers WHERE Number BETWEEN 1 AND t2.Qty) A;
The result
FirstItem_001
FirstItem_002
[...]
FirstItem_005
SecondItem_001
SecondItem_002
[...]
SecondItem_007
The idea in short:
We use an INNER JOIN to get the quantity joined to the item.
Now we use APPLY, which is a row-wise action, to bind as many rows to the set, as we need it.
The first item will return with 5 lines, the second with 7. And the trick with STR() and REPLACE() is one way to create a padded number. You might use FORMAT() (v2012+), but this is working rather slowly...
The table #MockNumbers is a declared table variable containing a list of numbers from 1 to 100. This answer provides an example how to create a pyhsical numbers and date table. Any database should have such a table...
If you don't want to create a numbers table, you can search for a tally table or tally on the fly. There are many answers showing approaches how to create a list of running numbers...a

Database Trigger that finds nearest record from another table after insert

I have two tables(a,b), both with a shape or geometry field. I want a trigger to run after an insert on table a to find the (single) nearest spatial record from table b. I have looked into the STDistance function with little luck. Table a is unique.
AFTER INSERT
Table a
OBJECTID,RoadID
12345,NULL
Table b
AssetID
RD12345
RD12233
RD12333
RD12222
STDistnace would say Table a.OBJECTID 12345 nearest Table b.AssetID = RD12222
Result
Table a
OBJECTID,RoadID
12345,RD12222
I have completed some preliminary testing which returns all matching records (from both tables) but I am trying to condense it down to only the matching record with the lowest distance, hence the aggregate function(MIN) on STDistance.
SELECT TableA.AssetID,MIN(TableA.Shape.STDistance(TableB.Shape)) AS DIST, TableB.AssetID AS RoadID
FROM TableA, TableB
GROUP BY TableA.AssetID, TableB.AssetID
HAVING MIN(TableA.Shape.STDistance(TableB.Shape)) < 250
ORDER BY AssetID
The result I get is a many to many relationship by distance for all records. If I apply the aggregate function(MIN) I can reduce it significantly however the Table a unique id's still duplicate. The plan is once the select statement worked I would translate it into my trigger - I would prefer the answer to be based on how it would be implemented in a trigger.
You may use CROSS APPLY ... SELECT TOP 1.... ORDER BY distance to cross join two tables and select the nearest record:
SELECT A.OBJECTID, NearestB.B_ID, NearestB.Distance
FROM TableA A
CROSS APPLY(
select TOP 1
A.Shape.STDistance(B.Shape) AS distance,
B.AssetID as B_ID
from TableB B
order by 1
) NearestB
And the trigger might be:
CREATE TRIGGER TableA_Insertion_SetNearestB ON TableA
INSTEAD OF INSERT
AS
BEGIN
INSERT INTO TableA (
OBJECTID,
Shape,
RoadID
) SELECT
INSERTED.OBJECTID,
INSERTED.Shape,
NearestB.B_ID
) FROM
INSERTED
CROSS APPLY(
select TOP 1
INSERTED.Shape.STDistance(B.Shape) AS distance,
B.AssetID as B_ID
from TableB B
order by 1
) NearestB
END
GO

Doing a join only if count is greater than one

I wonder if the following a bit contrived example is possible without using intermediary variables and a conditional clause.
Consider an intermediary query which can produce a result set that contain either no rows, one row or multiple rows. Most of the time it produces just one row, but when multiple rows, one should join the resulting rows to another table to prune it down to either one or no rows. After this if there is one row (as opposed to no rows), one would want to return multiple columns as produced by the original intermediary query.
I have in my mind something like following, but it won't obviously work (multiple columns in switch-case, no join etc.), but maybe it illustrates the point. What I would like to have is to just return what is currently in the SELECT clause in case ##ROWCOUNT = 1 or in case it is greater, do a INNER JOIN to Auxilliary, which prunes down x to either one row or no rows and then return that. I don't want to search Main more than once and Auxilliary only when x here contains more than one row.
SELECT x.MainId, x.Data1, x.Data2, x.Data3,
CASE
WHEN ##ROWCOUNT IS NOT NULL AND ##ROWCOUNT = 1 THEN
1
WHEN ##ROWCOUNT IS NOT NULL AND ##ROWCOUNT > 1 THEN
-- Use here #id or MainId to join to Auxilliary and there
-- FilteringCondition = #filteringCondition to prune x to either
-- one or zero rows.
END
FROM
(
SELECT
MainId,
Data1,
Data2,
Data3
FROM Main
WHERE
MainId = #id
) AS x;
CREATE TABLE Main
(
-- This Id may introduce more than row, so it is joined to
-- Auxilliary for further pruning with the given conditions.
MainId INT,
Data1 NVARCHAR(MAX) NOT NULL,
Data2 NVARCHAR(MAX) NOT NULL,
Data3 NVARCHAR(MAX) NOT NULL,
AuxilliaryId INT NOT NULL
);
CREATE TABLE Auxilliary
(
AuxilliaryId INT IDENTITY(1, 1) PRIMARY KEY,
FilteringCondition NVARCHAR(1000) NOT NULL
);
Would this be possible in one query without a temporary table variable and a conditional? Without using a CTE?
Some sample data would be
INSERT INTO Auxilliary(FilteringCondition)
VALUES
(N'SomeFilteringCondition1'),
(N'SomeFilteringCondition2'),
(N'SomeFilteringCondition3');
INSERT INTO Main(MainId, Data1, Data2, Data3, AuxilliaryId)
VALUES
(1, N'SomeMainData11', N'SomeMainData12', N'SomeMainData13', 1),
(1, N'SomeMainData21', N'SomeMainData22', N'SomeMainData23', 2),
(2, N'SomeMainData31', N'SomeMainData32', N'SomeMainData33', 3);
And a sample query, which actually behaves as I'd like it to behave with the caveat I'd want to do the join only if querying Main directly with the given ID produces more than one result.
DECLARE #id AS INT = 1;
DECLARE #filteringCondition AS NVARCHAR(1000) = N'SomeFilteringCondition1';
SELECT *
FROM
Main
INNER JOIN Auxilliary AS aux ON aux.AuxilliaryId = Main.AuxilliaryId
WHERE MainId = #id AND aux.FilteringCondition = #filteringCondition;
You don't usually use a join to reduce the result set of the left table. To limit a result set you'd use the where clause instead. In combination with another table this would be WHERE [NOT] EXISTS.
So let's say this is your main query:
select * from main where main.col1 = 1;
It returns one of the following results:
no rows, then we are done
one row, then we are also done
more than one row, then we must extend the where clause
The query with the extended where clause:
select * from main where main.col1 = 1
and exists (select * from other where other.col2 = main.col3);
which returns one of the following results:
no rows, which is okay
one row, which is okay
more than one row - you say this is not possible
So the task is to do this in one step instead. I count records and look for a match in the other table for every record. Then ...
if the count is zero we get no result anyway
if it is one I take that row
if it is greater than one, I take the row for which exists a match in the other table or none when there is no match
Here is the full query:
select *
from
(
select
main.*,
count(*) over () as cnt,
case when exists (select * from other where other.col2 = main.col3) then 1 else 0 end
as other_exists
from main
where main.col1 = 1
) counted_and_checked
where cnt = 1 or other_exists = 1;
UPDATE: I understand that you want to avoid unnecessary access to the other table. This is rather difficult to do however.
In order to only use the subquery when necessary, we could move it outside:
select *
from
(
select
main.*,
count(*) over () as cnt
from main
where main.col1 = 1
) counted_and_checked
where cnt = 1 or exists (select * from other where other.col2 = main.col3);
This looks much better in my opinion. However there is no precedence among the two expressions left and right of an OR. So the DBMS may still execute the subselect on every record before evaluating cnt = 1.
The only operation that I know of using left to right precedence, i.e. doesn't look further once a condition on the left hand side is matched is COALESCE. So we could do the following:
select *
from
(
select
main.*,
count(*) over () as cnt
from main
where main.col1 = 1
) counted_and_checked
where coalesce( case when cnt = 1 then 1 else null end ,
(select count(*) from other where other.col2 = main.col3)
) > 0;
This may look a bit strange, but should prevent the subquery from being executed, when cnt is 1.
You may try something like
select * from Main m
where mainId=#id
and #filteringCondition = case when(select count(*) from Main m2 where m2.mainId=#id) >1
then (select filteringCondition from Auxilliary a where a.AuxilliaryId = m.AuxilliaryId) else #filteringCondition end
but it's hardly very fast query. I'd better use temp table or just if and two queries.

SQL Join on sequence number

I have 2 tables (A, B). They each have a different column that is basically an order or a sequence number. Table A has 'Sequence' and the values range from 0 to 5. Table B has 'Index' and the values are 16740, 16744, 16759, 16828, 16838, and 16990. Unfortunately I do not know the significance of these values. But I do believe they will always match in sequential order. I want to join these tables on these numbers where 0 = 16740, 1 = 16744, etc. Any ideas?
Thanks
You could use a case expression to convert table a's values to table b's values (or vise-versa) and join on that:
SELECT *
FROM a
JOIN b ON a.[sequence] = CASE b.[index] WHEN 16740 THEN 0
WHEN 16744 THEN 1
WHEN 16759 THEN 2
WHEN 16828 THEN 3
WHEN 16838 THEN 4
WHEN 16990 THEN 5
ELSE NULL
END;
#Mureinik has a great example. If down the road you do end up adding more numbers maybe putting this information into a new table would be a good idea.
CREATE TABLE C(
AInfo INT,
BInfo INT
)
INSERT INTO TABLE C(AInfo,BInfo) VALUES(0,16740)
INSERT INTO TABLE C(AInfo,BInfo) VALUES(1,16744)
etc
Then you can Join all the tables.
If the values are in ascending order as per your example, you can use the ROW_NUMBER() function to achieve this:
;with cte AS (SELECT *, ROW_NUMBER() OVER(ORDER BY [Index])-1 RN
FROM B)
SELECT *
FROM cte