Logic to check if exact ids (3+ records) are present in a group in SQL Server - sql

I have some sample data like:
INSERT INTO mytable
([FK_ID], [TYPE_ID])
VALUES
(10, 1),
(11, 1), (11, 2),
(12, 1), (12, 2), (12, 3),
(14, 1), (14, 2), (14, 3), (14, 4),
(15, 1), (15, 2), (15, 4)
Now, here I am trying to check if in each group by FK_ID we have exact match of TYPE_ID values for 1, 2 & 3.
So, the expected output is like:
(10, 1) this should fail
As in group FK_ID = 10 we only have one record
(11, 1), (11, 2) this should also fail
As in group FK_ID = 11 we have two records.
(12, 1), (12, 2), (12, 3) this should pass
As in group FK_ID = 12 we have two records.
And all the TYPE_ID are exactly matching 1, 2 & 3 values.
(14, 1), (14, 2), (14, 3), (14, 4) this should also fail
As we have 4 records here.
(15, 1), (15, 2), (15, 4) this should also fail
Even though we have three records, it should fail as the TYPE_ID here (1, 2, 4) are not matching with required match (1, 2, 3).
Here is my attempt:
select * from mytable t1
where exists (select COUNT(t2.TYPE_ID)
from mytable t2 where t2.FK_ID = t1.FK_ID
and t2.TYPE_ID IN (1, 2, 3)
group by t2.FK_ID having COUNT(t2.TYPE_ID) = 3);
This is not working as expected, because it also pass for FK_ID = 14 which has four records.
Demo: SQL Fiddle
Also, how we can make it generic so that if we need to check for 4 or more TYPE_ID values like (1,2,3,4) or (1,2,3,4,5), we can do that easily by updating few values.

The following query will do what you want:
select fk_id
from t
group by fk_id
having sum(case when type_id in (1, 2, 3) then 1 else 0 end) = 3 and
sum(case when type_id not in (1, 2, 3) then 1 else 0 end) = 0;
This assumes that you have no duplicate pairs (although depending on how you want to handle duplicates, it might be as easy as using, from (select distinct * from t) t).
As for "genericness", you need to update the in lists and the 3.
If you want something more generic:
with vals as (
select id
from (values (1), (2), (3)) v(id)
)
select fk_id
from t
group by fk_id
having sum(case when type_id in (select id from vals) then 1 else 0 end) = (select count(*) from vals) and
sum(case when type_id not in (select id from vals) then 1 else 0 end) = 0;

You can use this code:
SELECT y.fk_id FROM
(SELECT x.fk_id, COUNT(x.type_id) AS count, SUM(x.type_id) AS sum
FROM mytable x GROUP BY (x.fk_id)) AS y
WHERE y.count = 3 AND y.sum = 6
For making it generic, you can equal y.count with N and y.sum with N*(N-1)/2, where N is the number you are looking for (1, 2, ..., N).

You can try this query. COUNT and DISTINCT used for eliminate duplicate records.
SELECT
[FK_ID]
FROM
#mytable T
GROUP BY
[FK_ID]
HAVING
COUNT(DISTINCT CASE WHEN [TYPE_ID] IN (1,2,3) THEN [TYPE_ID] END) = 3
AND COUNT(CASE WHEN [TYPE_ID] NOT IN (1,2,3) THEN [TYPE_ID] END) = 0

Try this:
select FK_ID,count(distinct TYPE_ID) from mytable
where TYPE_ID<=3
group by FK_ID
having count(distinct TYPE_ID)=3

You should use CTE with Dynamic pass Value which you have mentioned in Q.
WITH CTE
AS (
SELECT FK_ID,
COUNT(*) CNT
FROM #mytable
GROUP BY FK_ID
HAVING COUNT(*) = 3) <----- Pass Value here What you want to Display Result,
CTE1
AS (
SELECT T.[ID],
T.[FK_ID],
T.[TYPE_ID],
ROW_NUMBER() OVER(PARTITION BY T.[FK_ID] ORDER BY
(
SELECT NULL
)) RN
FROM #mytable T
INNER JOIN CTE C ON C.FK_ID = T.FK_ID),
CTE2
AS (
SELECT C1.FK_ID
FROM CTE1 C1
GROUP BY C1.FK_ID
HAVING SUM(C1.TYPE_ID) = SUM(C1.RN))
SELECT TT1.*
FROM CTE2 C2
INNER JOIN #mytable TT1 ON TT1.FK_ID = C2.FK_ID;
From above SQL Command which will produce Result (I have passed 3) :
ID FK_ID TYPE_ID
4 12 1
5 12 2
6 12 3

Related

SQL - Getting Sum of 'X' Consecutive Values where X is an Integer in another Row (With Categories)

Say for example, I wanted to SUM all the values from the current row until the provided count. See table below:
For example:
Category A, Row 1: 10+15+25 = 50 (because it adds Rows 1 to 3 due to Count)
Category A, Row 2: 15+25+30+40 = 110 (because it adds Rows 2 to 5 due to count)
Category A, Row 5: 40+60 = 100 (because it Adds Rows 5 and 6. Since the count is 5, but the category ends at Row 6, so instead of that, it sums all available data which is Rows 5 and 6 only, thus having a value of 100.
Same goes for Category B.
How do I do this?
You can do this using window functions:
with tt as (
select t.*,
sum(quantity) over (partition by category order by rownumber) as running_quantity,
max(rownumber) over (partition by category) as max_rownumber
from t
)
select tt.*,
coalesce(tt2.running_quantity, ttlast.running_quantity) - tt.running_quantity + tt.quantity
from tt left join
tt tt2
on tt2.category = tt.category and
tt2.rownumber = tt.rownumber + tt.count - 1 left join
tt ttlast
on ttlast.category = tt.category and
ttlast.rownumber = ttlast.max_rownumber
order by category, rownumber;
I can imagine that under some circumstances this would be much faster -- particularly if the count values are relatively large. For small values of count, the lateral join is probably faster, but it is worth checking if performance is important.
Actually, a pure window functions approach is probably the best approach:
with tt as (
select t.*,
sum(quantity) over (partition by category order by rownumber) as running_quantity
from t
)
select tt.*,
(coalesce(lead(tt.running_quantity, tt.count - 1) over (partition by tt.category order by tt.rownumber),
first_value(tt.running_quantity) over (partition by tt.category order by tt.rownumber desc)
) - tt.running_quantity + tt.quantity
)
from tt
order by category, rownumber;
Here is a db<>fiddle.
Try this:
DECLARE #DataSource TABLE
(
[Category] CHAR(1)
,[Row Number] BIGINT
,[Quantity] INT
,[Count] INT
);
INSERT INTO #DataSource ([Category], [Row Number], [Quantity], [Count])
VALUES ('A', 1, 10, 3)
,('A', 2, 15, 4)
,('A', 3, 25, 2)
,('A', 4, 30, 1)
,('A', 5, 40, 5)
,('A', 6, 60, 2)
--
,('B', 1, 12, 2)
,('B', 2, 13, 3)
,('B', 3, 17, 1)
,('B', 4, 11, 2)
,('B', 5, 10, 5)
,('B', 6, 7, 3);
SELECT *
FROM #DataSource E
CROSS APPLY
(
SELECT SUM(I.[Quantity])
FROM #DataSource I
WHERE I.[Row Number] <= E.[Row Number] + E.[Count] - 1
AND I.[Row Number] >= E.[Row Number]
AND E.[Category] = I.[Category]
) DS ([Sum]);

Group elements of a column into mulitple subgroups SQL

I am looking at different breeds of cattle and their AnimalTypeCode , BreedCateoryID and resultant Growth.
I have the following query
SELECT DATEPART(yyyy,[KillDate])
,[AnimalTypeCode]
,AVG([Growth])
,[BreedCategoryID]
FROM [dbo].[tblAnimal]
WHERE (AnimalTypeCode='C'
or AnimalTypeCode= 'E')
GROUP BY DATEPART(yyyy,[KillDate])
,[AnimalTypeCode]
,[BreedCategoryID]
GO
This query is good and gives me almost what I want, but BreedCategoryID is numbered 1 through 7 and I would like to group them:
(1 = Pure Dairy),
(2 and 3 = Dairy)
(4, 5, 6 and 7 = Beef)
So instead of getting the mean Growthrate for each BreedCategoryID I would like to get the average for Pure Dairy, Dairy, and Beef.
Any help greatly appreciated!
You can assign a new "variable" using cross apply in the from clause:
SELECT YEAR(KillDate]), a.AnimalTypeCode, v.grp,
AVG([Growth])
FROM [dbo].[tblAnimal] a CROSS APPLY
(VALUES (CASE WHEN a.BreedCategoryID IN (1) THEN 'Pure Dairy'
WHEN a.BreedCategoryID IN (2, 3) THEN 'Dairy'
WHEN a.BreedCategoryID IN (4, 5, 6, 7) THEN 'Beef'
END)
) as v(grp)
WHERE a.AnimalTypeCode IN ('C', 'E')
GROUP BY YEAR(KillDate]), a.AnimalTypeCode, v.grp;
Note that I also introduced table aliases and qualified all the column references.
Do the calculations in a derived table (the subquery). GROUP BY its result:
select killyear, [AnimalTypeCode], AVG([Growth]), BreedCat
(
SELECT DATEPART(yyyy,[KillDate]) killyear
,[AnimalTypeCode]
,[Growth]
,case when [BreedCategoryID] = 1 then 'Pure Dairy'
when [BreedCategoryID] in (2, 3) then 'Dairy'
when [BreedCategoryID] in (4, 5, 6, 7) then 'Beef'
end BreedCat
FROM [dbo].[tblAnimal]
WHERE (AnimalTypeCode='C'
or AnimalTypeCode= 'E')
) dt
GROUP BY killyear
,[AnimalTypeCode]
,BreedCat

Best SQL query to retrieve the data which has all required data

I have a transaction table with item details for each company. I want to write a query to retrieve the companies only having item numbers 1,2 and 3 (according to my sample code in below). Selected companies should have all 1,2,3 items. If some company has only item 1, then it shouldn't come. How can I write this?
CREATE TABLE #TmpTran
(
ID BIGINT IDENTITY,
COMPANY_ID BIGINT,
ITEM_NAME VARCHAR(50),
ITEM_NUMBER INT
)
INSERT INTO #TmpTran (COMPANY_ID, ITEM_NAME, ITEM_NUMBER)
VALUES (1, 'ABC', 1), (1, 'DEF', 2), (1, 'HIJ', 3),
(2, 'KLM', 4), (2, 'KLM', 5), (2, 'ABC', 1)
How can I get only Company 1 data using WHERE or JOIN query?
You can do this with group by and having:
select company_id
from #tmptran tt
where item_number in (1, 2, 3)
group by company_id
having count(distinct item_number) = 3;
Another way (more flexible approach)
select company_id
from #tmptran tt
group by company_id
having count(case when item_number = 1 then 1 end) > 0;
and count(case when item_number = 2 then 1 end) > 0;
and count(case when item_number = 3 then 1 end) > 0;
select tt.company_id
from #tmptran tt
where tt.item_number in (1, 2, 3)
group by tt.company_id
having sum(max(case tt.item_number when 1 then 1 end)) +
and sum(max(case tt.item_number when 2 then 1 end)) +
and sum(max(case tt.item_number when 3 then 1 end)) = 3
You said you have a lot of fields. Probably the easiest for the reader to follow would be something like:
select distinct tt.company_id
from #tmptran tt
where tt.item_number in (1, 2, 3)
and exists(select 1
from #tmptran ttSub
where ttSub.company_id = tt.company_id and ttSub.item_number = 1)
and exists(select 1
from #tmptran ttSub
where ttSub.company_id = tt.company_id and ttSub.item_number = 2)
and exists(select 1
from #tmptran ttSub
where ttSub.company_id = tt.company_id and ttSub.item_number = 3)

Logic to check if exact ids are present in a group in SQL Server

I have some sample data like:
INSERT INTO mytable ([ID], [FK_ID], [TYPE_ID])
VALUES
(1, 10, 1),
(2, 11, 1), (3, 11, 2),
(4, 12, 1), (5, 12, 2), (6, 12, 3),
(7, 14, 2), (8, 14, 3)
Now, here I am trying to check if in each group by FK_ID we have exact match of TYPE_ID values 1 & 2.
So, the expected output is like:
(1, 10, 1) this should fail
As in group FK_ID = 10 we only have one record
(2, 11, 1), (3, 11, 2) this should pass
As in group FK_ID = 11 we have two records.
And both the TYPE_ID are matching 1 & 2 values.
(4, 12, 1), (5, 12, 2), (6, 12, 3) this should also fail
As we have 3 records here.
(7, 14, 2), (8, 14, 3) this should also fail
Even though we have exact two records, it should fail as the TYPE_ID here are not matching with 1 & 2 values.
Here is my attempt:
select *
from mytable t1
where exists (select count(t2.TYPE_ID)
from mytable t2
where t2.FK_ID = t1.FK_ID
and t2.TYPE_ID in (1, 2)
group by t2.FK_ID
having count(t2.TYPE_ID) = 2);
This is not working as expected, because it also pass for FK_ID = 12 which has three records.
Demo: SQL Fiddle
There are probably several different ways of doing this. One could be:
SELECT FK_ID
FROM mytable
GROUP BY FK_ID
HAVING COUNT(*) = 2
AND MIN(TYPE_ID) = 1
AND MAX(TYPE_ID) = 2
We can add min and max to the group by query
select t1.* from mytable t1,
( select fk_id, count(*) As cnt from mytable
Group by fk_id
Having count(*) = 2
AND max(type_id)=2
ANd min(Type_id) = 1) As t2
Where t1.fk_id = t2.fk_id
Another way, but less optimal than Nenad's, is to use SELECT INTO (with output to temporary table) and then with another query SELECT only these rows that have proper TYPE_ID values.

Delete rows in table that are sum of other rows per group

Group rows by T, and in each group find the row that is the largest or smallest (if values are negative) sum of other rows from that group, and delete that row (one for each group), if group does not have enough elements to find sum or enough but none of the rows indicates sum of others nothing happens
CREATE TABLE Test (
T varchar(10),
V int
);
INSERT INTO Test
VALUES ('A', 4),
('B', -5),
('C', 5),
('A', 2),
('B', -1),
('C', 10),
('A', 2),
('B', -4),
('C', 5),
('D', 0);
expected result:
A 2
A 2
B -1
B -4
C 5
C 5
D 0
Like the comments, the requirements seem strange. The below code assumes that the summing is already pre-populated and merely removes the largest/smallest as long as the highest value is not 0.
if object_id('tempdb..#test') is not null
drop table #test
CREATE TABLE #Test (
T varchar(10),
V int
);
INSERT INTO #Test
VALUES ('A', 4), ('B', -5), ('C', 5), ('A', 2), ('B', -1), ('C', 10), ('A', 2), ('B', -4), ('C', 5), ('D', 0);
if object_id('tempdb..#test2') is not null
drop table #test2
SELECT
T,
V,
ABS(V) as absV
INTO #TEST2
FROM #TEST
SELECT * FROM #TEST2
if object_id('tempdb..#max') is not null
drop table #max
SELECT
T,
MAX(absV) AS MaxAbsV
INTO #Max
FROM #TEST2
GROUP BY T
HAVING MAX(AbsV) != 0
DELETE #TEST2
FROM #TEST2
INNER JOIN #MAX ON #TEST2.T = #MAX.T AND #TEST2.absV = #Max.MaxAbsV
SELECT * FROM #TEST2
ORDER BY T ASC
; with cte as
(
select T, V,
R = row_number() over (partition by T order by ABS(V) desc),
C = count(*) over (partition by T)
from Test
)
delete c
from cte c
inner join
(
select T, S = sum(V)
from cte
where R <> 1
group by T
) s on c.T = s.T
where c.C >= 3
and c.R = 1
and c.V = s.S
Using ABS and NOT Exists
DECLARE #Test TABLE (
T varchar(10),
V int
);
INSERT INTO #Test
VALUES ('A', 4), ('B', -5), ('C', 5), ('A', 2), ('B', -1), ('C', 10), ('A', 2), ('B', -4), ('C', 5), ('D', 0);
;WITH CTE as (
select T,max(ABS(v ))v from #Test
WHERE V <> 0
GROUP BY T )
SELECT T,V FROM #Test T where NOT exists (Select 1 FROM cte WHERE T = T.T AND v = ABS(T.V) )
ORDER BY T.T
Determine first if the rows are positive or negative by checking if SUM(V) is positive. And then determine if the smallest or largest value is equal to the SUM of the other rows, by subtracting from SUM(V) the MIN(V) if negative or MAX(V) if positive:
DELETE t
FROM Test t
INNER JOIN (
SELECT
T,
SUM(V) - CASE WHEN SUM(V) >= 0 THEN MAX(V) ELSE MIN(V) END AS ToDelete
FROM Test
GROUP BY T
HAVING COUNT(*) >= 3
) a
ON a.T = t.T
AND a.ToDelete = t.V
ONLINE DEMO
You can use the below query to get the required output :-
select * into #t1 from test
select * from
(
select TT.T as T,TT.V as V
from test TT
JOIN
(select T,max(abs(V)) as V from #t1
group by T) P
on TT.T=P.T
where abs(TT.V) <> P.V
UNION ALL
select A.T as T,A.V as V from test A
JOIN(
select T,count(T) as Tcount from test
group by T
having count(T)=1) B on A.T=B.T
) X order by T
drop table #t1
You are looking for a value per group that is the sum of all the group's other values. E.g. 4 of (2,2,4) or -5 of (-5,-4,-1).
This is usually only one record per group. But it can be multiple times the same number. Here are examples for ties: (0,0) or (-2,2,4,4), or (-2,-2,4,4,4) or (-10,3,3,3,3,4).
As you see, you are looking in any way for values that equal half of the group's total sum. (Of course. We are looking for n+n, where one n is in one record and the other n is the sum of all the other records.)
The only special case is when there is only one value in the group which is zero. That we don't want to delete of course.
Here is an update statement that cannot deal with ties, but would delete all maximum values instead of just one:
delete from test
where 2 * v =
(
select case when count(*) = 1 then null else sum(v) end
from test fullgroup
where fullgroup.t = test.t
);
In order to deal with ties you would need artificial row numbers, so as to delete only one record of all candidates.
with candidates as
(
select t, v, row_number() over (partition by t order by t) as rn
from
(
select
t, v,
sum(v) over (partition by t) as sumv,
count(*) over (partition by t) as cnt
from test
) comparables
where sumv = 2 * v and cnt > 1
)
delete
from candidates
where rn = 1;
SQL fiddle: http://sqlfiddle.com/#!6/6d97e/1
See if below query helps:
DELETE [Audit].[dbo].[Test] FROM [Audit].[dbo].[Test] as AA
INNER JOIN (select T,
CASE
WHEN MAX(V) < 0 THEN MIN(V)
WHEN MIN(V) > 0 THEN MAX(V) ELSE MAX(V)
END as MAX_V,
CASE
WHEN SUM(V) > 0 THEN SUM(V) - MAX(V)
WHEN SUM(V) < 0 THEN SUM(V) - MIN(V) ELSE SUM(V)
END as SUM_V_REST
from [Audit].[dbo].[Test]
Group by T
Having Count(V) > 1) as BB ON AA.T = BB.T and AA.V = BB.MAX_V