SQL-Query - finding pattern of another table - sql

I have a table with colors:
COLORS
idColor Name
------- ------
4 Yellow
5 Green
6 Red
And I have another table with data:
PRODUCTS
idProduct idCategory idColor
--------- ---------- -------
1 1 4
2 1 5
3 1 6
4 2 10
5 2 11
6 2 12
7 3 4
8 3 5
9 3 8
10 4 4
11 4 5
12 4 6
13 5 4
14 6 4
15 6 5
I just want return rows from Products when the idColor values from table Colors (4, 5, 6) are present in the second table and IdCategory has exactly 3 elements with the same idColor values 4, 5, 6.
For this example, The query should return:
IdCategory
----------
1
4

Try this:
SELECT idCategory
FROM PRODUCTS
GROUP BY idCategory
HAVING COUNT(*) = 3
AND COUNT(DISTINCT CASE WHEN idColor IN (4,5,6) THEN idColor END) = 3
Here is a demo for you to try.
UPDATED
If you want to dynamically filter the results depending on the values of the table `COLOR
SELECT idCategory
FROM PRODUCTS P
LEFT JOIN (SELECT idColor, COUNT(*) OVER() TotalColors
FROM COLORS) C
ON P.idColor = C.idColor
GROUP BY idCategory
HAVING COUNT(*) = MIN(C.TotalColors)
AND COUNT(DISTINCT C.idColor) = MIN(C.TotalColors)
Here is a fiddle with this example.

You can use aggregates to make sure it has all 3 colors, and also to make sure it DOESN'T have any other colors. Something like this:
SELECT *
FROM
(
SELECT idCategory
, SUM(CASE WHEN idColor IN (4, 5, 6) THEN 1 ELSE 0 END) AS GoodColors
, SUM(CASE WHEN idColor NOT IN (4, 5, 6) THEN 1 ELSE 0 END) AS BadColors
FROM Products
GROUP BY idCategory
) t0
WHERE GoodColors = 3 AND BadColors = 0
Note, if the 4, 5, 6 is found more than once per idCategory then a different technique must be employed. But from your example, it doesn't appear that way.

I am guessing that you would like to perform this task based on data in a table, rather than hardcoding the values 4, 5, and 6 (like in some of the answers given). To that end, in my solution I created a dbo.ColorSets table that you can fill with as many different sets of colors as you want, then run the query and see all the product Categories that match those color Sets. The reason I didn't just use your dbo.Color table is that it appeared to be the lookup table, complete with color names, so it didn't seem like the right one to be picking out a particular set of colors rather than the entire list possible.
I used a technique that will maintain good performance even on huge amounts of data, as compared to other query methods that use aggregates exclusively. No matter what method one uses, this task will pretty much always require a scan of the entire Products table because you can't compare all the rows without, well, comparing all the rows. But the JOIN is on indexable columns and is only for the candidates that have a very good chance of being proper matches, so the amount of work required is greatly reduced.
Here's what the ColorSets table looks like:
CREATE TABLE dbo.ColorSets (
idSet int NOT NULL,
idColor int NOT NULL,
CONSTRAINT PK_ColorSet PRIMARY KEY CLUSTERED (idSet, idColor)
);
INSERT dbo.ColorSets
VALUES
(1, 4),
(1, 5),
(1, 6), -- your color set: yellow, green, and red
(2, 4),
(2, 5),
(2, 8) -- an additional color set: yellow, green, and purple
;
And the query (see this working in a SqlFiddle):
WITH Sets AS (
SELECT
idSet,
Grp = Checksum_Agg(idColor)
FROM
dbo.ColorSets
GROUP BY
idSet
), Categories AS (
SELECT
idCategory,
Grp = Checksum_Agg(idColor)
FROM
dbo.Products
GROUP BY
idCategory
)
SELECT
S.idSet,
C.idCategory
FROM
Sets S
INNER JOIN Categories C
ON S.Grp = C.Grp
WHERE
NOT EXISTS (
SELECT *
FROM
(
SELECT *
FROM dbo.ColorSets CS
WHERE CS.idSet = S.idSet
) CS
FULL JOIN (
SELECT *
FROM dbo.Products P
WHERE P.idCategory = C.idCategory
) P
ON CS.idColor = P.idColor
WHERE
CS.idColor IS NULL
OR P.idColor IS NULL
)
;
Result:
idSet idCategory
1 1
2 3
1 4

If I understand your question, this should do it
select distinct idCategory
from Products
where idColors in (4,5,6)

Related

SQL Server (terminal result) hierarchy map

In SQL Server 2016, I have a table with the following chaining structure:
dbo.Item
OriginalItem
ItemID
NULL
7
1
2
NULL
1
5
6
3
4
NULL
8
NULL
5
9
11
2
3
EDIT NOTE: Bold numbers were added as a response to #lemon comments below
Importantly, this example is a trivialized version of the real data, and the neatly ascending entries is not something that is present in the actual data, I'm just doing that to simplify the understanding.
I've constructed a query to get what I'm calling the TerminalItemID, which in this example case is ItemID 4, 6, and 7, and populated that into a temporary table #TerminalItems, the resultset of which would look like:
#TerminalItems
TerminalItemID
4
6
7
8
11
What I need, is a final mapping table that would look something like this (using the above example -- note that it also contains for 4, 6, and 7 mapping to themselves, this is needed by the business logic):
#Mapping
ItemID
TerminalItemID
1
4
2
4
3
4
4
4
5
6
6
6
7
7
8
8
9
11
11
11
What I need help with is how to build this last #Mapping table. Any assistance in this direction is greatly appreciated!
This should do:
with MyTbl as (
select *
from (values
(NULL, 1 )
,(1, 2 )
,(2, 3 )
,(3, 4 )
,(NULL, 5 )
,(5, 6 )
,(NULL, 7 )
) T(OriginalItem, ItemID)
)
, TerminalItems as (
/* Find all leaf level items: those not appearing under OriginalItem column */
select LeafItem=ItemId, ImmediateOriginalItem=M.OriginalItem
from MyTbl M
where M.ItemId not in
(select distinct OriginalItem
from MyTbl AllParn
where OriginalItem is not null
)
), AllLevels as (
/* Use a recursive CTE to find and report all parents */
select ThisItem=LeafItem, ParentItem=ImmediateOriginalItem
from TerminalItems
union all
select ThisItem=AL.ThisItem, M.OriginalItem
from AllLevels AL
inner join
MyTbl M
on M.ItemId=AL.ParentItem
)
select ItemId=coalesce(ParentItem,ThisItem), TerminalItemId=ThisItem
from AllLevels
order by 1,2
Beware of the MAXRECURSION setting; by default SQLServer iterates through recursion 100 times; this would mean that the depth of your tree can be 100, max (the maximum number of nodes between a terminal item and its ultimate original item). This can be increased by OPTION(MAXRECURSION nnn) where nnn can be adjusted as needed. It can also be removed entirely by using 0 but this is not recommended because your data can cause infinite loops.
This is a typical gaps-and-islands problem and can also be carried out without recursion in three steps:
assign 1 at the beginning of each partition
compute a running sum over your flag value (generated at step 1)
extract the max "ItemID" on your partition (generated at step 2)
WITH cte1 AS (
SELECT *, CASE WHEN OriginalItem IS NULL THEN 1 ELSE 0 END AS changepartition
FROM Item
), cte2 AS (
SELECT *, SUM(changepartition) OVER(ORDER BY ItemID) AS parts
FROM cte1
)
SELECT ItemID, MAX(ItemID) OVER(PARTITION BY parts) AS TerminalItemID
FROM cte2
Check the demo here.
Assumption: Your terminal id items correspond to the "ItemID" value preceding a NULL "OriginalItem" value.
EDIT: "Fixing orphaned records."
The query works correctly when records are not orphaned. The only way to deal them, is to get missing records back, so that the query can work correctly on the full data.
This is carried out by an extra subquery (done at the beginning), that will apply a UNION ALL between:
the available records of the original table
the missing records
WITH fix_orphaned_records AS(
SELECT * FROM Item
UNION ALL
SELECT NULL AS OriginalItem,
i1.OriginalItem AS ItemID
FROM Item i1
LEFT JOIN Item i2 ON i1.OriginalItem = i2.ItemID
WHERE i1.OriginalItem IS NOT NULL AND i2.ItemID IS NULL
), cte AS (
...
Missing records correspond to "OriginalItem" values that are never found within the "ItemID" field. A self left join will uncover these missing records.
Check the demo here.
You can use a recursive CTE to compute the last item in the sequence. For example:
with
n (orig_id, curr_id, lvl) as (
select itemid, itemid, 1 from item
union all
select n.orig_id, i.itemid, n.lvl + 1
from n
join item i on i.originalitem = n.curr_id
)
select *
from (
select *, row_number() over(partition by orig_id order by lvl desc) as rn from n
) x
where rn = 1
Result:
orig_id curr_id lvl rn
-------- -------- ---- --
1 4 4 1
2 4 3 1
3 4 2 1
4 4 1 1
5 6 2 1
6 6 1 1
7 7 1 1
See running example at db<>fiddle.

Using recursive CTE to generate hierarchy results ordered by depth without the use of heiarchyid

I would like to query hierarchy results ordered by depth first without the use of SQL's heiarchyid built in function. Essentially, I am hoping to accomplish the depth ordering without any fancy functions.
I have provided a temp table below that contains these records:
Id
p_Id
order1
name1
1
null
1
josh
2
null
2
mary
3
null
3
george
4
1
1
joe
5
1
2
jeff
6
2
1
marg
7
2
2
moore
8
2
3
max
9
3
1
gal
10
3
2
guy
11
4
1
tod
12
4
2
ava
13
9
1
ron
14
9
2
bill
15
9
100
pat
where p_Id is the id of the parent record, and order1 is essentially just the ordering of which the depth first output should be displayed. To show why my query does not fully work, I made the order1 of the last record 100 instead of say, 3. However this should not ultimately matter since 100 and 3 both come after the previous order1 value, 2.
An example of a correct result table is shown below:
Id
p_Id
order1
name1
Descendants
1
null
1
josh
josh
4
1
1
joe
josh/joe
11
4
1
tod
josh/joe/tod
12
4
2
ava
josh/joe/ava
5
1
2
jeff
josh/jeff
2
null
2
mary
mary
6
2
1
marg
mary/marg
7
2
2
moore
mary/moore
8
2
3
max
mary/max
3
null
3
george
george
9
3
1
gal
george/gal
13
9
1
ron
george/gal/ron
15
9
2
bill
george/gal/bill
14
9
100
pat
george/gal/pat
10
3
2
guy
george/guy
Where an example of my results are shown below:
Id
p_Id
order1
name1
Descendants
levels
1
null
1
josh
josh
.1
4
1
1
joe
josh/joe
.1.1
11
4
1
tod
josh/joe/tod
.1.1.1
12
4
2
ava
josh/joe/ava
.1.1.2
5
1
2
jeff
josh/jeff
.1.2
2
null
2
mary
mary
.2
6
2
1
marg
mary/marg
.2.1
7
2
2
moore
mary/moore
.2.2
8
2
3
max
mary/max
.2.3
3
null
3
george
george
.3
9
3
1
gal
george/gal
.3.1
13
9
1
ron
george/gal/ron
.3.1.1
15
9
100
pat
george/gal/pat
.3.1.100
14
9
2
bill
george/gal/bill
.3.1.2
10
3
2
guy
george/guy
.3.2
where I have created a levels column that essentially concatenates the order1 values and separates them with a period. This almost returns the correct results, but due to the fact that I am ordering by this string (of numbers and periods), the levels value of .3.1.100 will come before .3.1.2 , which is not what the desired output should look like. I am sure there is a different method to return the correct depth order. See below for the code that generates a temp table, and the code that I used to generate the incorrect output that I have so far.
if object_id('tempdb..#t1') is not null drop table #t1
CREATE TABLE #t1 (Id int, p_Id int, order1 int, name1 varchar(150))
INSERT into #t1 VALUES
(1, null, 1, 'josh'),
(2, null, 2, 'mary'),
(3, null, 3, 'george'),
(4, 1, 1, 'joe'),
(5, 1, 2, 'jeff'),
(6, 2, 1, 'marg'),
(7, 2, 2, 'moore'),
(8, 2, 3, 'max'),
(9, 3, 1, 'gal'),
(10, 3, 2, 'guy'),
(11, 4, 1, 'tod'),
(12, 4, 2, 'ava'),
(13, 9, 1, 'ron'),
(14, 9, 2, 'bill'),
(100, 9, 100, 'pat');
select * from #t1
-- Looking to generate heiarchy results ordered by depth --
; with structure as (
-- Non-recursive term.
-- Select the records where p_Id is null
select p.Id,
p.p_Id,
p.order1,
p.name1,
cast(p.name1 as varchar(64)) as Descendants,
cast(concat('.', p.order1) as varchar(150)) as levels
from #t1 p
where p.p_Id is null
union all
-- Recursive term.
-- Treat the records from previous iteration as parents.
-- Stop when none of the current records have any further sub records.
select c.Id,
c.p_Id,
c.order1,
c.name1,
cast(concat(p.Descendants, '/', c.name1) as varchar(64)) as Descendants,
cast(concat(p.levels, '.', c.order1) as varchar(150)) as levels
from #t1 c -- c being the 'child' records
inner join structure p -- p being the 'parent' records
on c.p_Id = p.Id
)
select *
from structure
order by replace(levels, '.', '') asc
Take II. As pointed out by OP my original answer fails for more than 10 children. So what we can do (OP's suggestion) is pad the values out with zeros to a constant length. But what length? We need to take the largest number of children under a node and add this to the largest value or order, so for the example provided this is 100 + 3, and then take the length of that (3) and pad every order with zeros to 3 digits long. This means we will always be ordering as desired.
declare #PadLength int = 0;
select #PadLength = max(children)
from (
select len(convert(varchar(12),max(order1)+count(*))) children
from #t1
group by p_Id
) x;
-- Looking to generate heiarchy results ordered by depth --
with structure as (
-- Non-recursive term
-- Select the records where p_Id is null
select
p.Id [Id]
, p.p_Id [ParentId]
, p.order1 [OrderBy]
, p.name1 [Name]
, cast(p.name1 as varchar(64)) Descendants
, concat('.', right(replicate('0',#Padlength) + convert(varchar(12),p.order1), #PadLength)) Levels
from #t1 p
where p.p_Id is null
union all
-- Recursive term
-- Treat the records from previous iteration as parents.
-- Stop when none of the current records have any further sub records.
select
c.Id,
c.p_Id,
c.order1,
c.name1,
cast(concat(p.Descendants, '/', c.name1) as varchar(64)),
concat(p.levels, '.', right(replicate('0',#Padlength) + convert(varchar(12),c.order1), #PadLength))
from #t1 c -- c being the 'child' records
inner join structure p on c.p_Id = p.Id -- p being the 'parent' records
)
select *
from structure
order by replace(levels, '.', '') asc;
Note: This answer fails in the case when there are more than 10 children under a particular node. Leaving for interest.
So this issue you have run into is that you are ordering by a string not a number. So the string 100 comes before the string 2. But you need to order by a string to take care of the hierarchy, so one solution is to replace order1 with row_number() based on the order1 column while its still a number and use the row_number() to build your ordering string.
So you replace:
cast(concat(p.levels, '.', c.order1) as varchar(150)) as levels
with
cast(concat(p.levels, '.', row_number() over (order by c.Order1)) as varchar(150))
giving a full query of
with structure as (
-- Non-recursive term.
-- Select the records where p_Id is null
select p.Id,
p.p_Id,
p.order1,
p.name1,
cast(p.name1 as varchar(64)) as Descendants,
cast(concat('.', p.order1) as varchar(150)) as levels
from #t1 p
where p.p_Id is null
union all
-- Recursive term.
-- Treat the records from previous iteration as parents.
-- Stop when none of the current records have any further sub records.
select c.Id,
c.p_Id,
c.order1,
c.name1,
cast(concat(p.Descendants, '/', c.name1) as varchar(64)) as Descendants,
cast(concat(p.levels, '.', row_number() over (order by c.Order1)) as varchar(150))
from #t1 c -- c being the 'child' records
inner join structure p -- p being the 'parent' records
on c.p_Id = p.Id
)
select *
from structure
order by replace(levels, '.', '') asc;
Which returns the desired results.
Note: good question, well written.

Getting two different types of sums with only one row

I have a table that looks like this:
id code total
1 2 30
1 4 60
1 2 31
2 2 10
2 4 11
What I'd like to do, is basically get one row per id for the sum of records for code 2 and the sum of records for all codes for that id. So something like this:
id code2_total overall
1 61 121
2 10 21
I've tried the following:
select id
, abs(sum(total) over (partition by id)) as overall
, (select sum(total) from table where code = '2' group by id) as code2_total
from table limit 1
But I'm getting multiple items in the subquery error. How can I achieve something like this?
Use group by with a regular sum and a conditional sum (i.e. using a case expression).
declare #MyTable table (id int, code int, total int);
insert into #MyTable (id, code, total)
values
(1, 2, 30),
(1, 4, 60),
(1, 2, 31),
(2, 2, 10),
(2, 4, 11);
select id
, sum(case when code = 2 then total else 0 end) code2_total
, sum(total) overall
from #MyTable
group by id
order by id;
Returns
id
code2_total
overall
1
61
121
2
10
21
Note limit 1 is MySQL not SQL Server and doesn't help you here anyway.
Note also that providing the DDL+DML as I have shown here makes it much easier for people to assist.

Matching multiple rows in where clause for filter

I have two tables as the below:
Table 1 : Product_Information
Information_ID
Product_Name
1
A
2
B
3
C
4
D
5
E
Table 2 : Discriptor_Values
Information_ID
Descriptor_ID
Descriptor_Value
1
1
98
1
2
142
1
3
29.66
2
1
50
2
2
11
2
3
14
3
1
17
3
2
76
3
3
85
4
1
59
4
2
48
4
3
35
5
1
48
5
2
12
5
3
19
Using the above tables, I am creating a filter page like in any online shopping page i.e. for mobile phone Min and max range of price, Min and max range of internal storage are descriptor and range of values.
Likewise I will select descriptor and give min and max values for it and the matching product will be the result.
If I pass any filter range then the filtered list of products will be shown else all the records should be shown.
I am trying as the below query but not getting the correct output. I am getting the union of rows which matches any of the passed row (#tblFilter ).
CREATE TABLE #tblFilter(
[descriptor_id] [int] NULL,
[min_value] [decimal](18, 0) NULL,
[max_value] [decimal](18, 0) NULL
)
insert into #tblFilter values (1, 40.33, 70.33)
insert into #tblFilter values (2, 100.33, 150.33)
insert into #tblFilter values (3, 10, 60)
select p.*
from Product_Information p
inner join Discriptor_Values dv on p.Information_ID = dv.Information_ID
left join #tblFilter t1 on t1.descriptor_id = dv.Descriptor_id
WHERE ((dv.Descriptor_ID = t1.descriptor_id
and convert(decimal, dv.Descriptor_Value)
between CONVERT(decimal, t1.min_value) and CONVERT(decimal, t1.max_value))
or not exists (select 1 from #tblFilter))
drop TABLE #tblFilter
Please help me to minimize the result list by filter and show all records if there is no row in filter table (#tblFilter).
I believe you want:
select p.*
from Product_Information p join
Discriptor_Values dv
on p.Information_ID = dv.Information_ID left join
#tblFilter t1
on t1.descriptor_id = dv.Descriptor_id
where dv.Descriptor_Value between t1.min_value and t1.max_value or
dv.Descriptor_id is null;
I removed the conversions to decimals. You might actually need them, but in the question the values look like numbers and the question doesn't specify that they are stored as strings.

Order by followed by dependent order by

I have a table with part of data like below . I have done order by on edition_id .
Now there is further requirement of ordering laungauge_id which depends on value of edition_id.
Edition_id refers to city from which a newspaper is published.
Language_id refers to different languages in which newspaper is
published.
So suppose edition_id = 5 it means New Delhi.
For New Delhi language_id are 13(English ), 5 (Hindi) ,1(Telugu ),4(Urdu).
What i want is to display for New Delhi , is display all English articles first , followed by hindi , followed by Telugu followed by Urdu.
If edition_id=1 then order of language_id should be 13,1,2.
Similarly ,
If edition_id=5 then order of language_id should be 13,5,1,4
Right now what I have is
Edition_id | Language_id
1 1
1 2
1 13
1 1
1 13
1 2
5 4
5 1
5 1
5 4
5 13
5 5
5 13
What is required
Edition_id | Language_id
1 13
1 13
1 1
1 1
1 2
1 2
5 13
5 13
5 5
5 1
5 1
5 4
5 4
How to do this ? Please help.
Is something like this possibe
Select * from <table>
order by edition_id ,
case when edition=6 then <order specified for language_id ie 13,5,1,4>
I would create a supplementary ranking table. I would then JOIN to provide your sort order. Eg:
EDITION_SORT_ORDER
EDITION_ID LANGUAGE_ID RANK
---------- ----------- ----
1 13 1
1 1 2
1 2 3
5 13 1
5 5 2
5 1 3
5 4 4
Using this table in a query might look like this:
SELECT E.EDITION_ID, E.LANGUAGE_ID
FROM <TABLE> E LEFT OUTER JOIN EDITION_SORT_ORDER S ON
E.EDITION_ID = S.EDITION_ID AND
E.LANGUAGE_ID = S.LANGUAGE_ID
ORDER BY S.RANK
This way you can add other rules in future, and it isn't a huge mess of CASE logic.
Alternatively, if you want to avoid a JOIN, you could create a stored function which did a similar lookup and returned a rank (based on passed parameters of EDITION_ID and LANGUAGE_ID).
If you must use CASE, then I'd confine it to a function so you can re-use the logic elsewhere.
If there is no mathematical logic behind it, I would insert another column that can be used for proper sorting.
If you cannot do this, you can simply type out the rules for the relation like this:
Order By Edition_Id,
case Edition_id
when 1 then
case Language_id
when 13 then 1
when 1 then 2
when 2 then 3
end
when 5 then
case Language_id
when 13 then 1
when 5 then 2
when 1 then 3
when 4 then 4
end
end
without a fixed order colum you could things like that, but the logic is not comprehensible.
Assuming first criteria is length of Language_id,
Second is Edition_id= Language_id,
rest is order of Language_id it could or work this way:
Declare #t table(Edition_id int, Language_id int)
insert into #t values
(1, 1),
(1, 2),
(1, 13),
(1, 1),
(1, 13),
(1, 2),
(5, 4),
(5, 1),
(5, 1),
(5, 4),
(5, 13),
(5, 5),
(5, 13);
Select * from #t
order by Edition_id,Case when len (Cast(Language_ID as Varchar(10)))=1 then '1' else '0' end
+case when Edition_id=Language_id then '0' else '1' end
,Language_ID
You've probably considered this but if your desired ordering is always based of the actual alphabetical name of the language then there would usually be a table with the language description that you could join with and then sort by. I base this on your quote below.
...English articles first , followed by hindi , followed by Telugu
followed by Urdu.
SELECT E.EDITION_ID, E.LANGUAGE_ID, LN.LANGUAGE_NAME
FROM <TABLE> E LEFT OUTER JOIN <LANGUAGE_NAMES> LN ON
E.LANGUAGE_ID = LN.LANGUAGE_ID
ORDER BY 1, 3