Filter a join based on multiple rows

Filter a join based on multiple rows - sql

I'm trying to write a query that filters a join based on several rows in another table. Hard to put it into words, so I'll provide a cut-back simple example.
Parent
Child
P1
C1
P1
C2
P1
C3
P2
C1
P2
C2
P2
C4
P3
C1
P3
C3
P3
C5
Essentially all rows are stored in the same table, however there is a ParentID allowing one item to link to another (parent) row.
The stored procedure is taking a comma delimited list of "child" codes, and based on whatever is in this list, I need to provide a list of potential siblings.
For example, if the comma delimited list was empty, the returned rows should be C1, C2, C3, C4, C5. If the list is "C2", the returned rows would be C1, C3, C4, and if the list is 'C1, C2', then the only returned row would be c3, c4.
Sample query:
SELECT [S].[ID]
FROM utItem [P]
INNER JOIN utItem [C]
ON [P].[ID] = [C].[ParentID]
INNER JOIN
(
-- Encapsulated to simplify sample.
SELECT [ID]
FROM udfListToRows( #ChildList )
GROUP BY
[ID]
) [DT]
ON [DT].[ID] = [C].[ID]
/*
In the event where I passed in "C2", this would work, it would return C1, C3, C4.
However this falls apart the moment there is more than 1 value in #ChildList. If I pass in "C2, C3", it would return siblings for either. But I only want siblings of BOTH.
**/
INNER JOIN [utItem] [S]
ON [C].[ParentID] = [S].[ParentID]
AND [C].[ID] <> [S].[ID]
WHERE
#ChildList IS NOT NULL
GROUP BY
[S].[ID]
UNION ALL
-- In the event that no #ChildList values are provided, return a full list of possible children (e.g. 1,2,3,4,5).
SELECT [C].[ID]
FROM [utItem] [P]
INNER JOIN [utItem] [C]
ON [P].[ID] = [C].[ParentID]
WHERE
#ChildList IS NULL
GROUP BY
[C].[ID]

Firstly, you can split your data into a table variable for ease of use
DECLARE #input TABLE (NodeId varchar(2));
INSERT #input (NodeId)
SELECT [ID]
FROM udfListToRows( #ChildList ); -- or STRING_SPLIT or whatever
Assuming you already had your data in a proper table variable (rather than a comma-separated list) you can do this
DECLARE #totalCount int = (SELECT COUNT(*) FROM #input);
SELECT DISTINCT
t.Child
FROM (
SELECT
t.Parent,
t.Child,
i.NodeId,
COUNT(i.NodeId) OVER (PARTITION BY t.Parent) matches
FROM YourTable t
LEFT JOIN #input i ON i.NodeId = t.Child
) t
WHERE t.matches = #totalCount
AND t.NodeId IS NULL;
db<>fiddle
This is a kind of relational division
Left-join the input to the main table
Using a window function, calculate how many matches you get per Parent
There must be at least as many matches as there are inputs
We take the distinct Child, excluding the original inputs

Related

What's the trick for unnesting an array in Snowflake?

I have some complex array logic I'm using in postgres but now need to transfer to snowflake. The issue I'm having is with the second column syntax of connected_users, specifically the unnest. I know flatten(table(input => is the solution to unnesting in snowflake but I just cant seem to get the syntax right.
What this logic currently does is:
concatenates two different arrays (from c2 users and c1 users)
unnests the concatenated array into a flattened table
re-arrays only distinct values (e) from the flattened table where c0 user is not equal to any of the values
the query:
select
c0.user_id as user
, array(select distinct e from unnest(array_cat(
array_agg(distinct c2.user_id order by c2.user_id desc),
array_agg(distinct c1.user_id order by c1.user_id desc)
)) as a(e) where c0.user_id != e) as connected_users
from device_logs c0
left join device_logs c1 ON (
c0.device_id = c1.device_id
)
left join device_logs c11 ON (
c1.user_id = c11.user_id
)
left join device_logs c2 ON (
c2.device_id = c11.device_id
)
group by 1, 2
I was hoping that I could easily just replace the unnest function with flatten(table(input =>, however that still produces errors

SqlServer - search if a list of values is fully contained on table

I have a sql table and a list of values to search.
if at least all the elements of the table are contained in the list, then I must return the Ticket Id (it means that I will update this record). Otherwise, I will return null (it means that it will be a new registration).
For example
Use cases:
If I search for this elements: C1, C3, C6, it will be an update and I will get ticketid 1
If I search for this elements: C8, C3, C6, C10, it will be a create and I will get null as return value
A list of values is a Predefined-Type with a column, in this case, #ElementsToSearch with a column Value
SELECT T.Id
FROM
Ticket t
INNER JOIN
TicketValue TL ON TL.TicketId = T.Id
LEFT OUTER JOIN
#ElementsToSearch ES ON ES.Value = TL.Value
WHERE
ES.Value is null
thank you

Whatever you want to return just interchange null and 1
declare #ElementsToSearch as Table(value varchar(10))
insert into #ElementsToSearch values('C1'),('C2'),('C3')
SELECT
CASE WHEN (COUNT(CASE WHEN ES.value IS NULL then 1 end)>0) then NULL else T.id end as output
FROM
Ticket t
INNER JOIN
TicketValue TL ON TL.TicketId = T.Id
LEFT OUTER JOIN
#ElementsToSearch ES ON ES.Value = TL.Value
group by T.id

Group by on two tables and perform Left join on outcome VBA ADODB SQL Query

I want to perform Group BY on two csv files and then perform Left Join on the outcome of both tables through VBA ADO Query in Excel. My end motive is to print the recordset.
Here is what I have done so far.
SELECT * FROM (
SELECT f1.[c3],
f1.[c4],
f1.[c5],
f1.[c6],
Sum(f1.[c8]) AS SUMDATA
FROM test1.csv F1
GROUP BY f1.[c3],
f1.[c4],
f1.[c5],
f1.[c6]) AS f3
LEFT JOIN SELECT * FROM (
SELECT f2.[c3],
f2.[c4],
f2.[c5],
f2.[c6],
Sum(f2.[c8]) AS SUMDATA
FROM test2.csv f2
GROUP BY f2.[c3],
f2.[c4],
f2.[c5],
f2.[c6]) AS f4
on f3.[c3]+ f3.[c4]+ f3.[c5]+ f3.[c6] = f4.[c3]+ f4.[c4]+ f4.[c5]+ f4.[c6]
WHERE f3.[SUMDATA] <> f4.[SUMDATA]
This shows a syntax error. How to implement this? Any help is much appreciated. TIA.
An update -
I manage to implement 1 LEFT JOIN and 2 GROUP BYs between 2 tables. As per the request, here are few details regarding my dataset.
It consists of fields - c1, c2 .... c8.
c8 is my target field.
My expected output - I do not need c7, c1 and c2 in output sheet. The info of c7, c1 and c2 is irrelevant. I need to do 5 things with my data.
Group Sum the c8 field based on c3, c4, c5 and c6 fields in CSV file 1 and store target field as SUMDATA
Group Sum the c8 field based on c3, c4, c5 and c6 fields in CSV file 2 and store target field as SUMDATA
Find the mismatched SUMDATA field entries between CSV1 and CSV2 (I used LEFT JOIN for this on concatenated c3, c4, c5, c6 fields)
Find the entries which are present in CSV1 but not in CSV2
Find the entries which are present in CSV2 but not in CSV1
Currently, I manage to write the code that works till step 3. I need to store the grouped tables temporarily which I got from Step 1 and 2, to perform the steps 4 and 5 which can be done through 2 more UNION, LEFT JOINs, and WHERE combination. This is where I am stuck right now.

This isn't really an answer but the formatting is important for readability.
It looks like there's a lot wrong with your SQL.
The syntax should look like this (assuming querying a csv works like you are thinking):
SELECT SUB1.Field1,
SUB1.AggField AS Agg1,
SUB2.AggField AS Agg2
FROM (SELECT Field1,
MAX(Field2) Agg_Field
FROM Table1 T1
GROUP
BY Field1
) SUB1
LEFT
JOIN (SELECT Field1,
MAX(Field2) Agg_Field
FROM Table1 T2
GROUP
BY Field1
) SUB2
ON SUB1.Field1 = SUB2.Field1
WHERE SUB1.AggField <> SUB2.AggField;
Also, you are missing a comma here: F1.[c5] F1.[c6] in the first chunk.
Try fixing the SQL syntax like this and see where that gets you.

SQL Server Recursive CTE for finding all the dependencies

I have a table with the following data.
Road City
R1 C1
R2 C2
R3 C1
R3 C3
R4 C3
R4 C5
R5 C5
If R1 is the input I need to get R1, R3, R4 and R5 as the output. This is because R1 belongs to C1 and C1 has R3 and R3 also belongs to C3 which has R4 and similarly R5.
I was trying to make use of CTE recursion but not able to get it to work. I tried stored procedure recursive call but it goes only 30 levels deep.
with tmp1 as (
select ROAD, CITY, 1 as Level from table R1 WHERE ROAD = 1712
UNION ALL
select R2.ROAD, R2.CITY,Level + 1 as Level
from tmp1 INNER JOIN table R2 ON tmp1.CITY = R2.CITY and tmp1.ROAD <> R2.ROAD
)
select * from tmp1
OPTION (maxrecursion 0)
Any thoughts greatly appreciated!

A recursive CTE will not work without some way of breaking cycles. Other database vendors have specific features for disallowing a row to be added twice. Unless something was added in the latest releases, Microsoft SQL Server does not.
The following does not work, because the recursive clause is referring to the CTE twice. (Or it contains a subquery)
WITH recur AS (SELECT Road, City
FROM #Map
WHERE Road = #StartingRoad
--
UNION ALL
--
SELECT next.Road, next.City
FROM #Map next
INNER JOIN recur
ON (recur.City = next.City AND recur.Road <> next.Road)
OR (recur.City <> next.City AND recur.Road = next.Road)
WHERE NOT EXISTS (SELECT NULL
FROM recur test
WHERE test.Road = next.Road AND test.City = next.City))
SELECT *
FROM recur;
Msg 253, Level 16, State 1, Line 36
Recursive member of a common table expression 'recur' has multiple recursive references.
It is possible with a straight forward loop, which you could stick in a stored procedure:
DECLARE #Map TABLE (Road VARCHAR(2), City VARCHAR(2));
INSERT INTO #Map (Road, City)
VALUES ('R1', 'C1')
, ('R2', 'C2')
, ('R3', 'C1')
, ('R3', 'C3')
, ('R4', 'C3')
, ('R4', 'C5')
, ('R5', 'C5');
DECLARE #StartingRoad VARCHAR(2) = 'R1';
DECLARE #Results TABLE (Road VARCHAR(2), City VARCHAR(2));
INSERT INTO #Results (Road, City)
SELECT Road, City
FROM #Map
WHERE Road = #StartingRoad
WHILE (1=1) BEGIN
INSERT INTO #Results (Road, City)
SELECT next.Road, next.City
FROM #Map next
INNER JOIN #Results r
ON (r.City = next.City AND r.Road <> next.Road)
OR (r.City <> next.City AND r.Road = next.Road)
WHERE NOT EXISTS (SELECT NULL
FROM #Results test
WHERE test.Road = next.Road AND test.City = next.City);
IF ##ROWCOUNT = 0
BREAK;
END;
SELECT DISTINCT Road
FROM #Results

This might get you partially there. I am doing a partial Cartesian join to generate the city/road combinations, and artificially restricting the recursion to 20 levels. I can't help but think there is a better way.
WITH cte(road,
city,
connected_city,
connected_road)
AS (
SELECT a.road,
a.city,
b.city AS connected_city,
b.Road connected_road
FROM deleteme a
INNER JOIN deleteme b ON a.City = b.City
WHERE a.road <> b.road)
SELECT DISTINCT
road
FROM cte;
road
R1
R3
R4
R5

How to get all Contract no against Leads in oracle sql query?

I need to create a sql query for below scenario:
Table name is remark
Columns are contractno and leadid.
1 contractno can have multiple leadid.
similarly,
1 leadid can assigned to multiple contractno.
Lets assume:
C1 --> L1
C2 --> L1, L2
C3 --> L2
I will get only one contractno i.e. C1 as parameter.
Now I have to find all Contracts against C1 through leadid.
Please help me out how I can achieve this.
Thank you.

SELECT r1.contractno
FROM remark r1
JOIN remark r2
ON r1.leadid = r2.leadid
WHERE r2.contractno = 'C1'
AND r1.contractno <> 'C1'
This assume your table has this format:
contractno leadid
C1 L1
C2 L1
C2 L2
C3 L1
If you dont, then you need to split the csv value into rows first:
Turning a Comma Separated string into individual rows

You can use a LISTAGG if you have to group list of contracts. Here too it is assumed that your table has linear format and not comma separated leadids
WITH cn
AS (SELECT DISTINCT leadid
FROM remark
WHERE contractno = 'C1')
SELECT Listagg(r.contractno, ',')
within GROUP (ORDER BY ROWNUM) contractno_C1
FROM remark r
join cn
ON r.leadid = cn.leadid
WHERE r.contractno <> 'C1'
GROUP BY cn.leadid;
http://sqlfiddle.com/#!4/54e48/1/0

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Filter a join based on multiple rows - sql

Related

What's the trick for unnesting an array in Snowflake?

SqlServer - search if a list of values is fully contained on table

Group by on two tables and perform Left join on outcome VBA ADODB SQL Query

SQL Server Recursive CTE for finding all the dependencies

How to get all Contract no against Leads in oracle sql query?

Categories

Resources