Using multiple and interdepended CROSS-APPLY across multiple tables - sql

How can I use either CROSS APPLY (or INNER JOIN) to get data from one table based on the values in other tables?
I.e. I have the following tables:
Table Descriptions:
ProdID
Description
TrackNum
361
Test 1
499
388
Test 2
003
004
5599
238
Test 3
499
361
Test 10
555
004
Test 40
555
Table Products:
ProdID
ProductName
Price
361
P1
5.00
388
P2
5.00
004
P3
12.00
238
P4
6.00
515
P5
7.00
636
P6
7.00
775
P7
7.00
858
P8
8.00
Table Invoices:
ProdID
TrackNum
InvoiceID
361
499
718
388
199
718
004
499
718
238
499
718
361
555
333
004
555
444
361
111
444
388
222
333
616
116
565
717
116
565
361
003
221
388
003
221
004
5599
728
What I need my query to do is to:
Go into Invoices table first, and get only records that matches specified InvoiceID and TrackNum;
Then go into Products table and get only rows that have matches on ProdID between the data I pulled out in Step #1 and the data existis in the Products table.
Then finally get all columns from the Descriptions table, but only for the rows which I got in the Step #2 and which matches on ProdID.
What I need to have at the end is something like this (if I get more columns that is fine, but I do not want to get more rows):
ProdID
Description
TrackNum
361
Test 1
499
004
5599
238
Test 3
499
I have following query (and I have tried using INNER JOIN and CROSS APPLY) - but it returns me more rows than I need:
SELECT * FROM [Descriptions] AS [DES]
CROSS APPLY
(
select * from [Invoices] AS [INV] where [INV].[TrackNum] = '499' AND [INV].[InvoiceID] = '718'
) [INV]
CROSS APPLY
(
select * from [Products] AS [GP]
WHERE [GP].[ProdID] = [INV].[ProdID]
) [GP2]
WHERE
[DES].[ProdID] = [GP2].[ProdID]
order by [DES].[ProdID] asc

SELECT
*
FROM
invoices AS i
LEFT JOIN
descriptions AS d
ON d.prodid = i.prodid
AND d.tracknum = i.tracknum -- you don't have this, but I think it's required.
LEFT JOIN
products AS p
ON p.prodid = i.prodid
WHERE
i.invoiceid = 718
AND i.tracknum = 499
ORDER BY
i.prodid
One thing that concerns me is that both the invoices and descriptions have a column named tracknum, but your query and expected data indicate that you don't want to include that in the join? That's very confusing and either a poor column name, or a mistake in your query and example results.

Based on what you describe you want the following, start with your Invoices table and a where clause to get the right rows, then join on Products and Descriptions.
I'm also guessing that you want to match the Description on TrackNum? Since it appears you have a unique Description per ProdId/TrackNum combination.
select [INV].[ProdID], [DES].[Description], [INV].[TrackNum]
from [Invoices] as [INV]
inner join [Products] as [GP] on [GP].[ProdID] = [INV].[ProdID]
inner join [Descriptions] on [DES].[ProdID] = [GP].[ProdID] and [DES].[TrackNum] = [INV].[TrackNum]
where [INV].[TrackNum] = '499' AND [INV].[InvoiceID] = '718'
order by [DES].[ProdID] asc;
Note: You normally only use a 'CROSS APPLY' for queries where you want to run/evaluate something per row in your main table.

In this case the Inner Join is sufficient. You don't need to use Cross Apply

Related

Is there is way to get SUM() of column without GROUPING by joining multiple tables in SQL Server

I am getting SUM() of amount in CrowdfundedUser table by GROUP BY CrowdfundID but difficult to get SUM() because all columns are unique.
Crowdfund:
CrowdfundID
GoalAmount
StartedDate
9
10000
09/02/2022
5
20000
10/02/2022
55
350000
11/02/2022
444
541256
12/02/2022
54
78458
13/02/2022
CrowdfundedUser:
ID
User ID
CrowdfundID
Amount
744
12214
9
1000
745
4124
5
8422
746
12214
55
784
747
12214
444
874
748
64554
54
652
CrowdfundiPaymentTransaction:
CrowdfundedUserID
Invoice
Amount
PaymentDate
744
RA45A14124
1000
09/02/2022
745
RA45A12412
8422
10/02/2022
746
RA45U14789
784
11/02/2022
747
RA45F12457
874
12/02/2022
748
RA45M00124
652
13/02/2022
My query :
SELECT
c.CrowdfundID,
SUM(cu.Amount),
SUM(cpt..Amount)
FROM
Crowdfund c
INNER JOIN
CrowdfundedUser cu ON c.CrowdfundID = cu.CrowdfundID
INNER JOIN
CrowdfundiPaymentTransaction cpt ON cu.ID = cpt.CrowdfundedUserID
GROUP BY
c.CrowdfundID
SELECT c.CrowdfundID,
SUM(cu.Amount) OVER (
ORDER BY c.CrowdfundID) Amount,
SUM(cpt..Amount) OVER (
ORDER BY c.CrowdfundID) CptAmount
FROM Crowdfund c
INNER JOIN CrowdfundedUser cu ON c.CrowdfundID = cu.CrowdfundID
INNER JOIN CrowdfundiPaymentTransaction cpt ON cu.ID = cpt.CrowdfundedUserID

Merge rows by adding the deviating value as new columns

I want to output for each person that has filled out a specific questionnaire their personal data as well as the data from the specific questionnaire/assessment. The values are stored in the table criteriaDF.
The result of my current query:
KlientId
Name1
Name2
CriteriaName
Result
335
Name1Person1
Name2Person1
IF1
Yes
335
Name1Person1
Name2Person1
IF2
Yes
335
Name1Person1
Name2Person1
IF3
No
336
Name1Person2
Name2Person2
IF1
Yes
336
Name1Person2
Name2Person2
IF2
Yes
336
Name1Person2
Name2Person2
IF3
No
What I want to have:
KlientId
Name1
Name2
IF1
IF2
IF3
335
Person1Name1
Person1Name2
Yes
Yes
No
336
Person2Name1
Person2Name2
Yes
No
No
So the criterias should get their own columns and the rows referencing the same person should merge into one based on the same KlientId.
The query I used to get my current result uses a few joins in order to get from person to client, to process to assessment to criteria, where the CriteriaName and Result lies. The other tables are just used with their foreign keys to get to these values "IF1: Yes" etc.
SELECT Client.Id, Person.Name1, Person.Name2, Person.Birthday, CriteriaDF.CriteriaName, CriteriaDF.Result
FROM Client
INNER JOIN Person ON Client.PersonId=Person.Id
INNER JOIN Process ON Client.Id=Process.ClientId
INNER JOIN AssessmentDF ON Process.Id=AssessmentDF.ProCessId
INNER JOIN CriteriaDF ON AssessmentDF.Id=CriteriaDF.AssessmentDfId
WHERE AssessmentDF.Name='RightAssessmentName' AND AssessmentDF.Date > DATEADD(day, -180, GETDATE())
Edit: Query using aliases:
SELECT t1.Id, t2.Name1, t2.Name2, t2.Birthday, t5.CriteriaName, t5.Result
FROM Client AS t1
INNER JOIN Person AS t2 ON t1.PersonId=t2.Id
INNER JOIN Process AS t3 ON t1.Id=t3.ClientId
INNER JOIN AssessmentDF AS t4 ON t3.Id=t4.ProcessId
INNER JOIN CriteriaDF AS t5 ON t4.Id=t5.AssessmentDfId
WHERE t4.Bezeichnung='RightAssessmentName' AND t4.Date > DATEADD(day, -180, GETDATE())
My main question is how to convert the tuples CriteriaName & Result to a new column for each unique CriteriaName and Fill the cell with the Result.
I dont think they're important, but here are all minimal tables that are used to get from the person to the criterias (Ids might not fit perfectly to the result & what I wanted to have, just to understand how the data is stored):
Table Person:
Id
Name1
Name2
2766
Person1Name1
Person1Name2
2767
Person2Name2
Person2Name2
2768
Person3Name2
Person3Name2
Table Klient:
Id
PersonId
1
2766
335
2767
336
2768
Table Process:
Id
KlientId
2485
335
2515
336
2519
336
Table AssessmentDF
Id
ProcessId
Date
Name
43
2485
2022-04-18
RightAssessmentName
44
2515
2022-05-18
RightAssessmentName
45
2519
2022-06-18
RightAssessmentName
Table CriteriaDF:
In reality there is IF1-If19
Id
AssessmentDfId
CriteriaName
ProcessId
Result
551
43
IF1
2485
Yes
552
43
IF2
2485
Yes
553
43
IF3
2485
No
554
44
IF1
2515
Yes
555
44
IF2
2515
Yes
556
44
IF3
2515
No
557
45
IF1
2519
Yes
558
45
IF2
2519
No
559
45
IF3
2519
No

Query returns a few extra records

I have these tables and this query in an Access database:
samples
hole_id | depth_from | depth_to |
DH001 100 105
DH001 105 120
DH001 110 115
DH001 115 120
overlapping_samples (and therefore the correct output)
hole_id | depth_from | depth_to |
DH001 110 115
DH001 115 120
query
SELECT a.*
FROM samples AS a
INNER JOIN overlapping_samples AS o
ON a.hole_id=o.hole_id
WHERE a.hole_id=o.hole_id AND a.depth_to=o.depth_to
;
results
hole_id | depth_from | depth_to |
DH001 100 105
DH001 110 115
DH001 115 120
It's very simple. The result is almost ok, but it includes some extra records from the left table (i.e. samples). In fact, in the example above it may not necessarily return the extra row. Only a small percentage are.
If not obvious, I want to return all the records from the left table that match to the right table. The right table is actually a subset of the left, and therefore the query should have the same number of records. It's intended for a DELETE statement, but
i've changed your query to:
SELECT a.hole_id as ahole_id, a.depth_from as adepth_from, a.depth_to as adepth_to,o.hole_id as ohole_id, o.depth_from as odepth_from, o.depth_to as odepth_to
FROM samples AS a
LEFT JOIN overlapping_samples AS o ON a.hole_id=o.hole_id AND a.depth_to=o.depth_to AND a.depth_from=o.depth_from
WHERE a.hole_id=o.hole_id AND a.depth_to=o.depth_to;
and it gave me this result
ahole_id | adepth_from | adepth_to | ohole_id | odepth_from | odepth_to |
DH001 110 115 DH001 110 115
DH001 115 120 DH001 115 120
is that what you were looking for?
this may work:
SELECT a.*
FROM samples a
JOIN overlapping_samples o ON a.hole_id = o.hole_id
WHERE a.depth_from = o.depth_from
AND a.depth_to = o.depth_to;
I fixed a problem in WHERE clause, from:
a.hole_id=o.hole_id
to:
a.depth_from = o.depth_from
hole_id is already present in JOIN ... ON a.hole_id = o.hole_id
if you still don't get correct count you may need to look at your data and add some extra condition either in WHERE or JOIN clause

Convert This SQL Query to ANSI SQL

I would like to convert this SQL query into ANSI SQL. I am having trouble wrapping my head around the logic of this query.
I use Snowflake Data Warehouse, but it does not understand this query because of the 'delete' statement right before join, so I am trying to break it down. From my understanding the row number column is giving me the order from 1 to N based on timestamp and placing it in C. Then C is joined against itself on the rows other than the first row (based on id) and placed in C1. Then C1 is deleted from the overall data, which leaves only the first row.
I may be understanding the logic incorrectly, but I am not used to seeing the 'delete' statement right before a join. Let me know if I got the logic right, or point me in the right direction.
This query was copy/pasted from THIS stackoverflow question which has the exact situation I am trying to solve, but on a much larger scale.
with C as
(
select ID,
row_number() over(order by DT) as rn
from YourTable
)
delete C1
from C as C1
inner join C as C2
on C1.rn = C2.rn-1 and
C1.ID = C2.ID
The specific problem I am trying to solve is this. Let's assume I have this table. I need to partition the rows by primary key combinations (primKey 1 & 2) while maintaining timestamp order.
ID primKey1 primKey2 checkVar1 checkVar2 theTimestamp
100 1 2 302 423 2001-07-13
101 3 6 506 236 2005-10-25
100 1 2 302 423 2002-08-15
101 3 6 506 236 2008-12-05
101 3 6 300 100 2010-06-10
100 1 2 407 309 2005-09-05
100 1 2 302 423 2012-05-09
100 1 2 302 423 2003-07-24
Once the rows are partitioned and the timestamp is ordered within each partition, I need to delete the duplicate checkVar combination (checkVar 1 & 2) rows until the next change. Thus leaving me with the earliest unique row. The rows with asterisks are the ones which need to be removed since they are duplicates.
ID primKey1 primKey2 checkVar1 checkVar2 theTimestamp
100 1 2 302 423 2001-07-13
*100 1 2 302 423 2002-08-15
*100 1 2 302 423 2003-07-24
100 1 2 407 309 2005-09-05
100 1 2 302 423 2012-05-09
101 3 6 506 236 2005-10-25
*101 3 6 506 236 2008-12-05
101 3 6 300 100 2010-06-10
This is the final result. As you can see for ID=100, even though the 1st and 3rd record are the same, the checkVar combination changed in between, which is fine. I am only removing the duplicates until the values change.
ID primKey1 primKey2 checkVar1 checkVar2 theTimestamp
100 1 2 302 423 2001-07-13
100 1 2 407 309 2005-09-05
100 1 2 302 423 2012-05-09
101 3 6 506 236 2005-10-25
101 3 6 300 100 2010-06-10
If you want to keep the earliest row for each id, then you can use:
delete from yourtable yt
where yt.dt > (select min(yt2.dt)
from yourtable yt
where yt2.id = yd.id
);
Your query would not do this, if that is your intent.

Quantifying the unique number of records using fuzzy matching

I am currently inner joining a customer table using the mds.mdq.similarity function in SQL Server to fuzzy match the customer name records:
Select a.CUST_ID as a_CUST_ID
,a.CU_NAME as a_CU_NAME
,b.CUST_ID as b_CUST_ID
,b.CU_NAME as b_CU_NAME
from #tmp a
inner join #tmp b
on a.CUST_ID > b.CUST_ID
and (mds.mdq.Similarity (a.CU_NAME, b.CU_NAME, 2, 0, 0)) > 0.9
Now, running this query gives me the following sample table:
a_CUST_ID a_CU_NAME b_CUST_ID b_CU_NAME
112 abc 111 abbc
113 abc- 111 abbc
111 abbc 110 abc_
112 abc 110 abc_
114 xyz 115 xyz-
What I would like to find is a way to quantify the number of "unique" CU_NAMEs from this ("unique" being as per the mds.mdq.similarity matching logic).
In the above sample, we would say 110 ~ 111 ~ 112 ~ 113 and 114 ~ 115. Hence, there would be 2 "unique" CU_NAMEs. The expected outcome would be:
Number_of_Unique_CU_NAME
2