Inner Join without repeating value, keeping the rows of the first table

Inner Join without repeating value, keeping the rows of the first table - sql

I have a table with the discount for each invoice. For example:
Invoice Number|Discount
------------------------
1 | 3
2 | 5
3 | 6
I need to pull these discounts to the invoice lines table (because they only apply to the total of the invoice, not to a particular line). At the same time I cannot lose any line.
Example: If the invoice 1 has 5 lines, I need all lines to show up (the 5 lines of the invoice), but I want the discount only once (for example, the first line would be enough).
Expected:
Invoice Number|Discount
------------------------
1 | 3
1 | null
1 | null
1 | null
1 | null
If I have an Invoice table, and a InvoiceLines table that can be joined by the invoice number in both tables, how can I get the result that I need?
I tried this query without success:
Select
ROW_NUMBER() over(order by v.num_fra)as Rank,
l.*,
v.ctdrap_div as discount
from ffac_vta v --(invoicetable)
join ffac_hla l --(invoice lines table)
ON v.num_fra = l.num_fra
Can you help me?

Here is another way to do this...Basically, your sub query pulls the line item info - and gets the row number (partitioned by order number). Then, you LEFT OUTER JOIN that subset to the table with the discount value ONLY when the row number = 1. This approach doesn't require a CASE statement since the LEFT OUTER JOIN will give you NULL values for all row numbers above 1.
SELECT Sub.*,
V.ctdrap_div AS [Discount]
FROM
(
SELECT *,
ROW_NUMBER() OVER(PARTITION BY v.num_fra ORDER BY v.num_fra) AS [Row Number]
FROM ffac_hla L
) Sub
LEFT OUTER JOIN ffac_vta V
ON v.num_fra = Sub.num_fra
AND Sub.[Row Number] = 1

It seems like you should be able to just change your join to a left join.
Select
ROW_NUMBER() over(order by v.num_fra)as Rank,
l.*,
v.ctdrap_div as discount
from ffac_vta v --(invoicetable)
left join ffac_hla l --(invoice lines table)
ON v.num_fra = l.num_fra

You will need to use the key of your lines table or something to order your rows. Something like this. Also, do we have ROW_NUMBER() in SQL-Server-2008?
SELECT T.num_fra,
CASE WHEN T.rank = 1 THEN T.Discount
ELSE NULL AS Discount
FROM
(
Select
ROW_NUMBER() over(PARTITION BY v.num_fra ORDER BY <<ADD THE KEY OF INVOICE LINES HERE>>)as Rank,
l.*,
v.ctdrap_div as discount
from ffac_vta v --(invoicetable)
join ffac_hla l --(invoice lines table)
ON v.num_fra = l.num_fra
) AS T

Change to
;WITH cte AS (SELECT DENSE_RANK() over(order by v.num_fra)as Rank,
num_fra,
l.*,
v.ctdrap_div as discount
FROM ffac_vta v --(invoicetable)
JOIN ffac_hla l --(invoice lines table)
ON v.num_fra = l.num_fra
)
SELECT num_fra
, CASE WHEN Rank = 1 THEN discount ELSE 0 END
, *
FROM cte;

I think you could use left join...
For more information about join, you can see this web site : https://www.quora.com/SQL-What-is-the-difference-between-various-types-of-joins
difference between join image
I hope this help you

Try this:
select i.InvoiceID,Discount
from invoicedetail i
left join invoicediscount id on i.invoiceID=id.invoiceid and i.linenumber=1

Related

How to show ONLY the max value of a inner join table column?

I used INNER JOIN on two tables :
Transactions
- transaction_id (PK)
-ticket_id (FK) reference to ticketsforsale
Ticketsforsale :
- ticket_id (PK)
- type
- price
(there are more columns in each table but serve no purpose for this question)
The query i tried is the following :
SELECT ticketsforsale.type , SUM(ticketsforsale.price) AS TotalProfit
FROM ticketsforsale INNER JOIN transactions
ON ticketsforsale.ticket_id = transactions.ticket_id
GROUP BY ticketsforsale.type
The result is :
Sports | 300
Cruise | 600
Theater| 100
I tried using this line in the query
WHERE TotalProfit = SELECT(MAX(TotalProfit)
But I can't figure out the right place for this line.
What i want the query to do is to show only the ROW containing the max value of "TotalProfit" . I am just missing the right MAX function usage on this query , thanks !

Use ORDER BY and a limit the result set to one row:
SELECT tfs.type , SUM(tfs.price) AS TotalProfit
FROM ticketsforsale tfs INNER JOIN
transactions t
ON tfs.ticket_id = t.ticket_id
GROUP BY tfs.type
ORDER BY TotalProfit DESC
FETCH FIRST 1 ROW ONLY;
Note that I introduced table aliases as well, so the query is easier to write and to read.
Based on this query, you don't seem to need the JOIN:
SELECT tfs.type , SUM(tfs.price) AS TotalProfit
FROM ticketsforsale tfs
GROUP BY tfs.type
ORDER BY TotalProfit DESC
FETCH FIRST 1 ROW ONLY;

You can use CTE and pick only one row based on TotalProfit values.
with cte as (
SELECT ticketsforsale.type , SUM(ticketsforsale.price) AS TotalProfit
FROM ticketsforsale INNER JOIN transactions
ON ticketsforsale.ticket_id = transactions.ticket_id
GROUP BY ticketsforsale.type
)
select *
from cte
order by TotalProfit desc
limit 1
If you want to use max(), you can do something like this:
with cte as (
SELECT ticketsforsale.type , SUM(ticketsforsale.price) AS TotalProfit
FROM ticketsforsale INNER JOIN transactions
ON ticketsforsale.ticket_id = transactions.ticket_id
GROUP BY ticketsforsale.type
)
select *
from cte
where TotalProfit = (select max(TotalProfit) from cte)

Last two row from multiple records assistance

I’m having a little problem that I’m not sure how to get a round and hoping someone here can assist. What I need to do is run a select on multiple records and retrieve the last two records of p.CustID. When enter in one p.CustID the code works fine however I need to remove the where clause and I need it to retrieve the last two records for each p.CustID (about 14,000 records in total) When I remove the where clause it only returns two records in total which are top two records in my from statement [DB_User].[dbo].[P1ASellers]. I tried using this in a CTE but still cannot get this to return
Code I’m using below:
SELECT TOP (2)
cbc.StorePartnerCustConfigID,
p.CustID,
cbc.ConfigID,
cbc.EffectiveDate,
ROW_NUMBER() OVER (ORDER BY cbc.StorePartnerID DESC) AS RowNum
FROM [DB_User].[dbo].[P1ASellers] p
INNER JOIN [ACA].dbo.tblConfig_StorePartnerConfig BP
ON BP.EntityID=CAST(p.CustID AS VARCHAR)
INNER JOIN [ACA].dbo.tblConfig_StorePartner CBP
ON CBP.StorePartnerID=BP.StorePartnerID
INNER JOIN [ACA].dbo.tblConfig_StorePartnerCustConfig CBC
ON CBP.StorePartnerID=CBC.StorePartnerID
AND cbc.ProcessConfigID IN (1,2,3,4)
INNER JOIN [ACA].dbo.tblConfig_StorePartnerCustConfig CBC2
ON CBC.StorePartnerID=CBC2.StorePartnerID
AND cbc2.ConfigID IN (1,2,3,4) where p.CustID=55555 <-need to remove the
where
ORDER BY cbc.StorePartnerID DESC
The results from the query
StorePartnerCustConfigID CustID ConfigID EffectiveDate RowNum
15031 55555 4 2015-06-25 1
15032 55555 1 2015-06-25 2
What I actually get after I remove the where clause:
StorePartnerCustConfigID CustID ConfigID EffectiveDate RowNum
68995 89566 2 2011-03-02 1
68996 89566 1 2011-03-02 2
what I expect after I remove the where clause:
StorePartnerCustConfigID CustID ConfigID EffectiveDate RowNum
15031 55555 4 2015-06-25 1
15032 55555 1 2015-06-25 2
64584 65486 2 2013-04-16 1
64585 65486 1 2013-04-16 2
So on and so on.......
Any input greatly appreciated, thanks!!

I think you are looking for top 2 records for each customer which you can get as below:
SELECT TOP (2) with ties
cbc.StorePartnerCustConfigID,
p.CustID,
cbc.ConfigID,
cbc.EffectiveDate,
ROW_NUMBER() OVER (ORDER BY cbc.StorePartnerID DESC) AS RowNum
FROM [DB_User].[dbo].[P1ASellers] p
INNER JOIN [ACA].dbo.tblConfig_StorePartnerConfig BP
ON BP.EntityID=CAST(p.CustID AS VARCHAR)
INNER JOIN [ACA].dbo.tblConfig_StorePartner CBP
ON CBP.StorePartnerID=BP.StorePartnerID
INNER JOIN [ACA].dbo.tblConfig_StorePartnerCustConfig CBC
ON CBP.StorePartnerID=CBC.StorePartnerID
AND cbc.ProcessConfigID IN (1,2,3,4)
INNER JOIN [ACA].dbo.tblConfig_StorePartnerCustConfig CBC2
ON CBC.StorePartnerID=CBC2.StorePartnerID
AND cbc2.ConfigID IN (1,2,3,4)
ORDER (row_number() over(partition by p.CustId Order BY cbc.StorePartnerID DESC-1)/2 +1 )

Count number of rows that were returned in a join

This query successfully returns only the first row of a join that could potentially have more than one row.
WITH RevisionProducts AS (
SELECT
qr.LeadID,
p.Code,
ROW_NUMBER() OVER(PARTITION BY qr.LeadID ORDER BY qr.LeadID DESC) rownumber
FROM
QuoteRevisions qr
JOIN ...
)
SELECT
l.LeadID,
...
co.Name,
rp1.Code,
0 AS CodeCount
FROM
Leads l
JOIN Companies co on co.CompanyID = l.CompanyID
JOIN RevisionProducts rp1 ON rp1.LeadID = l.ID AND rp1.rownumber = 1
What I want to do now is replace...
0 AS CodeCount
...with the actual number of rows that would have been returned in the join, had we allowed them all. Can't figure out how to do this.
I'm not sure I need the CTE, but I figured it might be handy since I most likely need to reference the same query again for the count?
EDIT:
Ok it looks like I need to be more clear. So the query with in the CTE... let's run it with a WHERE clause:
SELECT
qr.LeadID,
p.Code,
ROW_NUMBER() OVER(PARTITION BY qr.LeadID ORDER BY qr.LeadID DESC) rownumber
FROM
QuoteRevisions qr
JOIN ...
WHERE
qr.LeadID = 151
And let's say that query returns 5 rows. So, in the first query, if we DID NOT limit the join to the first row only, then the join would have returned 5 rows when Lead.LeadID got to 151. So in the final dataset, there would have been 5 rows that were identical except for the rp1.Code column.
I have already limited the number of rows to 1, which is what I wanted. But now, I want to know how many rows would have been returned.
I hope that makes more sense.

How about something like this?
WITH RevisionProducts AS (
SELECT
qr.LeadID,
p.Code,
ROW_NUMBER() OVER(PARTITION BY qr.LeadID ORDER BY qr.LeadID DESC) rownumber
COUNT(*) OVER(PARTITION BY qr.LeadID ) rowcount
FROM
QuoteRevisions qr
JOIN ...
)
SELECT
l.ID,
...
co.Name,
rp1.Code,
rp1.rowcount AS CodeCount
FROM
Leads l
JOIN Companies co on co.CompanyID = l.CompanyID
JOIN RevisionProducts rp1 ON rp1.LeadID = l.ID AND rp1.rownumber = 1

Returning ID's from two other tables or null if no IDs found using using a left join SQL Server

I am wondering if someone could hep me. I am trying to make a join on two tables and return an id if an id is there but if there is no id return null but still return the row for that product and not ignore it. My query below returns twice the amount the records to which I can not figure out why.
SELECT
T2.ProductID, FirstChild.SupplierID, SecondChild.AccountID
FROM
Products T2
LEFT OUTER JOIN
(
SELECT TOP(1) SupplierID, Reference,CompanyID, Row_Number() OVER (Partition By SupplierID Order By SupplierID) AS RowNo FROM Suppliers
)
FirstChild ON T2.SupplierReference = FirstChild.Reference AND RowNo = 1AND FirstChild.CompanyID =T2.CompanyID
LEFT OUTER JOIN
(
SELECT TOP(1) AccountID, SageKey,CompanyID, Row_Number() OVER (Partition By AccountID Order By AccountID) AS RowNo2 FROM Accounts
)
SecondChild ON T2.ProductAccountReference = SecondChild.Reference AND RowNo2 = 1 AND SecondChild.CompanyID =T2.CompanyID
Example of what I am trying to do
ProductID SupplierID AccountID
1 5 2
2 6 NULL
3 NULL NULL

OUTER APPLY and ditching the ROW_NUMBER Seems like a better choice here:
SELECT
p.ProductId
,FirstChild.SupplierId
,SecondChild.AccountId
FROM
Products p
OUTER APPLY (SELECT TOP (1) s.SupplierId
FROM
Suppliers s
WHERE
p.SupplierReference = s.SupplierReference
AND p.CompanyId = s.CompanyId
ORDER BY
s.SupplierId
) FirstChild
OUTER APPLY (SELECT TOP (1) a.AccountId
FROM
Accounts
WHERE
p.ProductAccountReference = a.Reference
AND p.CompanyId = a.CompanyId
ORDER BY
a.AccountID
) SecondChild
The way your query is written above there is no correlation for the derived tables. Which means you would always get what ever SupplierId SQL chooses based on optimization and if that doesn't happen to always be Row1 you wont get the value. You need to relate your Table and select top 1, adding an ORDER BY in your derived table is like identifying the row number you want.

If it's just showing duplicate records, wouldn't an inelegant solution just be to add distinct in the select line?

Find MAX with JOIN where Field also shows up in another Table

I have 3 tables: Master, Paper and iCodes. For a certain set of Master.Ref's, I need to find Max(Paper.Date), where the Paper.Code is also in the iCodes table (i.e., Paper.Code is a type of iCode). Master is joined to Paper by the File field.
EDIT:
I only need the Max(Paper.Date) its corresponding Code; I do not need all of the Codes.
I wrote the following but it is very slow. I have a few hundred ref #'s to look for. What is a better way to do this?
SELECT Master.Ref,
Paper.Code,
mp.MaxDate
FROM ( SELECT p.File ,
MAX(p.Date) AS MaxDate ,
FROM Paper AS p
LEFT JOIN Master AS m ON p.File = m.File
WHERE m.Ref IN ('ref1', 'ref2', 'ref3', 'ref4', 'ref5', 'ref6'... )
AND p.Code IN ( SELECT DISTINCT i.iCode
FROM iCodes AS i
)
GROUP BY p.File
) AS mp
LEFT JOIN Master ON mp.File = Master.File
LEFT JOIN Paper ON Master.File = Paper.File
AND mp.MaxDate = Paper.Date
WHERE Paper.Code IN ( SELECT DISTINCT iCodes.iCode
FROM iCodes
)

Does this do what you want?
SELECT m.Ref, p.Code, max(p.date)
FROM Master m LEFT JOIN
Paper
ON m.File = p.File
WHERE p.Code IN (SELECT DISTINCT iCodes.iCode FROM iCodes) and
m.Ref IN ('ref1','ref2','ref3','ref4','ref5','ref6'...)
GROUP BY m.Ref, p.Code;
EDIT:
To get the code on the max date, then use window functions:
select ref, code, date
from (SELECT m.Ref, p.Code, p.date
row_number() over (partition by m.Ref order by p.date desc) as seqnum
FROM Master m LEFT JOIN
Paper
ON m.File = p.File
WHERE p.Code IN (SELECT DISTINCT iCodes.iCode FROM iCodes) and
m.Ref IN ('ref1','ref2','ref3','ref4','ref5','ref6'...)
) mp
where seqnum = 1;
The function row_number() assigns a sequential number starting at 1 to a group of rows. The groups are defined by the partition by clause, so in this case everything with the same m.Ref value would be in a single group. Within the group, rows are assigned the number based on the order by clause. So, the one with the biggest date gets the value of 1. That is the row you want.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas