VLookup in SQL? - Joining to only pick out the top row - sql

I am trying to get just the first row from a JOIN in SQL. Something similiar to Vlookup in Excel.
I have the following tables
CREATE TABLE customer_lookup (
customer_product varchar(50),
supplier_product varchar(50),
customer_code varchar(10)
)
CREATE TABLE supplier (
part_number varchar(50)
)
INSERT INTO customer_lookup (
customer_product,
supplier_product,
customer_code ) VALUES ('CONTAINER', 'BOX', 'CUST01')
INSERT INTO customer_lookup (
customer_product,
supplier_product,
customer_code ) VALUES ('CONTAINER', 'BOX', 'CUST02')
INSERT INTO customer_lookup (
customer_product,
supplier_product,
customer_code ) VALUES ('FABRIC', 'MATERIAL', 'CUST01')
INSERT INTO supplier ( part_number ) VALUES ('FABRIC')
INSERT INTO supplier ( part_number ) VALUES ('CONTAINER')
INSERT INTO supplier ( part_number ) VALUES ('PAINT')
and my query is
SELECT
s.part_number, c.supplier_product, c.customer_code
FROM
supplier s
LEFT JOIN
(
SELECT * FROM customer_lookup t
) c
ON s.part_number = c.customer_product
http://sqlfiddle.com/#!6/716b5/1
The result I am trying to get is
part_number supplier_product customer_code
FABRIC MATERIAL CUST01
CONTAINER BOX CUST01
PAINT (null) (null)
but the above SQL query produces
part_number supplier_product customer_code
FABRIC MATERIAL CUST01
CONTAINER BOX CUST01
CONTAINER BOX CUST02
PAINT (null) (null)
I don't care that the row with CONTAINER is missing customer_code CUST02. I just need to top one
I have tried
SELECT
s.part_number, c.supplier_product, c.customer_code
FROM
supplier s
LEFT JOIN
(
SELECT TOP 1 * FROM customer_lookup t
) c
ON s.part_number = c.customer_product
but this just nulls out both FABRIC and PAINT rows
Any help would be appreciated

You can use GROUP BY and MAX to achieve what you're looking for
SELECT
s.part_number, c.supplier_product, MAX(c.customer_code)
FROM
supplier s
LEFT JOIN
(
SELECT * FROM customer_lookup t
) c
ON s.part_number = c.customer_product
GROUP BY s.part_number, c.supplier_product
For every part_number and supplier_product unique identifying combination, you want the highest customer_code value.

If you don't care which row qualifies as the top row, as long as it returns one row at most, then you can use the row_number window function with order by null.
SELECT s.part_number, c.supplier_product, c.customer_code
FROM supplier s
LEFT JOIN (SELECT *,
row_number() over (partition by customer_product order by null) as rn
FROM customer_lookup) c
ON s.part_number = c.customer_product
AND c.rn = 1
If you do care which row gets picked, then just modify the order by clause accordingly.

You can simply use CROSS APPLY to get your results, the main benefit here is that you are not using aggregation (GROUP BY)
SELECT
s.part_number, c.supplier_product, c.customer_code
FROM
supplier s
CROSS APPLY
(
SELECT TOP 1 * FROM customer_lookup t
WHERE s.part_number = t.customer_product
ORDER BY t.customer_code
) c
You should also add an ORDER BY to ensure the results are order the way you want them to be (I have added this in for you).
You should also define columns that you are using rather than using an asterisk (*) but that's up to you (I've left this as is for now)
http://sqlfiddle.com/#!6/716b5/17

If you're wanting the results you showed and in the order you showed them
SELECT
s.part_number, c.supplier_product, MIN(c.customer_code)
FROM
supplier s
LEFT JOIN
(
SELECT * FROM customer_lookup t
) c
ON s.part_number = c.customer_product
GROUP BY s.part_number, c.supplier_product
ORDER BY c.supplier_product DESC

Related

Write Cross Apply to select last row with condition

I'm trying to make request for SQL Table which looks like:
CREATE TABLE StudentMark (
Id int NOT NULL IDENTITY(1,1),
StudentId int NOT NULL,
Mark int
);
Is that possible to select StudentMark rows where row should be last row for each user with mark greater than 4.
I'm trying to accomplish that by doing:
SELECT *
FROM [dbo].StudentMark outer
CROSS APPLY (
SELECT TOP(1) *
FROM [dbo].StudentMark inner
WHERE inner.StudentId= outer.StudentId AND inner.Mark>4
) cApply
But that doesn't do what's needed. Could anyone help?
I am curious if something like this will actually get what you are looking for:
SELECT Data.StudentId,
Data.Id,
Data.Mark
FROM
(
SELECT ROW_NUMBER() OVER (PARTITION BY StudentMark.StudentId ORDER BY StudentMark.Id DESC) AS RowNumber,
StudentMark.StudentId,
StudentMark.Id,
StudentMark.Mark
FROM dbo.StudentMark
) AS Data
WHERE Data.RowNumber = 1
What this will do is get the row number of each and then let you filter by the row number on what is returned. I changed it to look for the first row instead of the last row, but sorted DESC, such that you will get what effectively would have been the last row entered for a student id.
Presumably, "last row" is based on id. If so:
SELECT *
FROM [dbo].StudentMark sm CROSS APPLY
(SELECT TOP (1) sm2.*
FROM [dbo].StudentMark sm2
WHERE sm2.StudentId = sm.StudentId AND sm2.Mark > 4
ORDER BY sm2.id DESC
) sm2;
EDIT:
If you only want the last row per student with Mark > 4, then use filtering:
select sm.*
from dbo.StudentMark sm
where sm.id = (select max(sm2.id)
from dbo.StudentMark sm2
where sm2.studentId = sm.studentId and sm2.Mark > 4
);
SELECT *
FROM StudentMark A
CROSS APPLY
(
SELECT TOP 1 Mark
FROM StudentMark B
WHERE A.StudentId = B.StudentId
ORDER BY StudentId
)M
WHERE M.Mark > 4

Identify duplicates records and insert into another table

I have a table like this in this there are duplicate records are there So my requirement is identify the duplicate records and store into another table i.e., Customer_duplicate
and distinct records into one table
Existing query:
Create proc usp_store_duplicate_into_table
as
begin
insert into Customer_Duplicate
select *
from Customer C
group by cid
having count(cid) > 1
What you have is fine, except that you can't select items that are not in your group by; for example, you could do:
insert into Customer_Duplicate
select cid, count(*)
from Customer C
group by cid
having count(cid) > 1
Depending on what Customer_Duplicate looks like. If you really need to include all the rows then something like this might work for you:
insert into Customer_Duplicate
select *
from customer c
where c.cid in
(
select cid
from Customer
group by cid
having count(cid) > 1
)
You can Use Row_Number() ranking Function With Partition By in SQL Server to Identify Duplicate rows.
In Partition By You can Define numbers of columns That you have to Find duplicate records.
For Example I am Using Name and No, You can Replace it with Your columns name.
insert into Customer_Duplicate
SELECT * FROM (
select * , ROW_NUMBER() OVER(PARTITION BY NAME,NO ORDER BY NAME,NO) AS RNK
from Customer C
) AS d
WHERE rnk > 1
For finding the duplicates, you can use the below code.
insert into Customer_Duplicate
SELECT c.name, c.othercolumns
(select c.name,c.othercolumns, ROW_NUMBER() OVER(PARTITION BY cid ORDER BY 1) AS rnk
from Customer C
) AS c
WHERE c.rnk >1;
If you want to insert distinct records into another table, you can use the below code.
insert into Customer_Distinct
SELECT c.name, c.othercolumns
(select c.name,c.othercolumns, ROW_NUMBER() OVER(PARTITION BY cid ORDER BY 1) AS rnk
from Customer C
) AS c
WHERE c.rnk = 1;

ERROR: ORA-00923: FROM keyword not found where expected

I tried to fetch data from a oracle sql table with the count of records. I tried like following,
SELECT *,
(COUNT(BRAND_ID) AS TOTAL)
FROM
(
SELECT BRAND_ID,
BRAND_CODE,
BRAND_TITLE
FROM BRAND
WHERE ACTIVE = '1'
ORDER BY BRAND_TITLE ASC
OFFSET 10 ROWS
FETCH NEXT 10 ROWS ONLY
) BRAND
LEFT JOIN
((
SELECT PRODUCT_ID,
PRODUCT_SKU_ID,
PRODUCT_WEB_ID,
PRODUCT_TITLE,
PRODUCT_SALES_PRICE,
PRODUCT_REGULAR_PRICE,
PRODUCT_RATING
FROM PRODUCT
WHERE
(
PRODUCT_TYPE='B'
OR PRODUCT_TYPE='R'
)
AND AVAILABILITY='1'
) PRDUCT ) ON BRAND.BRAND_CODE= PRDUCT.BRAND_CODE
When I'm executing this I got the following error,
ERROR: ORA-00923: FROM keyword not found where expected
How may I fix this.
Thanks in Advance!
I guess You should remove * from select statement in the first line. Try the below one.
SELECT (COUNT(BRAND_ID) AS TOTAL)
FROM
(
SELECT BRAND_ID,
BRAND_CODE,
BRAND_TITLE
FROM BRAND
WHERE ACTIVE = '1'
ORDER BY BRAND_TITLE ASC
OFFSET 10 ROWS
FETCH NEXT 10 ROWS ONLY
) BRAND
LEFT JOIN
((
SELECT PRODUCT_ID,
PRODUCT_SKU_ID,
PRODUCT_WEB_ID,
PRODUCT_TITLE,
PRODUCT_SALES_PRICE,
PRODUCT_REGULAR_PRICE,
PRODUCT_RATING
FROM PRODUCT
WHERE
(
PRODUCT_TYPE='B'
OR PRODUCT_TYPE='R'
)
AND AVAILABILITY='1'
) PRDUCT ) ON BRAND.BRAND_CODE= PRDUCT.BRAND_CODE
You are using a aggreagte function in the select statement . So you cannot simply call Select * for other columns.
First you should give an alias for the inside columns selected for easiness.
Then select that columns in the outside SELECT
Since one of the column in select is using agg function then a Group By should be done by other columns coming in Select.
Here for easiness i gave column name as c2,c3....rename as like u want.
If no alias is given u can specify the column as it is specified.
SELECT c2,c3,c4,c5,c6,c7,c8,c9,c10,
COUNT(BRAND_ID) AS TOTAL
FROM
(
SELECT BRAND_ID ,
BRAND_CODE AS c2,
BRAND_TITLE AS c3
FROM BRAND
WHERE ACTIVE = '1'
ORDER BY BRAND_TITLE ASC
OFFSET 10 ROWS
FETCH NEXT 10 ROWS ONLY
) BRAND
LEFT JOIN
((
SELECT PRODUCT_ID AS c4,
PRODUCT_SKU_ID AS c5,
PRODUCT_WEB_ID AS c6,
PRODUCT_TITLE AS c7,
PRODUCT_SALES_PRICE AS c8,
PRODUCT_REGULAR_PRICE AS c9,
PRODUCT_RATING AS c10
FROM PRODUCT
WHERE
(
PRODUCT_TYPE='B'
OR PRODUCT_TYPE='R'
)
AND AVAILABILITY='1'
) PRDUCT ) ON BRAND.BRAND_CODE= PRDUCT.BRAND_CODE
Group By c2,c3,c4,c5,c6,c7,c8,c9,c10
I don't have 12c, so can't test, but maybe this is what you're after?
SELECT *
FROM
(
SELECT BRAND_ID,
BRAND_CODE,
BRAND_TITLE
FROM (select b.*,
count(brand_id) over () total
from BRAND b
WHERE ACTIVE = '1'
ORDER BY BRAND_TITLE ASC
OFFSET 10 ROWS
FETCH NEXT 10 ROWS ONLY
) BRAND
LEFT JOIN
((
SELECT PRODUCT_ID,
PRODUCT_SKU_ID,
PRODUCT_WEB_ID,
PRODUCT_TITLE,
PRODUCT_SALES_PRICE,
PRODUCT_REGULAR_PRICE,
PRODUCT_RATING
FROM PRODUCT
WHERE
(
PRODUCT_TYPE='B'
OR PRODUCT_TYPE='R'
)
AND AVAILABILITY='1'
) PRDUCT ) ON BRAND.BRAND_CODE= PRDUCT.BRAND_CODE;
This uses an analytic query to get the count of all brand_ids over the whole table before you filter the rows. I'm not sure if you wanted the count per brand_id (count(*) over (partititon by brand_id) or perhaps the count of distinct brand_ids (count(distinct brand_id) over ()), though, so you'll have to play around with the count function to get the results you're after.

Get Products supporting all delivery modes in SQL

I have a table ProductDeliveryModes as:
ProductId DeliveryId
P101 D1
P101 D2
P101 D3
P102 D1
P102 D2
P102 D3
P103 D1
I need to get products which support all delivery modes (D1, D2, D3). From looking at the table the products should be: P101 and P102.
The query that I formed to get the solution is:
SELECT ProductId
FROM (SELECT DISTINCT ProductId,
DeliveryId
FROM ProductDeliveryModes) X
WHERE X.DeliveryId IN ( 'D1', 'D2', 'D3' )
GROUP BY ProductId
HAVING COUNT(*) = 3
The problem that I see in my solution is that one should know the count of the total number of delivery modes. We could make the count dynamic by getting the count from Sub-query.
Is there a better solution ?
I believe you can use DISTINCT with COUNT function to get the same result:
SELECT [ProductID]
FROM ProductDeliveryModes
GROUP BY [ProductID]
HAVING COUNT(DISTINCT [DeliveryId]) = 3
Check the example.
You can simple store the distinct delivery count in a variable and used it. If you need to do this in a single query, this is one of the possible ways:
WITH CTE (DeliveryCount) AS
(
SELECT COUNT(DISTINCT [DeliveryID])
FROM DataSource
)
SELECT [ProductID]
FROM DataSource
CROSS APPLY CTE
GROUP BY [ProductID]
,CTE.DeliveryCount
HAVING COUNT(DISTINCT [DeliveryID]) = DeliveryCount
See the example.
you can use this below query for better performance.
;WITH CTE_Product
AS
(
SELECT DISTINCT ProductID
FROM ProductDeliveryModes
),CTE_Delivery
AS
(
SELECT DISTINCT DeliveryId
FROM ProductDeliveryModes
)
SELECT *
FROM CTE_Product C
WHERE NOT EXISTS
(
SELECT 1
FROM CTE_Delivery D
LEFT JOIN ProductDeliveryModes T ON T.DeliveryId = D.DeliveryId AND T.ProductId=C.ProductId
WHERE T.ProductID IS NULL
)
You can modify your query just a bit to get the actual count of distinct delivery methods:
SELECT ProductID
FROM ProductDeliveryModes
GROUP BY ProductID
HAVING COUNT(*) =
(SELECT COUNT (DISTINCT DeliveryId) FROM ProductDeliveryModes)

SQL Group BY SUM one column and select of first row of grouped items

I have a part table where I have 5 fields. I want to sum the QTY of the mfgpn while showing the first returned row for the other 3 fields (Manfucturer, DateCode, Description). I initially thought of using the MIN function as follows, but that doesn't really help me insofar as that the data is not a int data type. How would I go about doing this? Right now I'm stuck at the following query below:
SELECT SUM([QTY]) AS QTY
,[MFGPN]
,MIN([MANUFACTURER]) AS MANUFACTURER
,MIN([DATECODE]) AS DateCode
,MIN([DESCRIPTION]) AS DESCRIPTION
INTO part
GROUP BY MFGPN, MANUFACTURER, DATECODE, description
ORDER BY mfgpn ASC
Would CROSS APPLY work for you?
SELECT
SUM(a.[QTY]) AS QTY
,a.[MFGPN]
,c.[MANUFACTURER]
,c.[DATECODE]
,c.[DESCRIPTION]
FROM part a
CROSS APPLY (SELECT TOP 1 * FROM part b WHERE a.[MFGPN] = b.[MFGPN]) c
GROUP BY
a.[MFGPN]
,c.[MANUFACTURER]
,c.[DATECODE]
,c.[DESCRIPTION]
Tested with the following:
DECLARE #T1 AS TABLE (
[QTY] int
,[MFGPN] NVARCHAR(50)
,[MANUFACTURER] NVARCHAR(50)
,[DATECODE] DATE
,[DESCRIPTION] NVARCHAR(50));
INSERT #T1 VALUES
(2, 'MFGPN-1', 'MANUFACTURER-A', '20120101', 'A-1'),
(4, 'MFGPN-1', 'MANUFACTURER-B', '20120102', 'B-1'),
(3, 'MFGPN-1', 'MANUFACTURER-C', '20120103', 'C-1'),
(1, 'MFGPN-2', 'MANUFACTURER-A', '20120101', 'A-2'),
(5, 'MFGPN-2', 'MANUFACTURER-B', '20120101', 'B-2')
SELECT
SUM(a.[QTY]) AS QTY
,a.[MFGPN]
,c.[MANUFACTURER]
,c.[DATECODE]
,c.[DESCRIPTION]
FROM #T1 a
CROSS APPLY (SELECT TOP 1 * FROM #T1 b WHERE a.[MFGPN] = b.[MFGPN]) c
GROUP BY
a.[MFGPN]
,c.[MANUFACTURER]
,c.[DATECODE]
,c.[DESCRIPTION]
Produces
QTY MFGPN MANUFACTURER DATECODE DESCRIPTION
9 MFGPN-1 MANUFACTURER-A 2012-01-01 A-1
6 MFGPN-2 MANUFACTURER-A 2012-01-01 A-2
This can be easily managed with a windowed SUM():
WITH summed_and_ranked AS (
SELECT
MFGPN,
MANUFACTURER,
DATECODE,
DESCRIPTION,
QTY = SUM(QTY) OVER (PARTITION BY MFGPN),
RNK = ROW_NUMBER() OVER (
PARTITION BY MFGPN
ORDER BY DATECODE -- or which column should define the order?
)
FROM atable
)
SELECT
MFGPN,
MANUFACTURER,
DATECODE,
DESCRIPTION,
QTY,
INTO parts
FROM summed_and_ranked
WHERE RNK = 1
;
For every row, the total group quantity and the ranking within the group is calculated. When actually getting rows for inserting into the new table (the main SELECT), only rows with RNK values of 1 are pulled. Thus you get a result set containing group totals as well as details of certain rows.