SQL Filter unique results

SQL Filter unique results - sql

I am trying to get an SQL statement that will output a unique part number eg no duplicates
However I want the type as Purchased is the default and when there isnt a Purchased part it defults back to Manufactured. NOTE all parts can be purchased
The result I require is to only show unique part numbers
e.g. 1 to 10 and Purchased is the default Type
Table1
Part_number | Type | Cost
Part 1 | Manufactured | £1.00
Part 1 | Purchased | £0.56
Part 2 | Manufactured | £1.26
Part 2 | Purchased | £0.94
Part 3 | Manufactured | £0.36
Part 3 | Purchased | £0.16
Part 4 | Manufactured | £1.00
Part 4 | Purchased | £1.50
Part 5 | Manufactured | £1.65
Part 6 | Manufactured | £1.98
Part 7 | Manufactured | £0.15
Part 8 | Manufactured | £0.45
Part 9 | Manufactured | £1.20
Part 9 | Purchased | £0.80
Part 10| Manufactured | £1.00
This is the result I am hoping to get back
Part_number | Type | Cost
Part 1 | Purchased | £0.56
Part 2 | Purchased | £0.94
Part 3 | Purchased | £0.16
Part 4 | Purchased | £1.50
Part 5 | Manufactured | £1.65
Part 6 | Manufactured | £1.98
Part 7 | Manufactured | £0.15
Part 8 | Manufactured | £0.45
Part 9 | Purchased | £0.80
Part 10| Manufactured | £1.00
I have tried loads of different techniques but am not getting the result.
I am guessing that I will need to create temp tables that are filtered and then join the tables together but I really don't know.
Any help will be apricated

You could also just grab the first row in each group by sorting them. This would make it easier when there are other columns of data to bring back.
with data as (
select *, row_number() over (
partition by part_number
order by case when t.type = Purchased then 1 else 2 end) as rn
from t
)
select * from data where rn = 1;
If there are other types this would work as well although you would want to tweak it if there are more than two per part.

One method uses not exists. Assuming you have at most two rows per part:
select t.*
from t
where t.type = 'Purchased' or
(t.type = 'Manufactured' and
not (exists (select 1 from t t2 where t2.part_number = t.part_number and t2.type = 'Purchased')
)
);
There are other fun ways to handle this. For instance, an aggregation approach:
select part_number,
max(type) as type,
coalesce(max(case when type = 'Purchased' then cost end),,
max(cost)
) as cost
from t
group by part_number;

Related

Linking 2 columns, same table to a different table

First time poster, a little background I am not the most experienced SQL user, most of my knowledge is self taught, but I really struggling to get the results I am looking for here so I am hoping someone can point me in the right direction.
In the simplest form
I have a table that has all of our Item_ID's. Each of those item numbers has a Universal_ID associated with it stored in the same table structure. Most of the time these numbers match, except in the example below Item_ID 2 has a Universal_ID of 1
Item_ID | Univeral_ID
1 | 1
2 | 1
We then have an inventory table, which can be linked on the ItemID to show the QTY
Item_ID | Item_Qty | Item_Code
1 | 10 | 2/2/2021
1 | 20 | 2/3/2021
2 | 30 | 2/2/2021
If the Item_ID and Universal_ID are the same, it is quite easy to obtain the inventory
However I am struggling to get inventories for both when they do not match.
For example, if I wanted to find the QTY of Item_ID 1, I would be returned 2 results
Item_ID | Item_Qty | Item_Code
1 | 10 | 2/2/2021
1 | 20 | 2/3/2021
Problem: if I specifically am interested in Item_ID 2, how can I link it to the inventory table, to see not only Item_ID 2's qty available and also Item_ID 1's qty available since the Universal_ID does not match the Item_ID
So I would like the results to be just like the 2nd block of code I posted.
Item_ID | Item_Qty | Item_Code
1 | 10 | 2/2/2021
1 | 20 | 2/3/2021
2 | 30 | 2/2/2021
What is the best way to set up views or my select query to make this happen? If I need to add any more info I can!

You can use a left join and filtering:
select i.*
from inventory i left join
universal u
on i.item_id = u.item_id
where 1 in (u.universal_id, i.item_id);

Make a 1 to 1 multi-field SQL join where only some of the values match

I am trying to build a table that will be used as a conversion chart. I aim to make a simple join with this conversion table on multiple fields (8 in my case), and get a result. I will try to simplify the examples as much as I can because the original chart is a 40x10 matrix.
Let's say that I have these two (I know they don't make much sense and have bad design but they are just examples):
supply_conversion_chart
---
supply (integer)
customer_id (integer)
product_id (integer)
size (varchar)
purchase_type (varchar)
purchases
---
customer_id (integer)
product_id (integer)
size (varchar)
purchase_type (varchar)
and conversion chart would look something like this:
| supply | customer_id | product_id | size | purchase_type |
|--------|--------------|------------|----------|---------------|
| 100 | 1 | anything | anything | online |
| 101 | 1 | anything | anything | offline |
| 102 | other than 1 | anything | anything | online |
| 103 | 1 | 5 | XXL | online |
The main goal was to get an exact supply value by simply doing a join by doing something like:
SELECT supply
FROM purchases p
JOIN supply_conversion_chart scc ON
p.customer_id = scc.customer_id AND
p.product_id = scc.product_id AND
p.size = scc.size AND
p.purchase_type = scc.purchase_type;
Let's say that these are the records on purchases table:
| customer_id | product_id | size | purchase_type |
|-------------|------------|------|---------------|
| 1 | 3 | M | online |
| 1 | 5 | S | offline |
| 12345 | 4 | XL | online |
| 1 | 5 | XXL | online |
| 4353 | null | M | online |
I would expect first record's supply value to be 101, second record's to be 102, third 102, fourth 103, and fifth to be 102. However, as far as I know, SQL won't be able to do a proper join on all of these records except the fourth one, which is fully matching with supply 103 on supply_conversion_chart table. I don't know if it is possible in the first place to do a join using multiple fields when some of those fields are not fully matching.
My approach is probably faulty and there are better ways to get the results I am trying to achieve but I don't even know where to start. What should I do?
The original chart is much bigger that the provided example, and that I will be doing a join on 8 different fields.

You approach is a lateral join:
select p.*, scc.*
from purchases p left join lateral
(select scc.*
from supply_conversion_chart scc
where (scc.customer_id = p.customer_id or scc.customer_id is null) and
(scc.product_id = p.product_id or scc. product_id is null) and
(scc.size = p.size or scc.size is null) and
(scc.purchase_type = p.purchase_type or scc.purchase_type is null)
order by ( (scc.customer_id = p.customer_id)::int +
(scc.product_id = p.product_id)::int
(scc.size = p.size)::int
(scc.purchase_type = p.purchase_type)::int
) desc
limit 1
) scc;
Note: This represents "everything" as NULL. It doesn't have special logic for "customer other than 1". However, it does show you how to implement basically what you are trying to do.

SQL/Power BI Joins without common column

So I have the following problem:
I have 2 tables, one containing different bids for a product_type, and one containing the price, date etc. to which the product was sold.
The tables look like this:
Table bids:
+----------+---------------------+---------------------+--------------+-------+
| Bid_id | Start_time | End_time | Product_type | price |
+----------+---------------------+---------------------+--------------+-------+
| 1 | 18.01.2020 06:00:00 | 18.01.2020 06:02:33 | blue | 5 € |
| 2 | 18.01.2020 06:00:07 | 18.01.2020 06:00:43 | blue | 7 € |
| 3 | 18.01.2020 06:01:10 | 19.01.2020 15:03:15 | red | 3 € |
| 4 | 18.01.2020 06:02:20 | 18.01.2020 06:05:44 | blue | 6 € |
| | | | | |
+----------+---------------------+---------------------+--------------+-------+
Table sells:
+---------+---------------------+--------------+--------+
| Sell_id | Sell_time | Product_type | Price |
+---------+---------------------+--------------+--------+
| 1 | 18.01.2020 06:00:31 | Blue | 6,50 € |
| 2 | 18:01.2020 06:51:03 | Red | 2,50 € |
| | | | |
+---------+---------------------+--------------+--------+
The sell_id and the bid_id have no relation with each other.
What I want to find out is, what is the maximum bid to the time we sold the product_type. So if we take sell_id 1, it should check, which bids for this specific product_type were active during the sell_time (in this case bid_id 1 and 2) and give back the higher price (in this case bid_id 2).
I tried to solve this problem in Power Bi, however, I was not able to get a solution. I assume, that I have to work with SQL-Joins to solve it.
Is it possible, to join based on criteria instead of matching columns? Something like:
SELECT bids.start_time, bids.end_time, bids.product_type, MAX(bids.price), sells.sell_time, sells.product_type, sells.price
FROM sells
INNER JOIN bids ON bids.start_time<sells.sell_time AND bids.end_time > sells.sell_time;
I am sorry if this question is confusing, I am still new to this sorry. Thanks in advance for ANY help!

Your sample data Sell_time should be 18.01.2020, right? You Can try this code (can be resource-intensive in relation to the amount of data due to Cartesian joins). If you are sure that Sell day is always in Bid Start day, then you can add date column to yours tables and use additional TREATAS(VALUE(bids[day], sells[day])
Test =
VAR __tretasfilter =
TREATAS ( VALUES ( bids[Product_type] ), sells[Product_type] )
RETURN
SUMMARIZE (
FILTER (
SUMMARIZECOLUMNS (
sells[Sell_id],
bids[Price],
bids[Start_time],
sells[Sell_time],
bids[End_time],
sells[Product_type],
__tretasfilter
),
[Start_time] <= [Sell_time]
&& [End_time] >= [Sell_time]
),
sells[Sell_id],
"MaxPrice", MAX ( bids[Price] )
)

Returning singular row/value from joined table date based on closest date

I have a Production Table and a Standing Data table. The relationship of Production to Standing Data is actually Many-To-Many which is different to how this relationship is usually represented (Many-to-One).
The standing data table holds a list of tasks and the score each task is worth. Tasks can appear multiple times with different "ValidFrom" dates for changing the score at different points in time. What I am trying to do is query the Production Table so that the TaskID is looked up in the table and uses the date it was logged to check what score it should return.
Here's an example of how I want the data to look:
Production Table:
+----------+------------+-------+-----------+--------+-------+
| RecordID | Date | EmpID | Reference | TaskID | Score |
+----------+------------+-------+-----------+--------+-------+
| 1 | 27/02/2020 | 1 | 123 | 1 | 1.5 |
| 2 | 27/02/2020 | 1 | 123 | 1 | 1.5 |
| 3 | 30/02/2020 | 1 | 123 | 1 | 2 |
| 4 | 31/02/2020 | 1 | 123 | 1 | 2 |
+----------+------------+-------+-----------+--------+-------+
Standing Data
+----------+--------+----------------+-------+
| RecordID | TaskID | DateActiveFrom | Score |
+----------+--------+----------------+-------+
| 1 | 1 | 01/02/2020 | 1.5 |
| 2 | 1 | 28/02/2020 | 2 |
+----------+--------+----------------+-------+
I have tried the below code but unfortunately due to multiple records meeting the criteria, the production data duplicates with two different scores per record:
SELECT p.[RecordID],
p.[Date],
p.[EmpID],
p.[Reference],
p.[TaskID],
s.[Score]
FROM ProductionTable as p
LEFT JOIN StandingDataTable as s
ON s.[TaskID] = p.[TaskID]
AND s.[DateActiveFrom] <= p.[Date];
What is the correct way to return the correct and singular/scalar Score value for this record based on the date?

You can use apply :
SELECT p.[RecordID], p.[Date], p.[EmpID], p.[Reference], p.[TaskID], s.[Score]
FROM ProductionTable as p OUTER APPLY
( SELECT TOP (1) s.[Score]
FROM StandingDataTable AS s
WHERE s.[TaskID] = p.[TaskID] AND
s.[DateActiveFrom] <= p.[Date]
ORDER BY S.DateActiveFrom DESC
) s;
You might want score basis on Record Level if so, change the where clause in apply.

How to combine two tables allocating Sold amounts vs Demand without loops/cursor

My task is to combine two tables in a specific way. I have a table Demands that contains demands of some goods (tovar). Each record has its own ID, Tovar, Date of demand and Amount. And I have another table Unloads that contains unloads of tovar. Each record has its own ID, Tovar, Order of unload and Amount. Demands and Unloads are not corresponding to each other and amounts in demands and unloads are not exactly equal. One demand may be with 10 units and there can be two unloads with 4 and 6 units. And two demands may be with 3 and 5 units and there can be one unload with 11 units.
The task is to get a table which will show how demands are covering by unloads. I have a solution (SQL Fiddle) but I think that there is a better one. Can anybody tell me how such tasks are solved?
What I have:
------------------------------------------
| DemandNumber | Tovar | Amount | Order |
|--------------------------------|--------
| Demand#1 | Meat | 2 | 1 |
| Demand#2 | Meat | 3 | 2 |
| Demand#3 | Milk | 6 | 1 |
| Demand#4 | Eggs | 1 | 1 |
| Demand#5 | Eggs | 5 | 2 |
| Demand#6 | Eggs | 3 | 3 |
------------------------------------------
------------------------------------------
| SaleNumber | Tovar | Amount | Order |
|--------------------------------|--------
| Sale#1 | Meat | 6 | 1 |
| Sale#2 | Milk | 2 | 1 |
| Sale#3 | Milk | 1 | 2 |
| Sale#4 | Eggs | 2 | 1 |
| Sale#5 | Eggs | 1 | 2 |
| Sale#6 | Eggs | 4 | 3 |
------------------------------------------
What I want to receive
-------------------------------------------------
| DemandNumber | SaleNumber | Tovar | Amount |
-------------------------------------------------
| Demand#1 | Sale#1 | Meat | 2 |
| Demand#2 | Sale#1 | Meat | 3 |
| Demand#3 | Sale#2 | Milk | 2 |
| Demand#3 | Sale#3 | Milk | 1 |
| Demand#4 | Sale#4 | Eggs | 1 |
| Demand#5 | Sale#4 | Eggs | 1 |
| Demand#5 | Sale#5 | Eggs | 1 |
| Demand#5 | Sale#6 | Eggs | 3 |
| Demand#6 | Sale#6 | Eggs | 1 |
-------------------------------------------------
Here is additional explanation from author's comment:
Demand#1 needs 2 Meat and it can take them from Sale#1.
Demand#2 needs 3 Meat and can take them from Sale#1.
Demand#3 needs 6 Milk but there is only 2 Milk in Sale#3 and 1 Milk in Sale#4, so we show only available amounts.
And so on.
The field Order in the example determine the order of calculations. We have to process Demands according to their Order. Demand#1 must be processed before Demand#2. And Sales also must be allocated according to their Order number. We cannot assign eggs from sale if there are sales with eggs with lower order and non-allocated eggs.
The only way I can get this is using loops. Is it posible to avoid loops and solve this task only with t-sql?

If the Amount values are int and not too large (not millions), then I'd use a table of numbers to generate as many rows as the value of each Amount.
Here is a good article describing how to generate it.
Then it is easy to join Demand with Sale and group and sum as needed.
Otherwise, a plain straight-forward cursor (in fact, two cursors) would be simple to implement, easy to understand and with O(n) complexity. If Amounts are small, set-based variant is likely to be faster than cursor. If Amounts are large, cursor may be faster. You need to measure performance with actual data.
Here is a query that uses a table of numbers. To understand how it works run each query in the CTE separately and examine its output.
SQLFiddle
WITH
CTE_Demands
AS
(
SELECT
D.DemandNumber
,D.Tovar
,ROW_NUMBER() OVER (PARTITION BY D.Tovar ORDER BY D.SortOrder, CA_D.Number) AS rn
FROM
Demands AS D
CROSS APPLY
(
SELECT TOP(D.Amount) Numbers.Number
FROM Numbers
ORDER BY Numbers.Number
) AS CA_D
)
,CTE_Sales
AS
(
SELECT
S.SaleNumber
,S.Tovar
,ROW_NUMBER() OVER (PARTITION BY S.Tovar ORDER BY S.SortOrder, CA_S.Number) AS rn
FROM
Sales AS S
CROSS APPLY
(
SELECT TOP(S.Amount) Numbers.Number
FROM Numbers
ORDER BY Numbers.Number
) AS CA_S
)
SELECT
CTE_Demands.DemandNumber
,CTE_Sales.SaleNumber
,CTE_Demands.Tovar
,COUNT(*) AS Amount
FROM
CTE_Demands
INNER JOIN CTE_Sales ON
CTE_Sales.Tovar = CTE_Demands.Tovar
AND CTE_Sales.rn = CTE_Demands.rn
GROUP BY
CTE_Demands.Tovar
,CTE_Demands.DemandNumber
,CTE_Sales.SaleNumber
ORDER BY
CTE_Demands.DemandNumber
,CTE_Sales.SaleNumber
;
Having said all this, usually it is better to perform this kind of processing on the client using procedural programming language. You still have to transmit all rows from Demands and Sales to the client. So, by joining the tables on the server you don't reduce the amount of bytes that must go over the network. In fact, you increase it, because original row may be split into several rows.
This kind of processing is sequential in nature, not set-based, so it is easy to do with arrays, but tricky in SQL.

I have no idea what your requirements are or what the business rules are or what the goals are but I can say this -- you are doing it wrong.
This is SQL. In SQL you do not do loops. In SQL you work with sets. Sets are defined by select statements.
If this problem is not resolved with a select statement (maybe with sub-selects) then you probably want to implement this in another way. (C# program? Some other ETL system?).
However, I can also say there is probably a way to do this with a single select statement. However you have not given enough information for me to know what that statement is. To say you have a working example and that should be enough fails on this site because this site is about answering questions about problems and you don't have a problem you have some code.
Re-phrase the question with inputs, expect outputs, what you have tried and what your question is. This is covered well in the FAQ.
Or if you have working code you want reviewed, it may be appropriate for the code review site.

I see additional 2 possible ways:
1. for 'advanced' data processing and calculations you can use cursors.
2. you can use SELECT with CASE construction

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas