Linking 2 columns, same table to a different table - sql

First time poster, a little background I am not the most experienced SQL user, most of my knowledge is self taught, but I really struggling to get the results I am looking for here so I am hoping someone can point me in the right direction.
In the simplest form
I have a table that has all of our Item_ID's. Each of those item numbers has a Universal_ID associated with it stored in the same table structure. Most of the time these numbers match, except in the example below Item_ID 2 has a Universal_ID of 1
Item_ID | Univeral_ID
1 | 1
2 | 1
We then have an inventory table, which can be linked on the ItemID to show the QTY
Item_ID | Item_Qty | Item_Code
1 | 10 | 2/2/2021
1 | 20 | 2/3/2021
2 | 30 | 2/2/2021
If the Item_ID and Universal_ID are the same, it is quite easy to obtain the inventory
However I am struggling to get inventories for both when they do not match.
For example, if I wanted to find the QTY of Item_ID 1, I would be returned 2 results
Item_ID | Item_Qty | Item_Code
1 | 10 | 2/2/2021
1 | 20 | 2/3/2021
Problem: if I specifically am interested in Item_ID 2, how can I link it to the inventory table, to see not only Item_ID 2's qty available and also Item_ID 1's qty available since the Universal_ID does not match the Item_ID
So I would like the results to be just like the 2nd block of code I posted.
Item_ID | Item_Qty | Item_Code
1 | 10 | 2/2/2021
1 | 20 | 2/3/2021
2 | 30 | 2/2/2021
What is the best way to set up views or my select query to make this happen? If I need to add any more info I can!

You can use a left join and filtering:
select i.*
from inventory i left join
universal u
on i.item_id = u.item_id
where 1 in (u.universal_id, i.item_id);

Related

Make a 1 to 1 multi-field SQL join where only some of the values match

I am trying to build a table that will be used as a conversion chart. I aim to make a simple join with this conversion table on multiple fields (8 in my case), and get a result. I will try to simplify the examples as much as I can because the original chart is a 40x10 matrix.
Let's say that I have these two (I know they don't make much sense and have bad design but they are just examples):
supply_conversion_chart
---
supply (integer)
customer_id (integer)
product_id (integer)
size (varchar)
purchase_type (varchar)
purchases
---
customer_id (integer)
product_id (integer)
size (varchar)
purchase_type (varchar)
and conversion chart would look something like this:
| supply | customer_id | product_id | size | purchase_type |
|--------|--------------|------------|----------|---------------|
| 100 | 1 | anything | anything | online |
| 101 | 1 | anything | anything | offline |
| 102 | other than 1 | anything | anything | online |
| 103 | 1 | 5 | XXL | online |
The main goal was to get an exact supply value by simply doing a join by doing something like:
SELECT supply
FROM purchases p
JOIN supply_conversion_chart scc ON
p.customer_id = scc.customer_id AND
p.product_id = scc.product_id AND
p.size = scc.size AND
p.purchase_type = scc.purchase_type;
Let's say that these are the records on purchases table:
| customer_id | product_id | size | purchase_type |
|-------------|------------|------|---------------|
| 1 | 3 | M | online |
| 1 | 5 | S | offline |
| 12345 | 4 | XL | online |
| 1 | 5 | XXL | online |
| 4353 | null | M | online |
I would expect first record's supply value to be 101, second record's to be 102, third 102, fourth 103, and fifth to be 102. However, as far as I know, SQL won't be able to do a proper join on all of these records except the fourth one, which is fully matching with supply 103 on supply_conversion_chart table. I don't know if it is possible in the first place to do a join using multiple fields when some of those fields are not fully matching.
My approach is probably faulty and there are better ways to get the results I am trying to achieve but I don't even know where to start. What should I do?
The original chart is much bigger that the provided example, and that I will be doing a join on 8 different fields.
You approach is a lateral join:
select p.*, scc.*
from purchases p left join lateral
(select scc.*
from supply_conversion_chart scc
where (scc.customer_id = p.customer_id or scc.customer_id is null) and
(scc.product_id = p.product_id or scc. product_id is null) and
(scc.size = p.size or scc.size is null) and
(scc.purchase_type = p.purchase_type or scc.purchase_type is null)
order by ( (scc.customer_id = p.customer_id)::int +
(scc.product_id = p.product_id)::int
(scc.size = p.size)::int
(scc.purchase_type = p.purchase_type)::int
) desc
limit 1
) scc;
Note: This represents "everything" as NULL. It doesn't have special logic for "customer other than 1". However, it does show you how to implement basically what you are trying to do.

SQL Filter unique results

I am trying to get an SQL statement that will output a unique part number eg no duplicates
However I want the type as Purchased is the default and when there isnt a Purchased part it defults back to Manufactured. NOTE all parts can be purchased
The result I require is to only show unique part numbers
e.g. 1 to 10 and Purchased is the default Type
Table1
Part_number | Type | Cost
Part 1 | Manufactured | £1.00
Part 1 | Purchased | £0.56
Part 2 | Manufactured | £1.26
Part 2 | Purchased | £0.94
Part 3 | Manufactured | £0.36
Part 3 | Purchased | £0.16
Part 4 | Manufactured | £1.00
Part 4 | Purchased | £1.50
Part 5 | Manufactured | £1.65
Part 6 | Manufactured | £1.98
Part 7 | Manufactured | £0.15
Part 8 | Manufactured | £0.45
Part 9 | Manufactured | £1.20
Part 9 | Purchased | £0.80
Part 10| Manufactured | £1.00
This is the result I am hoping to get back
Part_number | Type | Cost
Part 1 | Purchased | £0.56
Part 2 | Purchased | £0.94
Part 3 | Purchased | £0.16
Part 4 | Purchased | £1.50
Part 5 | Manufactured | £1.65
Part 6 | Manufactured | £1.98
Part 7 | Manufactured | £0.15
Part 8 | Manufactured | £0.45
Part 9 | Purchased | £0.80
Part 10| Manufactured | £1.00
I have tried loads of different techniques but am not getting the result.
I am guessing that I will need to create temp tables that are filtered and then join the tables together but I really don't know.
Any help will be apricated
You could also just grab the first row in each group by sorting them. This would make it easier when there are other columns of data to bring back.
with data as (
select *, row_number() over (
partition by part_number
order by case when t.type = Purchased then 1 else 2 end) as rn
from t
)
select * from data where rn = 1;
If there are other types this would work as well although you would want to tweak it if there are more than two per part.
One method uses not exists. Assuming you have at most two rows per part:
select t.*
from t
where t.type = 'Purchased' or
(t.type = 'Manufactured' and
not (exists (select 1 from t t2 where t2.part_number = t.part_number and t2.type = 'Purchased')
)
);
There are other fun ways to handle this. For instance, an aggregation approach:
select part_number,
max(type) as type,
coalesce(max(case when type = 'Purchased' then cost end),,
max(cost)
) as cost
from t
group by part_number;

SQL: efficiently get the last record

I have a table order which contains order date.
WarehouseId | OrderId | ItemId | OrderDate
-------------------------------------------
1 | 1 | 1 | 2016-08-01
1 | 2 | 2 | 2016-08-02
1 | 3 | 5 | 2016-08-10
2 | 1 | 1 | 2016-08-05
3 | 1 | 6 | 2016-08-06
(table is simplified and only shown required fields)
How to efficiently select the last order for particular Warehouse? I am currently do:
SELECT TOP 1 * FROM tblOrder WHERE WarehouseId = 1 ORDER BY OrderDate DESC
My concern is, when I have a million (or more) orders for particular warehouse, by doing sorting and select the first record, it will be too slow (I think?).
Is there any more efficient way to select the last order record?
Thanks
If you're going to it a lot, you could consider setting an index on the OrderDate field. That will speed things up (but be aware it might have an impact on other queries against this table - it's a complicated topic, talk to a DBA!).
Otherwise, your query is fine, unless you're worried about the ordering when there are identical dates, in which case you should decide on a secondary field to order by as well, such as OrderID (which you suggested in the comments).

SQL: Bug in Joining two tables

I have a item table from which i want to get Sum of item quantity
Query:
Select item_id, Sum(qty) from item_tbl group by item_id
Result:
==================
| ID | Quantity |
===================
| 1 | 10 |
| 2 | 20 |
| 3 | 5 |
| 4 | 20 |
The second table is invoice table from which i am getting the item quantity which is sold. I am joining these two tables as
Query:
Select item_tbl.item_id, Sum(item_tbl.qty) as [item_qty],
-isnull(Sum(invoice.qty),0) as [invoice_qty]
from item_tbl
left join invoice on item_tbl.item_id = invoice invoice.item_id group by item_tbl.item_id
Result:
=================================
| ID | item_qty | invoice_qty |
=================================
| 1 | 10 | -5 |
| 2 | 20 | -20 |
| 3 | 10 | -25 | <------ item_qty raised from 5 to 10 ??
| 4 | 20 | -20 |
I don't know if i am joining these tables in right way. Because i want to get everything from item table and available things from invoice table to maintain the inventory. So i use left join. Help please..
Modification
when i added group by item_id, qty i got this:
=================================
| ID | item_qty | invoice_qty |
=================================
| 1 | 10 | -5 |
| 2 | 20 | -20 |
| 3 | 5 | -5 |
| 3 | 5 | -20 |
| 4 | 20 | -20 |
As its a view so ID is repeated. what should i do to avoid this ??
Clearing things up, my answer from the comments explained:
While using left join operation (A left join B) - a record will be created for every matching B record to an A record, also - a record will be created for any A record that has no matching B record, using null values wherever needed to complement the fields from B.
I would advise reading up on Using Joins in SQL when approaching such problems.
Below are 2 possible solutions, using different assumptions.
Solution A
Without any assumptions regarding primary key:
We have to sum up the item quantity column to determine the total quantity, resulting in two sums that need to be performed, I would advise using a sub query for readability and simplicity.
select item_tbl.item_id, Sum(item_tbl.qty) as [item_qty], -isnull(Sum(invoice_grouped.qty),0) as [invoice_qty]
from item_tbl left join
(select invoice.item_id as item_id, Sum(invoice.qty) as qty from invoice group by item_id) invoice_grouped
on (invoice_grouped.item_id = item_tbl.item_id)
group by item_tbl.item_id
Solution B
Assuming item_id is primary key for item_tbl:
Now we know we can rely on the fact that there is only one quantity for each item_id, so we can do without the sub query by selecting any (max) of the item quantities in the join result, resulting in a quicker execution plan.
select item_tbl.item_id, Max(item_tbl.qty) as [item_qty], -isnull(Sum(invoice.qty),0) as [invoice_qty]
from item_tbl left join invoice on (invoice.item_id = item_tbl.item_id)
group by item_tbl.item_id
If your database design is following the common rules, item_tbl.item_id must be unique.
So just change your query:
Select item_tbl.item_id, item_tbl.qty as [item_qty],
-isnull(Sum(invoice.qty),0) as [invoice_qty]
from item_tbl
left join invoice on item_tbl.item_id = invoice invoice.item_id group by item_tbl.item_id, item_tbl.qty

Randomly Populating Foreign Key In Sample Data Set

I'm generating test data for a new database, and I'm having trouble populating one of the foreign key fields. I need to create a relatively large number (1000) of entries in a table (SurveyResponses) that has a foreign key to a table with only 6 entries (Surveys)
The database already has a Schools table that has a few thousand records. For arguments sake lets say it looks like this
Schools
+----+-------------+
| Id | School Name |
+----+-------------+
| 1 | PS 1 |
| 2 | PS 2 |
| 3 | PS 3 |
| 4 | PS 4 |
| 5 | PS 5 |
+----+-------------+
I'm creating a new Survey table. It will only have about 3 rows.
Survey
+----+-------------+
| Id | Col2 |
+----+-------------+
| 1 | 2014 Survey |
| 2 | 2015 Survey |
| 3 | 2016 Survey |
+----+-------------+
SurveyResponses simply ties a school to a survey.
Survey Responses
+----+----------+----------+
| Id | SchoolId | SurveyId |
+----+----------+----------+
| 1 | 1 | 1 |
| 2 | 2 | 2 |
| 3 | 3 | 1 |
| 4 | 4 | 3 |
| 5 | 5 | 2 |
+----+----------+----------+
Populating the SurveyId field is what's giving me the most trouble. I can randomly select 1000 Schools, but I haven't figured out a way to generate 1000 random SurveyIds. I've been trying to avoid a while loop, but maybe that's the only option?
I've been using Red Gate SQL Data Generator to generate some of my test data, but in this case I'd really like to understand how this can be done with raw SQL.
Here is one way, using a correlated subquery to get a random survey associated with each school:
select s.schoolid,
(select top 1 surveyid
from surveys
order by newid()
) as surveyid
from schools s;
Note: This doesn't seem to work. Here is a SQL Fiddle showing the non-workingness. I am quite surprised it doesn't work, because newid() should be a
EDIT:
If you know the survey ids have no gaps and start with 1, you can do:
select 1 + abs(checksum(newid()) % 3) as surveyid
I did check that this does work.
EDIT II:
This appears to be overly aggressive optimization (in my opinion). Correlating the query appears to fix the problem. So, something like this should work:
select s.schoolid,
(select top 1 surveyid
from surveys s2
where s2.surveyid = s.schoolid or s2.surveyid <> s.schoolid -- nonsensical condition to prevent over optimization
order by newid()
) as surveyid
from schools s;
Here is a SQL Fiddle demonstrating this.