Joining a table to two one-to-many relationship tables in SQL Server - sql

Happy Friday folks,
I'm trying to write an SSRS report displaying data from three (actually about 12, but only three relevant) tables that have akward relationships and the SQL query behind the data is proving difficult.
There are three entities involved - a Purchase Order, a Sales Order, and a Delivery. The problem is the a Purchase Order can have many sales orders, and also many deliveries which are NOT linked to the sales orders...that would be too easy.
Both the Sales Order and Delivery tables can be linked to the Purchase Order table by foreign keys and an intermediate table each.
I need to basically list Purchase Orders, a list of sales orders and a list of deliveries next to them, with NULLs for any fields that aren't valid so that'll give the required output in SSRS/when read by a human, ie, for a purchase order with 2 sales orders and 4 delivery dates;
PO SO Delivery
1234 ABC 05/10
1234 DEF 09/10
1234 NULL 10/12
1234 NULL 14/12
The above (when grouped by PO) will tell the users there are two sales orders and four (unlinked) delivery dates.
Likewise if there are more SOs than deliveries, we need NULLs in the Delivery column;
PO SO Delivery
1234 ABC 03/08
1234 DEF NULL
1234 GHI NULL
1234 JKL NULL
Above would be the case with 4 SOs and one delivery date.
Using Left Outer joins alone gives too much duplication - in this case 8 rows, as it gives 4 delivery dates for each match on the sales order;
PO SO Delivery
1234 ABC 05/10
1234 ABC 09/10
1234 ABC 10/12
1234 ABC 14/12
1234 DEF 05/10
1234 DEF 09/10
1234 DEF 10/12
1234 DEF 14/12
It's fine that the PO column is duplicated as SSRS can visually group that - but the SO/Delivery fields can't be allowed to duplicate as this can't be got rid of in the report - if I group the column in SSRS by SO then it still spits out 4 delivery dates for each one.
The only situation our query works nice is when there is just one SO per PO. In that case the single PO and SO numbers are duplicated together for x deliveries and can both be neatly grouped in SSRS. Unfortunately this is a rare occurence in the data.
I've thought of trying to use some sort of windowing function or CROSS APPLY but both fall down as they will repeat for every PO number listed and end up spitting out too much data.
At the point of thinking this just isn't set-based enough to be doable in SQL, I know the data is horrible..
Any help much appreciated.
EDIT - basical sqlfiddle link to the table schemas. Omitted many columns which aren't relevant. http://sqlfiddle.com/#!2/5ba16
Example data...
Purchase Order
PO_Number Style
1001 Black work boots
1002 Green hat
1006 Red Scarf
Sales Order
Sales_order_number PO_number Qty Retailer
A100-21 1001 15 Walmart
A100-22 1001 29 Walmart
A200-31 1006 1000 Asda
Delivery
Delivery_ID Delivery_Date PO_number
1543285 10/05/2014 1001
1543286 12/05/2014 1001
1543287 17/05/2014 1001
1543288 21/05/2014 1002

If you assign row numbers to the elements in salesorders and deliveries, you can link on that.
Something like this
declare #salesorders table (po int, so varchar(10))
declare #deliveries table (po int, delivery date)
declare #purchaseorders table (po int)
insert #purchaseorders values (123),(456)
insert #salesorders values (123,'a'),(123,'b'),(456,'c')
insert #deliveries values (123,'2014-1-1'),(456,'2014-2-1'),(456,'2014-2-1')
select *
from
(
select numbers.number, p.po, so.so, d.delivery from #purchaseorders p
cross join (Select number from master..spt_values where type='p') numbers
left join (select *,ROW_NUMBER() over (partition by po order by so) sor from #salesorders ) so
on p.po = so.po and numbers.number = so.sor
left join (select * , ROW_NUMBER() over (partition by po order by delivery) dor from #deliveries) d
on p.po = d.po and numbers.number = d.dor
) v
where so is not null or delivery is not null
order by po,number

Related

Newbie struggling to join 3 tables

I have 3 tables of purchases in the following format:
date | company_id | apple_txn_amt
date | company_id | orange_txn_amt
date | company_id | pear_txn_amt
There are multiple purchases/sales daily for many companies. I'm trying to join and group so there is only 1 date per company along with total fruit balance:
date | company_id | total_apple_balance | total_apple_orange_balance | total_pear_balance
I have built a query for a similar case earlier, and used 2 joins. But this was for only one company's data so I was only joining on date=date for each table. Process for each table was: gather buys, sells, union those two, union to a new table with generate_series() to insert 0s for days missing, calculate daily delta, and group by day to have a running total. Then something like:
SELECT
apple.day
apple.total
orange.total
pear.total
(apple + orange + pear) AS total_fruit
FROM apple
JOIN orange ON orange.date = apple.date
JOIN pear ON pear.date = apple.date
ORDER BY day
It's like I need to JOIN ON date and company id but from what I can tell this isn't possible.
Should I approach this in a different way?
Sure you can add the company_id like
SELECT
apple.day
apple.total
orange.total
pear.total
(apple.total + orange.total + pear.total) AS total_fruit
FROM apple
JOIN orange ON orange.date = apple.date AND orange.company_id = apple.company_id
JOIN pear ON pear.date = apple.date AND pear.company_id = apple.company_id
ORDER BY day
But the design of your database isn't right, if circumstances don't require it.
you would not have 3 tables, you would have only one with Fruit type as another column, to differentiate them

POSTGRESQL - Finding specific product when

I've attempted to write a query but I've not managed to get it working correctly.
I'm attempting to retrieve where a specific product has been bought but where it also has been bought with other products. In the case below, I want to find where product A01 has been bought but also when it was bought with other products.
Data (extracted from tables for illustration):
Order | Product
123456 | A01
123457 | A01
123457 | B02
123458 | C03
123459 | A01
123459 | C03
Query which will return all orders with product A01 without showing other products:
SELECT
O.NUMBER
O.DATE
P.NUMBER
FROM
ORDERS O
JOIN PRODUCTS P on P.ID = O.ID
WHERE
P.NUMBER = 'A01'
I've tried to create a sub query which brings back just orders of product A01 but I don't know how to place it in the query for it to return all orders containing product A01 as well as any other product ordered with it.
Any help on this would be very grateful.
Thanks in advance.
You can use conditional SUM to detect if one ORDER group have one ore more 'A01'
CREATE TABLE orders
("Order" int, "Product" varchar(3))
;
INSERT INTO orders
("Order", "Product")
VALUES
(123456, 'A01'),
(123457, 'A01'),
(123457, 'B02'),
(123458, 'C03'),
(123459, 'A01'),
(123459, 'C03')
;
SELECT "Order"
FROM orders
GROUP BY "Order"
HAVING SUM(CASE WHEN "Product" = 'A01' THEN 1 ELSE 0 END) > 0
I appreciated Juan's including the DDL to create the database on my system. By the time I saw it, I'd already done all the same work, except that I got around the reserved word problem by naming that field Order1.
Sadly, I didn't consider that either of the offered queries worked on my system. I used MySQL.
The first one returned the A01 lines of the two orders on which other products were ordered too. I took Alex's purpose to include seeing all items of all orders that included A01. (Perhaps he wants to tell future customers what other products other customers have ordered with A01, and generate sales that way.)
The second one returned the three A01 lines.
Maybe Alex wants:
select *
from orders
where Order1 in (select Order1
from orders
where Product = 'A01')
It outputs all lines of all orders that include A01. The subquery makes a list of all orders with A01. The first query returns all lines of those orders.
In a big database, you might not want to run two queries, but this is the only way I see to get the result I understood Alex wanted. If that is what he wanted, he would have to run a second query once armed with output from the queries offered, so there's no real gain.
Good discussion. Thanks to all!
Use GROUP BY clause along with HAVING like
select "order", Product
from data
group by "order"
having count(distinct product) > 1;

DB2 Select from two tables when one table requires sum

In a DB2 Database, I want to do the following simple mathematics using a SQL query:
AvailableStock = SupplyStock - DemandStock
SupplyStock is stored in 1 table in 1 row, let's call this table the Supply table.
So the Supply table has this data:
ProductID | SupplyStock
---------------------
109 10
244 7 edit: exclude this product from the search
DemandStock is stored in a separate table Demand, where demand is logged as each customer logs demand during a customer order journey. Example data from the Demand table:
ProductID | DemandStock
------------------------
109 1
244 4 edit: exclude this product
109 6
109 2
So in our heads, if I want to calculate the AvailableStock for product '109', Supply is 10, Demand for product 109 totals to 9, and so Available stock is 1.
How do I do this in one select query in DB2 SQL?
The knowledge I have so far of some of the imagined steps in PseudoCode:
I select SupplyStock where product ID = '109'
I select sum(DemandStock) where product ID = '109'
I subtract SupplyStock from DemandStock
I present this as a resulting AvailableStock
The results will look like this:
Product ID | AvailableStock
109 9
I'd love to get this selected in one SQL select query.
Edit: I've since received an answer (that was almost perfect) and realised the question missed out some information.
This information:
We need to exclude data from products we don't want to select data for, and we also need to specifically select product 109.
My apologies, this was omitted from the original question.
I've since added a 'where' to select the product and this works for me. But for future sake, perhaps the answer should include this information too.
You do this using a join to bring the tables together and group by to aggregate the results of the join:
select s.ProductId, s.SupplyStock, sum(d.DemandStock),
(s.SupplyStock - sum(d.DemandStock)) as Available
from Supply s left join
Demand d
on s.ProductId = d.ProductId
where s.ProductId = 109
group by s.ProductId, s.SupplyStock;

Sql views vs jdbc select-join, where to abstract?

I have 3 tables (see below), Table A describes a product, Table B holds inventory information for different dates, and Table C holds the price of each product for different dates.
Table A
------------------
product_id product_name
1 book
2 pencil
3 stapler
... ...
Table B
------------------
product_id date_id quantity
1 2012-12-01 100
1 2012-12-02 110
1 2012-12-03 90
2 2012-12-01 98
2 2012-12-02 50
... ... ...
Table C
-------------------
product_id date_id price
1 2012-12-01 10.29
1 2012-12-02 12.12
2 2012-12-02 32.98
3 2012-12-01 10.12
In many parts of my java application I would like to know what the dollar-value of each of the product is so I end up doing the following query
select
a.product_name,
b.date_id,
b.quantity * c.price as total
from A a
join B b on a.product_id = b.product_id
join C c on a.product_id = c.product_id and b.date_id = c.date_id
where b.date_id = ${date_input}
I had an idea today that I could make the query above be a view (minus the date condition), then query the view for a specific date so my queries would look like
select * from view where date_id = ${date_input}
I'm not sure where the appropriate level of abstraction for such logic is. Should it be in java code (read from a pref file), or encoded into a view in the database?
The only reason I don't want to put it as a view is that as time goes by the join will become expensive as there will be more and more dates to cover, and I'm usually only interested in the past month's worth of data. Perhaps a stored proc is better? Would that be a good place to abstract this logic?
If views are implemented correctly you should never see worst performance in a case like this where the query would be the same without the view. More dates will not affect the performance because you have this view.
Make the view, it is the correct abstraction in this case.

SQL SUM with Repeating Sub Entries - Best Practice?

I hit this issue regularly but here is an example....
I have a Order and Delivery Tables. Each order can have one to many Deliveries.
I need to report totals based on the Order Table but also show deliveries line by line.
I can write the SQL and associated Access Report for this with ease ....
SELECT xxx
FROM
Order
LEFT OUTER JOIN
Delivery on Delivery.OrderNO = Order.OrderNo
until I get to the summing element. I obviously only want to sum each Order once, not the 1-many times there are deliveries for that order.
e.g. The SQL might return the following based on 2 Orders (ignore the banalness of the report, this is very much simplified)
Region OrderNo Value Delivery Date
North 1 £100 12-04-2012
North 1 £100 14-04-2012
North 2 £73 01-05-2012
North 2 £73 03-05-2012
North 2 £73 07-05-2012
South 3 £50 23-04-2012
I would want to report:
Total Sales North - £173
Delivery 12-04-2012
Delivery 14-04-2012
Delivery 01-05-2012
Delivery 03-05-2012
Delivery 07-05-2012
Total Sales South - £50
Delivery 23-04-2012
The bit I'm referring to is the calculation of the £173 and £50 which the first of which obviously shouldn't be £419!
In the past I've used things like MAX (for a given Order) but that seems like a fudge.
Surely there must be a regular answer to this seemingly common problem but I can't find one.
I don't necessarily need the code - just a helpful point in the right direction.
Many thanks,
Chris.
A roll up operator may not look pretty. However, it would do the regular aggregates that you see now, and it show the subtotals of the order. This is what you're looking for.
SELECT xxx
FROM
Order
LEFT OUTER JOIN
Delivery on Delivery.OrderNO = Order.OrderNo
GROUP BY xxx
WITH ROLLUP;
I'm not exactly sure how the rest of your query is set up, but it would look something like this:
Region OrderNo Value Delivery Date
North 1 £100 12-04-2012
North 1 £100 14-04-2012
North 2 £73 01-05-2012
North 2 £73 03-05-2012
North 2 £73 07-05-2012
NULL NULL f419 NULL
I believe what you want is called a windowing function for your aggregate operation. It looks like the following:
SELECT xxx, SUM(Value) OVER (PARTITION BY Order.Region) as OrderTotal
FROM
Order
LEFT OUTER JOIN
Delivery on Delivery.OrderNO = Order.OrderNo
Here's the MSDN article. The PARTITION BY tells the SUM to be done separately for each distinct Order.Region.
Edit: I just noticed that I missed what you said about orders being counted multiple times. One thing you could do is SUM() the values before joining, as a CTE (guessing at your schema a bit):
WITH RegionOrders AS (
SELECT Region, OrderNo, SUM(Value) OVER (PARTITION BY Region) AS RegionTotal
FROM Order
)
SELECT Region, OrderNo, Value, DeliveryDate, RegionTotal
FROM RegionOrders RO
INNER JOIN Delivery D on D.OrderNo = RO.OrderNo