How to make a query to get average weeks? - sql

I have the following 2 tables which records every buying and selling of Goods as datetime
1- Selling Table
Date_of_Selling
------------------------------------------------------
15-09-2006
25-08-2007
13-08-2009
16-01-2009
22-01-2010
..
..
and here 2nd Table
2-Buying Table
Date_of_Buying
----------------------------------------------------
22-09-2004
25-16-2006
01-08-2010
22-08-2010
...
..
..
How can I find the average number of weeks between selling and buying ...
using timestampfunction or whatever ?
the result should be like : 5.3 weeks between selling and buying ,
Thanks in advance ...

I think you want to know how fast a product is selling. In other words, how much days(or weeks) it takes to sell a product. If it is true, this SQL will give you the average days. You can devide it by 7 to get the weeks (I hope it works, I didn't try it, just improvising).
Select Date_of_Selling, Date_of_Buying, DAY(Date_of_Selling - Date_of_Buying)) as AverageDays
from SellingTable ST
Inner join BuyingTable BT
On ST.ProductID = BT.ProductID

Related

Having trouble understanding DBMS Query I/O cost calculation

Can someone please explain how they worked out the total IO cost in both queries?
I'll take a stab at it.
I\O Cost is coming from whatever number of rows you are having to currently do work on.
PLAN A:
You select all of Product and all of Vendor. That is 7300 rows you have to do work on. But this example will also join which means each vendor will have all products, so then it selects 300 * 7000 = 2100000.
You then have to do work on 2100000 rows to start your vendor code filter.
You then have to do work on 7000 rows to start your state filter since you previously filtered down to your initial product list.
PLAN B:
There's a typo, no closing parenthesis for IN().
You for your IN first since your query needs that dataset to complete. You know you have 300 vendors so you are doing work on 300 vendors to filter it down to 10.
You then select all of your products and all of your florida vendors so you are working with 7010 rows, 7000 products, 10 vendors (from the previous step), so that you can perform the join which gives each vendor 7000 products.
10 * 7000 means you are now working with 70000 rows to perform the final filter on v_code.
Total I\O Cost is the sum for each processing step.
I hope that makes sense.

SQL-sum over dynamic period

I have 2 tables: Customers and Actions, where each customer has uniqe ID (which can be found in each table).
Part of the customers became club members at a specific date (change between the customers). I'm trying to summarize their purchases until that date, and to get those who purchase more than (for example) 200 until they become club members.
For example, I can have the following customer:
custID purchDate purchAmount
1 2015-05-12 100
1 2015-07-12 150
1 2015-12-29 320
Now, assume that custID=1 became a club member at 2015-12-25; in that case, I'd like to get SUM(purchAmount)=250 (pay attention that I'd like to get this customer because 250>200).
I tried the following:
SELECT cust.custID, SUM(purchAmount)totAmount
FROM customers cust
JOIN actions act
ON cust.custID=act.custID
WHERE act.clubMember=1
AND cust.purchDate<act.clubMemberDate
GROUP BY cust.custID
HAVING totAmount>200;
Is it the right way to "attack" this question, or should I use something like while loop over the clubMemberDate (which telling the truth-I don't know how to do)?
I'm working with Teradata.
Your help will be appreciated.

powerpivot average of an aggregate

I may be missing an obvious solution but I am looking for a way to take the average of a summed value. For example I have profit at an item# level where each item is on a bill as well and I want average of the bills profit.
Item# | Bill# | Profit
1 1 100
2 1 200
1 2 100
2 2 200
If I just take the avg of profit I get 150 but I want the avg of the bill total which would be 300. Is it possible to do this? I was thinking something like Calculate(Average(Profit),Bill# = Bill#) but that is always true?
Thanks in advance!
It's not totally clear how you intend to use your measure but there are some powerful iterative functions in PowerPivot that do this kind of thing. This formula iterates over each bill# and averages the sum of the profit:
= AVERAGEX(VALUES(tbl[bill#]), SUM(tbl[profit]))
The first argument simply creates a 'column' of the unique bill#s and the second is the summing the profit per bill#.
assuming your table is called tbl

Optimal selection for ordering multiple items (parts) from multiple suppliers (vendors)

The task here is to define the optimal (as detailed below) way of ordering items (parts) from suppliers.
The relevant parts of the table schema (with some sample data) are
Items
ID NUMBER
1 Item0001
2 Item0002
3 Item0003
Suppliers
ID NAME DELIVERY DISCOUNT
1 Supplier0001 0 0
2 Supplier0002 0 0.025
3 Supplier0003 20 0
DELIVERY is the delivery charge (in dollars) levied by that supplier on each delivery. DISCOUNT is the settlement discount (as a percentage i.e. 2.5% for ID=2 above) allowed by that supplier for on time payment.
SupplierItems
SUPPLIER_ID ITEM_ID PRICE
1 2 21.67
1 5 45.54
1 7 32.97
This is the many-to-many join between suppliers and items with the price that supplier charges for that item (in dollars). Every item has at least 1 supplier but some have more than one. A supplier may have no items.
PartsRequests
ID ITEM_ID QUANTITY LOCATION_ID ORDER_ID
1 59 4 2 (null)
2 89 5 2 (null)
3 42 4 2 (null)
This table is a request from a field site for parts to be ordered and delivered by the supplier to that site. A delivery of any number of items to a site attracts a delivery charge. When the parts are ordered, the ORDER_ID is inserted into the table so we are only concerned with those where ORDER_ID IS NULL
The question is, what is the optimal way to order these parts for each `LOCATION' where there are 3 optimal solutions that need to be presented to the user for selection.
The combination of orders with the least number of suppliers
The combination of orders with the lowest total cost i.e. The sum of QUANTITY*PRICE for each item plus the DELIVERY for each order summed over all orders ignoring DISCOUNT
As item 2 but accounting for DISCOUNT
Clearly I need to determine the combinations of orders that are available and then determining the optimal ones becomes trivial but I am a bit stuck on an efficient way to deal with building the combinations.
I have built some SQL fiddles in SQL Server 2008 with random data. This one has 100 items, 10 suppliers and 100 requests. This one has 1000 items, 50 suppliers and 250 requests. The table schema is the same.
Update
I reasoned that the solution had to be recursive and I built a nice table valued function to get but I ran into the 32 hard limit on recursion in SQL Server. I was uncomfortable with it anyway because it hinted more of a procedural language solution than a RDMS.
So I am now playing with CTE recursion.
The root query is:
SELECT DISTINCT
'' SOLUTION_ID
,LOCATION_ID
,SUPPLIER_ID
,(subquery I haven't quite worked out) SOLE_SUPPLIER
FROM PartsRequests pr
INNER JOIN
SupplierItems si ON pr.ITEM_ID=si.ITEM_ID
WHERE pr.ORDER_ID IS NULL
This gets all the suppliers that can supply the required items and is certainly a solution, probably not optimal. The subquery sets a flag if the supplier is the sole supplier of any product required for that location; if so they must be part of any solution.
The recursive part is to remove suppliers one by one by means of CTE.SUPPLIER_ID<>CTE.SUPPLIER_ID and add them if they still cover all the items. The SOLUTION_ID will be a CSV list of the suppliers removed, partly to uniquely identify each solution and partly to check against so I get combinations instead of permutations.
Still working on the details, the purpose of this update was to allow the Community to say "Yay, looks like that will work" or, alternatively "You moron, that won't work because ..."
Thanks
This is a more general answer (as in, not sql) as I think solving this problem will require something more powerful. Your first scenario is to select a minimum number of suppliers. This problem can be seen as a set cover problem as you are trying to cover all demands per site with the suppliers. This problem is already NP-complete.
Your third scenario seems to be basically the same as the second. You just have to take the discount into account in the prices, assuming you pay on time for every order.
The second scenario is at least NP-hard as I see a lot of resemblance with the facility location problem. You are trying to decide which suppliers (facilities) to use (open) to cover your orders (demands) based on their prices and delivery costs (opening costs).
Enumerating your possible solutions seems infeasible as with 10 suppliers, you have 2^10 possibilities of using them, further complicated by the distribution of demands internally.
I would suggest some dynamic programming to first select the suppliers that you have to use (=they are the only ones that deliver a specific thing), eliminating some possibilities (if the cost for supplier A +delivery cost A< cost for supplier B) and then trying to expand your set of possible solutions. Linear programming is also a valid train of thought.

Minimum price selection

Situation looks like this:
We have product 'A123' and we have to remember lowest price for it.
Prices for one product comes from random number of shops and there is no way to tell when shop x will send us their price for 'A123'.
So I had SQL table with columns:
product_number
price
shop (from which shop this price comes)
An SQL function for updating product price looks like this (this is SQL pseudo-code, syntax doesn't matter):
function update_product(in_shop, in_product_number, in_price)
select price, shop into productRow from products where product_number = in_product_number;
if found then
if (productRow.price > in_price) or (productRow.price < in_price and productRow.shop = in_shop) then
update row with new price and new shop
end if;
else
insert new product that we didn't have before
end if;
the (productRow.price < in_price and productRow.shop = in_shop) condition is to prevent situation like this:
In products table we have
A123 22.5 amazon
then comes info from amazon again:
A123 25 amazon
Thanks to above condition we update price to higher which is correct behavior.
But algorithm fails in this situation: again we have a row in the products table:
A123 22.5 amazon
then comes info from merlin
A123 23 merlin (we don't update because price is higher)
then comes info from amazon
A123 35 amazon
and we update table and now we have:
A123 35 amazon
but this is wrong because merlin earlier has lower price for that product.
Any idea how to avoid this situation?
The only way you are going to solve your problem is keep track of the price per shop and then only return the lowest current price. So for example you would need a table like the one you already have, but when you select out of the table something like:
select min(price)
from products
where product_number = :my_product
Personally if it were me, I would keep a time stamp of when you receive the product price updates so you would be able to ascertain when you got the update.
To make this work you should maintain a table that contains the following:
Product
Supplier
LatestPrice
Then identify the current best supplier by querying that table - you can either do this when requested or when the table is updated either way you simplify the problem at the price of slightly more complex schema and queries
Additional (following comment):
Ok, this is going to mean that you need to store more data - but you don't have a huge amount of choice - the data is required to solve the problem so you either: a) have to update prices from all suppliers concurrently and then choose the best price from that snapshot or b) store the prices as you get them and pick the best price from the data you've got. The former implies a fairly hefty overhead in terms of fetching and processing data whereas the latter is basically a fairly modest storage problem and something any decent databases will cope with easily.
Basically, the problem is that you only store the lowest price from 1 vendor. You have to keep records of prices of all vendors, and use a selection query to select the minimum.
For example, If you have:
A123 22.5 Amazon
and you got:
A123 23 Merlin
You have to insert it, even if it is with higher price, because it's a different vendor. So you'll have:
A123 22.5 Amazon
A123 23 Merlin
When you get the new price from Amazon, for example: 25, you just update it. So you'll get:
A123 25 Amazon
A123 23 Merlin
then select the lowest price, Merlin, in this case.