Group by sku, max date SQL - sql

I know this is asked quite a bit here, and I have tried to use other examples to incorporate into my own, but I can't seem to make this work.
I have columns for sku, date, and cost, and I want to view all 3 columns, but only by max date, grouped by sku.
Currently:
Sku Date Cost
1 06/24/15 .01
1 02/22/14 .02
2 06/24/15 .04
2 02/22/14 .05
Need:
Sku Date Cost
1 06/24/15 .01
2 06/24/15 .04
This is what my SQL looks like:
SELECT dbo_SKU.PROD_CODE AS Sku, dbo_LOTS.REC_DATE AS [Last Date],
dbo_LOT_ITEM.COST AS Cost
FROM (dbo_LOTS INNER JOIN dbo_SKU ON dbo_LOTS.SKU_ID = dbo_SKU.SKU_ID)
INNER JOIN dbo_LOT_ITEM ON dbo_LOTS.LOT_ID = dbo_LOT_ITEM.LOT_ID;
Here is what the design view looks like (I'm more of a visual person):
Design View
This is week 2 of teaching myself how to operate Access and how it all works, so if we could break this down in crayon on how I make this work correctly, that would be great.

You can add additional logic to get the last date. One method is to add a correlated subquery in the WHERE clause:
SELECT s.PROD_CODE AS Sku, l.REC_DATE AS [Last Date], li.COST AS Cost
FROM (dbo_LOTS as l INNER JOIN
dbo_SKU as si
ON l.SKU_ID = s.SKU_ID
) INNER JOIN
dbo_LOT_ITEM as li
ON l.LOT_ID = li.LOT_ID
WHERE l.REC_DATE = (SELECT MAX(l2.REC_DATE)
FROM dbo_LOTS as l2
WHERE l2.SKU_ID = l.SKU_ID
);

This worked:
SELECT dbo_SKU.PROD_CODE AS Sku, dbo_LOTS.REC_DATE AS [Last Date], dbo_LOTS.COSTPERSKU AS Cost
FROM dbo_LOTS INNER JOIN dbo_SKU ON dbo_LOTS.SKU_ID = dbo_SKU.SKU_ID
WHERE (((dbo_LOTS.REC_DATE)=(SELECT MAX(l2.REC_DATE)
FROM dbo_LOTS as l2 WHERE l2.SKU_ID = dbo_LOTS.SKU_ID)));

Related

Different results in SQL based on what columns I display

I am trying to run a query to gather the total items on hand in our database. However it seems i'm getting incorrect data. I am selecting selecting just the amount field and summing it using joins from separate tables based on certain parameters, however if I display additional fields such as order number, and date all of a sudden im getting different data, even though those fields are being used as filters in the query. Is it because its not in the select statement? If it needs to be in the select statement is it possible to not display them?
Here are the two queries.
-- Items On Hand
select CONVERT(decimal(25, 2), SUM(tw.amount)) as 'Amt'
from [Sales Header] sh
join
(
select *
from TWAllOrders
where [Status] like 'Released'
) tw
on tw.[Order Nb] = sh.No_
join
(
select *
from OnHand
) oh
on tw.No_ = oh.[Item No_]
where sh.[Requested Delivery Date] < getdate()
HAVING SUM(tw.Quantity) <= SUM(oh.Qty)
providing a sum of 21667457.20
and with the added columns
-- Items On Hand
select CONVERT(decimal(25, 2), SUM(tw.amount)) as 'Amt', [Requested Delivery Date], sh.No_, tw.[Status]
from [Sales Header] sh
join
(
select *
from TWAllOrders
where [Status] like 'Released'
) tw
on tw.[Order Nb] = sh.No_
join
(
select *
from OnHand
) oh
on tw.No_ = oh.[Item No_]
where sh.[Requested Delivery Date] < getdate()
group by sh.[Requested Delivery Date], sh.No_, tw.[Status]
HAVING SUM(tw.Quantity) <= SUM(oh.Qty)
order by sh.[Requested Delivery Date] ASC
Providing a sum of 12319998
I'm self taught in SQL so I may be misunderstanding something obvious, thanks for the help.
With no sample data, I am going to have to demonstrate this in principle. In the latter query you have a GROUP BY meaning the scope of the values in the HAVING will differ, and thus the filtering from said HAVING will be different.
Let's take the following sample data:
CREATE TABLE dbo.MyTable (Grp char(1),
Quantity int,
Required int);
INSERT INTO dbo.MyTable (Grp, Quantity, [Required])
VALUES('a',2,7),
('a',14,2),
('b',4, 7),
('b',3,4),
('c',17,5);
Now we'll perform an overly simplified version of your query:
SELECT SUM(Quantity)
FROM dbo.MyTable
HAVING SUM(Quantity) > SUM(Required);
This brings back the value 40; which is the SUM of all the values in Quantity. A value is returned because the total SUM of Required is 25.
Now let's add a GROUP BY like your second query:
SELECT SUM(Quantity)
FROM dbo.MyTable
GROUP BY Grp
HAVING SUM(Quantity) > SUM(Required);
Now we have 2 rows, with the values 16 and 17 giving a total value of 33. That's because the rows where Grp have a value of 'B' are filtered out, as the SUM of Quantity is lower that Required for 'B'.
The same is happening in your data; in the grouped data you have groups where the HAVING condition isn't met, so those rows aren't returned.

getting avg of column based on the result set

I have a select statement that divides the count of sales by country, priceBanding (see example below)
The select statement looks like follows:
SELECT p.[Price Band]
,t.[Country]
,o.COUNT([Order]) as [Order Count]
FROM #price p (temp table)
INNER JOIN country t ON p.CountryCode = t.countryCode
INNER JOIN sales o ON o.salesValue >= p.startPrice and s.salesValue < p.endPrice
What i want to be able to do is based on this result i want to get an avg of the unit count i.e. For all orders that are under 20 what is the avg unit counts and the same for all others. How can i do this?
Its most likely simple but I cant think through it.
What I am after:
So as you can see, in the price band <20 in UK the order count is 50, and the avg Units of that is 2. As i mentioned earlier, I want the Avg Units of all orders that are under 20 (which is 50 in the picture).
Is that clearer?
Thanks in advance.
EDIT:
The first table: assume it to be the source
And the second table gets the avg, that's what I am after.
Wouldn't you just use avg()?
SELECT p.[Price Band], t.[Country],
o.COUNT(*) as [Order Count],
AVG(Items)
FROM #price p INNER JOIN
country t
ON p.CountryCode = t.countryCode INNER JOIN
sales o
ON o.salesValue >= p.startPrice and s.salesValue < p.endPrice
GROUP BY p.[Price Band], t.[Country]
ORDER BY t.[Country], p.[Price Band]
Note: SQL Server does integer division of integers (so 3/2 = 1 not 1.5) and similarly for AVG(). It is more accurate to use a decimal point number. An easy way is to use AVG(items * 1.0).

Count the number of occurrences grouped by some rows

I have made a query to bring me the number of products that have not been in stock (I know that by looking at the orders which the manufacturer returned with some status code), by product, date and storage, that looks like this:
SELECT count(*) as out_of_stock,
prod.id as product_id,
ped.data_envio::date as date,
opl.id as storage_id
from sub_produtos_pedidos spp
left join cad_produtos prod ON spp.ean_produto = prod.cod_ean
left join sub_pedidos sp ON spp.id_pedido = sp.id
left join pedidos ped ON sp.id_pedido = ped.id
left join op_logisticos opl ON sp.id_op_logistico = opl.id
where spp.motivo = '201' -- this is the code that means 'not in inventory'
group by storage_id,product_id,date
That produces an answer like this:
out_of_stock | product_id | date | storage_id
--------------|------------|-------------|-------------
1 | 5 | 2012-10-16 | 1
5 | 4 | 2012-10-16 | 2
Now I need to get the number of occurrences, by product and storage, of products that have been out of stock for 2 or more days, 5 or more days and so on.
So I guess I need to do a new count on the first query, aggregating the resultant rows in some defined day intervals.
I tried looking at the datetime functions in Postgres (http://www.postgresql.org/docs/7.3/static/functions-datetime.html), but couldn't find what I need.
May be I didn't get correctly you question, but it looks you need leverage sub-query.
Now I need to get the number of occurrences, by product and storage, of products that have been out of stock for 2 or more days
So:
SELECT COUNT(*), date, product_id FROM ( YOUR BIG QUERY IS THERE ) a
WHERE a.date < (CURRENT_DATE - interval '2' day)
GROUP BY date, product_id
Since you seem to want every row in the result individually, you cannot aggregate. Use a window function instead to get the count per day. The well known aggregate function count() can also serve as window aggregate function:
SELECT current_date - ped.data_envio::date AS days_out_of_stock
,count(*) OVER (PARTITION BY ped.data_envio::date)
AS count_per_days_out_of_stock
,ped.data_envio::date AS date
,p.id AS product_id
,opl.id AS storage_id
FROM sub_produtos_pedidos spp
LEFT JOIN cad_produtos p ON p.cod_ean = spp.ean_produto
LEFT JOIN sub_pedidos sp ON sp.id = spp.id_pedido
LEFT JOIN op_logisticos opl ON opl.id = sp.id_op_logistico
LEFT JOIN pedidos ped ON ped.id = sp.id_pedido
WHERE spp.motivo = '201' -- code for 'not in inventory'
ORDER BY ped.data_envio::date, p.id, opl.id
Sort order: Products having been out of stock for the longest time first.
Note, you can just subtract dates to get an integer in Postgres.
If you want a running count in the sense of "n rows have been out of stock for this number of days or more", use:
count(*) OVER (ORDER BY ped.data_envio::date) -- ascending order!
AS running_count_per_days_out_of_stock
You get the same count for the same day, peers are lumped together.

Selecting the rows with min value for a field, where the rest of fields differ

I am trying to write SQL (Access 2010) to select parts which have a minimum price from a table where the parts can repeat, as some of the other fields are different.
The table that looks like this:
Dist Part Num Ven Part Num Dist Desc Price
DD7777QED 7777QED DD Product A 10
IM7777QED 7777QED IM This is Product A 12
SY7777QED 7777QED SY Product A Desc 15
DD8888QED 8888QED DD Product B 15
IM8888QED 8888QED IM This is Product B 10
SY8888QED 8888QED SY Product B Desc 12
IM999ABC 999ABC IM Product C Desc 15
I am trying to extract all details for each row that has the min price for that Ven Part Num that repeats. In essence all details for the supplier's row that has the cheapest price for that Vendor Part Number.
The result from the above sample data should be this:
Dist Part Num Ven Part Num Dist Desc Price
DD7777QED 7777QED DD Product A 10
IM8888QED 8888QED IM This is Product A 10
IM999ABC 999ABC IM Product A Desc 15
Thanks
EDIT: Thank you jurgen d for your answer, although I think you meant to use Ven Part Num (instead of Dist Part Num). I have ammended to this query now which almost works to what I want:
SELECT T1.*
FROM My_Table T1
INNER JOIN
(
SELECT [Ven Part Num], MIN(Price) AS MPrice
FROM My_Table
GROUP BY [Ven Part Num]
) T2 ON T1.[Ven Part Num] = T2.[Ven Part Num] AND T1.Price = T2.MPrice
Challenge now is that if two Dist have the same MIN price for the same Ven Part Num, then the resulting extract contains 2 rows for that Ven Part Num, but I want just one, either will do. I tried TOP 1 but it runs and brings up only one row as result of the whole query. I have 40K rows I am expecting! How do I extract only one of these two rows in the final report?
Thanks again!
select t1.*
from your_table t1
inner join
(
select [Dist Part Num], min(price) as mprice
from your_table
group by [Dist Part Num]
) t2 on t1.[Dist Part Num] = t2.[Dist Part Num] and t1.price = t2.mprice

SQL Query with count, sum and group by

Quick question if anyone has time to answer. The current query works great, but I need to also get a total count of the orders and the total shipping. I know the numbers are getting thrown off because of the joins.
I know that my count and sum will be:
count(DISTINCT orders.id) AS num_orders,
SUM(orders.shipping_cost_ex_tax) AS shipping
I think I need to use the count and sum in the original select and handle the rest in the join, but for the life of me I can't get this right.
Any help would be appreciated, even if it's "run a separate query". Thanks everyone.
Current query:
SELECT
IF(products.categories LIKE '68', 'Shirts', 'Books') AS group_key,
CONCAT(order_products.name) AS product_name,
brands.name AS author,
SUM(order_products.quantity) AS num_units,
CASE WHEN products.sku LIKE '%-WB' THEN 'Combo'
WHEN products.sku LIKE '%-BO' THEN 'Box'
ELSE ''
END AS item_type,
SUM(IF(order_products.discount IS NULL, order_products.price_ex_tax, (order_products.price_ex_tax - order_products.discount))) AS income
FROM orders
INNER JOIN order_products ON order_products.bc_order_id = orders.bc_id
INNER JOIN products ON order_products.bc_product_id = products.bc_id
INNER JOIN brands ON products.brand_id = brands.bc_id
WHERE (orders.created_at BETWEEN '2012-01-28 00:00:00' and '2012-02-21 23:00:00')
GROUP BY group_key,
case when products.brand_id = '68'
then products.name
else products.sku
end
Looking at the comment provided and not having your full schema in front of me. Would something like this work:
Table Report
(
id,
countOrders,
countSales,
countShipping,
countTax,
datePublished
)
Table SoldProducts
(
id,
price,
tax,
shippingPrice,
datePurchased
)
So what you would do in this instance is generate a report by querying from SoldProducts then you would persist the report that was generated.