T-SQL Get all years from - to - sql

I am trying to find a lists with the sales of each Shop for each year from a database. But i want to print all years with 0 sales as well (for some years/shops there will be no data in my database)
I have done the following procedure but it only returns Years and shops which have non zero values .
SELECT YEAR(temp.INVOICE_DATE) AS Year, Shop.Name, SUM(temp.QTY * Product.Price) AS Total
FROM (
SELECT pci.INVOICE_ID, ci.STORE_ID, ci.INVOICE_DATE, pci.PRODUCT_ID, pci.QTY
FROM [Product_Customer Invoice] pci, [Customer Invoice] ci
WHERE pci.INVOICE_ID = ci.INVOICE_ID
)AS temp, Product, Shop
WHERE Product.PRODUCT_ID = temp.PRODUCT_ID AND Shop.STORE_ID = temp.STORE_ID
GROUP BY YEAR(temp.INVOICE_DATE), Shop.Name
ORDER BY Year ASC
I get The following result
I would like to ask for any ideas how to include 0 for years or shops that not any sales has been done

Use a cross join to generate the rows and then left join to bring in the data you want:
SELECT y.Year, s.Shop.Name,
COALESCE(SUM(pci.QTY * p.Price), 0) AS Total
FROM Shop s CROSS JOIN
(SELECT DISTINCT invoice_date as year FROM [Product_Customer Invoice]
) y LEFT JOIN
[Product_Customer Invoice] pci
ON YEAR(pci.invoice_date) = y.year AND
pci.store_id = s.store_id LEFT JOIN
Product p
ON pci.PRODUCT_ID = p.PRODUCT_ID
GROUP BY y.year, s.name
ORDER BY y.year, s.name;
Notice that this also fixes your join syntax and removes the unnecessary subquery.
This assumes that all the years you want are in [Product_Customer Invoice]. If you want a different set of years, either explicitly list them using VALUES(), use a recursive CTE to generate them, or use a calendar table.

Related

SQL - Create complete matrix with all variables even if null

please provide some assistance/guidance to solve the following:
I have 1 main table which indicates sales volumes by sales person per different product type.
Where a salesperson did not sell a particular product on a particular day, there is no record.
The intention is to create null value records for salesmen that did not sell a product on a specific day. The query must be dynamic as there are many more salesmen with sales over many days.
Thanks in advance
Just generate records for all sales persons, days, and products using cross join and then bring in the existing data:
select p.salesperson, d.salesdate, st.salestype,
coalesce(t.sales_volume, 0)
from (select distinct salesperson from t) p cross join
(select distinct salesdate from t) d cross join
(select distinct salestype from t) st left join
t
on t.salesperson = p.salesperson and
t.salesdate = d.salesdate and
t.salestype = st.salestype;
Note: You may have other tables that have lists of sales people, dates, and types -- and those can be used instead of the select distinct queries.

Cross Selling Matrix - In Snowflake

I am trying to build a cross selling matrix with the following structure pivoted as seen below where X is the % of frequency in a basket with the other product:
I need to pivot this data in excel or another tool afterwards so I assume the query in Snowflake needs to output tabular dataset ready for pivoting, and I am struggling with its logic.
This is what I have so far:
SELECT FCT.TRANSACTION_ID,
PRD.PRODUCT_TYPE,
COUNT(DISTINCT FCT.PRODUCT_ID),
COUNT(DISTINCT FCT1.PRODUCT_ID)
FROM TRANSACTION_ORDERS FCT
INNER JOIN DIM_PRODUCT PRD ON FCT.PRODUCT_ID = PRD.PRODUCT_ID
LEFT JOIN FACT_TRANSACTION_ORDERS FCT1 ON FCT.TRANSACTION_ID = FCT1.TRANSACTION_ID
AND FCT.PRODUCT_ID != FCT1.PRODUCT_ID
GROUP BY FCT.TRANSACTION_ID, FCT.PRODUCT_ID, FCT1.PRODUCT_ID
Is the joining even correct? Or should I be doing a cross join? Also, how to capture percent frequency of both products in the same basket?
Many thanks!
EDIT: I am trying to capture the frequency of different product types appearing in the same basket.
The values are the same for combinations in both directions. ProductType1 intersection with column ProductType2 is the same value as column Product Type1 row ProductType2.
When in a basket cross analysis they should vary. It is not the same per direction. In other words, baskets with ProductType1 may have ProductType2 X % of the time but baskets with ProductType2 should have ProductType1 with Y% of the time.
You want a self join. I would expect the products to be in the same order, but you seem be using the same transaction. In any case, this is the structure of the query:
WITH TP AS (
SELECT T.*, P.PRODUCT_TYPE
FROM TRANSACTION_ORDERS T JOIN
DIM_PRODUCT P
ON T.PRODUCT_ID = P.PRODUCT_ID
)
SELECT TP.PRODUCT_TYPE, TP2.PRODUCT_TYPE,
COUNT(DISTINCT TP.TRANSACTION_ID) as NUM_ORDERS
FROM TP JOIN
TP TP2
ON TP2.TRANSACTION_ID = TP.TRANSACTION_ID
GROUP BY TP.PRODUCT_TYPE, TP2.PRODUCT_TYPE;
If this were per order, you would just change the ON clause in the outer query to use the order id.
Note that this uses COUNT(DISTINCT) rather than COUNT(*) because a transaction/order could have multiple products of the same type. Presumably, you want that counted only once.
EDIT:
If you want to divide by the number of transactions that have either product type (which makes sense to me), then I would approach this as:
WITH TP AS (
SELECT DISTINCT T.TRANSACTION_ID, P.PRODUCT_TYPE
FROM TRANSACTION_ORDERS T JOIN
DIM_PRODUCT P
ON T.PRODUCT_ID = P.PRODUCT_ID
)
SELECT TP.PRODUCT_TYPE, TP2.PRODUCT_TYPE,
COUNT(*) as NUM_ORDERS,
( MAX(CASE WHEN TP.PRODUCT_TYPE = TP2.PRODUCT_TYPE THEN COUNT(*) END) OVER (PARTITION BY TP.PRODUCT_TYPE) +
MAX(CASE WHEN TP.PRODUCT_TYPE = TP2.PRODUCT_TYPE THEN COUNT(*) END) OVER (PARTITION BY TP2.PRODUCT_TYPE) -
COUNT(*)
) as Num_Orders_Either,
( COUNT(*) * 1.0 /
( MAX(CASE WHEN TP.PRODUCT_TYPE = TP2.PRODUCT_TYPE THEN COUNT(*) END) OVER (PARTITION BY TP.PRODUCT_TYPE) +
MAX(CASE WHEN TP.PRODUCT_TYPE = TP2.PRODUCT_TYPE THEN COUNT(*) END) OVER (PARTITION BY TP2.PRODUCT_TYPE) -
COUNT(*)
) as ratio
FROM TP JOIN
TP TP2
ON TP2.TRANSACTION_ID = TP.TRANSACTION_ID
GROUP BY TP.PRODUCT_TYPE, TP2.PRODUCT_TYPE;
This calculates the total orders containing both products using the sum of the orders with either product minus the number with both.

Select most Occurred Value SQL with Inner Join

I am using this query to get the following data from different linked tables. But let's say the VENDORS for an item were three. Now here in result i want to show the Vendor which occurred most. I mean if Item ABC was supplied by 3 different vendors many times. Then here i want to get the Vendor who supplied most of the times item ABC.
My query is this.
use iBusinessFlex;
SELECT Items.Name,
Max(Items.ItemID) as ItemID ,
MAX(Items.Description)as Description,
MAX(ItemsStock.CurrentPrice) as UnitPrice,
MAX(ItemsStock.Quantity) as StockQuantiity,
MAX(Vendors.VendorName) as VendorName,
SUM(ItemReceived.Quantity) as TotalQuantity
From ItemReceived
INNER JOIN Items ON ItemReceived.ItemId=Items.ItemID
INNER JOIN ItemsStock ON ItemReceived.ItemId=ItemsStock.ItemID
INNER JOIN PurchaseInvoices ON PurchaseInvoices.PurchaseInvoiceId = ItemReceived.PurchaseInvoiceId
INNER JOIN Vendors ON Vendors.VendorId = PurchaseInvoices.VendorId
Group By Items.Name
EDIT : I have included this sub query but i am not sure if it is showing correct result. i mean Showing Vendor for each Item who provided that item most of the times
use iBusinessFlex;
SELECT Items.Name,
Max(Items.ItemID) as ItemID ,
MAX(Items.Description)as Description,MAX(ItemsStock.CurrentPrice) as UnitPrice,
MAX(ItemsStock.Quantity) as StockQuantiity,MAX(Vendors.VendorName) as VendorName,
SUM(ItemReceived.Quantity) as TotalQuantity
From ItemReceived
INNER JOIN Items ON ItemReceived.ItemId=Items.ItemID INNER JOIN ItemsStock
ON ItemReceived.ItemId=ItemsStock.ItemID INNER JOIN PurchaseInvoices
ON PurchaseInvoices.PurchaseInvoiceId = ItemReceived.PurchaseInvoiceId INNER JOIN Vendors
ON Vendors.VendorId IN (
SELECT Top 1 MAX(PurchaseInvoices.VendorId) as VendorOccur
FROM PurchaseInvoices INNER JOIN Vendors ON Vendors.VendorId=PurchaseInvoices.VendorId
GROUP BY PurchaseInvoices.VendorId
ORDER BY COUNT(*) DESC
And the Result Looks like this.
First, I would start with who ordered what thing the most. But the MOST is based on what... the most quantity? Price?, Number of Times? If you use one vendor and order 6 times qty of 10 you have 60 things. But order 1 time from another vendor for 100 qty, which one wins. You have to decide the basis of MOST, but I will go based on most times
per your original question.
So all things come from PurchasedInvoices which has a vendor ID. I dont care who the vendor is, just their ID, so no need to join. Also, don't need the item name if I am just looking for my counts. The query below will show per item, each vendor and their respective most times ordered and quantities ordered. I added the items and vendor table joins just to show the names.
select
IR.ItemID,
PI.VendorID,
max( I.Name ) Name,
max( V.VendorName ) VendorName,
count(*) as TimesOrderedFrom,
SUM( IR.Quantity ) as QuantityFromVendor
from
ItemsReceived IR
JOIN PurchaseInvoices PI
on IR.PurchaseInvoiceID = PI.PurchaseInvoiceID
JOIN Items I
on IR.ItemID = I.ItemID
JOIN Vendors V
on IR.VendorID = V.VendorID
group by
IR.ItemID,
PI.VendorID
order by
-- Per item
IR.ItemID,
-- Most count ordered
count(*),
-- If multiple vendors, same count, get total quantity
sum( IR.Quantity )
Now, to get only 1 per item, this would create a correlated subquery and you
can add 'TOP 1' to return only the first by this. Since the aggregate of count
is already done, you can then get the vendor contact info.
select
I.Name,
V.VendorName,
TopVendor.TimesOrderedFromVendor,
TopVendor.QuantityFromVendor
from
Items I
JOIN ( select TOP 1
IR.ItemID,
PI.VendorID,
count(*) as TimesOrderedFrom,
SUM( IR.Quantity ) as QuantityFromVendor
from
ItemsReceived IR
JOIN PurchaseInvoices PI
on IR.PurchaseInvoiceID = PI.PurchaseInvoiceID
where
-- correlated subquery based on the outer-most item
IR.ItemID = I.ItemID
group by
IR.ItemID,
PI.VendorID
order by
-- Per item
IR.ItemID,
-- Most count ordered
count(*),
-- If multiple vendors, same count, get total quantity
sum( IR.Quantity ) ) TopVendor
on I.ItemID = TopVendor.ItemID
JOIN Vendors V
on TopVendor.VendorID = V.VendorID
No sense in having the INNER Subquery joining on the vendor and items just for the names. Get those once and only at the end when the top vendor is selected.

SQL Sum returning wrong number

I am adding up the amount of tickets sold for a sporting event, the answer should be under 100 but my answer is in the thousands.
SELECT Stubhub.Active.Opponent,
SUM(Stubhub.Active.Qty) AS AQty, SUM(Stubhub.Sold.Qty) AS SQty
FROM Stubhub.Active INNER JOIN
Stubhub.Sold ON Stubhub.Active.Opponent = Stubhub.Sold.Opponent
GROUP BY Stubhub.Active.Opponent
This is type of problem occurs because you are getting a cartesian product between each table for each opponent. The solution is to pre-aggregate by opponent:
SELECT a.Opponent, a.AQty, s.SQty
FROM (SELECT a.Opponent, SUM(a.Qty) as AQty
FROM Stubhub.Active a
GROUP BY a.Opponent
) a INNER JOIN
(SELECT s.Opponent, SUM(s.QTY) as SQty
FROM Stubhub.Sold s
GROUP BY s.Opponent
) s
ON a.Opponent = s.Opponent;
Notice that in this case, you do not need the aggregation in the outer query.

Complex SQL Query Help

I am trying to do rather complex SQL query to produce a report. This is a database used by an inventory and accounting system.
Essentially I need to produce a report with the following columns
Month / Year (group results by month / year)
Reseller (order results by reseller with in the month / year group)
Total sales - Sales - Hardware
Total sales - Sales - Consumables
The following tables will need to be used in the report:
Invoice
Reseller
Job
JobStockItem
Stock
Essentially the query would need to start as:
1. Select all invoices from Invoice
2. Get the reseller name from Reseller.Name (join on Reseller.ID with Invoice.CustomerID)
3. Get the associated job ID from Job table (join on Job.InvoiceID with Invoice.ID)
4. Get each component of the invoice from JobStockItems (join JobStockItem.JobID on Job.ID)
5. Get the stock item in in the job from Stock (join on JobStockItems.StockId on Stock.ID) and see if the category (Stock.Category1) is either Hardware or Consumables
6. If the stock item is hardware or consumables, use the sale price in the JobStockItem (JobStockItem.PriceExTax) and add it towards the total for the month of the resellers purchases
The month and year come from the invoice date (Invoice.InvoiceDate).
Now I could produce this result myself by executing a bunch of queries and processing myself, one each for the above steps, but it's going to end up slow and I'm sure there'd have to be a query out there that could wrap all those requirements up and do it in one?
I have not attempted to do the query yet as to be honest, I don't know where to start - it's a lot more complex than anything I've done in the past.
I am just using Management Studio, not using Reporting Services, Crystal Reports or anything. My aim is to dump the output to HTML when I have it working.
Thanks heaps in advance.
It think if you Left Join into the JobStockItems table twice (once for hardware, and once for consumables), you can manage all that in one query. The final query will look something like this (Don't have my editor up right now, so apologies for any typos)
SELECT DATEPART(m, Invoice.InvoiceDate) month,
DATEPART(yy, Invoice.InvoiceDate) year,
Reseller.Name,
SUM(jobstockitems_hardware.Price) sales_hardware,
SUM(jobstockitems_consumables.Price) sales_consumables,
FROM Invoice
INNER JOIN Reseller
ON Invoice.CustomerID = Reseller.ID
INNER JOIN Job
ON Invoice.ID = Job.InvoiceID
LEFT JOIN (SELECT JobID, SUM(PriceExTax) Price
FROM JobStockItems
INNER JOIN Stock
ON JobStockItems.StockID = Stock.StockID
AND Stock.Category1 = 'Hardware'
GROUP BY JobID) jobstockitems_hardware
ON Job.ID = jobstockitems_hardware.JobID
LEFT JOIN (SELECT JobID, SUM(PriceExTax) Price
FROM JobStockItems
INNER JOIN Stock
ON JobStockItems.StockID = Stock.StockID
AND Stock.Category1 = 'Consumables'
GROUP BY JobID) jobstockitems_consumables
ON Job.ID = jobstockitems_consumables.JobID
GROUP BY DATEPART(m, Invoice.Date),
DATEPART(yy, Invoice.Date),
Reseller.Name
ORDER BY DATEPART(yy, Invoice.Date) ASC,
DATEPART(m, Invoice.Date) ASC,
Reseller.Name ASC
I'm assuming that Retailer has a column called Name that you want to return as well, feel free to change that to ID or whatever else you'd rather return.
Edit: Fixed query to remove duplicates
Partly you want to PIVOT your results. So you can simply join your data together and let the PIVOT do the selecting and summarizing of categories:
select *
from (
select Year = datepart(year, i.InvoiceDate),
Month = datepart(month, i.InvoiceDate),
ResellerName = r.name,
StockCategory = s.Category1,
jsi.PriceExTax
from Invoice i
inner join Reseller r
on r.ID = i.CustomerID
inner join Job j
on j.InvoiceID = i.ID
inner join JobStockItem jsi
on jsi.JobID = j.ID
inner join Stock s
on s.ID = jsi.StockID
) d
pivot (sum(PriceExTax) for StockCategory in (Hardware, Consumables)) p
order by Year, Month, ResellerName;
In the end you'll need some conditions in the inner query (e.g. where i.InvoiceDate between ...).
Perhaps you have to multiply the PriceExTax with the amount in JobStockItem...
DPMattingley has produced a good query, but with one limitation - if a time period has no results, it won't show a row at all. This may well be an unlikely case in the specific example here but where it does happen it's annoying to find the report hides the zero results month!
My standard solution to this involves taking the month from a numbers table, which forces all time periods to appear. Using DPMattingley's query as a starting point -
select
mths.month,
mths.year,
d.Name,
d.sales_hardware,
d.sales_consumables
from
/*All the in-range months on which to report*/
(select
datepart(m,dateadd(m,number,minDate)) month,
datepart(yy,dateadd(m,number,minDate)) year
from
numbers n
inner join (select min(invoice.invoicedate) as minDate from invoice) m
on n.number between 0 and datediff(m,minDate,getdate())) mths
left join
/*Data for each month*/
(SELECT DATEPART(m, Invoice.InvoiceDate) month,
DATEPART(yy, Invoice.InvoiceDate) year,
Reseller.Name,
SUM(jobstockitems_hardware.Price) sales_hardware,
SUM(jobstockitems_consumables.Price) sales_consumables
FROM
Invoice
INNER JOIN Reseller
ON Invoice.CustomerID = Reseller.ID
INNER JOIN Job
ON Invoice.ID = Job.InvoiceID
LEFT JOIN (SELECT JobID, SUM(PriceExTax) Price
FROM JobStockItems
INNER JOIN Stock
ON JobStockItems.StockID = Stock.StockID
AND Stock.Category1 = 'Hardware'
GROUP BY JobID) jobstockitems_hardware
ON Job.ID = jobstockitems_hardware.JobID
LEFT JOIN (SELECT JobID, SUM(PriceExTax) Price
FROM JobStockItems
INNER JOIN Stock
ON JobStockItems.StockID = Stock.StockID
AND Stock.Category1 = 'Consumables'
GROUP BY JobID) jobstockitems_consumables
ON Job.ID = jobstockitems_consumables.JobID
GROUP BY DATEPART(m, Invoice.Date),
DATEPART(yy, Invoice.Date),
Reseller.Name) d
on mths.month=d.month
and mths.year=d.year
ORDER BY mths.month ASC,
mths.year ASC,
Reseller.Name ASC