SQL - Create complete matrix with all variables even if null - sql

please provide some assistance/guidance to solve the following:
I have 1 main table which indicates sales volumes by sales person per different product type.
Where a salesperson did not sell a particular product on a particular day, there is no record.
The intention is to create null value records for salesmen that did not sell a product on a specific day. The query must be dynamic as there are many more salesmen with sales over many days.
Thanks in advance

Just generate records for all sales persons, days, and products using cross join and then bring in the existing data:
select p.salesperson, d.salesdate, st.salestype,
coalesce(t.sales_volume, 0)
from (select distinct salesperson from t) p cross join
(select distinct salesdate from t) d cross join
(select distinct salestype from t) st left join
t
on t.salesperson = p.salesperson and
t.salesdate = d.salesdate and
t.salestype = st.salestype;
Note: You may have other tables that have lists of sales people, dates, and types -- and those can be used instead of the select distinct queries.

Related

Lookup with duplicate values

This is kind of a complicated question. I have three tables:
A PRODUCTS table
ProductID
ProductName
Product A
Edwardian Desk
Product B
Edwardian Lamp
And a GROUPS table
ProductGroup
ProductID
Group A
Product A
Group A
Product B
Group B
Product C
And a SALES table
Product ID
Sales
Product A
1000
Product B
500
And I need to show the total of Sales per Product Group.
This part I understand; I wrote the query:
SELECT Groups.ProductGroup, SUM(Sales) AS TotalSales
FROM Groups
JOIN Sales
ON Groups.ProductID=Sales.ProductID
GROUP BY Groups.ProductGroup
This is the part that confuses me though: for each group, I need to pull in one of the names of the products in the group. However, it does not matter which name is pulled. So the final data could show:
Group A, Edwardian Desk, 1500
or
Group A, Edwardian Lamp, 1500
How can I pull the name of the product into my query?
I am working in Microsoft SQL Server
There's a number of ways to bring in one of your product's names, a couple of options are to either use an aggregation with a correlated subquery or to to use an apply.
Note, I've used aliases for your table names - doing so is good practice and makes queries more compact and easier to read. Also - presumably this is a contrived example and not your actual tables - but generally it's not a good practice to have column names identical to the table name, so if Sales on table Sales represents a quantity, then just call it Quantity!
select g.ProductGroup,
(select Min(ProductName) from Products p where p.ProductId=g.ProductId) FirstProductAlphabetically,
Sum(s.Sales) as TotalSales
from Groups g
join Sales s on s.ProductID=g.ProductID
group by g.ProductGroup
select g.ProductGroup,
p.ProductName as FirstProductById,
Sum(s.Sales) as TotalSales
from Groups g
join Sales s on s.ProductID=g.ProductID
cross apply (
select top (1) p.ProductName
from Products p
where p.ProductId=g.ProductId
order by ProductId
)p
group by g.ProductGroup
You can add products to the JOIN and use an aggregation function:
SELECT g.ProductGroup, SUM(s.Sales) AS TotalSales,
MIN(p.ProductName)
FROM Groups g JOIN
Sales s
ON g.ProductID = s.ProductID JOIN
Products p
ON p.ProductID = s.ProductId
GROUP BY g.ProductGroup;
Note: I often add two columns, MIN() and MAX() to get two sample names.
I should add. Your sample data has ProductIds that are not in the Products. That suggests a problem with either the question (more likely) or the data model. If you actually have references to non-existent products, then use a LEFT JOIN to Products rather than an inner join.

How to find missing data in table Sql

This is similar to How to find missing data rows using SQL? and How to find missing rows (dates) in a mysql table? but a bit more complex, so I'm hitting a wall.
I have a data table with the noted Primary key:
country_id (PK)
product_id (PK)
history_date (PK)
amount
I have a products table with all products, a countries table, and a calendar table with all valid dates.
I'd like to find all countries, dates and products for which there are missing products, with this wrinkle:
I only care about dates for which there are entries for a country for at least one product (i.e. if the country has NOTHING on that day, I don't need to find it) - so, by definition, there is an entry in the history table for every country and date I care about.
I know it's going to involve some joins maybe a cross join, but I'm hitting a real wall in finding missing data.
I tried this (pretty sure it wouldn't work):
SELECT h.history_date, h.product_id, h.country_id, h.amount
FROM products p
LEFT JOIN history h ON (p.product_id = h.product_id)
WHERE h.product_id IS NULL
No Joy.
I tried this too:
WITH allData AS (SELECT h1.country_id, p.product_id, h1.history_date
FROM products p
CROSS JOIN (SELECT DISTINCT country_id, history_date FROM history) h1)
SELECT f.history_date, f.product_id, f.country_id
FROM allData f
LEFT OUTER JOIN history h ON (f.country_id = h.country_id AND f.history_date = h.history_date AND f.product_id = h.product_id)
WHERE h.product_id IS NULL
AND h.country_id IS NOT NULL
AND h.history_date IS NOT null
also no luck. The CTE does get me every product on every date that there is also data, but the rest returns nothing.
I only care about dates for which there are entries for a country for
at least one product (i.e. if the country has NOTHING on that day, I
don't need to find it)
So we care about this combination:
from (select distinct country_id, history_date from history) country_date
cross join products p
Then it's just a matter of checking for existence:
select *
from (select distinct country_id, history_date from history) country_date
cross join products p
where not exists (select null
from history h
where country_date.country_id = h.country_id
and country_date.history_date = h.history_date
and p.product_id = h.product_id
)

SQL JOIN AND SUB QUERY

below are two tables customer information and second customer with their loan and repayment I want to join these two tables and retrieve the expected result in the picture, please note that in the second table one customer can have more than one loan and he/she can repay it in several installments that in the second table that it saved in many rows, I am expecting to join the two tables and retrieve clients who are still owed from the company.
I found the expected result as a good solution for this purpose but could not retrieve the expected result.
and if you have a good solution for my purpose rather than expected result please share.
example:-
I tried below query but it does not work, because the client id number 1 disbursed three times loan when it shows the tree times purpose it shows that total of the loan three times too
SELECT DISTINCT
CUSTOMER_AIB_INFO_TABLE.ID,
CUSTOMER_AIB_INFO_TABLE.NAME,
(
SELECT SUM (CUSTOMER_AIB_LOAN_TABLE.LOAN) AS "Total Loan"
FROM CUSTOMER_AIB_LOAN_TABLE
WHERE CUSTOMER_AIB_INFO_TABLE.ID = CUSTOMER_AIB_LOAN_TABLE.FK_ID
) AS "Loan",
(
SELECT SUM (CUSTOMER_AIB_LOAN_TABLE.REPAYMENT) FROM CUSTOMER_AIB_LOAN_TABLE
WHERE CUSTOMER_AIB_INFO_TABLE.ID = CUSTOMER_AIB_LOAN_TABLE.FK_ID
) AS Repayment ,
CUSTOMER_AIB_LOAN_TABLE.PURPOSE
FROM CUSTOMER_AIB_INFO_TABLE
INNER JOIN CUSTOMER_AIB_LOAN_TABLE
ON CUSTOMER_AIB_INFO_TABLE.ID = CUSTOMER_AIB_LOAN_TABLE.FK_ID
Instead of Subquery you could use a inner join on select table group by id
SELECT DISTINCT
CUSTOMER_AIB_INFO_TABLE.ID,
CUSTOMER_AIB_INFO_TABLE.NAME,
T.Total_Loan AS "Loan",
T.Total_Rep AS Repayment,
CUSTOMER_AIB_LOAN_TABLE.PURPOSE
FROM CUSTOMER_AIB_INFO_TABLE
INNER JOIN (
SELECT CUSTOMER_AIB_LOAN_TABLE.FK_ID as FK_ID,
SUM (CUSTOMER_AIB_LOAN_TABLE.LOAN) AS Total_Loan
, SUM (CUSTOMER_AIB_LOAN_TABLE.REPAYMENT) AS Total_Rep
FROM CUSTOMER_AIB_LOAN_TABLE
GROUP BY CUSTOMER_AIB_LOAN_TABLE.ID
) T on t.FK_ID = CUSTOMER_AIB_INFO_TABLE.ID

Excluding multiple results in specific column (SQL JOIN)

I'm taking my first steps in terms of practical SQL use in real life.
I have a few tables with contractual and financial information and the query works exactly as I need - to a certain point. It looks more or less like that:
SELECT /some columns/ from CONTRACTS
Linked 3 extra tables with INNER JOIN to add things like department names, product information etc. This all works but they all have simplish one-to-one relationship (one contract related to single department in Department table, one product information entry in the corresponding table etc).
Now this is my challenge:
I also need to add contract invoicing information doing something like:
inner join INVOICES on CONTRACTS.contnoC = INVOICES.contnoI
(and selecting also the Invoice number linked to the Contract number, although that's partly optional)
The problem I'm facing is that unlike with other tables where there's always one-to-one relationship when joining tables, INVOICES table can have multiple (or none at all) entries that correspond to a single contract no. The result is that I will get multiple query results for a single contract no (with different invoice numbers presented), needlessly crowding the query results.
Essentially I'm looking to add INVOICES table to a query to just identify if the contract no is present in the INVOICES table (contract has been invoiced or not). Invoice number itself could be presented (it is with INNER JOIN), however it's not critical as long it's somehow marked. Invoice number fields remains blank in the result with the INNER JOIN function, which is also necessary (i.e. to have the row presented even if the match is not found in INVOICES table).
SELECT DISTINCT would look to do what I need, but I seemed to face the problem that I need to levy DISTINCT criteria only for column representing contract numbers, NOT any other column (there can be same values presented, but all those should be presented).
Unfortunately I'm not totally aware of what database system I am using.
Seems like the question is still getting some attention and in an effort to provide some explanation here are a few techniques.
If you just want any contract with details from the 1 to 1 tables you can do it similarily to what you have described. the key being NOT to include any column from Invoices table in the column list.
SELECT
DISTINCT Contract, Department, ProductId .....(nothing from Invoices Table!!!)
FROM
Contracts c
INNER JOIN Departments D
ON c.departmentId = d.Department
INNER JOIN Product p
ON c.ProductId = p.ProductId
INNER JOIN Invoices i
ON c.contnoC = i.contnoI
Perhaps a Little cleaner would be to use IN or EXISTS like so:
SELECT
Contract, Department, ProductId .....(nothing from Invoices Table!!!)
FROM
Contracts c
INNER JOIN Departments D
ON c.departmentId = d.Department
INNER JOIN Product p
ON c.ProductId = p.ProductId
WHERE
EXISTS (SELECT 1 FROM Invoices i WHERE i.contnoI = c.contnoC )
SELECT
Contract, Department, ProductId .....(nothing from Invoices Table!!!)
FROM
Contracts c
INNER JOIN Departments D
ON c.departmentId = d.Department
INNER JOIN Product p
ON c.ProductId = p.ProductId
WHERE
contnoC IN (SELECT contnoI FROM Invoices)
Don't use IN if the SELECT ... list can return a NULL!!!
If you Actually want all of the contracts and just know if a contract has been invoiced you can use aggregation and a case expression:
SELECT
Contract, Department, ProductId, CASE WHEN COUNT(i.contnoI) = 0 THEN 0 ELSE 1 END as Invoiced
FROM
Contracts c
INNER JOIN Departments D
ON c.departmentId = d.Department
INNER JOIN Product p
ON c.ProductId = p.ProductId
LEFT JOIN Invoices i
ON c.contnoC = i.contnoI
GROUP BY
Contract, Department, ProductId
Then if you actually want to return details about a particular invoice you can use a technique similar to that of cybercentic87 if your RDBMS supports or you could use a calculated column with TOP or LIMIT depending on your system.
SELECT
Contract, Department, ProductId, (SELECT TOP 1 InvoiceNo FROM invoices i WHERE c.contnoC = i.contnoI ORDER BY CreateDate DESC) as LastestInvoiceNo
FROM
Contracts c
INNER JOIN Departments D
ON c.departmentId = d.Department
INNER JOIN Product p
ON c.ProductId = p.ProductId
GROUP BY
Contract, Department, ProductId
I would do it this way:
with mainquery as(
<<here goes you main query>>
),
invoices_rn as(
select *,
ROW_NUMBER() OVER (PARTITION BY contnoI order by
<<some column to decide which invoice you want to take eg. date>>) as rn
)
invoices as (
select * from invoices_rn where rn = 1
)
select * from mainquery
left join invoices i on contnoC = i.contnoI
This gives you an ability to get all of the invoice details to your query, also it gives you full control of which invoice you want see in your main query. Please read more about CTEs; they are pretty handy and much easier to understand / read than nested selects.
I still don't know what database you are using. If ROW_NUMBER is not available, I will figure out something else :)
Also with a left join you should use COALESCE function for example:
COALESCE(i.invoice_number,'0')
Of course this gives you some more possibilities, you could for example in your main select do:
CASE WHEN i.invoicenumber is null then 'NOT INVOICED'
else 'INVOICED'
END as isInvoiced
You can use
SELECT ..., invoiced = 'YES' ... where exists ...
union
SELECT ..., invoiced = 'NO' ... where not exists ...
or you can use a column like "invoiced" with a subquery into invoices to set it's value depending on whether you get a hit or not

Access 2002 SQL for joining three tables

I have been trying to get this to work for a while now. I have 3 tables. First table has the Sales for customers which include the CustomerID, DateOfSales (Which always has the first of the month). The second table has the CustomerName, CustomerID. The third table has which customers buy what product lines. They are stored by CustomerID, ProductID.
I want to get a list (from one SQL hopefully) that has ALL the customers that are listed as buying a certain ProductID AND the maxDate from the Sales. I can get all of them IF there are sales for that customer. How the heck do I get ALL customers that buy the certain ProductID AND the maxDate from Sales or NULL if there is no sales found?
SalesList |CustomerList|WhoBuysWhat
----------|------------|-----------
maxDate |CustomerID |CustomerID
CustomerID| |ProductID=17
This is as close as I got. It gets all max dates but only if there have been sales. I want the CustomerID and a NULL for the maxDate if there were no sales recorded yet.
SELECT WhoBuysWhat.CustomerID, CustomerList.CustomerName,
Max(SalesList.MonthYear) AS MaxOfMonthYear FROM (CustomerList INNER
JOIN SalesList ON CustomerList.CustomerID = SalesList.CustomerID) INNER
JOIN WhoBuysWhat ON CustomerList.CustomerID = WhoBuysWhat.CustomerID
WHERE (((SalesList.ProductID)=17)) GROUP BY WhoBuysWhat.CustomerID,
CustomerList.CustomerName;
Is it possible or do I need to use multiple SQL statements? I know we should get something newer than Access 2002 but that is what they have.
You want LEFT JOINs:
SELECT cl.CustomerID, cl.CustomerName,
Max(sl.MonthYear) AS MaxOfMonthYear
FROM (CustomerList as cl LEFT JOIN
(SELECT sl.*
FROM SalesList sl
WHERE sl.ProductID = 17
) as sl
ON cl.CustomerID = sl.CustomerID
) LEFT JOIN
WhoBuysWhat wbw
ON cl.CustomerID = wbw.CustomerID
GROUP BY cl.CustomerID, cl.CustomerName;