MS Access duplicate values using SUM - sql

I'm having trouble writing a query in Microsoft Access 2016 that will show the sum of an Expense for a particular event, the sum of the signs that event produced, along with the year, event description and company name.
I think I am missing something simple, and am going to feel ridiculous once someone points it out. Hopefully I managed to format my question well enough that it is easy to spot!
Here are the tables involved, along with the dummy data I am testing with.
All_Company Company_Event
------------------ ---------------------------
| ID | Company | | ID | EventDescription |
|------|---------| |----|--------------------|
| 1 | Crapple | | 1 | Concert |
| 2 | Rito | | 2 | Party |
------------------ ---------------------------
Company_Target_Actual
----------------------------------------------------------------
| All_CompanyID | Company_EventID | Year | Quarter | Signed |
|----------------|-------------------|------|---------|--------|
| 1 | 2 | 2015 | 1 | 1 |
| 1 | 2 | 2015 | 2 | 0 |
| 1 | 2 | 2015 | 3 | 3 |
| 1 | 2 | 2015 | 4 | 1 |
----------------------------------------------------------------
Budget_Company_Expense
---------------------------------------------------------------------------------
| ID | All_CompanyID | Company_EventID | Year | Category | SubCategory| Expense |
---------------------------------------------------------------------------------
| 1 | 1 | 2 | 2015 | ABCD | 123 | 40 |
| 2 | 1 | 2 | 2015 | ABCD | cat | 113 |
| 3 | 1 | 2 | 2015 | ABCD | dog | 71 |
---------------------------------------------------------------------------------
This is my code for the query, I broke it up from the ugly Access long lines of code to make it easier to read.
SELECT DISTINCTROW All_Company.Company, Budget_Company_Expense.Year,
Budget_Company_Expense.Company_EventID, Company_Event.EventDescription,
Sum(Budget_Company_Expense.Expense) AS [Sum Of Expense USD],
Sum(Company_Target_Actual.Signed) AS [Sum Of Signed]
FROM Company_Event
INNER JOIN ((All_Company
INNER JOIN Company_Target_Actual
ON All_Company.[ID] = Company_Target_Actual.[All_CompanyID])
INNER JOIN Budget_Company_Expense
ON All_Company.[ID] = Budget_Company_Expense.[All_CompanyID])
ON Company_Event.[ID] = Budget_Company_Expense.[Company_EventID]
GROUP BY All_Company.Company, Budget_Company_Expense.Year,
Budget_Company_Expense.Company_EventID, Company_Event.EventDescription;
and here is the result from running my query
Result
-------------------------------------------------------------------------------------------
| Company | Year | Company_EventID | EventDescription | Sum of Expense USD | Sum of Signed|
-------------------------------------------------------------------------------------------
| Crapple | 2015 | 2 | Party | $896.00 | 15 |
-------------------------------------------------------------------------------------------
As you can see, it is summing as if the total signs (5) happened 3 times (the number of entries in the Company_Target_Actual table) and vis versa for the Expense. Any help on my issue would be greatly appreciated,
and if I forgot any information that may help find my mistake please let me know what else I can provide!

Consider splitting the query into two aggregations, one to sum Signed in Company_Target_Actual and the other to sum Expense in Business_Company_Expense. Then, join the two queries by Company, Event, and Year which are the grouping factors.
Below uses two derived tables (subqueries in FROM/JOIN clause). However, you can very well save either one as a separate query and then join them in final query:
SELECT t1.Company, t1.Year, t1.Company_EventID, t1.EventDescription,
t2.[Sum Of Expense USD], t1.[Sum of Signed]
FROM
(SELECT ac.ID AS CompanyID, ac.Company, ca.Year, ca.Company_EventID, ev.EventDescription,
SUM(ca.Signed) AS [Sum Of Signed]
FROM (Company_Target_Actual ca
INNER JOIN Company_Event ev
ON ca.Company_EventID = ev.ID)
INNER JOIN All_Company ac
ON ca.All_CompanyID = ac.ID
GROUP BY ac.ID, ac.Company, ca.Year, ca.Company_EventID, ev.EventDescription) AS t1
INNER JOIN
(SELECT ac.ID AS CompanyID, ac.Company, be.Year, be.Company_EventID, ev.EventDescription,
SUM(be.Expense) AS [Sum Of Expense USD]
FROM (Budget_Company_Expense be
INNER JOIN Company_Event ev
ON be.Company_EventID = ev.ID)
INNER JOIN All_Company ac
ON be.All_CompanyID = ac.ID
GROUP BY ac.ID, ac.Company, be.Year, be.Company_EventID, ev.EventDescription) AS t2
ON t1.CompanyID = t2.CompanyID
AND t1.Company_EventID = t2.Company_EventID
AND t1.Year = t2.Year

Related

SQL Group By and Join based on a weird client table

I have 3 tables that I want to join together and group it to get client membership info. My code works for grouping the base table together but it breaks at the join part and I can't figure out why.
BASE TABLE : sales_detail
+-------+-----------+-----------+-----------------------------------------+
| order_date | transaction_id| product_cost | payment_type | country
+-------+-----------+-----------+------------------------------------------+
| 10/1 | 12345 | 20 | mastercard | usa
| 10/1 | 12345 | 50 | mastercard | usa
| 10/5 | 82456 | 50 | mastercard | usa
| 10/9 | 64789 | 30 | visa | canada
| 10/15 | 08546 | 20 | mastercard | usa
| 10/15 | 08546 | 90 | mastercard | usa
| 10/17 | 65898 | 50 | mastercard | usa
+-------+-----------+-----------+-------------------------------------+
table : client_information
+-------+-----------+-----------+-------------------+
| other_id | client_Type | item
+-------+-----------+-----------+----------+
| 112341 | new | hola |
| 112341 | old | mango |
| 145634 | old | pine |
| 879547 | old | vip |
| 745688 | new | unio |
| 745688 | old | dog |
| 147899 | new | cat |
| 124589 | new | amigo |
+-------+-----------+-----------+-----------+
table : connector
+-------+-----------+-----------+-------------------+
| transaction_ID | other_id | item
+-------+-----------+-----------+----------+
| 12345 | 112341 | hola |
| 82456 | 145634 | pine |
| 08157 | 879547 | unio |
| 08546 | 745688 | dog |
| 65898 | 147899 | cat |
| 06587 | 124589 | amigo |
+-------+-----------+-----------+-----------+
**I want the output to look something like this: **
IDEAL OUTPUT
+-------+-----------+-----------+--------------------------------+
| order_date | transaction_ID | product_cost | client_Type|
+-------+-----------+-----------+--------------------------------+
| 10/1 | 12345 | 70 | new |
| 10/5 | 82456 | 70 | old |
| 10/15 | 08546 | 110 | old |
| 10/17 | 65898 | 50 | new |
+-------+-----------+-----------+----------------------------------+
**i am trying to join my base table to the connector table by transaction ID to get other_id and items to match to client_type **
This is the code i used but it failed to compile after adding in left joins :
select t1.transaction_id, sum(t1.product_cost), t1.order_date, t3.client_type
from sales_detail t1
left join (select DISTINCT transaction_ID, other_id, fruits from connector) t2
ON t1.transaction_ID=t2.transaction_ID
left join (select DISTINCT order_id, client_type, fruits from client information) t3
ON t2.other_id=t3.other_id and t2.item=t3.item
where t1.payment_type='mastercard' and t1.order_Date between '2020-10-01' and'2020-10-31'
and country != 'canada'
GROUP BY t1.transaction_id, t1.order_date, t3.client_type;
Thanks in advance! I am a beginner so still learning the ins and outs of sql! (am using hive)
I think that's joins and aggregation. For more efficiency, you can pre-aggregate in a subquery, then join:
select sd.*, ci.client_type
from (
select order_date, transaction_id, sum(product_cost) product_cost
from sales_detail
where
payment_type = 'mastercard'
and order_date >= '2020-10-01'
and order_date < '2020-11-01'
and country <> 'canada'
group by order_date, transaction_id
) sd
inner join connector c on c.transaction_id = sd.transaction_id
inner join client_information ci on ci.other_id = c.other_id
Note that I rewrote the filter on order_date to use half-open intervals rather than between. This properly handles the case when your dates have a time portion.
From what I have understood, your code works although not as you would like using an INNER JOIN and it fails to add a LEFT JOIN. I think what happens is a failure due to the NULL elements. to add a NULL element and not get an error, you have to use some function that changes the NULL value to 0 .
One such function is the ISNULL(yourColumn, 0) function of T-SQL.The documentation.
I can see that in result table you only need clients who used mastercard, so you should use inner join there so only those client who used mastercard will be considered. While the remaining query is okay i guess, but main problem was the join on client information.
I think on the answer with GMB you also need to join on item column otherwise you will get multiple rows output.
select sd.*, ci.client_type
from (
select order_date, transaction_id, sum(product_cost) product_cost
from sales_detail
group by order_date, transaction_id
) sd
inner join connector c on c.transaction_id = sd.transaction_id
inner join client_information ci on ci.other_id = c.other_id and ci.item = c.item
Just modify with your filters and you should be sorted.

Duplicate records upon joining table

I am still very new to SQL and Tableau however I am trying to work myself towards achieving a personal project of mine.
Table A; shows a table which contains the defect quantity per product category and when it was raised
+--------+-------------+--------------+-----------------+
| Issue# | Date_Raised | Category_ID# | Defect_Quantity |
+--------+-------------+--------------+-----------------+
| PCR12 | 11-Jan-2019 | Product#1 | 14 |
| PCR13 | 12-Jan-2019 | Product#1 | 54 |
| PCR14 | 5-Feb-2019 | Product#1 | 5 |
| PCR15 | 5-Feb-2019 | Product#2 | 7 |
| PCR16 | 20-Mar-2019 | Product#1 | 76 |
| PCR17 | 22-Mar-2019 | Product#2 | 5 |
| PCR18 | 25-Mar-2019 | Product#1 | 89 |
+--------+-------------+--------------+-----------------+
Table B; shows the consumption quantity of each product by month
+-------------+--------------+-------------------+
| Date_Raised | Category_ID# | Consumed_Quantity |
+-------------+--------------+-------------------+
| 5-Jan-2019 | Product#1 | 100 |
| 17-Jan-2019 | Product#1 | 200 |
| 5-Feb-2019 | Product#1 | 100 |
| 8-Feb-2019 | Product#2 | 50 |
| 10-Mar-2019 | Product#1 | 100 |
| 12-Mar-2019 | Product#2 | 50 |
+-------------+--------------+-------------------+
END RESULT
I would like to create a table/bar chart in tableau that shows that Defect_Quantity/Consumed_Quantity per month, per Category_ID#, so something like this below;
+----------+-----------+-----------+
| Month | Product#1 | Product#2 |
+----------+-----------+-----------+
| Jan-2019 | 23% | |
| Feb-2019 | 5% | 14% |
| Mar-2019 | 89% | 10% |
+----------+-----------+-----------+
WHAT I HAVE TRIED SO FAR
Unfortunately i have not really done anything, i am struggling to understand how do i get rid of the duplicates upon joining the tables based on Category_ID#.
Appreciate all the help I can receive here.
I can think of doing left joins on both product1 and 2.
select to_char(to_date(Date_Raised,'d-mon-yyyy'),'mon-yyyy')
, (p2.product1 - sum(case when category_id='Product#1' then Defect_Quantity else 0 end))/p2.product1 * 100
, (p2.product2 - sum(case when category_id='Product#2' then Defect_Quantity else 0 end))/p2.product2 * 100
from tableA t1
left join
(select to_char(to_date(Date_Raised,'d-mon-yyyy'),'mon-yyyy') Date_Raised
, sum(Comsumed_Quantity) as product1 tableB
where category_id = 'Product#1'
group by to_char(to_date(Date_Raised,'d-mon-yyyy'),'mon-yyyy')) p1
on p1.Date_Raised = t1.Date_Raised
left join
(select to_char(to_date(Date_Raised,'d-mon-yyyy'),'mon-yyyy') Date_Raised
, sum(Comsumed_Quantity) as product2 tableB
where category_id = 'Product#2'
group by to_char(to_date(Date_Raised,'d-mon-yyyy'),'mon-yyyy')) p2
on p2.Date_Raised = t1.Date_Raised
group by to_char(to_date(Date_Raised,'d-mon-yyyy'),'mon-yyyy')
By using ROW_NUMBER() OVER (PARTITION BY ORDER BY ) as RN, you can remove duplicate rows. As of your end result you should extract month from date and use pivot to achieve.
I would do this as:
select to_char(date_raised, 'YYYY-MM'),
(sum(case when product = 'Product#1' then defect_quantity end) /
sum(case when product = 'Product#1' then consumed_quantity end)
) as product1,
(sum(case when product = 'Product#2' then defect_quantity end) /
sum(case when product = 'Product#2' then consumed_quantity end)
) as product2
from ((select date_raised, product, defect_quantity, 0 as consumed_quantity
from a
) union all
(select date_raised, product, 0 as defect_quantity, consumed_quantity
from b
)
) ab
group by to_char(date_raised, 'YYYY-MM')
order by min(date_raised);
(I changed the date format because I much prefer YYYY-MM, but that is irrelevant to the logic.)
Why do I prefer this method? This will include all months where there is a row in either table. I don't have to worry that some months are inadvertently filtered out, because there are missing production or defects in one month.

Relational database - adding products

I have the following situation, namely I need to make a database,
in which I will store products that the user added to breakfast,
lunch, midday meal and dinner ON A SPECIFIC DAY.
I have a problem with the construction of such a relational database.
I currently have this combination of two tables:
It seems to me that I need 3 tables here in which
the products themselves will be placed, but I have no idea how
I can combine these 3 tables to get queries
products depending on the type of meal (breakfast, lunch ..) and date (the day they were added)
Yes, you should have a Products table that should have the 5 last columns you are showing in your second table (Orders?). And remove them from the Orders table such that it only has the IDs referencing the Meal and Product and the Date.
Then you can do the following:
SELECT o.Date, m.Meal_Name, p.Product_Name, p.Carbohydrates,
p.Protein, p.Fat, p.Calories
FROM Orders o
INNER JOIN Meals m ON o.MealID = m.MealID
INNER JOIN Products p ON o.ProductID = p.ProductID
ORDER BY o.date, m.Meal_Name, p.Product_Name
Note that this will allow you to easly change the parameters (such as fat or Carbohydrates for a Product and have it appear in all records for that product.
While there is certainly plenty of room to interpretation here and you may only want to go so far in normalizing your data, I think a better option would be:
meals:
id | user_id | category_id | date
1 | 1 | 1 | 2019-09-03
meal_category
id | name
1 | breakfast
2 | lunch
3 | dinner
products
id | name | carbs | protein | fat | calories
1 | apple| 10 | 5 | 0 | 30
2 | cat | 0 | 20 | 5 | 80
3 | ham | 10 | 30 | 10 | 160
meal_products
meal_id | product_id
1 | 1
1 | 2
Bringing this together:
SELECT meals.id, meals.user_id, meal.date, meal_category.name, product.name, product.carbs, products.protein, products.fat, products.calories
FROM meals
INNER JOIN meal_category ON meals.category_id = meal_category.id
INNER JOIN meal_produts ON meals.id = meal_products.meal_id
INNER JOIN products ON meal_products.product_id = products.id
Which would yeild
+-----------+----------------+------------+---------------------+---------------+----------------+-------------------+---------------+-------------------+
| meals.id, | meals.user_id, | meal.date, | meal_category.name, | product.name, | product.carbs, | products.protein, | products.fat, | products.calories |
+-----------+----------------+------------+---------------------+---------------+----------------+-------------------+---------------+-------------------+
| 1 | 1 | 9/3/2019 | breakfast | apple | 10 | 5 | 0 | 30 |
| 1 | 1 | 9/3/2019 | breakfast | cat | 0 | 20 | 5 | 80 |
+-----------+----------------+------------+---------------------+---------------+----------------+-------------------+---------------+-------------------+

Repeat all rows in left table for each unique ID in other table

I have a team of people who are scored on up to three metrics; sales, leads and Hours.
I have a table (tblScores) in MS Access which holds these scores but only if there is any. (e.g if someone had no sales there would be no entry for them for sales)
| USERID | Metric | Score |
----------------------------------
| 20511 | Sales | 12 |
| 20511 | Leads | 9 |
| 20511 | Hours | 8 |
| 20694 | Sales | 10 |
| 20694 | Hours | 7.5 |
I am trying to create an SQL query that will output three records (each possible metric) for each User in the above table including null values where they don't have an entry for that metric. e.g
| USERID | Metric | Score |
----------------------------------
| 20511 | Sales | 12 |
| 20511 | Leads | 9 |
| 20511 | Hours | 8 |
| 20694 | Sales | 10 |
| 20694 | Leads | Null |
| 20694 | Hours | 7.5 |
I have set up another table (tblMetrics) with just these 3 metrics
| Metric |
---------------
| Sales |
| Leads |
| Hours |
and tried to do a left join on the metric table against the score table
SELECT tblMetrics.*, TblScores.UserID, TblScores.Score
FROM tblMetrics LEFT JOIN TblScores ON tblMetrics.Metric = TblScores.Metric;
but it is still not giving the desired output. Does anyone know if this possible?
You need to do a CROSS JOIN first to generate all combinations, then do the LEFT JOIN to find which one are missing and assign NULL
I check access syntaxis and the CROSS JOIN should be write like this
SELECT DISTINCT M.Metric, S.USERID
FROM tblMetric M, tblScore S
And the Left Join should be
SELECT userMetrc.*, S.Score
FROM ( SELECT DISTINCT M.Metric, S.USERID
FROM tblMetric M, tblScore S
) userMetric
LEFT JOIN tblScore S
ON ( userMetric.USERID = S.USERID
AND userMetric.Metric = S.Metric )

Data aggregation with left-outer join

I am trying to pull some data with transaction counts, by branch, by week, which will later be used to feed some dynamic .Net charts.
I have a calendar table, I have a branch table and I have a transaction table.
Here is my DB info (only relevant columns included):
Branch Table:
ID (int), Branch (varchar)
Calendar Table:
Date (datetime), WeekOfYear(int)
Transaction Table:
Date (datetime), Branch (int), TransactionCount(int)
So, I want to do something like the following:
Select b.Branch, c.WeekOfYear, sum(TransactionCount)
FROM BranchTable b
LEFT OUTER JOIN TransactionTable t
on t.Branch = b.ID
JOIN Calendar c
on t.Date = c.Date
WHERE YEAR(c.Date) = #Year // (SP accepts this parameter)
GROUP BY b.Branch, c.WeekOfYear
Now, this works EXCEPT when a branch doesn't have any transactions for a week, in which case NO RECORD is returned for that branch on that week. What I WANT is to get that branch, that week and "0" for the sum. I tried isnull(sum(TransactionCount), 0) - but that didn't work, either. So I will get the following (making up sums for illustration purposes):
+--------+------------+-----+
| Branch | WeekOfYear | Sum |
+--------+------------+-----+
| 1 | 1 | 25 |
| 2 | 1 | 37 |
| 3 | 1 | 19 |
| 4 | 1 | 0 | //THIS RECORD DOES NOT GET RETURNED, BUT I NEED IT!
| 1 | 2 | 64 |
| 2 | 2 | 34 |
| 3 | 2 | 53 |
| 4 | 2 | 11 |
+--------+------------+-----+
So, why doesn't the left-outer join work? Isn't that supposed to
Any help will be greatly appreciated. Thank you!
EDIT: SAMPLE TABLE DATA:
Branch Table:
+----+---------------+
| ID | Branch |
+----+---------------+
| 1 | First Branch |
| 2 | Second Branch |
| 3 | Third Branch |
| 4 | Fourth Branch |
+----+---------------+
Calendar Table:
+------------+------------+
| Date | WeekOfYear |
+------------+------------+
| 01/01/2015 | 1 |
| 01/02/2015 | 1 |
+------------+------------+
Transaction Table
+------------+--------+--------------+
| Date | Branch | Transactions |
+------------+--------+--------------+
| 01/01/2015 | 1 | 12 |
| 01/01/2015 | 1 | 9 |
| 01/01/2015 | 2 | 4 |
| 01/01/2015 | 2 | 2 |
| 01/01/2015 | 2 | 23 |
| 01/01/2015 | 3 | 42 |
| 01/01/2015 | 3 | 19 |
| 01/01/2015 | 3 | 7 |
+------------+--------+--------------+
If you want to return a query that contains each Branch and each week, then you'll need to first create a full list of that, then use a LEFT JOIN to the transactions to get the count. The code will be similar to:
select bc.Branch,
bc.WeekOfYear,
TotalTransaction = coalesce(sum(t.TransactionCount), 0)
from
(
select b.id, b.branch, c.WeekOfYear, c.date
from branch b
cross join Calendar c
-- if you want to limit the number of rows returned use a WHERE to limit the weeks
-- so far in the year or using the date column
WHERE c.date <= getdate()
and YEAR(c.Date) = #Year // (SP accepts this parameter)
) bc
left join TransactionTable t
on t.Date = bc.Date
and bc.id = t.branch
GROUP BY bc.Branch, bc.WeekOfYear
See Demo
This code will create in your subquery a full list of each branch with each date. Once you have this list, then you can JOIN to the transactions to get your total transaction count and you'd return each date as you want.
Bring in the Calendar before you bring in the transactions:
SELECT b.Branch, c.WeekOfYear, sum(TransactionCount)
FROM BranchTable b
INNER JOIN CalendarTable c ON YEAR(c.Date) = #Year
LEFT JOIN TransactionTable t ON t.Branch = b.ID AND t.Date = c.Date
GROUP BY b.Branch, c.WeekOfYear
ORDER BY c.WeekOfYear, b.Branch