I am still very new to SQL and Tableau however I am trying to work myself towards achieving a personal project of mine.
Table A; shows a table which contains the defect quantity per product category and when it was raised
+--------+-------------+--------------+-----------------+
| Issue# | Date_Raised | Category_ID# | Defect_Quantity |
+--------+-------------+--------------+-----------------+
| PCR12 | 11-Jan-2019 | Product#1 | 14 |
| PCR13 | 12-Jan-2019 | Product#1 | 54 |
| PCR14 | 5-Feb-2019 | Product#1 | 5 |
| PCR15 | 5-Feb-2019 | Product#2 | 7 |
| PCR16 | 20-Mar-2019 | Product#1 | 76 |
| PCR17 | 22-Mar-2019 | Product#2 | 5 |
| PCR18 | 25-Mar-2019 | Product#1 | 89 |
+--------+-------------+--------------+-----------------+
Table B; shows the consumption quantity of each product by month
+-------------+--------------+-------------------+
| Date_Raised | Category_ID# | Consumed_Quantity |
+-------------+--------------+-------------------+
| 5-Jan-2019 | Product#1 | 100 |
| 17-Jan-2019 | Product#1 | 200 |
| 5-Feb-2019 | Product#1 | 100 |
| 8-Feb-2019 | Product#2 | 50 |
| 10-Mar-2019 | Product#1 | 100 |
| 12-Mar-2019 | Product#2 | 50 |
+-------------+--------------+-------------------+
END RESULT
I would like to create a table/bar chart in tableau that shows that Defect_Quantity/Consumed_Quantity per month, per Category_ID#, so something like this below;
+----------+-----------+-----------+
| Month | Product#1 | Product#2 |
+----------+-----------+-----------+
| Jan-2019 | 23% | |
| Feb-2019 | 5% | 14% |
| Mar-2019 | 89% | 10% |
+----------+-----------+-----------+
WHAT I HAVE TRIED SO FAR
Unfortunately i have not really done anything, i am struggling to understand how do i get rid of the duplicates upon joining the tables based on Category_ID#.
Appreciate all the help I can receive here.
I can think of doing left joins on both product1 and 2.
select to_char(to_date(Date_Raised,'d-mon-yyyy'),'mon-yyyy')
, (p2.product1 - sum(case when category_id='Product#1' then Defect_Quantity else 0 end))/p2.product1 * 100
, (p2.product2 - sum(case when category_id='Product#2' then Defect_Quantity else 0 end))/p2.product2 * 100
from tableA t1
left join
(select to_char(to_date(Date_Raised,'d-mon-yyyy'),'mon-yyyy') Date_Raised
, sum(Comsumed_Quantity) as product1 tableB
where category_id = 'Product#1'
group by to_char(to_date(Date_Raised,'d-mon-yyyy'),'mon-yyyy')) p1
on p1.Date_Raised = t1.Date_Raised
left join
(select to_char(to_date(Date_Raised,'d-mon-yyyy'),'mon-yyyy') Date_Raised
, sum(Comsumed_Quantity) as product2 tableB
where category_id = 'Product#2'
group by to_char(to_date(Date_Raised,'d-mon-yyyy'),'mon-yyyy')) p2
on p2.Date_Raised = t1.Date_Raised
group by to_char(to_date(Date_Raised,'d-mon-yyyy'),'mon-yyyy')
By using ROW_NUMBER() OVER (PARTITION BY ORDER BY ) as RN, you can remove duplicate rows. As of your end result you should extract month from date and use pivot to achieve.
I would do this as:
select to_char(date_raised, 'YYYY-MM'),
(sum(case when product = 'Product#1' then defect_quantity end) /
sum(case when product = 'Product#1' then consumed_quantity end)
) as product1,
(sum(case when product = 'Product#2' then defect_quantity end) /
sum(case when product = 'Product#2' then consumed_quantity end)
) as product2
from ((select date_raised, product, defect_quantity, 0 as consumed_quantity
from a
) union all
(select date_raised, product, 0 as defect_quantity, consumed_quantity
from b
)
) ab
group by to_char(date_raised, 'YYYY-MM')
order by min(date_raised);
(I changed the date format because I much prefer YYYY-MM, but that is irrelevant to the logic.)
Why do I prefer this method? This will include all months where there is a row in either table. I don't have to worry that some months are inadvertently filtered out, because there are missing production or defects in one month.
Related
query which calculates the total amount in dollars of stolen goods for each month for restricted and neutral items.
I have 2 tables
first
| UPC | item | in_stock | price | ship_day | class |
1 | 101 | 'generator' | 16 | 5999 | '12-1-2065'| 'restricted'
2 | 102 | 'blank tape' | 30 | 3000 | '12-1-2065'| 'neutral'
second
| UPC | unit_stolen |
1 | 101 | 4 |
1 | 401 | 2 |
If I understand correctly, this is basically a join and group by:
select date_trunc('mon', f.ship_day) as yyyymm,
sum(f.price * s.unit_stolen) filter (where f.class = 'restricted'),
sum(f.price * s.unit_stolen) filter (where f.class = 'neutral')
from first f join
second s
on f.upc = s.upc
group by date_trunc('mon', f.ship_day)
I have 3 tables that I want to join together and group it to get client membership info. My code works for grouping the base table together but it breaks at the join part and I can't figure out why.
BASE TABLE : sales_detail
+-------+-----------+-----------+-----------------------------------------+
| order_date | transaction_id| product_cost | payment_type | country
+-------+-----------+-----------+------------------------------------------+
| 10/1 | 12345 | 20 | mastercard | usa
| 10/1 | 12345 | 50 | mastercard | usa
| 10/5 | 82456 | 50 | mastercard | usa
| 10/9 | 64789 | 30 | visa | canada
| 10/15 | 08546 | 20 | mastercard | usa
| 10/15 | 08546 | 90 | mastercard | usa
| 10/17 | 65898 | 50 | mastercard | usa
+-------+-----------+-----------+-------------------------------------+
table : client_information
+-------+-----------+-----------+-------------------+
| other_id | client_Type | item
+-------+-----------+-----------+----------+
| 112341 | new | hola |
| 112341 | old | mango |
| 145634 | old | pine |
| 879547 | old | vip |
| 745688 | new | unio |
| 745688 | old | dog |
| 147899 | new | cat |
| 124589 | new | amigo |
+-------+-----------+-----------+-----------+
table : connector
+-------+-----------+-----------+-------------------+
| transaction_ID | other_id | item
+-------+-----------+-----------+----------+
| 12345 | 112341 | hola |
| 82456 | 145634 | pine |
| 08157 | 879547 | unio |
| 08546 | 745688 | dog |
| 65898 | 147899 | cat |
| 06587 | 124589 | amigo |
+-------+-----------+-----------+-----------+
**I want the output to look something like this: **
IDEAL OUTPUT
+-------+-----------+-----------+--------------------------------+
| order_date | transaction_ID | product_cost | client_Type|
+-------+-----------+-----------+--------------------------------+
| 10/1 | 12345 | 70 | new |
| 10/5 | 82456 | 70 | old |
| 10/15 | 08546 | 110 | old |
| 10/17 | 65898 | 50 | new |
+-------+-----------+-----------+----------------------------------+
**i am trying to join my base table to the connector table by transaction ID to get other_id and items to match to client_type **
This is the code i used but it failed to compile after adding in left joins :
select t1.transaction_id, sum(t1.product_cost), t1.order_date, t3.client_type
from sales_detail t1
left join (select DISTINCT transaction_ID, other_id, fruits from connector) t2
ON t1.transaction_ID=t2.transaction_ID
left join (select DISTINCT order_id, client_type, fruits from client information) t3
ON t2.other_id=t3.other_id and t2.item=t3.item
where t1.payment_type='mastercard' and t1.order_Date between '2020-10-01' and'2020-10-31'
and country != 'canada'
GROUP BY t1.transaction_id, t1.order_date, t3.client_type;
Thanks in advance! I am a beginner so still learning the ins and outs of sql! (am using hive)
I think that's joins and aggregation. For more efficiency, you can pre-aggregate in a subquery, then join:
select sd.*, ci.client_type
from (
select order_date, transaction_id, sum(product_cost) product_cost
from sales_detail
where
payment_type = 'mastercard'
and order_date >= '2020-10-01'
and order_date < '2020-11-01'
and country <> 'canada'
group by order_date, transaction_id
) sd
inner join connector c on c.transaction_id = sd.transaction_id
inner join client_information ci on ci.other_id = c.other_id
Note that I rewrote the filter on order_date to use half-open intervals rather than between. This properly handles the case when your dates have a time portion.
From what I have understood, your code works although not as you would like using an INNER JOIN and it fails to add a LEFT JOIN. I think what happens is a failure due to the NULL elements. to add a NULL element and not get an error, you have to use some function that changes the NULL value to 0 .
One such function is the ISNULL(yourColumn, 0) function of T-SQL.The documentation.
I can see that in result table you only need clients who used mastercard, so you should use inner join there so only those client who used mastercard will be considered. While the remaining query is okay i guess, but main problem was the join on client information.
I think on the answer with GMB you also need to join on item column otherwise you will get multiple rows output.
select sd.*, ci.client_type
from (
select order_date, transaction_id, sum(product_cost) product_cost
from sales_detail
group by order_date, transaction_id
) sd
inner join connector c on c.transaction_id = sd.transaction_id
inner join client_information ci on ci.other_id = c.other_id and ci.item = c.item
Just modify with your filters and you should be sorted.
Hi my query below is summing multiple values based on #cropseasons in my table. Since i have 4 crop seasons it seems to be multiplying the values by 4 since i have crop season as 1, 2, 3 or 4. All i want is values for 1 crop season. Can anyone assist? I have crop season in both tables.
With Summary as (
Select B_NAME as Branch, LOC as Location
,SUM(payment) as Gallons
,SUM(case when printed = 1 THEN Fee ELSE NULL END) as FeeCollected
,SUM(case when printed = 0 THEN Fee ELSE NULL END) as FeeNotCollected
,SUM(case when printed = 1 THEN Payment ELSE NULL END) as GallonsIssued
,SUM(case when printed = 0 THEN Payment ELSE NULL END) as GallonsNotIssued
From SicbWeeklyDeliveriesFuelArchive F Inner Join FarmerGroups G ON G.BSI_CODE = F.BSI_CODE
Where F.CROP_SEASON = #cropseason
Group By B_NAME, LOC
)
SELECT Branch
,Location
,Gallons
,GallonsIssued
,GallonsNotIssued
,FeeCollected
,FeeNotCollected
,((GallonsIssued/Gallons) * 100) as pct_GallonsCollected
FROM Summary
Order by Location, Branch
SicbWeeklyDeliveriesFuelArchive
+-------+----------+-------------+-----+---------+------+-------------+---------+
| ID | BSI_CODE | B_NAME | LOC | PAYMENT | FEE | CROP_SEASON | PRINTED |
+-------+----------+-------------+-----+---------+------+-------------+---------+
| 18735 | 2176 | SAN NARCISO | CZ | 85 | 8.5 | 4 | 0 |
| 18738 | 2176 | SAN NARCISO | CZ | 65 | 6.5 | 4 | 0 |
| 18739 | 10494 | SAN NARCISO | CZ | 85 | 8.5 | 3 | 0 |
+-------+----------+-------------+-----+---------+------+-------------+---------+
FarmerGroups
+-------+----------+-------------+-------------+
| ID | BSI_CODE | CROP_SEASON | BRANCH |
+-------+----------+-------------+-------------+
| 10473 | 2176 | 4 | SAN NARCISO |
| 11478 | 2176 | 3 | SAN NARCISO |
| 12787 | 10494 | 4 | SAN ROMAN |
+-------+----------+-------------+-------------+
It seems your join criteria is incomplete. The tables share BSI_CODE and CROP_SEASON, so I guess you want:
FROM sicbweeklydeliveriesfuelarchive f
JOIN farmergroups g ON g.bsi_code = f.bsi_code AND g.crop_season = f.crop_season
WHERE f.crop_season = #cropseason
But that's just guessing. Only you know how the tables are really related, what their rows represent, what columns make a row unique and what result you are actually after. Why do you join farmergroups at all? It looks like you are not really using the table in your query.
I'm having trouble writing a query in Microsoft Access 2016 that will show the sum of an Expense for a particular event, the sum of the signs that event produced, along with the year, event description and company name.
I think I am missing something simple, and am going to feel ridiculous once someone points it out. Hopefully I managed to format my question well enough that it is easy to spot!
Here are the tables involved, along with the dummy data I am testing with.
All_Company Company_Event
------------------ ---------------------------
| ID | Company | | ID | EventDescription |
|------|---------| |----|--------------------|
| 1 | Crapple | | 1 | Concert |
| 2 | Rito | | 2 | Party |
------------------ ---------------------------
Company_Target_Actual
----------------------------------------------------------------
| All_CompanyID | Company_EventID | Year | Quarter | Signed |
|----------------|-------------------|------|---------|--------|
| 1 | 2 | 2015 | 1 | 1 |
| 1 | 2 | 2015 | 2 | 0 |
| 1 | 2 | 2015 | 3 | 3 |
| 1 | 2 | 2015 | 4 | 1 |
----------------------------------------------------------------
Budget_Company_Expense
---------------------------------------------------------------------------------
| ID | All_CompanyID | Company_EventID | Year | Category | SubCategory| Expense |
---------------------------------------------------------------------------------
| 1 | 1 | 2 | 2015 | ABCD | 123 | 40 |
| 2 | 1 | 2 | 2015 | ABCD | cat | 113 |
| 3 | 1 | 2 | 2015 | ABCD | dog | 71 |
---------------------------------------------------------------------------------
This is my code for the query, I broke it up from the ugly Access long lines of code to make it easier to read.
SELECT DISTINCTROW All_Company.Company, Budget_Company_Expense.Year,
Budget_Company_Expense.Company_EventID, Company_Event.EventDescription,
Sum(Budget_Company_Expense.Expense) AS [Sum Of Expense USD],
Sum(Company_Target_Actual.Signed) AS [Sum Of Signed]
FROM Company_Event
INNER JOIN ((All_Company
INNER JOIN Company_Target_Actual
ON All_Company.[ID] = Company_Target_Actual.[All_CompanyID])
INNER JOIN Budget_Company_Expense
ON All_Company.[ID] = Budget_Company_Expense.[All_CompanyID])
ON Company_Event.[ID] = Budget_Company_Expense.[Company_EventID]
GROUP BY All_Company.Company, Budget_Company_Expense.Year,
Budget_Company_Expense.Company_EventID, Company_Event.EventDescription;
and here is the result from running my query
Result
-------------------------------------------------------------------------------------------
| Company | Year | Company_EventID | EventDescription | Sum of Expense USD | Sum of Signed|
-------------------------------------------------------------------------------------------
| Crapple | 2015 | 2 | Party | $896.00 | 15 |
-------------------------------------------------------------------------------------------
As you can see, it is summing as if the total signs (5) happened 3 times (the number of entries in the Company_Target_Actual table) and vis versa for the Expense. Any help on my issue would be greatly appreciated,
and if I forgot any information that may help find my mistake please let me know what else I can provide!
Consider splitting the query into two aggregations, one to sum Signed in Company_Target_Actual and the other to sum Expense in Business_Company_Expense. Then, join the two queries by Company, Event, and Year which are the grouping factors.
Below uses two derived tables (subqueries in FROM/JOIN clause). However, you can very well save either one as a separate query and then join them in final query:
SELECT t1.Company, t1.Year, t1.Company_EventID, t1.EventDescription,
t2.[Sum Of Expense USD], t1.[Sum of Signed]
FROM
(SELECT ac.ID AS CompanyID, ac.Company, ca.Year, ca.Company_EventID, ev.EventDescription,
SUM(ca.Signed) AS [Sum Of Signed]
FROM (Company_Target_Actual ca
INNER JOIN Company_Event ev
ON ca.Company_EventID = ev.ID)
INNER JOIN All_Company ac
ON ca.All_CompanyID = ac.ID
GROUP BY ac.ID, ac.Company, ca.Year, ca.Company_EventID, ev.EventDescription) AS t1
INNER JOIN
(SELECT ac.ID AS CompanyID, ac.Company, be.Year, be.Company_EventID, ev.EventDescription,
SUM(be.Expense) AS [Sum Of Expense USD]
FROM (Budget_Company_Expense be
INNER JOIN Company_Event ev
ON be.Company_EventID = ev.ID)
INNER JOIN All_Company ac
ON be.All_CompanyID = ac.ID
GROUP BY ac.ID, ac.Company, be.Year, be.Company_EventID, ev.EventDescription) AS t2
ON t1.CompanyID = t2.CompanyID
AND t1.Company_EventID = t2.Company_EventID
AND t1.Year = t2.Year
Can you please help me build an SQL query to retrieve data from a history table?
I'm a newbie with only a one-week coding experience. I've been trying simple SELECT statements so far but have hit a stumbling block.
My football club's database has three tables. The first one links balls to players:
BallDetail
| BallID | PlayerID | TeamID |
|-------------------|--------|
| 1 | 11 | 21 |
| 2 | 12 | 22 |
The second one lists things that happen to the balls:
BallEventHistory
| BallID | Event | EventDate |
|--------|------ |------------|
| 1 | Pass | 2012-01-01 |
| 1 | Shoot | 2012-02-01 |
| 1 | Miss | 2012-03-01 |
| 2 | Pass | 2012-01-01 |
| 2 | Shoot | 2012-02-01 |
And the third one is a history change table. After a ball changes hands, history is recorded:
HistoryChanges
| BallID | ColumnName | ValueOld | ValueNew |
|--------|------------|----------|----------|
| 2 | PlayerID | 11 | 12 |
| 2 | TeamID | 21 | 22 |
I'm trying to obtain a table that would list all passes and shoots Player 11 had done to all balls before the balls went to other players. Like this:
| PlayerID | BallID | Event | Month |
|----------|--------|-------|-------|
| 11 | 1 | Pass | Jan |
| 11 | 1 | Shoot | Feb |
| 11 | 2 | Pass | Jan |
I begin so:
SELECT PlayerID, BallID, Event, DateName(month, EventDate)
FROM BallDetail bd INNER JOIN BallEventHistory beh ON bd.BallID = beh.BallID
WHERE PlayerID = 11 AND Event IN (Pass, Shoot) ...
But how to make sure that Ball 2 also gets included despite being with another player now?
Select PlayerID,BallID,Event,datename(month,EventDate) as Month,Count(*) as cnt from
(
Select
Coalesce(
(Select ValueNew from #HistoryChanges where ChangeDate=(Select max(ChangeDate) from #HistoryChanges h2 where h2.BallID=h.BallID and ColumnName='PlayerID' and ChangeDate<=EventDate) and BallID=h.BallID and ColumnName='PlayerID')
,(Select PlayerID from #BallDetail where BallID=h.BallID)
) as PlayerID,
h.BallID,h.Event,EventDate
from #BallEventHistory h
) a
Group by PlayerID, BallID, Event,datename(month,EventDate)
SELECT d.PlayerID, d.BallID, h.Event, DATENAME(mm, h.EventDate) AS Month
FROM BallDetail d JOIN BallEventHistory h ON d.BallID = h.BallID
WHERE h.Event IN ('Pass', 'Shoot') AND d.PlayerID = 11
OR EXISTS (SELECT 1
FROM dbo.HistoryChanges c
WHERE c.ValueOld = 11 AND c.ValueNew = d.PlayerID AND c.ColumnName = 'PlayerID' and c.ChangeDate = h.EventDate)