SQL - 2 table values to be grouped by third unconnected value - sql

I want to create a graph that pulls data from 2 user questions generated from within an SQL database.
The issue is that the user questions are stored in the same table, as are the answers. The only connection is that the question string includes a year value, which I extract using the LEFT command so that I output a column called 'YEAR' with a list of integer values running from 2013 to 2038 (25 year period).
I then want to pull the corresponding answers ('forecast' and 'actual') from each 'YEAR' so that I can plot a graph with a couple of values from each year (sorry if this isn't making any sense). The graph should show a forecast line covering the 25 year period with a second line (or column) showing the actual value as it gets populated over the years. I'll then be able to visualise if our actual value is close to our original forecast figures (long term goal!)
CODE BELOW
SELECT CAST((LEFT(F_TASK_ANS.TA_ANS_QUESTION,4)) AS INTEGER) AS YEAR,
-- first select takes left 4 characters of question and outputs value as string then coverts value to whole number.
CAST((CASE WHEN F_TASK_ANS.TA_ANS_QUESTION LIKE '%forecast' THEN F_TASK_ANS.TA_ANS_ANSWER END) AS NUMERIC(9,2)) AS 'FORECAST',
CAST((CASE WHEN F_TASK_ANS.TA_ANS_QUESTION LIKE '%actual' THEN ISNULL(F_TASK_ANS.TA_ANS_ANSWER,0) END) AS NUMERIC(9,2)) AS 'ACTUAL'
-- actual value will be null until filled in each year therefore ISNULL added to replace null with 0.00.
FROM F_TASK_ANS INNER JOIN F_TASKS ON F_TASK_ANS.TA_ANS_FKEY_TA_SEQ = F_TASKS.TA_SEQ
WHERE TA_ANS_ANSWER <> ''
AND (TA_TASK_ID LIKE '%6051' OR TA_TASK_ID LIKE '%6052')
-- The two numbers above refer to separate PPM questions that the user enters a value into
I tried GROUP BY 'YEAR' but I get an
Error: Each GROUP BY expression must contain at least one column that
is not an outer reference - which I assume is because I haven't linked
the 2 tables in any way...
Should I be adding a UNION so the tables are joined?
What I want to see is something like the following output (which I'll graph up later)
YEAR FORECAST ACTUAL
2013 135000 127331
2014 143000 145102
2015 149000 0
2016 158000 0
2017 161000 0
2018... etc
Any help or guidance would be hugely appreciated.
Thanks

Although the syntax is pretty hairy, this seems like a fairly simple query. You are in fact linking your two tables (with the JOIN statement) and you don't need a UNION.
Try something like this (using a common table expression, or CTE, to make the grouping clearer, and changing the syntax for slightly greater clarity):
WITH data
AS (
SELECT YEAR = CAST((LEFT(A.TA_ANS_QUESTION,4)) AS INTEGER)
, FORECAST = CASE WHEN A.TA_ANS_QUESTION LIKE '%forecast'
THEN CONVERT(NUMERIC(9,2), A.TA_ANS_ANSWER)
ELSE CONVERT(NUMERIC(9,2), 0)
END
, ACTUAL = CASE WHEN A.TA_ANS_QUESTION LIKE '%actual'
THEN CONVERT(NUMERIC(9,2), ISNULL(A.TA_ANS_ANSWER,0) )
ELSE CONVERT(NUMERIC(9,2), 0)
END
FROM F_TASK_ANS A
INNER JOIN F_TASKS T
ON A.TA_ANS_FKEY_TA_SEQ = T.TA_SEQ
-- It sounded like you wanted to include the ones where the answer was null. If
-- that's wrong, get rid of the test for NULL.
WHERE (A.TA_ANS_ANSWER <> '' OR A.TA_ANS_ANSWER IS NULL)
AND (TA_TASK_ID LIKE '%6051' OR TA_TASK_ID LIKE '%6052')
)
SELECT YEAR
, FORECAST = SUM(data.Forecast)
, ACTUAL = SUM(data.Actual)
FROM data
GROUP BY YEAR
ORDER BY YEAR

Try something like this ...
SELECT CAST((LEFT(F_TASK_ANS.TA_ANS_QUESTION,4)) AS INT) AS [YEAR]
,SUM(CAST((CASE WHEN F_TASK_ANS.TA_ANS_QUESTION LIKE '%forecast'
THEN F_TASK_ANS.TA_ANS_ANSWER ELSE 0 END) AS NUMERIC(9,2))) AS [FORECAST]
,SUM(CAST((CASE WHEN F_TASK_ANS.TA_ANS_QUESTION LIKE '%actual'
THEN F_TASK_ANS.TA_ANS_ANSWER ELSE 0 END) AS NUMERIC(9,2))) AS [ACTUAL]
FROM F_TASK_ANS INNER JOIN F_TASKS
ON F_TASK_ANS.TA_ANS_FKEY_TA_SEQ = F_TASKS.TA_SEQ
WHERE TA_ANS_ANSWER <> ''
AND (TA_TASK_ID LIKE '%6051' OR TA_TASK_ID LIKE '%6052')
GROUP BY CAST((LEFT(F_TASK_ANS.TA_ANS_QUESTION,4)) AS INT)

Related

Improve CASE WHEN Performance

I want to calculate customer retention week over week. My sales_orders table has columns order_date, and customer_name. Basically I want to check if a customer in this week also had an order the previous week. To do this, I have used CASE WHEN and subquery as follows (I have extracted order_week in a cte I've called weekly_customers and gotten distinct customer names within each week):
SELECT wc.order_week,
wc.customer,
CASE
WHEN wc.customer IN (
SELECT sq.customer
FROM weekly_customers sq
WHERE sq.order_week = (wc.order_week - 1))
THEN 'YES'
ELSE 'NO'
END AS present_in_previous_week
from weekly_customers wc
The query returns the correct data. My issue, the table is really huge with about 15000 distinct weekly values. This obviously leads to very long execution time. Is there a way I can improve this loop or even an alternative to the loop altogether?
Something like this:
SELECT
wc.order_week,
wc.customer,
CASE WHEN wcb.customer IS NOT NULL THEN "YES" ELSE "NO" END AS present_in_previous_week
FROM weekly_customers AS wca
LEFT JOIN
weekly_customers AS wcb
ON
wca.customer = wcb.customer
AND wca.order_week - 1 = wcb.order_week
This joins all of the customer data onto the customer data from a week ago. If there is a record for a week ago then wcb.customer will not be null, and we can set the flag to "YES". Otherwise, we set the flag to "NO".

Compare data in the same table

I have a table that stores monthly data and I would like to create a comparison between the quantity movement within a period.
Here is an example of the table
.
The SQL statement below is meant to return any changes that has happened within a period - any fund/policy partial or total loss as well as partial or total gain. I have been battling with it for a while - any help would be well appreciated.
I currently have 5 sets of unions - (where the policies and funds match and there's a difference in quantities held, where the policies exist in the previous and not in the current and vice versa and where the securities exist in the previous and not in the current and vice versa) but the other unions work save for the last couple (where the securities exist in the previous and not in the current and vice versa). It doesn't seem to return every occurrence.
SELECT distinct pc.[Client]
,pc.Policy
,cast(pc.Qty as decimal) AS CurrQ
,0 AS PrevQ
,cast(pc.Qty as decimal) - 0 AS QtyDiff
,CASE WHEN cast(pc.Qty as decimal) - 0 > 0 THEN 'Bought Units'
WHEN cast(pc.Qty as decimal) - 0 < 0 THEN 'Sold Units'
ELSE 'Unknown'
END AS TransactionType
,convert(varchar,cast(pc.[ValDate] as date),103) AS CurrValDate
,'' AS PrevValDate
FROM table pc
WHERE convert(varchar,cast(pc.[ValDate] as date),103) = convert(varchar,getdate(),103)
AND pc.Policy IN (SELECT policy
FROM table
WHERE convert(varchar(10),[ValDate],103) = convert(varchar(10),getdate()-1,103)
AND pc.[Fund] NOT IN (SELECT PM.[Fund]
FROM table pc
LEFT JOIN table pm ON pc.policy = pm.policy
WHERE convert(varchar,cast(pc.[ValDate] as date),103) = convert(varchar,getdate(),103))
AND convert(varchar,cast(pm.[ValDate] as date),103) = convert(varchar,getdate()-1,103))
As #Larnu rightly mentioned in the comment section, the extra conditions in the query changed the run from a LEFT JOIN to an INNER JOIN. I changed the code to have policy, fund and date in the ON clause:
FROM table pc
LEFT JOIN table pm ON (pc.policy = pm.policy
AND pc.fund = pm.fund
AND pc.[ValDate]-1 = pm.[ValDate])
and got rid of the sub queries.
Thanks again Larnu.

Converting SQL pivot table to T-SQL for Report Builder 3.0

I've been having a spot of bother importing a rather long-winded SQL pivot table dataset into SQL Server Report Builder 3.0 in a format which allows me to add parameter to the report outcome. I understand that this requires the query to be T-SQL friendly
The context is, in case it helps, is that i'm building a report to give a view over various market research panel's eligibility statuses, and i'd like to be able to present a drop down menu to let users flick between panels. So the end #parameter will be on PanelCode / PanelName. It's a composite query:
SELECT
ELT.PanelCode,
ELR.PanelName,
ELR.Year,
ELT.PeriodType,
ELT.PeriodValue,
ELT.TotalPanelists,
ELT.EligiblePanelists,
ELR.TotalEligible,
ELR.TotalVacation,
ELR.TotalExcused,
ELR.TotalInactive,
ELR.TotalConnection,
ELR.TotalCompliance
FROM --the Ineligibility Reason Pivot Table (ELR)
(SELECT
PanelCode,
PanelName,
Year,
PeriodType,
PeriodValue,
Max([Eligible]) as TotalEligible,
Max([Vacation]) as TotalVacation,
Max([Excuse]) as TotalExcused,
Max([Inactive]) as TotalInactive,
Max([Connection]) as TotalConnection,
Max([Compliance]) as TotalCompliance
FROM
(SELECT
PanelCode,
PanelName,
Year,
PeriodType,
PeriodValue,
EligibilityFailureReason,
FROM FullPanellistEligibilityView) FPR
Pivot
(count(EligibilityFailureReason) FOR EligibilityFailureReason IN ([Eligible], [Vacation], [Excuse], [Inactive], [Connection], [Compliance])) AS PVT
WHERE PeriodType <> '4 week period' and Year > 2012
GROUP BY PanelCode, PanelName, PeriodType, Year, PeriodValue) as ELR
, -- And the Eligibility Totals Query, ELT
(
SELECT
PanelCode,
PanelName,
Year,
PeriodType,
PeriodValue,
Count(Poll1s) as TotalPanelists,
Sum(Poll1s) as EligiblePanelists
FROM
(SELECT
PanelCode,
PanelName,
Year
PeriodType,
PeriodValue,
CAST(isEligible as INT) as Poll1s
FROM FullPanellistEligibilityView) FPR
GROUP BY PanelCode, PeriodType, PeriodValue) ELT
WHERE (ELT.PeriodValue=ELR.PeriodValue) and (ELT.PanelCode=ELR.PanelCode)
I've been really struggling to find resources online which suggest how to take larger queries and make them Parameter-able in Report Builder 3. What do I need to add in addition to WHERE PanelName = #PanelName to make this run?
EDIT1: I don't doubt that I've made this query far more complicated than necessary, i'm self-teaching. The schema isn't really necessary as all this data is pulled from one single, already existing view, FullPanellistEligibilityView, sample data, stripped down and mocked up from the view, can be found here
There are two things you need to do in order to set up a data driven parameter selection.
Firstly, you need to create a dataset to populate your parameter drop down menu. This needs to list all the values you want your user to be able to select, in the correct order. This can return a column each for the Label shown to the user and the value passed to the query:
select distinct PanelCode -- Parameter Value
,PanelName -- Parameter Label
from FullPanellistEligibilityView
order by PanelName
Create a Parameter and set the available values to this dataset, with the appropriate column used for the Label and Value properties.
Secondly, you need to add a filter to your dataset. I have taken the liberty of re-writing your query above to use a derived table/common table expression/cte instead of your PIVOT. The code below includes the reference to the SSRS parameter which will insert the 'Value' for the parameter once selected. This code is obviously not tested as I don't have your schema, but the design should be easy enough to understand:
with t
as
(
select PanelCode
,PeriodValue
,count(isEligible) as TotalPanelists -- I'm assuming this is a BIT column, in which case it shouldn't have any null values. If it does, you will need to handle this with count(isnull(isEligible,0))
,Sum(CAST(isEligible as INT)) as EligiblePanelists
from FullPanellistEligibilityView
where PanelCode = #PanelCode -- This will filter your data due to the INNER JOIN below.
group by PanelCode
,PeriodType
,PeriodValue
)
select e.PanelCode
,e.PanelName
,e.Year
,e.PeriodType
,e.PeriodValue
,t.TotalPanelists
,t.EligiblePanelists
,sum(case when e.EligibilityFailureReason = 'Eligible' then 1 else 0 end) as TotalEligible,
,sum(case when e.EligibilityFailureReason = 'Vacation' then 1 else 0 end) as TotalVacation,
,sum(case when e.EligibilityFailureReason = 'Excuse' then 1 else 0 end) as TotalExcused,
,sum(case when e.EligibilityFailureReason = 'Inactive' then 1 else 0 end) as TotalInactive,
,sum(case when e.EligibilityFailureReason = 'Connection' then 1 else 0 end) as TotalConnection,
,sum(case when e.EligibilityFailureReason = 'Compliance' then 1 else 0 end) as TotalCompliance
from FullPanellistEligibilityView e
inner join t
on(e.PanelCode = t.PanelValue
and e.PeriodValue = t.PeriodValue
)
where e.PeriodType <> '4 week period'
and e.Year > 2012
group by e.PanelCode
,e.PanelName
,e.Year
,e.PeriodType
,e.PeriodValue
,t.TotalPanelists
,t.EligiblePanelists

SQL Query - combine 2 rows into 1 row

I have the following query below (view) in SQL Server. The query produces a result set that is needed to populate a grid. However, a new requirement has come up where the users would like to see data on one row in our app. The tblTasks table can produce 1 or 2 rows. The issue becomes when they're is two rows that have the same job_number but different fldProjectContextId (1 or 31). I need to get the MechApprovalOut and ElecApprovalOut columns on one row instead of two.
I've tried restructuring the query using CTE and over partition and haven't been able to get the necessary results I need.
SELECT TOP (100) PERCENT
CAST(dbo.Job_Control.job_number AS int) AS Job_Number,
dbo.tblTasks.fldSalesOrder, dbo.tblTaskCategories.fldTaskCategoryName,
dbo.Job_Control.Dwg_Sent, dbo.Job_Control.Approval_done,
dbo.Job_Control.fldElecDwgSent, dbo.Job_Control.fldElecApprovalDone,
CASE WHEN DATEDIFF(day, dbo.Job_Control.Dwg_Sent, GETDATE()) > 14
AND dbo.Job_Control.Approval_done IS NULL
AND dbo.tblProjectContext.fldProjectContextID = 1
THEN 1 ELSE 0
END AS MechApprovalOut,
CASE WHEN DATEDIFF(day, dbo.Job_Control.fldElecDwgSent, GETDATE()) > 14
AND dbo.Job_Control.fldElecApprovalDone IS NULL
AND dbo.tblProjectContext.fldProjectContextID = 31
THEN 1 ELSE 0
END AS ElecApprovalOut,
dbo.tblProjectContext.fldProjectContextName,
dbo.tblProjectContext.fldProjectContextId, dbo.Job_Control.Drawing_Info,
dbo.Job_Control.fldElectricalAppDwg
FROM dbo.tblTaskCategories
INNER JOIN dbo.tblTasks
ON dbo.tblTaskCategories.fldTaskCategoryId = dbo.tblTasks.fldTaskCategoryId
INNER JOIN dbo.Job_Control
ON dbo.tblTasks.fldSalesOrder = dbo.Job_Control.job_number
INNER JOIN dbo.tblProjectContext
ON dbo.tblTaskCategories.fldProjectContextId = dbo.tblProjectContext.fldProjectContextId
WHERE (dbo.tblTaskCategories.fldTaskCategoryName = N'Approval'
OR dbo.tblTaskCategories.fldTaskCategoryName = N'Re-Approval')
AND (CASE WHEN DATEDIFF(day, dbo.Job_Control.Dwg_Sent, GETDATE()) > 14
AND dbo.Job_Control.Approval_done IS NULL
AND dbo.tblProjectContext.fldProjectContextID = 1
THEN 1 ELSE 0
END = 1)
OR (dbo.tblTaskCategories.fldTaskCategoryName = N'Approval'
OR dbo.tblTaskCategories.fldTaskCategoryName = N'Re-Approval')
AND (CASE WHEN DATEDIFF(day, dbo.Job_Control.fldElecDwgSent, GETDATE()) > 14
AND dbo.Job_Control.fldElecApprovalDone IS NULL
AND dbo.tblProjectContext.fldProjectContextID = 31
THEN 1 ELSE 0
END = 1)
ORDER BY dbo.Job_Control.job_number, dbo.tblTaskCategories.fldProjectContextId
The above query gives me the following result set:
I've created a work around via code (which I don't like but it works for now) where i've used code to populate a "temp" table the way i need it to display the data, that is, one record if duplicate job numbers to get the MechApprovalOut and ElecApprovalOut columns on one row (see first record in following screen shot).
Example:
With the desired result set and one row per job_number, this is how the form looks with the data and how I am using the result set.
Any help restructuring my query to combine duplicate rows with the same job number where MechApprovalOut and ElecApproval out columns are on one row is greatly appreciated! I'd much prefer to use a view on SQL then code in the app to populate a temp table.
Thanks,
Jimmy
What I would do is LEFT JOIN the main table to itself at the beginning of the query, matching on Job Number and Sales Order, such that the left side of the join is only looking at Approval task categories and the right side of the join is only looking at Re-Approval task categories. Then I would make extensive use of the COALESCE() function to select data from the correct side of the join for use later on and in the select clause. This may also be the piece you were missing to make a CTE work.
There is probably also a solution that uses a ranking/windowing function (maybe not RANK itself, but something that category) along with the PARTITION BY clause. However, as those are fairly new to Sql Server I haven't used them enough personally to be comfortable writing an example solution for you without direct access to the data to play with, and it would still take me a little more time to get right than I can devote to this right now. Maybe this paragraph will motivate someone else to do that work.

T-SQL Sum Values of Like Rows

I currently use this select statement in SSRS to report Recent Demand and Days of Inventory to end users.
select Issue.MATERIAL_NUMBER,
SUM(Issue.SHIPPED_QTY)AS DEMAND_QTY,
Main.QUANTITY_TOTAL_STOCK / SUM(Issue.SHIPPED_QTY) * 122 AS [DOI]
From AGS_DATAMART.dbo.GOODS_ISSUE AS Issue
join AGS_DATAMART.dbo.OPR_MATERIAL_DIM AS MAT on MAT.MATERIAL_NUMBER = Issue.MATERIAL_NUMBER
join AGS_DATAMART.dbo.SCE_ECC_MAIN_FINAL_INV_FACT AS MAIN on MAT.MATERIAL_SID = MAIN.MATERIAL_SID
join AGS_DATAMART.dbo.SCE_PLANT_DIM AS PLANT on PLANT.PLANT_SID = MAIN.PLANT_SID
Where Issue.SHIP_TO_CUSTOMER_ID = #CUSTID
and Issue.ACTUAL_PGI_DATE > GETDATE() - 122
and PLANT.PLANT_CODE = #CUSTPLANT
and MAIN.STORAGE_LOCATION = '0001'
Group by Issue.MATERIAL_NUMBER,Main.QUANTITY_TOTAL_STOCK
Pretty Simple.
But is has come to my attention, that they have similar Material Numbers whos values need to be combined.
Material | Qty
0242-55161W 1
0242-55161 3
The two Material Numbers above should be combined and reported as 0242-55161 Qty 4.
How do I combine rows like this? This is just 1 of many queries that will need to be adjusted. Is it possible?
EDIT - The similar material will always be the base number plus the "W", if that matters.
Please note I am brand new to SQL and SSRS, and this is my first time posting here.
Let me know if I need to include any other details.
Thanks in advance.
Answer;
Using just replace, it kept returning 2 unique lines even when using SUM.
I was able to get the desired result using the following. Can you see anything wrong with this method?
with Issue_Con AS
(
select replace(Issue.MATERIAL_NUMBER,'W','') As [MATERIAL_NUMBER],
Issue.SHIPPED_QTY AS [SHIPPED_QTY]
From AGS_DATAMART.dbo.GOODS_ISSUE AS Issue
Where Issue.SHIP_TO_CUSTOMER_ID = #CUSTSHIP
and Issue.SALES_ORDER_TYPE_CODE = 'ZTPC'
and Issue.ACTUAL_PGI_DATE > GETDATE() - 122
)
select Issue_Con.MATERIAL_NUMBER,
SUM(Issue_Con.SHIPPED_QTY)AS [DEMAND_QTY],
Main_Con.QUANTITY_TOTAL_STOCK / SUM(Issue_Con.SHIPPED_QTY) * 122 AS [DOI]
From Issue_Con
join Main_Con on Main_Con.MATERIAL_Number = Issue_Con.MATERIAL_Number
Group By Issue_Con.MATERIAL_NUMBER, Main_Con.QUANTITY_TOTAL_STOCK;
You need to replace Issue.MATERIAL_NUMBER in the select and group by with something else. What that something else is depends on your data.
If it's always 10 digits with anything afterwards ignored, then you can use substr(Issue.MATERIAL_NUMBER, 1, 10)
If the extraneous character is always W and there are no Ws in the proper number, then you can use replace(Issue.MATERIAL_NUMBER, 'W', '')
If it's anything from the first alphabetic character, then you can use case when patindex('%[A-Za-z]%', Issue.MATERIAL_NUMBER) = 0 then Issue.MATERIAL_NUMBER else substr(Issue.MATERIAL_NUMBER, 1, patindex('%[A-Za-z]%', Issue.MATERIAL_NUMBER)) end
You could group your data by this expression instead of MATERIAL_NUMBER:
CASE SUBSTRING(MATERIAL_NUMBER, LEN(MATERIAL_NUMBER), 1)
WHEN 'W' THEN LEFT(MATERIAL_NUMBER, LEN(MATERIAL_NUMBER) - 1)
ELSE MATERIAL_NUMBER
END
That is, check if the last character is W. If it is, return all but the last character, otherwise return the entire value.
To avoid repeating the same expression twice (once in GROUP BY and once in SELECT) you could use a subselect, for example like this:
select Issue.MATERIAL_NUMBER_GROUP,
SUM(Issue.SHIPPED_QTY)AS DEMAND_QTY,
Main.QUANTITY_TOTAL_STOCK / SUM(Issue.SHIPPED_QTY) * 122 AS [DOI]
From (
SELECT
*,
CASE SUBSTRING(MATERIAL_NUMBER, LEN(MATERIAL_NUMBER), 1)
WHEN 'W' THEN LEFT(MATERIAL_NUMBER, LEN(MATERIAL_NUMBER) - 1)
ELSE MATERIAL_NUMBER
END AS MATERIAL_NUMBER_GROUP
FROM AGS_DATAMART.dbo.GOODS_ISSUE
) AS Issue
join AGS_DATAMART.dbo.OPR_MATERIAL_DIM AS MAT on MAT.MATERIAL_NUMBER = Issue.MATERIAL_NUMBER
join AGS_DATAMART.dbo.SCE_ECC_MAIN_FINAL_INV_FACT AS MAIN on MAT.MATERIAL_SID = MAIN.MATERIAL_SID
join AGS_DATAMART.dbo.SCE_PLANT_DIM AS PLANT on PLANT.PLANT_SID = MAIN.PLANT_SID
Where Issue.SHIP_TO_CUSTOMER_ID = #CUSTID
and Issue.ACTUAL_PGI_DATE > GETDATE() - 122
and PLANT.PLANT_CODE = #CUSTPLANT
and MAIN.STORAGE_LOCATION = '0001'
Group by Issue.MATERIAL_NUMBER_GROUP,Main.QUANTITY_TOTAL_STOCK