Missing Data in Query Results

Missing Data in Query Results - sql

I've inherited a database from someone who has since left my office, and I'm having trouble correcting an issue that's come up with some of the data being pulled.
I've been able to narrow it down to the issue only being with the query that's used to pull data to aggregate how many entries there were for a business and the percentage of those entries.
Shown in this first two images, there is data for the General Returns and the NOC entries:
[NOC Entries]
[General Returns]
But when applying the same date range, no data comes up in the percentage calculation, when I believe there should be.
There are two query steps that make up how the percentage data is pulled. Step one pulls the following:
[Step 1]
But step 2 only pulls a very small part of the data:
[Step 2]
Here's the SQL Code for Part 1:
SELECT TBL_ACH_Customers.CUST_ACHName
, TBL_ACH_Returns_and_NOCs_Transactions.ACHReturns_Type
, COUNT(TBL_ACH_Returns_and_NOCs_Transactions.ACHReturns_Type) AS CountOfACHReturns_Type
FROM TBL_ACH_Returns_and_NOCs_Transactions
RIGHT JOIN TBL_ACH_Customers
ON TBL_ACH_Returns_and_NOCs_Transactions.CUST_ACHName = TBL_ACH_Customers.CUST_ACHName
WHERE (((TBL_ACH_Returns_and_NOCs_Transactions.ACHReturns_Date)
BETWEEN [Forms]![FRM_ACH_RPT_Return_Percentages]![startdate]
AND [Forms]![FRM_ACH_RPT_Return_Percentages]![enddate]))
GROUP BY TBL_ACH_Customers.CUST_ACHName
, TBL_ACH_Returns_and_NOCs_Transactions.ACHReturns_Type
ORDER BY TBL_ACH_Customers.CUST_ACHName;
And Part 2:
SELECT QRY_ACH_RPT_Return_Percentages_Step_1.CUST_ACHName
, QRY_ACH_RPT_Return_Percentages_Step_1.ACHReturns_Type
, QRY_ACH_RPT_Return_Percentages_Step_1.CountOfACHReturns_Type
, COUNT(TBL_ACH_Transactions.ACHTRAN_Amount) AS CountOfACHTRAN_Amount
, SUM(TBL_ACH_Transactions.ACHTRAN_NumEntries) AS SumOfACHTRAN_NumEntries
FROM TBL_ACH_Transactions
INNER JOIN QRY_ACH_RPT_Return_Percentages_Step_1
ON TBL_ACH_Transactions.CUST_ACHName = QRY_ACH_RPT_Return_Percentages_Step_1.CUST_ACHName
WHERE (((TBL_ACH_Transactions.ACHTRAN_EffectiveDate)
BETWEEN [Forms]![FRM_ACH_RPT_Return_Percentages]![startdate]
AND [Forms]![FRM_ACH_RPT_Return_Percentages]![enddate]))
GROUP BY QRY_ACH_RPT_Return_Percentages_Step_1.CUST_ACHName
, QRY_ACH_RPT_Return_Percentages_Step_1.ACHReturns_Type
, QRY_ACH_RPT_Return_Percentages_Step_1.CountOfACHReturns_Type;
Any help or guidance would be appreciated, and feel free to ask questions if you need more information!

Related

COUNT Clicks/Opens for Engagement Scoring

I am a bit rusty on SQL so any assistance is appreciated. I am also referencing my SQL textbook but I thought I would try this out.
I am developing a lead scoring model starting with engagement scoring. I created a data extension to house the results and used the following query to populate:
SELECT a.[opportunityid],
a.[first name],
a.[last name],
a.[anticipatedentryterm],
a.[funnelstage],
a.[programofinterest],
a.[opportunitystage],
a.[opportunitystatus],
a.[createdon],
a.[ownerfirstname],
a.[ownerlastname],
a.[f or j visa student],
a.[donotbulkemail],
a.[statecode],
Count(DISTINCT c.[subscriberkey]) AS 'Clicks',
Count(DISTINCT b.[subscriberkey]) AS 'Opens',
Count(DISTINCT b.[subscriberkey]) * 1.5 +
Count(DISTINCT c.[subscriberkey]) * 3 AS 'Probability'
FROM [ug_all_time_joined] a
INNER JOIN [open] b
ON a.[opportunityid] = b.[subscriberkey]
INNER JOIN [click] c
ON a.[opportunityid] = c.[subscriberkey]
GROUP BY a.[opportunityid],
a.[first name],
a.[last name],
a.[anticipatedentryterm],
a.[funnelstage],
a.[programofinterest],
a.[opportunitystage],
a.[opportunitystatus],
a.[createdon],
a.[ownerfirstname],
a.[ownerlastname],
a.[f or j visa student],
a.[donotbulkemail],
a.[statecode]
Something is wrong with my COUNT functions, the query populates the same value in both Clicks and Opens and I don't think it's accurate. The result I am aiming for is how many times a subscriber id appears (which would correspond with the individual clicks/opens, each row is a 1 action).
Thank you!

Why is that surprising?
You have two joins that if you take to their logical conclusion imply that
b.[SubscriberKey] = c.[SubscriberKey]
Hence, counting distinct values will be the same.
You have not provide sample data or desired results. I can speculate, though, that you intend LEFT JOINs so you get some values in one table that are not matched in the other.

When you do an inner join, between a and b, your data is filtered when you join a and c, which will give you incorrect results. having no view of your data and no background of your tables, this is the best guess i have

Defaulting missing data

I have a complex set of schema that I am trying to pull data out of for a report. The query for it joins a bunch of tables together and I am specifically looking to pull a subset of data where everything for it might be null. The original relations for the tables look as such.
Location.DeptFK
Dept.PK
Section.DeptFK
Subsection.SectionFK
Question.SubsectionFK
Answer.QuestionFK, SubmissionFK
Submission.PK, LocationFK
From here my problems begin to compound a little.
SELECT Section.StepNumber + '-' + Question.QuestionNumber AS QuestionNumberVar,
Question.Question,
Subsection.Name AS Subsection,
Section.Name AS Section,
SUM(CASE WHEN (Answer.Answer = 0) THEN 1 ELSE 0 END) AS NA,
SUM(CASE WHEN (Answer.Answer = 1) THEN 1 ELSE 0 END) AS AnsNo,
SUM(CASE WHEN (Answer.Answer = 2) THEN 1 ELSE 0 END) AS AnsYes,
(select count(distinct Location.Abbreviation) from Department inner join Plant on location.DepartmentFK = Department.PK WHERE(Department.Name = 'insertParameter'))
as total
FROM Department inner join
section on Department.PK = section.DepartmentFK inner JOIN
subsection on Subsection.SectionFK = Section.PK INNER JOIN
question on Question.SubsectionFK = Subsection.PK INNER JOIN
Answer on Answer.QuestionFK = question.PK inner JOIN
Submission on Submission.PK = Answer.SubmissionFK inner join
Location on Location.DepartmentFK = Department.PK AND Location.pk = Submission.PlantFK
WHERE (Department.Name = 'InsertParameter') AND (Submission.MonthTested = '1/1/2017')
GROUP BY Question.Question, QuestionNumberVar, Subsection.Name, Section.Name, Section.StepNumber
ORDER BY QuestionNumberVar;
There are 15 total locations, with this query I get 12. If I remove a relation in the join for Location I get 15 total locations but my answer data gets multiplied by 15. My issue is that not all locations are required to test at the same time so their answers should default to NA, They don't get records placed in the DB so the relationship between Location/Submission is absent.
I have a workaround almost in place via the select count distinct but, The second part is a query for finding what each location answered instead of a sum which brings the problem right back around. It also has to be dynamic because the input parameters for a department won't bring a static number of locations back each time.
I am still learning my SQL so any additional material to look at for building this query would also be appreciated. So I guess the big question here is, How would I go about creating default data in this query for anytime the Location/Submission relation has a null value?
Edit: Dummy Data
QuestionNumberVar | Section | Subsection | Question | AnsYes | AnsNo | NA (expected)
1-1.1 Math Algebra Did you do your homework? 10 1 1(4)
1-1.2 Math Algebra Did your dog eat it? 9 3 0(3)
2-1.1 English Greek Did you do your homework? 8 0 4(7)
I have tried making left joins at various applicable portions of the code to no avail. All attempts at left joins have ended with no effect on info output. This query feeds into the Dataset for an SSRS report. There are a couple workarounds for this particular section via an expression to take total Locations and subtract AnsYes and AnsNo to get the true NA value but as explained above doesn't help with my next query.
Edit: SQL Server 2012 for those who asked
Edit: my attempt at an isnull() on the missing data returns nothing I suspect because the query already eliminates the "null/missing" data. Left joining while doing this has also failed. The point of failure is on Submissions. if we bind it to Locations there are locations missing but if we don't bind it there are multiplied duplicates because Department has a One-To-Many with Location and not vice versa. I am unable to make any schema changes to improve this process.
There is a previous report that I am trying to emulate/update. It used C# logic to process data and run multiple queries to attain the same data. I don't have this luxury. (previous report exports to excel directly instead of SSRS). Here is the previous logic used.
select PK from Department where Name = 'InsertParameter';
select PK from Submission where LocationFK = 'Location.PK_var' and MonthTested = '1/1/2017'
Then it runs those into a loop where it processes nulls into NA using C# logic
EDIT (Mediocre Solution): I ended up doing the workaround of making a calculated field that subtracts Yes and No from the total # of Locations that have that Dept. This is a mediocre solution because I didn't solve my original problem and made 3 datasets that should have been displayed as a singular dataset. One for question info, one for each locations answer and one for locations that didnt participate. If a true answer comes up I will check its validity but for now, Problem psuedo solved.

Most recent transaction date against a Works Order?

Apologies in advance for what will probably be a very stupid question but I've been using Google to teach myself SQL after making the move from years of using Crystal Reports.
We have Works Orders which can have numerous transactions against them. I want to find the most recent one and have it returned against the Works Order number (which is a unique ID)? I attempted to use MAX but that just returns whatever the Transaction Date for that record is.
I think my struggles may be caused by a lack of understanding of grouping in SQL. In Crystal it was just 'choose what to group by' but for some reason in SQL I seem to be forced to group by all selected fields.
My ultimate goal is to be able to compare the planned end date of the Works Order ("we need to finish this job by then") vs when the last transaction was booked against the Works Order, so that I can create an OTIF KPI.
I've attached an image of what I'm currently seeing in SQL Server 2014 Management Studio and below is my attempt at the query.
SELECT wip.WO.WO_No
, wip.WO.WO_Type
, stock.Stock_Trans_Log.Part_No
, stock.Stock_Trans_Types.Description
, stock.Stock_Trans_Log.Qty_Change
, stock.Stock_Trans_Log.Trans_Date
, wip.WO.End_Date
, wip.WO.Qty - wip.WO.Qty_Stored AS 'Qty remaining'
, MAX(stock.Stock_Trans_Log.Trans_Date) AS 'Last Production Receipt'
FROM stock.Stock_Trans_Log
INNER JOIN production.Part
ON stock.Stock_Trans_Log.Part_No = production.Part.Part_No
INNER JOIN wip.WO
ON stock.Stock_Trans_Log.WO_No = wip.WO.WO_No
INNER JOIN stock.Stock_Trans_Types
ON stock.Stock_Trans_Log.Tran_Type = stock.Stock_Trans_Types.Type
WHERE (stock.Stock_Trans_Types.Type = 10)
AND (stock.Stock_Trans_Log.Store_Code <> 'BI')
GROUP BY wip.WO.WO_No
, wip.WO.WO_Type
, stock.Stock_Trans_Log.Part_No
, stock.Stock_Trans_Types.Description
, stock.Stock_Trans_Log.Qty_Change
, stock.Stock_Trans_Log.Trans_Date
, wip.WO.End_Date
, wip.WO.Qty - wip.WO.Qty_Stored
HAVING (stock.Stock_Trans_Log.Part_No BETWEEN N'2Z' AND N'9A')
Query + results

If my paraphrase is correct, you could use something along the following lines...
WITH
sequenced_filtered_stock_trans_log AS
(
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY WO_No
ORDER BY Trans_Date DESC) AS reversed_sequence_id
FROM
stock.Stock_Trans_Log
WHERE
Type = 10
AND Store_Code <> 'BI'
AND Part_No BETWEEN N'2Z' AND N'9A'
)
SELECT
<stuff>
FROM
sequenced_filtered_stock_trans_log AS stock_trans_log
INNER JOIN
<your joins>
WHERE
stock_trans_log.reversed_sequence_id = 1
First, this will apply the WHERE clause to filter the log table.
After the WHERE clause is applied, a sequence id is calculated. Restarting from one for each partition (each WO_No), and starting from the highest Trans_Date.
Finally, that can be used in your outer query with a WHERE clause that specifies that you only want the records with sequence id one, this it the most recent row per WO_No. The rest of the joins on to that table would proceed as normal.
If there is any other filtering that should be done (through joins or any other means) that should all be done before the application of the ROW_NUMBER().

Oracle Subquery - How to?

I have 2 tables that look like:
GL_ROLLUP
Responsibility Responsibility_L3
15500 15B
15445 15C
15515 15B
15494 15C
15600 15D
GL_Detail
Responsibility Expense Amount
15500 6501 30.51
15445 6508 75.60
15515 6535 45.68
15494 6508 65.50
15600 6505 84.39
My query right now is pulling the data from the GL_Detail table; however what I want to be able to due is use the search parameter Responsibility_L3 found in GL_ROLLUP so I can query all Responsibilities within this rolled up value (instead of doing a range or a " = '15500' or....."). I tried doing this a few times, but my query seems to get stuck and never brings data back.
Any help would be greatly appreciated!

SELECT d.Responsibility, r.Responsibility_L3, d.Expense, d.Amount
FROM GL_Detail d LEFT JOIN GL_ROLLUP r ON r.Responsibility = d.Responsibility
ORDER BY Responsibility_L3
Does that get you what you are looking for?

The required query is a simple one:
select d.responsibility
, d.expense
, d.amount
from GL_ROLLUP r
inner join GL_Detail d
on d.responsibility = r.responsibility
where r.responsibility_l3 = :l3_search_param -- use appropriate notation
"my query seems to get stuck and never brings data back"
If the above query suffers the same fate, that's a performance issue. You will need to post an explain plan, details of data volumes,indexes, etc

Type conversion failure in update query

I'm fairly new to Access so this is driving me a little crazy.
I'm creating an inventory database and want to count the number of items in stock to update an ordering form. Received items are assigned an order code, and I want to count the number of instances of each order code found within the master table. I have a make table query which does this just fine:
SELECT PrimerList.PrimerName
, First(Primer_Master.FR) AS FR
, Primer_Master.OrderCode
, Count(Primer_Master.OrderCode) AS InStock
INTO PrimerOrder
FROM PrimerList
LEFT JOIN Primer_Master ON PrimerList.ID = Primer_Master.PrimerName
GROUP BY PrimerList.PrimerName
, Primer_Master.OrderCode
, Primer_Master.PrimerName
, Primer_Master.FR
, Primer_Master.Finished
HAVING ((([Primer_Master]![Finished])=No));
I want to use PrimerOrder to update an order list table PrimerOrderList which has all of the different possible order codes, updating the InStock value for records with matching OrderCode:
UPDATE PrimerOrderList
SET PrimerOrderList.InStock = PrimerOrder.InStock
WHERE (((PrimerOrderList.OrderCode)=[PrimerOrder].[OrderCode]));
However, when I try to run it I get parameter boxes which pop-up asking for PrimerOrder.OrderCode and PrimerOrderList.OrderCode. Even if I put in a valid value for each, I get a type conversion failure. I've checked the data types for both tables and don't see how there could be a type conversion failure - both are set to text.
Any insight would be greatly appreciated! Thanks in advance!

You haven't included the PrimerOrder table in your query. Should be:
UPDATE PrimerOrderList INNER JOIN PrimerOrder
ON PrimerOrderList.OrderCode = PrimerOrder.OrderCode
PrimerOrderList.InStock = PrimerOrder.InStock

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas