Report Builder 3.0 - How can I run this report with a large data set? - sql

I am new to developing reports with large amounts of data, so I am looking for some advice with a problem I am having.
I am developing a SSRS report with 6 parameters. Each Parameter has its' own data set that specifies a distinct list of values for the parameters.
The user will be able to choose as many values as they want for each parameter, except for 1 (the date).
The query looks something like;
SELECT [CURR].[PERIOD]
,[CURR].[PERIOD_MTD]
,[CURR].[CATEGORY1]
,[CURR].[CATEGORY2]
,[CURR].[CATEGORY3]
,[CURR].[CATEGORY4]
,[CURR].[CATEGORY5]
,[CURR].[CALCULATION1_ITD]
,[CURR].[CALCULATION2_ITD]
,[CURR].[CALCULATION3_ITD]
,[CURR].[CALCULATION4_ITD]
,[CURR].[CALCULATION1_MTD]
,[CURR].[CALCULATION2_MTD]
,[CURR].[CALCULATION3_MTD]
,[CURR].[CALCULATION4_MTD]
FROM [BIG_TABLE] [CURR] LEFT OUTER JOIN
[BIG_TABLE] [PREV M]
ON [CURR].[PERIOD_MTD] = [PREV M].[PERIOD]
AND [CURR].[CATEGORY1] = [PREV M].[CATEGORY1]
AND [CURR].[CATEGORY2] = [PREV M].[CATEGORY2]
AND [CURR].[CATEGORY3] = [PREV M].[CATEGORY3]
AND [CURR].[CATEGORY4] = [PREV M].[CATEGORY4]
AND [CURR].[CATEGORY5] = [PREV M].[CATEGORY5]
WHERE [CURR].[PERIOD] = #YYYYMM
AND [CURR].[CATEGORY1] IN (#PARAMETER1)
AND [CURR].[CATEGORY2] IN (#PARAMETER2)
AND [CURR].[CATEGORY3] IN (#PARAMETER3)
AND [CURR].[CATEGORY4] IN (#PARAMETER4)
AND [CURR].[CATEGORY5] IN (#PARAMETER5)
This is the problem I am having;
1 parameter has a distinct list of over 5,500 values that the user can choose from. When all of the values are selected, I notice that the parameter field does not populate like the others (Image below).
When the report runs, I get the following error:
This message is pretty ambiguous, but I isolated it to the fact that the report will run with fewer values in this parameter, but not all of them.
I am not sure what else to try. I am thinking this may be a matter of getting a large volume a data through the main data set.
Extra information:
Datasource is accessed with a shared connection to SharePoint
The table that is queried for this report has no indexes. I wonder if this is important because the table has roughly 26.5mill rows.

I would suggest your approach is wrong.
If you are presenting a user with a list of 5000+ items to choose from I'm guessing they would either choose small number of them or want to select all of them, it would be unlikely that they would sit there and choose 100 items from a list.
if this is the case then I would suggest appending an "ALL" option to the list (UNION to your original list) and then amend the query something like this...
WHERE [CURR].[PERIOD] = #YYYYMM
AND (#PARAMETER1 = 'ALL' OR [CURR].[CATEGORY1] IN (#PARAMETER1))
...
...
You might want to also read these previous SO questions and MS Article.
"IN" clause limitation in Sql Server
https://learn.microsoft.com/en-us/sql/t-sql/language-elements/in-transact-sql?view=sql-server-ver15

Related

Calculated Field in subreport

TL:DR I want a whole column of SQL equivalent of excel's COUNTIFS() Function.
I'm still quite new to MS Access, but I'm quite proficient with Excel. I've inherited an Access Database that tracks reasons for delays in a large logistics group, and am trying to build a report showing which reasons come up most.
So, I can output an SQL report showing all the reasons (they're in a table called 'Reasons', so that bit's easy). What I want is a calculated field next to that column, showing how many times each reason has been cited on the [Master Data] Table (field called 'Lateness Reason') in a given date range. And for double extra bonus points, the percentage they come up would be extremely handy.
I've looked online and found COUNTIFS equivalents for a single set of criteria, but in this case I want it calculating on each row of a report. I've also tried a few things myself, but the closest I could figure was:
SELECT Reasons.Reason, Count([Master Data].[ID]) AS Num
FROM Reasons INNER JOIN [Master Data] ON Reasons.[ID] = [Master Data].[Lateness Reason]
WHERE ((([Master Data].[Lateness Reason])=[Reasons].[Reason]));
which has incorrect syntax and possibly a few other problems (that WHERE clause has to apply equally to all lines, doesn't it?). My only other option might be to do a separate calculation for each line and 'union' them together, but that is likely to cause other problems in the future if more reasons get added (and there are quite a lot already).
Firstly, Is it Possible?
Secondly, If so, How??
Many Thanks in advance
EDIT: In response to comments, table structure is as follows;
Table "Reasons" has two columns; "Reason" and "ID"
Table "Master Data" has many columns, the ones I'd be bothered about are "Date", "ID" and "Lateness Reason" (Lateness reason is equivalent to an ID from table 'Reasons')
Table Reasons
ID Reason
___________________
1 | Stock Shortage
2 | Courier Problems
etc | etc
Master Data
ID Date Reason
__________________
1 | 01/01/1980 | 2
2 | 03/05/2020 | 2
etc
This is how I think your query should look like based on the information you provided.
SELECT Reasons.Reason, Count([Master Data].ID) AS Num
FROM Reasons INNER JOIN [Master Data] ON Reasons.ID = [Master Data].[Lateness Reason]
WHERE ((([Master Data].Date) Between [BeginDate] And [EndDate]))
GROUP BY Reasons.Reason;
Replace [BeginDate] and [EndDate] if you're using form control references.
As for the percentages based on the total, you can use text boxes as #June7 has suggested:
Use a total count in the header or footer of the report =Count(*) and then in the detail add a text box with the following control source: =[Num]/Count(*), where Num is the name of the textbox in your report which holds the counted values per reason. In this case, the control source for Num will be Num based on the query given.
Just a side note, naming your field Date is not advised since it's a reserved keyword in MS Access. It can cause unintended issues along the road.

How to query only old and duplicate data from a database in SQL

I'm trying to query my database to pull only duplicate/old data to write to a scratch section in excel (Using a macro passing SQL to the DB).
For now, I'm currently testing in Access alone to only filter out the old data.
First, I'm trying to filter my database by a specifed WorkOrder, RunNumber, and Row.
The code below only filters by Work Order, RunNumber, and Row. ...but SQL doesn't like when I tack on a 2nd AND statement; so this currently isn't working.
SELECT *
FROM DataPoints
WHERE (((DataPoints.[WorkOrder])=[WO2]) AND ((DataPoints.[RunNumber])=6) AND ((DataPoints.[Row]=1)
Once I figure that portion out....
Then if there is only 1 entry with specified WorkOrder, RunNumber, and Row, then I want filter it out. (its not needed in the scratch section, because its data is already written to the main section of my report)
If there are 2 or more entries with said criteria(WO, RN, and Row), then I want to filter out the newest entry based on RunDate and RunTime, and only keep all older entries.
For instance, in the clip below. The only item remaining in my filtered query will be the top entry with the timestamp 11:47:00AM.
.
Are there any recommended commands to complete this problem? Any ideas are helpful. Thank you.
I would suggest something along the lines of the following:
select t.*
from datapoints t
where
t.workorder = [WO2] and
t.runnumber = 6 and
t.row = 1 and
exists
(
select 1
from datapoints u
where
u.workorder = t.workorder and
u.runnumber = t.runnumber and
u.row = t.row and
(u.rundate > t.rundate or (u.rundate = t.rundate and u.runtime > t.runtime))
)
Here, if the correlated subquery within the where clause finds a record with the same workorder, runnumber and row, but with either a later rundate or the same rundate and a later runtime, then the record is returned by the main query.
You need two more )'s at the end of your code snippet. Or you can delete the parentheses completely in this example, MS Access will ad them back in as it deems necessary.
M.S. Access SQL can be tricky as it is not standards compliant and either doesn't allow for super complex queries, or it needs an ugly work around, like having a parentheses nesting nightmare when trying to join more than two tables.
For these reasons, I suggest using multiple Access queries to produce your results.

Query with Totals Query results as criteria returns the expected number of results squared

Background
I am building an Access 2010 database that has a table [ControllerAdjustments] that keeps track of all adjustments made to controllers with an [AdjustmentID] autonumber field, a [ControllerID] field, an [AdjustmentDate] field, [Setpoint] field, and a [Power] field. The [Power] field represents the power level when the adjustment was made. Ultimately I need two queries to return two sets of results, one query should return the current status of all controllers (basically the most recent adjustment made on each controller) and the other should return the most recent adjustment made on each controller where power level is 100%. I plan to use each of these queries to feed a report. Note: field names changed slightly for convenience when typing, full names given in the code blocks...
Method
I focused on the Current Query first, and figured I would just copy it and make necessary changes to create a 100% Query. I started with a totals query on the [ControllerAdjustments] table, that had [ControllerID] as a Group By field and [AdjustmentDate] as a field that returned the Max value. This query returns exactly the number of records I expected, and after reviewing the sample bogus data I put in the table to check it, it seems to return exactly the records I need. I then created a Select Query that returned all the fields I want in my Current Report, namely the [ControllerAdjustments] table and the related records in upstream related tables. I then set the criteria for the [ControllerID] field in my Select Query to equal [Total_CurrentContAdjs]![ControllerID] and the [AdjustmentDate] in the Select Query to [Total_CurrentContAdjs]![MaxOfAdjustmentDate]. Running this query returns exactly what I want. The SQL for this query is below:
SELECT List_Units.UnitID, List_EDTanks.TankNameShort, List_Controllers.ControllerType, ControllerAdjustments.AdjustmentDate, ControllerAdjustments.ControllerSetpoint, ControllerAdjustments.RxPower
FROM Total_ContAdjsCurrent, ((List_Units INNER JOIN List_EDTanks ON List_Units.UnitID = List_EDTanks.UnitID) INNER JOIN List_Controllers ON List_EDTanks.EDTankID = List_Controllers.EDTankID) INNER JOIN ControllerAdjustments ON List_Controllers.ControllerID = ControllerAdjustments.ControllerID
WHERE (((ControllerAdjustments.AdjustmentDate)=[Total_ContAdjsCurrent]![MaxOfAdjustmentDate]) AND ((ControllerAdjustments.ControllerID)=[Total_ContAdjsCurrent]![ControllerID]))
ORDER BY List_Units.Unit, List_EDTanks.TankSortOrder, List_Controllers.ControllerType DESC;
I then copied the Totals query and added a column for Power, selected Where, unchecked show, and put in 100 for criteria. This works as expected. I then copied my select query, and changed the criteria fields to direct to my new 100% Totals query. This is where my problems begin.
Problem
The second 100% Query does not seem to like the criteria, as it initially throws out the familiar parameter window. This is the SQL Statement for the second query, virtually the same except for referring to the 100% Totals query:
SELECT List_Units.UnitID, List_EDTanks.TankNameShort, List_Controllers.ControllerType, ControllerAdjustments.AdjustmentDate, ControllerAdjustments.ControllerSetpoint, ControllerAdjustments.RxPower
FROM Total_ContAdjsCurrent, Total_ContAdjsStdyState, ((List_Units INNER JOIN List_EDTanks ON List_Units.UnitID = List_EDTanks.UnitID) INNER JOIN List_Controllers ON List_EDTanks.EDTankID = List_Controllers.EDTankID) INNER JOIN ControllerAdjustments ON List_Controllers.ControllerID = ControllerAdjustments.ControllerID
WHERE (((ControllerAdjustments.AdjustmentDate)=[Total_ContAdjsStdyState]![MaxOfAdjustmentDate]) AND ((ControllerAdjustments.ControllerID)=[Total_ContAdjsStdyState]![ControllerID]))
ORDER BY List_Units.Unit, List_EDTanks.TankSortOrder, List_Controllers.ControllerType DESC;
Initially, Access did not add my Totals query into the show table box in design view, because its results were not directly used in the Select Query. So, I added the Totals query to the top, and that allowed my query to run without asking for parameters, but now it returns the number of results I was expecting squared. Basically if I am expecting 3 records: 1, 2, and 3, it is giving me: 1, 1, 1, 2, 2, 2, 3, 3, and 3. For the life of me I cannot figure out why it is doing this, especially because the exact same setup for my Current Query returns exactly what is expected... I thought maybe the where clause in my totals query had something to do with it, so I created a Select Query for the [ControllerAdjustments] table that returned all records with 100 for power. I then used this query for my totals query instead of the totals query itself, but this did not do anything different. I am at a loss, and not sure what else I can do to get the results I want. Any suggestions welcome, thank you!
I solved this by simply starting over from scratch and rebuilding my 100% query. Reviewing the SQL statements, they look identical, however for some reason my query now returns the right number of records. I have no idea why this worked, and am still curious what went wrong in the first place if anyone with time available cares to dig into it, but the original problem statement has been corrected--even if I have no idea how I did it haha...

SQL add up rows in a column

I'm running SQL queries in Orion Report Writer for Solarwinds Netflow Traffic Analyzer and am trying to add up data usage for specific conversations coming from the same general sources. In this case it is netflix. I've made some progress with my query.
SELECT TOP 10000 FlowCorrelation_Source_FlowCorrelation.FullHostname AS Full_Hostname_A,
SUM(NetflowConversationSummary.TotalBytes) AS SUM_of_Bytes_Transferred,
SUM(NetflowConversationSummary.TotalBytes) AS Total_Bytes
FROM
((NetflowConversationSummary LEFT OUTER JOIN FlowCorrelation FlowCorrelation_Source_FlowCorrelation ON (NetflowConversationSummary.SourceIPSort = FlowCorrelation_Source_FlowCorrelation.IPAddressSort)) LEFT OUTER JOIN FlowCorrelation FlowCorrelation_Dest_FlowCorrelation ON (NetflowConversationSummary.DestIPSort = FlowCorrelation_Dest_FlowCorrelation.IPAddressSort)) INNER JOIN Nodes ON (NetflowConversationSummary.NodeID = Nodes.NodeID)
WHERE
( DateTime BETWEEN 41539 AND 41570 )
AND
(
(FlowCorrelation_Source_FlowCorrelation.FullHostname LIKE 'ipv4_1.lagg0%')
)
GROUP BY FlowCorrelation_Source_FlowCorrelation.FullHostname, FlowCorrelation_Dest_FlowCorrelation.FullHostname, Nodes.Caption, Nodes.NodeID, FlowCorrelation_Source_FlowCorrelation.IPAddress
So I've got an output that filters everything but netflix sessions (Full_Hostname_A) and their total usage for each session (Sum_Of_Bytes_Transferred)
I want to add up Sum_Of_Bytes_Transferred to get a total usage for all netflix sessions
listed, which will output to Total_Bytes. I created the column Total_Bytes, but don't know how to output a total to it.
For some asked clarification, here is the output from the above query:
I want the Total_Bytes Column to be all added up into one number.
I have no familiarity with the reporting tool you are using.
From reading your post I'm thinking you want the the first 2 columns of data that you've got, plus at a later point in the report, a single figure being the sum of the total_bytes column you're already producing.
Your reporting tool probably has some means of totalling a column, but you may need to get the support people for the reporting tool to tell you how to do that.
Aside from this, if you can find a way of calling a separate query in a latter section of the report, or if you embed a new report inside your existing report, after the detail section, and use that to run a separate query then you should be able to get the data you want with this:
SELECT Sum(Total_Bytes) as [Total Total Bytes]
FROM ( yourExistingQuery ) x
yourExistingQuery means the query you've already got, in full (doesnt have to be put on one line), the paretheses are required, and so is the "x". (The latter provides a syntax-required name for the virtual table which your query defines).
Hope this helps.

SQL MIN() returns multiple values?

I am using SQL server 2005, querying with Web Developer 2010, and the min function appears to be returning more than one value (for each ID returned, see below). Ideally I would like it to just return the one for each ID.
SELECT Production.WorksOrderOperations.WorksOrderNumber,
MIN(Production.WorksOrderOperations.OperationNumber) AS Expr1,
Production.Resources.ResourceCode,
Production.Resources.ResourceDescription,
Production.WorksOrderExcel_ExcelExport_View.PartNumber,
Production.WorksOrderOperations.PlannedQuantity,
Production.WorksOrderOperations.PlannedSetTime,
Production.WorksOrderOperations.PlannedRunTime
FROM Production.WorksOrderOperations
INNER JOIN Production.Resources
ON Production.WorksOrderOperations.ResourceID = Production.Resources.ResourceID
INNER JOIN Production.WorksOrderExcel_ExcelExport_View
ON Production.WorksOrderOperations.WorksOrderNumber = Production.WorksOrderExcel_ExcelExport_View.WorksOrderNumber
WHERE Production.WorksOrderOperations.WorksOrderNumber IN
( SELECT WorksOrderNumber
FROM Production.WorksOrderExcel_ExcelExport_View AS WorksOrderExcel_ExcelExport_View_1
WHERE (WorksOrderSuffixStatus = 'Proposed'))
AND Production.Resources.ResourceCode IN ('1303', '1604')
GROUP BY Production.WorksOrderOperations.WorksOrderNumber,
Production.Resources.ResourceCode,
Production.Resources.ResourceDescription,
Production.WorksOrderExcel_ExcelExport_View.PartNumber,
Production.WorksOrderOperations.PlannedQuantity,
Production.WorksOrderOperations.PlannedSetTime,
Production.WorksOrderOperations.PlannedRunTime
If you can get your head around it, I am selecting certain columns from multiple tables where the WorksOrderNumber is also contained within a subquery, and numerous other conditions.
Result set looks a little like this, have blurred out irrelevant data.
http://i.stack.imgur.com/5UFIp.png (Wouldn't let me embed image).
The highlighted rows are NOT supposed to be there, I cannot explicitly filter them out, as this result set will be updated daily and it is likely to happen with a different record.
I have tried casting and converting the OperationNumber to numerous other data types, varchar type returns '100' instead of the '30'. Also tried searching search engines, no one seems to have the same problem.
I did not structure the tables (they're horribly normalised), and it is not possible to restructure them.
Any ideas appreciated, many thanks.
The MIN function returns the minimum within the group.
If you want the minimum for each ID you need to get group on just ID.
I assume that by "ID" you are referring to Production.WorksOrderOperations.WorksOrderNumber.
You can add this as a "table" in your SQL:
(SELECT Production.WorksOrderOperations.WorksOrderNumber,
MIN(Production.WorksOrderOperations.OperationNumber)
FROM Production.WorksOrderOperations
GROUP BY Production.WorksOrderOperations.WorksOrderNumber)