SQL Query - combine 2 rows into 1 row - sql

I have the following query below (view) in SQL Server. The query produces a result set that is needed to populate a grid. However, a new requirement has come up where the users would like to see data on one row in our app. The tblTasks table can produce 1 or 2 rows. The issue becomes when they're is two rows that have the same job_number but different fldProjectContextId (1 or 31). I need to get the MechApprovalOut and ElecApprovalOut columns on one row instead of two.
I've tried restructuring the query using CTE and over partition and haven't been able to get the necessary results I need.
SELECT TOP (100) PERCENT
CAST(dbo.Job_Control.job_number AS int) AS Job_Number,
dbo.tblTasks.fldSalesOrder, dbo.tblTaskCategories.fldTaskCategoryName,
dbo.Job_Control.Dwg_Sent, dbo.Job_Control.Approval_done,
dbo.Job_Control.fldElecDwgSent, dbo.Job_Control.fldElecApprovalDone,
CASE WHEN DATEDIFF(day, dbo.Job_Control.Dwg_Sent, GETDATE()) > 14
AND dbo.Job_Control.Approval_done IS NULL
AND dbo.tblProjectContext.fldProjectContextID = 1
THEN 1 ELSE 0
END AS MechApprovalOut,
CASE WHEN DATEDIFF(day, dbo.Job_Control.fldElecDwgSent, GETDATE()) > 14
AND dbo.Job_Control.fldElecApprovalDone IS NULL
AND dbo.tblProjectContext.fldProjectContextID = 31
THEN 1 ELSE 0
END AS ElecApprovalOut,
dbo.tblProjectContext.fldProjectContextName,
dbo.tblProjectContext.fldProjectContextId, dbo.Job_Control.Drawing_Info,
dbo.Job_Control.fldElectricalAppDwg
FROM dbo.tblTaskCategories
INNER JOIN dbo.tblTasks
ON dbo.tblTaskCategories.fldTaskCategoryId = dbo.tblTasks.fldTaskCategoryId
INNER JOIN dbo.Job_Control
ON dbo.tblTasks.fldSalesOrder = dbo.Job_Control.job_number
INNER JOIN dbo.tblProjectContext
ON dbo.tblTaskCategories.fldProjectContextId = dbo.tblProjectContext.fldProjectContextId
WHERE (dbo.tblTaskCategories.fldTaskCategoryName = N'Approval'
OR dbo.tblTaskCategories.fldTaskCategoryName = N'Re-Approval')
AND (CASE WHEN DATEDIFF(day, dbo.Job_Control.Dwg_Sent, GETDATE()) > 14
AND dbo.Job_Control.Approval_done IS NULL
AND dbo.tblProjectContext.fldProjectContextID = 1
THEN 1 ELSE 0
END = 1)
OR (dbo.tblTaskCategories.fldTaskCategoryName = N'Approval'
OR dbo.tblTaskCategories.fldTaskCategoryName = N'Re-Approval')
AND (CASE WHEN DATEDIFF(day, dbo.Job_Control.fldElecDwgSent, GETDATE()) > 14
AND dbo.Job_Control.fldElecApprovalDone IS NULL
AND dbo.tblProjectContext.fldProjectContextID = 31
THEN 1 ELSE 0
END = 1)
ORDER BY dbo.Job_Control.job_number, dbo.tblTaskCategories.fldProjectContextId
The above query gives me the following result set:
I've created a work around via code (which I don't like but it works for now) where i've used code to populate a "temp" table the way i need it to display the data, that is, one record if duplicate job numbers to get the MechApprovalOut and ElecApprovalOut columns on one row (see first record in following screen shot).
Example:
With the desired result set and one row per job_number, this is how the form looks with the data and how I am using the result set.
Any help restructuring my query to combine duplicate rows with the same job number where MechApprovalOut and ElecApproval out columns are on one row is greatly appreciated! I'd much prefer to use a view on SQL then code in the app to populate a temp table.
Thanks,
Jimmy

What I would do is LEFT JOIN the main table to itself at the beginning of the query, matching on Job Number and Sales Order, such that the left side of the join is only looking at Approval task categories and the right side of the join is only looking at Re-Approval task categories. Then I would make extensive use of the COALESCE() function to select data from the correct side of the join for use later on and in the select clause. This may also be the piece you were missing to make a CTE work.
There is probably also a solution that uses a ranking/windowing function (maybe not RANK itself, but something that category) along with the PARTITION BY clause. However, as those are fairly new to Sql Server I haven't used them enough personally to be comfortable writing an example solution for you without direct access to the data to play with, and it would still take me a little more time to get right than I can devote to this right now. Maybe this paragraph will motivate someone else to do that work.

Related

Is hiding all rows of duplicated data possible?

I am building a report using a SQL Stored Parameter that I created. It pulls data that may have 1 line per unique number, or 2 lines per unique number. I want to only show rows where there is only 1 line.
I have tried changing visibility, but the closest I have come to is
=IIF(Fields!uniquenumber.Value = previous(Fields!uniquenumber.value,True,False)
but this hides 1 of the rows, and not both. I have also tried using CASE WHEN in the query to identify if there are more than 1 line, but I still haven't been able to hide when there is more than 1 line. My query is below (redacted quite a few lines of extra criteria that aren't relevant to my question here. There are quite a few instances where the first part of my WHERE statement will have data that also meets the criteria of the 2nd part.
SELECT
Reorders.LastRxNo,
Reorders_1.LastRxNo AS Reorders1LastRxNo,
Rxs_2.RxBatch AS Rxs2RxBatch,
Reorders.FacID,
KeyIdentifiers.GPI,
Reorders.LastFillDt,
Rxs_1.RxBatch as Rxs1RxBatch,
CASE
WHEN (Reorders.LastRxNo<>Reorders_1.LastRxNo) THEN 1
ELSE 0
END AS Duplicate
FROM Reorders
LEFT OUTER JOIN Rxs AS Rxs_1
ON Reorders.LastRXNo = Rxs_1.RxNo
RIGHT OUTER JOIN KeyIdentifiers
ON Reorders.NDC = KeyIdentifiers.NDC
INNER JOIN Patients
ON Reorders.FacID = Patients.FacID
AND Reorders.PatID = Patients.PatID
LEFT OUTER JOIN Reorders AS Reorders_1
ON Reorders.FacID = Reorders_1.FacID
AND Reorders.PatID = Reorders_1.PatID
AND KeyIdentifiers.NDC = Reorders_1.NDC
LEFT OUTER JOIN Rxs AS Rxs_2
ON Reorders_1.LastRxNo = Rxs_2.RxNo
WHERE
Reorders.ProfileOnly = 1
AND Rxs_1.RxBatch IS NULL
AND Reorders.CutOffDt IS NULL
AND Reorders.PackType LIKE 'PHDEF%'
AND Reorders.Auto = 1
AND Reorders.PhRxStatus IS NULL
AND Reorders.LastFillDt > '01/01/2019'
AND (Reorders.LastRxNo = Reorders_1.LastRxNo
OR Reorders.LastRxNo <> Reorders_1.LastRxNo AND Rxs_2.RxBatch IN ('CF','GONE'))
ORDER BY Reorders.FacID, Patients.PatLName, Patients.PatFName
Yields Results Similar to:
LastRxNo Reorders1LastRxNo Rxs2RxBatch Duplicate
111 111 null 0
222 222 null 0
222 444 CF 1
In the above example, I only need to see the lines like line 1. Since "LastRxNo" 222 has 2 lines, I do not want to see it on my final report. (However, it is queried, because when LastRxNo =222 AND Reorders1LastRxNo = 222 and Rxs2RxBatch IS NULL would pull if I didn't account for it in my WHERE statement, I think. I am happy doing this any way possible, whether it be in the query or in SSRS report.
Replace your CASE ... as duplicate with
COUNT(*) OVER(PARTITION BY LastRxNo) AS LastRxNoCount
then
wrap the whole query in another select like this...
SELECT * FROM
(
your original query here including the change stated above
) q
WHERE q.LastRxNoCount =1
Just filter out all rows where the number of rows is greater than 1:
SELECT LastRxNo, RxBatch
FROM Reorders
WHERE RxBatch IN ('CF','GONE')
AND Reorders.LastRxNo IN
(SELECT LastRxNo FROM Reorders GROUP BY LastRxNo HAVING COUNT(LastRxNo) = 1)
You should just be able to add the last two lines into your WHERE clause.
Alternatively, you could add it to your JOIN conditions:
SELECT
Reorders.LastRxNo
FROM Reorders
INNER JOIN (
SELECT LastRxNo FROM Reorders
GROUP BY LastRxNo
HAVING COUNT(LastRxNo) = 1
) AS UniqueRxNo ON UniqueRxNo.LastRxNo = Reorders.LastRxNo
Yes, you can do it one of 2 ways:
is in your data source, do a distinct select.
Set the "Hide Duplicates" property on the detail line in the report properties

Count of how many times id occurs in table SQL regexp

Hi I have a redshift table of articles that has a field on it that can contain many accounts. So there is a one to many relationship between articles to accounts.
However I want to create a new view where it lists the partner id's in one column and in another column a count of how many times the partner id appears in the articles table.
I've attempted to do this using regex and created a new redshift view, but am getting weird results where it doesn't always build properly. So one day it will say a partner appears 15 times, then the next 17, then the next 15, when the partner id count hasn't actually changed.
Any help would be greatly appreciated.
SELECT partner_id,
COUNT(DISTINCT id)
FROM (SELECT id,
partner_ids,
SPLIT_PART(partner_ids,',',i) partner_id
FROM positron_articles a
LEFT JOIN util.seq_0_to_500 s
ON s.i < regexp_count (partner_ids,',') + 2
OR s.i = 1
WHERE i > 0
AND regexp_count (partner_ids,',') = 0
ORDER BY id)
GROUP BY 1;
Let's start with some of the more obvious things and see if we can start to glean other information.
Next GROUP BY 1 on your outer query needs to be GROUP BY partner_id.
Next you don't need an order by in your INNER query and the database engine will probably do a better job optimizing performance without it so remove ORDER BY id.
If you want your final results to be ordered then add an ORDER BY partner_id or similar clause after your group by of your OUTER query.
It looks like there are also problems with how you are splitting a partnerid from partnerids but I am not positive about that because I need to understand your view and the data it provides to know how that affects your record count for partnerid.
Next your LEFT JOIN statement on the util.seq_0_to_500 I am pretty sure you can drop off the s.i = 1 as the first condition will satisfy that as well because 2 is greater than 1. However your left join really acts more like an inner join because you then exclude any non matches from positron_articles that don't have a s.i > 0.
Oddly then your entire join and inner query gets kind of discarded because you only want articles that have no commas in their partnerids: regexp_count (partner_ids,',') = 0
I would suggest posting the code for your util.seq_0_to_500 and if you have a partner table let use know about that as well because you can probably get your answer a lot easier with that additional table depending on how regexp_count works. I suspect regex_count(partnerids,partnerid) exampleregex_count('12345,678',1234) will return greater than 0 at which point you have no choice but to split the delimited strings into another table before counting or building a new matching function.
If regex_count only matches exact between commas and you have a partner table your query could be as easy as this:
SELECT
p.partner_id
,COUNT(a.id) AS ArticlesAppearedIn
FROM
positron_articles a
LEFT JOIN PARTNERTABLE p
ON regexp_count(a.partnerids,p.partnerid) > 0
GROUP BY
p.partner_id
I will actually correct myself as I just thought of a way to join a partner table without regexp_count. So if you have a partner table this might work for you. If not you will need to split strings. It basically tests to see if the partnerid is the entire partnerids, at the beginning, in the middle, or at the end of partnerids. If one of those is met then the records is returned.
SELECT
p.partner_id
,COUNT(a.id) AS ArticlesAppearedIn
FROM
PARTNERTABLE p
INNER JOIN positron_articles a
ON
(
CASE
WHEN a.partnerids = CAST(p.partnerid AS VARCHAR(100)) THEN 1
WHEN a.partnerids LIKE p.partnerid + ',%' THEN 1
WHEN a.partnerids LIKE '%,' + p.partnerid + ',%' THEN 1
WHEN a.partnerids LIKE '%,' + p.partnerid THEN 1
ELSE 0
END
) = 1
GROUP BY
p.partner_id

Using Case Statement to replace value in one field based on value in a seperate field

I'm creating a report in SSRS and I can't use the IIF statement within the report to get the following done. I'm getting aggregate errors when I try to sum within the SSRS report.
`IIF(Fields!Period=0,0,IIF(Period=13,0,Balance/12))`
Works fine up until the moment I try to Sum.. get a silly aggregate error "Aggregate functions other than First, Last, Previous, Count, and Count Distinct can only aggregate data of a single data type"... These are all integers.
Basically I have a value in Master.Balance that I need to divide by 12 only when Secondary.Period equals 0 or 13. If Secondary.Period equals 0 or 13 then the value should be 0. I know my problem has to do with including the relationship between the tables, but I just don't know how to write that in.
Here is what I'm trying to use:
`CASE
WHEN Secondary.Period=0 OR Secondary.Period=13
THEN 0
ELSE Master.Balance/12
End As BudByPer`
Here is how the two tables are related to each other:
`FROM Master LEFT OUTER JOIN Secondary
ON Master.Project = Secondary.Project
AND Master.object = Secondary.object
AND Master.org = Secondary.org
`
How do I get the above into this:
SELECT DISTINCT Master.Project, Master.Object, Master.Fund, Master.Segment, Master.Balance, Secondary.project, Secondary.object, Secondary.org, Secondary.Period, Secondary.object, Secondary.Project.
FROM Master LEFT OUTER JOIN Secondary
ON Master.Project = Secondary.Project
AND Master.object = Secondary.object
AND Master.org = Secondary.org
WHERE (Master.object>=600000)
ORDER BY [Master.Fund]
You just need a select, it looks fine to me...
SELECT
Master.account,
Master.segment,
Secondary.desc,
Secondary.bud,
Segment.Num,
Segment.office,
CASE
WHEN Secondary.Period=0 OR Secondary.Period=13 THEN 0
ELSE Master.Balance/12
End As BudByPer
FROM Master
LEFT JOIN Secondary
ON Master.Project = Secondary.Project
AND Master.object = Secondary.object
AND Master.org = Secondary.org

SQL SUM function doubling the amount it should using multiple tables

My query below is doubling the amount on the last record it returns. I have 3 tables - activities, bookings and tempbookings. The query needs to list the activities and attached information and pull the total number (using the SUM) of places booked (as BookingTotal) from the booking table by each activity and then it needs to calculate the same for tempbookings (as tempPlacesReserved) providing the reservedate field inside that table is in the future.
However the first issue is that if there are no records for an activity in the tempbookings table it does not return any records for that activity at all, to get around this i created dummy records in the past so that it still returns the record, but if I can make it so I don't have to do this I would prefer it!
The main issue I have is that on the final record of the returned results it doubles the booking total and the places reserved which of course makes the whole query useless.
I know that I am doing something wrong I just haven't been able to sort it, I have searched similar issues online but am unable to apply them to my situation correctly.
Any help would be appreciated.
P.S. I'm aware that normally you wouldn't need to fully label all the paths to the databases, tables and fields as I have but for the program I am planning to use it in I have to do it this way.
Code:
SELECT [LeisureActivities].[dbo].[activities].[activityID],
[LeisureActivities].[dbo].[activities].[activityName],
[LeisureActivities].[dbo].[activities].[activityDate],
[LeisureActivities].[dbo].[activities].[activityPlaces],
[LeisureActivities].[dbo].[activities].[activityPrice],
SUM([LeisureActivities].[dbo].[bookings].[bookingPlaces]) AS 'bookingTotal',
SUM (CASE WHEN[LeisureActivities].[dbo].[tempbookings].[tempReserveDate] > GetDate() THEN [LeisureActivities].[dbo].[tempbookings].[tempPlaces] ELSE 0 end) AS 'tempPlacesReserved'
FROM [LeisureActivities].[dbo].[activities],
[LeisureActivities].[dbo].[bookings],
[LeisureActivities].[dbo].[tempbookings]
WHERE ([LeisureActivities].[dbo].[activities].[activityID]=[LeisureActivities].[dbo].[bookings].[activityID]
AND [LeisureActivities].[dbo].[activities].[activityID]=[LeisureActivities].[dbo].[tempbookings].[tempActivityID])
AND [LeisureActivities].[dbo].[activities].[activityDate] > GetDate ()
GROUP BY [LeisureActivities].[dbo].[activities].[activityID],
[LeisureActivities].[dbo].[activities].[activityName],
[LeisureActivities].[dbo].[activities].[activityDate],
[LeisureActivities].[dbo].[activities].[activityPlaces],
[LeisureActivities].[dbo].[activities].[activityPrice];
Your current query is using an INNER JOIN between each of the tables so if the tempBookings table has no records, you will not return anything.
I would advise that you start to use JOIN syntax. You might also need to use subqueries to get the totals.
SELECT a.[activityID],
a.[activityName],
a.[activityDate],
a.[activityPlaces],
a.[activityPrice],
coalesce(b.bookingTotal, 0) bookingTotal,
coalesce(t.tempPlacesReserved, 0) tempPlacesReserved
FROM [LeisureActivities].[dbo].[activities] a
LEFT JOIN
(
select activityID,
SUM([bookingPlaces]) AS bookingTotal
from [LeisureActivities].[dbo].[bookings]
group by activityID
) b
ON a.[activityID]=b.[activityID]
LEFT JOIN
(
select tempActivityID,
SUM(CASE WHEN [tempReserveDate] > GetDate() THEN [tempPlaces] ELSE 0 end) AS tempPlacesReserved
from [LeisureActivities].[dbo].[tempbookings]
group by tempActivityID
) t
ON a.[activityID]=t.[tempActivityID]
WHERE a.[activityDate] > GetDate();
Note: I am using aliases because it is easier to read
Use new SQL-92 Join syntax, and make join to tempBookings an outer join. Also clean up your sql with table aliases. Makes it easier to read. As to why last row has doubled values, I don't know, but on off chance that it is caused by extra dummy records you entered. get rid of them. That problem is fixed by using outer join to tempBookings. The other possibility is that the join conditions you had to the tempBookings table(t.tempActivityID = a.activityID) is insufficient to guarantee that it will match to only one record in activities table... If, for example, it matches to two records in activities, then the rows from Tempbookings would be repeated twice in the output, (causing the sum to be doubled)
SELECT a.activityID, a.activityName, a.activityDate,
a.activityPlaces, a.activityPrice,
SUM(b.bookingPlaces) bookingTotal,
SUM (CASE WHEN t.tempReserveDate > GetDate()
THEN t.tempPlaces ELSE 0 end) tempPlacesReserved
FROM LeisureActivities.dbo.activities a
Join LeisureActivities.dbo.bookings b
On b.activityID = a.activityID
Left Join LeisureActivities.dbo.tempbookings t
On t.tempActivityID = a.activityID
WHERE a.activityDate > GetDate ()
GROUP BY a.activityID, a.activityName,
a.activityDate, a.activityPlaces,
a.activityPrice;

SQL Server Update via Select Statement

I have the following sql statement and I want to update a field on the rows returned from the select statement. Is this possible with my select? The things I have tried are not giving me the desired results:
SELECT
Flows_Flows.FlowID,
Flows_Flows.Active,
Flows_Flows.BeatID,
Flows_Flows.FlowTitle,
Flows_Flows.FlowFileName,
Flows_Flows.FlowFilePath,
Flows_Users.UserName,
Flows_Users.DisplayName,
Flows_Users.ImageName,
Flows_Flows.Created,
SUM(CASE WHEN [Like] = 1 THEN 1 ELSE 0 END) AS Likes,
SUM(CASE WHEN [Dislike] = 1 THEN 1 ELSE 0 END) AS Dislikes
FROM Flows_Flows
INNER JOIN Flows_Users ON Flows_Users.UserID = Flows_Flows.UserID
LEFT JOIN Flows_Flows_Likes_Dislikes ON
Flows_Flows.FlowID=Flows_Flows_Likes_Dislikes.FlowID
WHERE Flows_Flows.Active = '1' AND Flows_Flows.Created < DATEADD(day, -60, GETDATE())
Group By Flows_Flows.FlowID, Flows_Flows.Active, Flows_Flows.BeatID,
Flows_Flows.FlowTitle, Flows_Flows.FlowFileName, Flows_Flows.FlowFilePath,
Flows_Users.UserName, Flows_Users.DisplayName, Flows_Users.ImageName,
Flows_Flows.Created
Having SUM(CASE WHEN [Like] = 1 THEN 1 ELSE 0 END) = '0' AND SUM(CASE WHEN [Dislike] = 1
THEN 1 ELSE 0 END) >= '0'
This select statement returns exactly what I need but I want to change the Active field from 1 to 0.
yes - the general structure might be like this: (note you don't declare your primary key)
UPDATE mytable
set myCol = 1
where myPrimaryKey in (
select myPrimaryKey from mytable where interesting bits happen here )
Because you haven't made your question more clear in what result you want to achieve, I'll provide an answer with my own assumptions.
Assumption
You have a select statement that gives you stuffs, and it works as desired. What you want it to do is to make it return results and update those selected rows on the fly - basically like saying "find X, tell me about X and make it Y".
Anwser
If my assumption is correct, unfortunately I don't think there is any way you can do that. A select does not alter the table, it can only fetch information. Similarly, an update does not provide more detail than the number of rows updated.
But don't give up yet, depending on the result you want to achieve, you have alternatives.
Alternatives
If you just want to update the rows that you have selected, you can
simply write an UPDATE statement to do that, and #Randy has provided
a good example of how it will be written.
If you want to reduce calls to server, meaning you want to make just
one call to the server and get result, as well as to update the
rows, you can write store procedures to do that.
Store procedures are like functions you wrote in programming languages. It essentially defines a set of sql operations and gives them a name. Each time you call that store procedure, the set of operations gets executed with supplied inputs, if any.
So if you want to learn more about store procedures you can take a look at:
http://www.mysqltutorial.org/introduction-to-sql-stored-procedures.aspx
If I understand correctly you are looking for a syntax to be able to select the value of Active to be 0 if it is 1. The syntax for something like that is
SELECT
Active= CASE WHEN Active=1 THEN 0 ELSE Active END
FROM
<Tables>
WHERE
<JOIN Conditions>