Sorting by newest date in joined query - sql

I have a query in MSSQL that needs modification but I am unable to get it working properly. The query now is the following:
SELECT Computer.Id AS ComputerId,
Concat(HardDisk.Id, ' ') disks
FROM Computer
LEFT JOIN HardDisk ON Computer.Id = HardDisk.ComputerId
LEFT JOIN DiskOperationLog ON DiskOperationLog.HardDiskId = HardDisk.Id
I need it to also check in the table DiskOperationLog for an EndTime column and if two DiskOperationLog columns with the same HardDisk.Id exists it only needs to select the DiskOperationLog with the newest date. Is this something you can do? I suspect it can be done using the max(DiskOperationLog.EndTime) but I am unable to get it properly included in my selection.
Any help is highly appreciated!

I need it to also check in the table DiskOperationLog for an EndTime column and if two DiskOperationLog columns with the same HardDisk.Id exists it only needs to select the DiskOperationLog with the newest date.
Your query doesn't seem to use DiskOperationLog -- not for filtering (the query uses LEFT JOIN) and not selecting any columns. Let me assume this is an oversight in the question.
In SQL Server, the simplest method to do what you want uses OUTER APPLY:
SELECT c.Id AS ComputerId, Concat(hd.Id, ' ') disks
FROM Computer c LEFT JOIN
HardDisk hd
ON c.Id = hd.ComputerId OUTER APPLY
(SELECT TOP (1) dol.*
FROM DiskOperationLog dol
WHERE dol.HardDiskId = hd.Id
ORDER BY dol.EndTime DESC
) dol;
APPLY implements a lateral join whihc is a lot like a correlated subquery, with the following differences:
The logic is in the FROM clause.
More than one column can be returned.
More than one row can be returned.

You can use a ROW_NUMBER() clause. You would want to partition by HardDisk.Id and order by DiskOperationLog.EndTime descending.
With Qry1 As (
SELECT Computer.Id AS ComputerId,
Concat(HardDisk.Id, ' ') disks,
DiskOperationLog.EndTime,
ROW_NUMBER() OVER(PARTITION BY HardDisk.Id ORDER BY DiskOperationLog.EndTime DESC) As Seq
FROM Computer
LEFT JOIN HardDisk
ON Computer.Id = HardDisk.ComputerId
LEFT JOIN DiskOperationLog
ON DiskOperationLog.HardDiskId = HardDisk.Id
)
SELECT Computer.Id AS ComputerId,
Concat(HardDisk.Id, ' ') disks,
DiskOperationLog.EndTime,
FROM Qry1
WHERE Seq = 1
BTW, if you're trying to get a list of comma-separated disk numbers in column #2, that is definitely not the way to do it.

Related

Sum fields of an Inner join

How I can add two fields that belong to an inner join?
I have this code:
select
SUM(ACT.NumberOfPlants ) AS NumberOfPlants,
SUM(ACT.NumOfJornales) AS NumberOfJornals
FROM dbo.AGRMastPlanPerformance MPR (NOLOCK)
INNER JOIN GENRegion GR ON (GR.intGENRegionKey = MPR.intGENRegionLink )
INNER JOIN AGRDetPlanPerformance DPR (NOLOCK) ON
(DPR.intAGRMastPlanPerformanceLink =
MPR.intAGRMastPlanPerformanceKey)
INNER JOIN vwGENPredios P โ€‹โ€‹(NOLOCK) ON ( DPR.intGENPredioLink =
P.intGENPredioKey )
INNER JOIN AGRSubActivity SA (NOLOCK) ON (SA.intAGRSubActivityKey =
DPR.intAGRSubActivityLink)
LEFT JOIN (SELECT RA.intGENPredioLink, AR.intAGRActividadLink,
AR.intAGRSubActividadLink, SUM(AR.decNoPlantas) AS
intPlantasTrabajads, SUM(AR.decNoPersonas) AS NumOfJornales,
SUM(AR.decNoPlants) AS NumberOfPlants
FROM AGRRecordActivity RA WITH (NOLOCK)
INNER JOIN AGRActividadRealizada AR WITH (NOLOCK) ON
(AR.intAGRRegistroActividadLink = RA.intAGRRegistroActividadKey AND
AR.bitActivo = 1)
INNER JOIN AGRSubActividad SA (NOLOCK) ON (SA.intAGRSubActividadKey
= AR.intAGRSubActividadLink AND SA.bitEnabled = 1)
WHERE RA.bitActive = 1 AND
AR.bitActive = 1 AND
RA.intAGRTractorsCrewsLink IN(2)
GROUP BY RA.intGENPredioLink,
AR.decNoPersons,
AR.decNoPlants,
AR.intAGRAActivityLink,
AR.intAGRSubActividadLink) ACT ON (ACT.intGENPredioLink IN(
DPR.intGENPredioLink) AND
ACT.intAGRAActivityLink IN( DPR.intAGRAActivityLink) AND
ACT.intAGRSubActivityLink IN( DPR.intAGRSubActivityLink))
WHERE
MPR.intAGRMastPlanPerformanceKey IN(4) AND
DPR.intAGRSubActivityLink IN( 1153)
GROUP BY
P.vchRegion,
ACT.NumberOfFloors,
ACT.NumOfJournals
ORDER BY ACT.NumberOfFloors DESC
However, it does not perform the complete sum. It only retrieves all the values โ€‹โ€‹of the columns and adds them 1 by 1, instead of doing the complete sum of the whole column.
For example, the query returns these results:
What I expect is the final sums. In NumberOfPlants the result of the sum would be 163,237 and of NumberJornales would be 61.
How can I do this?
First of all the (nolock) hints are probably not accomplishing the benefit you hope for. It's not an automatic "go faster" option, and if such an option existed you can be sure it would be already enabled. It can help in some situations, but the way it works allows the possibility of reading stale data, and the situations where it's likely to make any improvement are the same situations where risk for stale data is the highest.
That out of the way, with that much code in the question we're better served with a general explanation and solution for you to adapt.
The issue here is GROUP BY. When you use a GROUP BY in SQL, you're telling the database you want to see separate results per group for any aggregate functions like SUM() (and COUNT(), AVG(), MAX(), etc).
So if you have this:
SELECT Sum(ColumnB) As SumB
FROM [Table]
GROUP BY ColumnA
You will get a separate row per ColumnA group, even though it's not in the SELECT list.
If you don't really care about that, you can do one of two things:
Remove the GROUP BY If there are no grouped columns in the SELECT list, the GROUP BY clause is probably not accomplishing anything important.
Nest the query
If option 1 is somehow not possible (say, the original is actually a view) you could do this:
SELECT SUM(SumB)
FROM (
SELECT Sum(ColumnB) As SumB
FROM [Table]
GROUP BY ColumnA
) t
Note in both cases any JOIN is irrelevant to the issue.

Can we select first row of data from column in sql?

I have a table with multiple data for same ID. I want to get the first row data for the ID.
I have added the below SQL that I have tried.
SELECT
"client"."id",
"client"."company_name",
"client_details"."address"
from Client
LEFT OUTER JOIN "client_details" ON ("client"."id" = "client_details"."client_id")
Since I have multiple address for the same ID, can we get only the first id?
Currently the output I get is 2 rows with different addresses.
You can add to your SQL LIMIT 1 and in case you want to be sure the order you can also add to your SQL ORDER BY...
You can use distinct on:
select distinct on (c.id) c.id, c.company_name, cd.address
from Client c left join
client_details cd
on c.id = cd.client_id
order by c.id, ?;
The ? is for the column that specifies the ordering (the definition of "first"). I am guessing that cd.id is what you want.
Note that this query removes the double quotes and introduces table aliases. This is easier on both the eyes (to read) and the fingers (to type).
use row_number()
select * from
(
SELECT
"client"."id",
"client"."company_name",
"client_details"."address",row_number() over(partition by "client"."id" order by "client_details"."address") as rn
from Client
LEFT OUTER JOIN "client_details" ON "client"."id" = "client_details"."client_id"
)A where rn=1
If there is a field you can order the results by you could use a lateral join e.g.
SELECT
"client"."id",
"client"."company_name",
"client_details"."address"
from Client
left join lateral (
select *
from client_details cd
where cd.client_id = client.id
order by [some_ordering_field]
limit 1
) "client_details" on true

SQL Server JOINS

Can someone help explain to me how when I have 12 rows in table A and 10 in B and I do an inner join , I would get more rows than
in both A and B ?
Same with left and right joins...
This is just a simplified example. Let me share one of my issues with you
I have 2 views ; which was originally SQL on 2 base tables Culture and Trials.
And then when attempting to add another table Culture Steps, one of the team members separated the SQL into 2 views
Since this produces an error when updating(modification cannot be done as it affects multiple base tables), I would like to get
back to changing the SQL such that I no longer use the views but achieve the same results.
One of the views has
SELECT some columns
FROM dbo.Culture RIGHT JOIN
dbo.Trial ON dbo.Culture.cultureID = dbo.Trial.CultureID LEFT OUTER JOIN
dbo.TrialCultureSteps_view_part1 ON dbo.Culture.cultureID = dbo.TrialCultureSteps_view_part1.cultureID
The other TrialCultureSteps_view_part1 view
SELECT DISTINCT dbo.Culture.cultureID,
(SELECT TOP (1) WeekNr
FROM dbo.CultureStep
WHERE (CultureID = dbo.Culture.cultureID)
ORDER BY CultureStepID) AS normalstartweek
FROM dbo.Culture INNER JOIN
dbo.CultureStep AS CultureStep_1 ON dbo.Culture.cultureID = CultureStep_1.CultureID
So how can I combine the joins the achieve the same results using SQL only on tables without the need for views?
Welcome to StackOverflow! This link might be a good place to start in your understanding of JOINs. Essentially, the 'problem' you describe boils down to the fact that one or more of your sources (Trial, Culture, or the TrialCultureSteps view) has more than one record per CultureID - in other words, the same CultureID (#1) shows up on multiple rows.
Based solely on that ID, I'd execute the following three queries. Anything that is returned by them is the 'cause' of your duplications - the culture ID shows up more than once, so you'll have to JOIN on more than just CultureID. If, as I half-suspect, your view is the one that has multiple Culture IDs, you'll need to modify it to only return one record, or change the way that you JOIN to it.
SELECT *
FROM Trial
WHERE CultureID IN
(
SELECT CultureID
FROM Trial
GROUP BY CultureID
HAVING COUNT(*) > 1
)
ORDER BY CultureID
SELECT *
FROM Culture
WHERE CultureID IN
(
SELECT CultureID
FROM Culture
GROUP BY CultureID
HAVING COUNT(*) > 1
)
ORDER BY CultureID
SELECT *
FROM TrialCultureSteps_view_part1
WHERE CultureID IN
(
SELECT CultureID
FROM TrialCultureSteps_view_part1
GROUP BY CultureID
HAVING COUNT(*) > 1
)
ORDER BY CultureID
Let me know if any of these return values!
The comments explain the JOIN issues. As for rewriting, any views could be replaced with CTEs.
One other way to rewrite the query, would be : (Though having sample data and expected result would make this easier to confirm that it's correct)
;with TrialCultureSteps_view_part1 AS
(
Select Row_number() OVER (Partition BY CultureID ORDER BY CultureStepID) RowNumber
, WeekNr
, CultureID
)
SELECT some columns
dbo.trial LEFT OUTER JOIN
dbo.Culture ON dbo.Culture.cultureID = dbo.Trial.CultureID LEFT OUTER JOIN
TrialCultureSteps_view_part1 ON dbo.Culture.cultureID = dbo.TrialCultureSteps_view_part1.cultureID and RowNumber=1
Access code, I'm less familiar with the syntax, but I know that Row_Number() isn't available and I don't believe it has CTE syntax either. So, we'd need to put in some more nested derived tables.
SELECT some columns
dbo.trial LEFT OUTER JOIN
dbo.Culture ON dbo.Culture.cultureID = dbo.Trial.CultureID LEFT OUTER JOIN
( Select cs.CultureID, cs.WeekNr FROM
( SELECT CultureID, MIN(CultureStepID) CultureStepID
FROM dbo.CultureStep
GROUP BY CultureID
) Fcs INNER JOIN
CultureStep cs ON fcs.cultureStepID=cs.CultureStepID
) TrialCultureSteps_view_part1 ON dbo.Culture.cultureID = TrialCultureSteps_view_part1.cultureID
Assumptions here, is that CultureStepID is a PK for CultureStep. No assumption that a step must exist for each Culture entry.

Joining a derived table based on specific data from the outside query

I am trying to join one record in a table to another using a derived table and am having a bit of trouble figuring out the correct query to do so. What I want to do is have a JOIN of a derived table to a query where the derived table uses where statements depending on data from the outer query that is being joined to. So here is the current code that I am working on:
SELECT a.viewerid, a.id, v.id AS entry, a.jobid, v.sourceid, v.cost, a.applicant
FROM a_views a,
JOIN (
SELECT TOP 1 id, sourceid, cost FROM a_views vt
WHERE vt.viewerid = a.viewerid
AND vt.viewed_at <= a.viewed_at
AND vt.referrer NOT LIKE '%' + vt.hostName + '%'
ORDER BY viewed_at DESC
) v
The derived table is a query of the same table that the outer query uses, and viewerid is a FK to itself across the table where id is a unique auto-incrementing PK. I need to get the latest record in the a_views table where the viewer id's match, the datestamp (viewed_at) is less than the outer datestamp and the referrer column doesn't contain the hostName column.
Sounds like you need APPLY:
SELECT a.viewerid, a.id, v.id AS entry, a.jobid, v.sourceid, v.cost, a.applicant
FROM a_views a
CROSS APPLY (
SELECT TOP 1 id, sourceid, cost FROM a_views vt
WHERE vt.viewerid = a.viewerid
AND vt.viewed_at <= a.viewed_at
AND vt.referrer NOT LIKE '%' + vt.hostName + '%'
ORDER BY viewed_at DESC
) v
Since your query has JOIN I've gone for CROSS APPLY, but you may need OUTER APPLY depending on your exact requirements.

How to select DISTINCT results in an SQL JOIN query

this is my query so please check it and tell me. in this query is execute successfully but distinct is not working:
SELECT
DISTINCT(ticket_message.ticket_id),
support_ticket.user_id,
support_ticket.priority,
support_ticket.subject,
support_ticket.status,
ticket_message.message
FROM
support_ticket
LEFT OUTER JOIN ticket_message ON support_ticket.ticket_id = ticket_message.ticket_id
LEFT OUTER JOIN assign_ticket ON ticket_message.ticket_id = assign_ticket.ticket_id
The word distinct is a modifier to the keyword SELECT. So you need to think of it as SELECT DISTINCT and it ALWAYS operates across the entire row. It simply ignores the parentheses seen in the following:
select distinct(ticket_message.ticket_id)
because distinct is NOT a function.
So. What we appear to have is a support ticket with associated messages. There are usually multiple messages per support ticket, so I suspect what you want is more complex. For example you might want just the most recent message for each support ticket.
To achieve most recent we need a timestamp (or "datetime") column and we also need to know if your database supports "window functions". Let's assume you have a timestamp column called message_at and you database does support window functions, then this would reduce the number of rows:
SELECT
support_ticket.ticket_id
, support_ticket.user_id
, support_ticket.support_section
, support_ticket.priority
, support_ticket.subject
, support_ticket.status
, tm.file
, tm.message
, assign_ticket.section_id
, assign_ticket.section_admin_id
FROM support_ticket
LEFT OUTER JOIN (
SELECT
ticket_id
, file
, message
, ROW_NUMBER() OVER (PARTITION BY ticket_id ORDER BY message_at DESC) AS row_num
FROM ticket_message
) tm ON support_ticket.ticket_id = tm.ticket_id
AND tm.row_num = 1
LEFT OUTER JOIN assign_ticket ON tm.ticket_id = assign_ticket.ticket_id
ROW_NUMBER() OVER (PARTITION BY ticket_id ORDER BY message_at DESC) assigns the number 1 to the most recent message, and later we ignore all rows that are > 1 thus removing unwanted repetition in the results.
So.
We really need to know much more about your actual data, the database (and version) you are using and your real needs. It is almost certain that select distinct is NOT the right technique for what you are trying to achieve.
I suggest you read these: Provide a Minimal Complete Verifiable Example (MCVE)
and Why should I provide a MCVE
Use this statement:
SELECT DISTINCT
ticket_message.ticket_id
FROM
support_ticket
LEFT OUTER JOIN ticket_message ON
support_ticket.ticket_id = ticket_message.ticket_id
LEFT OUTER JOIN assign_ticket ON
ticket_message.ticket_id = assign_ticket.ticket_id
As soon as you add more columns to your query, DISTINCT also takes them into account as well.