Can some expert please explain what this query is doing? [duplicate] - sql

This question already has answers here:
Get top 1 row of each group
(19 answers)
Closed 6 months ago.
This is the query in one of the Reports that I am trying to fix. What is being done here?
select
*
from
(
SELECT
[dbo].[RegistrationHistory].[AccountNumber],
[dbo].[RegistrationHistory].[LinkedBP],
[dbo].[RegistrationHistory].[SerialNumber],
[dbo].[RegistrationHistory].[StockCode],
[dbo].[RegistrationHistory].[RegistrationDate],
[dbo].[RegistrationHistory].[CoverExpiry],
[dbo].[RegistrationHistory].[LoggedDate] as 'CoverExpiryNew',
ROW_NUMBER() OVER(PARTITION BY [dbo].[RegistrationHistory].[SerialNumber]
ORDER BY
LoggedDate asc) AS seq,
[dbo].[Registration].[StockCode] as 'CurrentStockCode'
FROM
[SID_Repl].[dbo].[RegistrationHistory]
LEFT JOIN
[SID_Repl].[dbo].[Registration]
on [dbo].[RegistrationHistory].[SerialNumber] = [dbo].[Registration].[SerialNumber]
where
[dbo].[RegistrationHistory].[StockCode] in
(
'E4272HL1',
'E4272HL2',
'E4272HL3',
'E4272H3',
'OP45200HA',
'OP45200HM',
'EOP45200HA',
'EOP45200HM',
'4272HL1',
'4272HL2',
'4272HL3',
'4272H3'
)
)
as t
where
t.seq = 1
and CurrentStockCode in
(
'E4272HL1',
'E4272HL2',
'E4272HL3',
'E4272H3',
'OP45200HA',
'OP45200HM',
'EOP45200HA',
'EOP45200HM',
'4272HL1',
'4272HL2',
'4272HL3',
'4272H3'
)
I am looking for a simplified way of splitting this query into step by step, so that I can see where it is going wrong.

ROW_NUMBER in a subquery combined with a filter on it in the outer query is an idiom to filter out all but the first row in a group. So here
ROW_NUMBER() OVER(PARTITION BY [dbo].[RegistrationHistory].[SerialNumber])
Assigns the row with the lowest SerialNumber 1, the next lowest, 2, etc. Then later
where
t.seq = 1
removes all but the row with the lowest serial number from the result.

Related

computed column base on ranked position

I am trying to get a new column(computed) to assign points based on the positions in the positions column as in this image
I have tried this query below but my quest was not successful: I seek your assistance please help
query in sql server
You can't do rank using apply like that. You'll always get "1". Use a subquery:
select . . .
from (select ae.*, rank() over (order by averagemark desc) as position
from agriculturalentries
) cross join
(values (case when rank >= 13 then 150 - rank * 10 end) ) as v(pointsearned);
I find the arithmetic easier to type than your case, but you can use the more verbose form.
You might ask why rank() always returns "1" in your query. That is because apply only considers one row at a time (as written). The rank over that row is necessarily "1".
With AllMarks AS
(SELECT
CompetitorID,
ApiEntryId,
AverageMark,
RANK() OVER (ORDER BY AverageMark)AS RankPosition
-- Add other columns
FROM ApicultureEntries
)
SELECT
a.*,(CASE WHEN a.RankPosition < 14 THEN 150 - RankPosition * 10 ELSE NULL END) AS PositionEarned
FROM AllMarks AS a

Oracle SQL: Get the max record for each duplicate ID in self join table [duplicate]

This question already has answers here:
Fetch the rows which have the Max value for a column for each distinct value of another column
(35 answers)
GROUP BY with MAX(DATE) [duplicate]
(6 answers)
Oracle SQL query: Retrieve latest values per group based on time [duplicate]
(2 answers)
Closed 5 years ago.
It's been marked as a duplicate and seems to be explained a bit in the linked questions, but I'm still trying to get the separate DEBIT and CREDIT columns on the same row.
I've created a View and I am currently self joining it. I'm trying to get the max Header_ID for each date.
My SQL is currently:
SELECT DISTINCT
TAB1.id,
TAB1.glperiods_id,
MAX(TAB2.HEADER_ID),
TAB1.batch_date,
TAB1.debit,
TAB2.credit,
TAB1.descrip
FROM
IQMS.V_TEST_GLBATCH_GJ TAB1
LEFT OUTER JOIN
IQMS.V_TEST_GLBATCH_GJ TAB2
ON
TAB1.ID = TAB2.ID AND TAB1.BATCH_DATE = TAB2.BATCH_DATE AND TAB1.GLPERIODS_ID = TAB2.GLPERIODS_ID AND TAB1.DESCRIP = TAB2.DESCRIP AND TAB1.DEBIT <> TAB2.CREDIT
WHERE
TAB1.ACCT = '3648-00-0'
AND
TAB1.DESCRIP NOT LIKE '%INV%'
AND TAB1.DEBIT IS NOT NULL
GROUP BY
TAB1.id,
TAB1.glperiods_id,
TAB1.batch_date,
TAB1.debit,
TAB2.credit,
TAB1.descrip
ORDER BY TAB1.batch_date
And the output for this is (37 rows in total):
I'm joining the table onto itself to get DEBIT and CREDIT on the same line. How do I select only the rows with the max HEADER_ID per BATCH_DATE ?
Update
For #sagi
Those highlighted with the red box are the rows I want and the ones in blue would be the ones I'm filtering out.
Fixed mistake
I recently noticed I had joined my table onto itself without making sure TAB2 ACCT='3648-00-0'.
The corrected SQL is here:
SELECT DISTINCT
TAB1.id,
TAB1.glperiods_id,
Tab1.HEADER_ID,
TAB1.batch_date,
TAB1.debit,
TAB2.credit,
TAB1.descrip
FROM
IQMS.V_TEST_GLBATCH_GJ TAB1
LEFT OUTER JOIN
IQMS.V_TEST_GLBATCH_GJ TAB2
ON
TAB1.ID = TAB2.ID AND TAB1.BATCH_DATE = TAB2.BATCH_DATE AND TAB2.ACCT ='3648-00-0'AND TAB1.GLPERIODS_ID = TAB2.GLPERIODS_ID AND TAB1.DESCRIP = TAB2.DESCRIP AND TAB1.DEBIT <> TAB2.CREDIT
WHERE
TAB1.ACCT = '3648-00-0'
AND
TAB1.DESCRIP NOT LIKE '%INV%'
AND TAB1.DEBIT IS NOT NULL
ORDER BY TAB1.BATCH_DATE
Use window function like ROW_NUMBER() :
SELECT s.* FROM (
SELECT t.*,
ROW_NUMBER() OVER(PARTITION BY t.batch_id ORDER BY t.header_id DESC) as rnk
FROM YourTable t
WHERE t.ACCT = '3648-00-0'
AND t.DESCRIP NOT LIKE '%INV%'
AND t.DEBIT IS NOT NULL) s
WHERE s.rnk = 1
This is an analytic function that rank your record by the values provided in the OVER clause.
PARTITION - is the group
ORDER BY - Who's the first of this group (first gets 1, second 2, ETC)
It is a lot more efficient then joins(Your problem could have been solved in many ways) , and uses the table only once.

How to select the first row from group by date [duplicate]

This question already has answers here:
Select first row in each GROUP BY group?
(20 answers)
Closed 8 years ago.
I am writing a program for amateur radio. Some callsigns will appear more than once in the data but the qsodate will be different. I only want the first occurrence of a call sign after a given date.
The query
select distinct
a.callsign,
a.SKCC_Number,
a.qsodate,
b.name,
a.SPC,
a.Band
from qso a, skccdata b
where SKCC_Number like '%[CTS]%'
AND QSODate > = '2014-08-01'
and b.callsign = a.callsign
order by a.QSODate
The problem:
Because contacts occur on different dates, I get all of the contacts - I have tried adding min(a.qsodate) to get only the first but then I run into all sorts of issues regarding grouping.
This query will be in a stored procedure, so creating temp tables or cursors will not be a problem.
You can use the ROW_NUMBER() to get the first row with the first date, like this:
WITH CTE
AS
(
select
a.callsign,
a.SKCC_Number,
a.qsodate,
b.name,
a.SPC,
a.Band,
ROW_NUMBER() OVER(PARTITION BY a.callsign ORDER BY a.QSODate) AS RN
from qso a,skccdata b
where SKCC_Number like '%[CTS]%'
AND QSODate > = '2014-08-01'
and b.callsign = a.callsign
)
SELECT *
FROM CTE
WHERE RN = 1;
ROW_NUMBER() OVER(PARTITION BY a.callsign ORDER BY a.QSODate) will give you a ranking number for each group of callsign ordered by QSODate, then the WHERE RN = 1 will eliminate all the rows except the first one which has the minimum QSODate.
Have you tried starting your query with SELECT TOP 1 ...(fields) Then you will only get one row. You can use TOP x .... for x number of rows, or TOP 50 PERCENT for the top half of the rows, etc. Then you can eliminate DISTINCT in this case
EDIT: misunderstood question. How about this?
select
a.callsign,
a.SKCC_Number,
a.qsodate,
(SELECT TOP 1 b.name FROM skccdata b WHERE b.callsign = a.callsign) as NAME,
a.SPC,
a.Band
from qso a
where SKCC_Number like '%[CTS]%'
AND QSODate > = '2014-08-01'
GROUP BY a.QSODate, a.callsign, a.SKCC_Number, a.SPC, a.Band
order by a.QSODate
and add callsign to your where clause to isolate callsigns

Why would the query show data from the wrong month?

I have a query:
;with date_cte as(
SELECT r.starburst_dept_name,r.monthly_past_date as PrevDate,x.monthly_past_date as CurrDate,r.starburst_dept_average - x.starburst_dept_average as Average
FROM
(
SELECT *,ROW_NUMBER() OVER(PARTITION BY starburst_dept_name ORDER BY monthly_past_date) AS rowid
FROM intranet.dbo.cse_reports_month
) r
JOIN
(
SELECT *,ROW_NUMBER() OVER(PARTITION BY starburst_dept_name ORDER BY monthly_past_date) AS rowid
FROM intranet.dbo.cse_reports_month
Where month(monthly_past_date) > month(DATEADD(m,-2,monthly_past_date))
) x
ON r.starburst_dept_name = x.starburst_dept_name AND r.rowid = x.rowid+1
Where r.starburst_dept_name is NOT NULL
)
Select *
From date_cte
Order by Average DESC
So doing some testing, I have alter some columns data, to see why it gives me certain information. I don't know why when I run the query it gives my a date column that should not be there from "january" (row 4) like the picture below:
The database has more data that has the same exact date '2014-01-25 00:00:00.000', so I'm not sure why it would only get that row and compare the average?
I did before I run the query alter the column in that row and change the date? But I'm not sure if that would have something to do with it.
UPDATE:
I have added the sqlfinddle,
What I would like to get it subtract the average
from last_month - last 2 month ago.
It Was actually working until I made a change and alter the data.
I made the changes to test a certain situation, which obviously lead
to learning that there are flaws to the query.
Based on your SQL Fiddle, this eliminates joins from prior than month-2 from showing up.
SELECT
thismonth.starburst_dept_name
,lastmonth.monthtly_past_date [PrevDate]
,thismonth.monthtly_past_date [CurrDate]
,thismonth.starburst_dept_average - lastmonth.starburst_dept_average as Average
FROM dbo.cse_reports thismonth
inner join dbo.cse_reports lastmonth on
thismonth.starburst_dept_name = lastmonth.starburst_dept_name
AND month(DATEADD(MONTH,-1,thismonth.monthtly_past_date))=month(lastmonth.monthtly_past_date)
WHERE MONTH(thismonth.monthtly_past_date)=month(DATEADD(MONTH,-1,GETDATE()))
Order by thismonth.starburst_dept_average - lastmonth.starburst_dept_average DESC

SQL Calculate Difference Between Current and Last Value By Timestamp Column

I am looking to calculate the difference between the current & last Value organised by the timestamp column?
My table is organised as follows:
MeterID(PK,FK,int.not null), ActualTimeStamp(smalldatetime,not null), Value(float,null)
Meter ID ActualTimeStamp Value
312514 2013-01-01 08:08:00 72
312514 2013-01-01 08:07:00 12
So my answer should be 72 - 12 = 60
The only solutions I can seem to find are using Row Number which i dont have an option of, if anyone can assist id really apprecieate it as its busting my brain.
Here's a query that can help you. Just modify this to fit your need/table names/etc.
with sub as (
select meterid,
actualtimestamp,
value,
row_number() over (partition by meterid order by actualtimestamp desc) as rn
from test
)
select meterid,
actualtimestamp,
value,
value - isnull((select value
from sub
where s.meterid = meterid
and rn = s.rn + 1), value) as answer
from sub s
order by meterid, actualtimestamp desc;
Basically what it does is that it adds a row number using the row_number() aggregate function. Using the row number, the query tries to get the value from the previous entry and getting the value difference.
Try the fiddler here
In SQL Server 2008, I would recommend a outer applyhere the short code of find diff with your requirement
select t.*, isnull((t.value - tprev.value),0) as diff
from test t outer apply
(select top 1 tprev.*
from test tprev
where tprev.meterid = t.meterid and
tprev.actualtimestamp < t.actualtimestamp
order by tprev.actualtimestamp desc
)tprev