How to use “Partition By” or “Max”? for SQL Server - sql

I have a very similar question to what was asked here for an Oracle DB (but I have an SQL Server 2012). The example I have used as a starter is based on the answer given here.
What I have is these four columns:
[L2] ,[DateofReporting],[L3] and [ServerName] more or less at a random day data is added to that table, but if it is, it will always be the same [L2],[DateofReporting],[L3] but with a different [ServerName]
Now I want to extract that data to give me all the servers [ServerName], which were added last for all months and years grouped by L2, L3 and the related month and year (coming from [DateofReporting]) .
SELECT [ID],[L2],[DateofReporting],[L3],[ServerName]
FROM (
select *,
max([DateofReporting])
OVER (PARTITION BY YEAR([DateofReporting]),
Month([DateofReporting])) maxdate
from [EADATAGOV].[Governance].[ToDos]
)max_date
where [DateofReporting] = maxdate
The problem I am phasing is, that the data is incomplete and their is obviously a bug in my statement. By now I don't see the tree for the forest, could you please help me clean up that SQL statement, or if there is a smarter way of doing it, I am open to suggestions.
I was thinking about utilizing ROW_NUMBER() to mark the relevant entries and than do a select on them, but I have never worked with that before.
thx Jan
example of output:
ID L2 DateofReporting L3 name
18214 Summer 2017-09-20 cloud BINHAS01105 <--
18215 Summer 2017-09-20 lightbulb BINHAS60276 <--
18217 Summer 2017-09-20 lightbulb CNAHAS62003 <--
15297 Summer 2017-09-15 cloud CINHAS01105
15298 Summer 2017-09-15 boat CINHAS60277
15300 Summer 2017-09-15 lightbulb DNAHAS62003
10512 Summer 2017-08-20 lightbulb DNAHAS62003 <--
the ones pointed out, are the ones I would expect to see in the result. As eg. boat does not have a newer entry than that of the 09-15.
new approach:
Select [L2],
MAX([DateofReporting]) LDateOfTest
from [EADATAGOV].[Governance].[ToDos]
group by [L2], YEAR([DateofReporting]), Month([DateofReporting]) ,[DType]
having DType= 'test'
order by LDateOfTest desc, L2 desc
This provides me (correctly) the latest date for each L2 for every month. Now in theory I should be able to use another query on the very same table where L2 and the LDateOfTest match.
My idea of a subselect does not work, as I can only pass one criteria, not two. But I don't know how that works, can you help me with the join(?) ?

It's difficult understand your request, since you didn't post any sample data (input).
As far as I understood, may be we can start from this query. Can you try it and pls let me know?
SELECT ID
,L2
,DATEOFREPORTING
,L3
,SERVERNAME
FROM(
SELECT ID
,L2
,DATEOFREPORTING
,L3
,SERVERNAME
,ROW_NUMBER() OVER (PARTITION BY NAME ORDER BY DATEOFREPORTING DESC) RN
FROM TODOS
) A
WHERE RN = 1;

Select [ID],[L2],[DateofReporting],[L3],[ServerName]
From(
Select [ID],[L2],[DateofReporting],[L3],[ServerName],
Row_NUmber() Over(Partition BY [ServerName],[L3] Order BY [DateofReporting] Desc) as Row_Num
from [EADATAGOV].[Governance].[ToDos]
) Temp
Where Row_Num = 1

That's the solution I have come up with after some hours of struggling. I had to completely reset my approach.
IF OBJECT_ID('tempdb..#tmp_table') IS NOT NULL DROP TABLE #tmp_table
Select [L2],
MAX([DateofReporting]) LDateOfTest
into #tmp_table --(L2t, LDateOfTest)
from [EADATAGOV].[Governance].[ToDos]
group by [L2], YEAR([DateofReporting]), Month([DateofReporting]) ,[DType]
having DType = 'test'
order by LDateOfTest desc, L2 desc
SELECT [ID]
,[EADATAGOV].[Governance].[ToDos].[L2] L2f
,YEAR([DateofReporting]) YoT, Month([DateofReporting]) MoT
,[L3]
,[ServerName]
FROM [EADATAGOV].[Governance].[ToDos]
right join #tmp_table tt on tt.L2 = [EADATAGOV].[Governance].[ToDos].[L2] and tt.LDateOfTest = ToDos.DateofReporting
where DType = 'test'
order by DateofReporting desc, L3 asc
DROP TABLE #tmp_table
It probably isn't the prettiest solution, but it get's me the results I was hoping for.

Related

Fetch latest date records with group by

I am trying to fetch the records with latest records with unique report_ids (col_1_0_ ).
Possible group by col_1_0_ and fetch the latest of that group record using this column col_10_0_ timestamp.
I tried to search this forum to find the way but it did not worked for me . It will be helpful if anyone help me to find the similar thread or help to make to get the results.
col_1_0_|ppi_event_id|col_2_0_ |col_10_0_ |
--------|------------|--------------------------|-----------------------|
149056| 3249|Draft |2020-08-25 13:01:49.016|
149056| 3249|Submitted | 2020-08-25 13:10:22.01|
149056| 3249|Submitted to administrator|2020-08-25 13:12:39.367|
149056| 3249|Validated |2020-08-25 13:13:28.879|
149060| 3249|Submitted to administrator|2020-08-25 13:32:41.924|
The expected result is
col_1_0_|ppi_event_id|col_2_0_ |col_10_0_ |
--------|------------|--------------------------|-----------------------|
149056| 3249|Validated |2020-08-25 13:13:28.879|
149060| 3249|Submitted to administrator|2020-08-25 13:32:41.924|
Anyone help in this regard.
Update : I have tried the solution mentioned below but sometimes it shows the first record "Draft" rather than "Validated"
Any other option to try ?
In Postgres, I would recommend using distinct on: that's a nice extension to the SQL standard, that was built exactly for the purpose you describe.
select distinct on (col_1_0) t.*
from mytable t
order by col_1_0, col_10_0_ desc
Traditional SQL format is to use row_number()
Demo
SELECT * FROM (
SELECT A.*,
ROW_NUMBER() OVER( PARTITION BY col_1_0_ ORDER BY col_10_0_) AS RN
FROM TABLE1 A ) X WHERE RN = 1;

I am stuck on getting a previous value

I have been working on this SQL code for a bit and I cannot get it to display like I want. I have an operation that we send parts outside of our business but there is no time stamp on when that operation sent out.
I am taking the previous operation's last labor date and the purchase order creation date to try and find out how long it takes that department to issued a purchase order.
I have tried LAST_Value to add to my query. I have even played with LAG and couldn't get a anything but errors.
SELECT
JobOpDtl.JobNum,
JobOpDtl.OprSeq,
JobOpDtl.OpDtlDesc,
LastValue.ClockInDate,
LastValue.LastValue
FROM Erp.JobOpDtl
LEFT OUTER JOIN Erp.LaborDtl ON
LaborDtl.JobNum = JobOpDtl.JobNum
and LaborDtl.OprSeq = JobOpDtl.OprSeq
LEFT OUTER JOIN (
Select
LaborDtl.JobNum,
LaborDtl.OprSeq,
MAX(LaborDtl.ClockInDate) as ClockInDate,
LAST_VALUE (LaborDtl.ClockInDate) OVER (PARTITION BY OprSeq ORDER BY JobNum) as LastValue
FROM Erp.LaborDtl
GROUP BY
LaborDtl.JobNum,
LaborDtl.OprSeq,
LaborDtl.ClockInDate
) as LastValue ON
JobOpDtl.JobNum = LastValue.JobNum
and JobOpDtl.OprSeq = LastValue.OprSeq
WHERE JobOpDtl.JobNum = 'PA8906'
GROUP BY
JobOpDtl.JobNum,
LastValue.OprSeq,
JobOpDtl.OpDtlDesc,
JobOpDtl.OprSeq,
LastValue.ClockInDate,
LastValue.LastValue
No errors, just not displaying how I am wanting it.
I would like it to display the OperSeq with the previous OperSeq last transaction date.
The basic function you want is LAG (as you suggested) but you need to wrap it in a COALESCE. Here is a sample code that illustrates the concept
SELECT * INTO #Jobs
FROM (VALUES ('P1','Step1', '2019-04-01'), ('P1','Step2', '2019-04-02')
, ('P1','Step3', '2019-04-03'), ('P1','Step4', NULL),
('P2','Step1', '2019-04-01'), ('P2','Step2', '2019-04-03')
, ('P2','Step3', '2019-04-06'), ('P2','Step4', NULL)
) as JobDet(JobNum, Descript, LastDate)
SELECT *
, COALESCE( LastDate, LAG(LastDate,1)
OVER(PARTITION BY JobNum
ORDER BY COALESCE(LastDate,GETDATE()))) as LastValue
FROM #Jobs
ORDER BY JobNum, Descript
DROP TABLE #Jobs
To apply it to your specific problem, I'd suggest using a COMMON TABLE EXPRESSION that replaces LastValue and using that instead of the raw table for your queries.
Your example picture doesn't match any tables you reference in your code (it would help us significantly if you included code that created temp tables matching those referenced in your code) so this is a guess, but it will be something like this:
;WITH cteJob as (
SELECT JobNum, OprSeq, OpDtlDesc, ClockInDate
, COALESCE( LastValue, LAG(LastValue,1)
OVER(PARTITION BY JobNum
ORDER BY COALESCE(LastValue,GETDATE()))) as LastValue
FROM Erp.JobOptDtl
) SELECT *
FROM cteJob as J
LEFT OUTER JOIN LaborDtl as L
on J.JobNum = JobNum
AND J.OprSeq = L.OprSeq
BTW, if you clean up your question to provide a better example of your data (i.e. SELECT INTO sttements like in the start of my answer that produce tables that correspond to the tables in your code instead of an image of an excel file) I might be able to get you closer to what you need, but hopefully this is enough to get you on the right track and it's the best I can do with what you've provided so far.

TSQL - Reduce the number of records with intelligence - patterns (crash impact data)

I have some data that contains data from measurements from crash impact tests.
When the object is not moving the measurements contain much rows of the same data, when the object is moving and shaking it can register quite big fluctuations.
Problem: I have hundreds of millions of lines of this data and to use it in reporting (mostly plotting) I have to find a way to make simplify everything and especially reduce the number of records.
Sometimes I have 20 times exactly the same value (=ChannelValue)
An example of the data is the following:
idMetaData;TimeStamp;SampleNumber;ChannelValue
3;0,5036500;12073;0.4573468975
3;0,5037000;12074;0.4418814526
3;0,5037500;12075;0.4109505628
3;0,5038000;12076;0.4109505628
3;0,5038500;12077;0.4264160077
3;0,5038999;12078;0.4573468975
3;0,5039499;12079;0.4573468975
3;0,5039999;12080;0.4109505628
3;0,5040500;12081;0.3336233382
3;0,5041000;12082;0.2408306686
3;0,5041500;12083;0.1789688889
3;0,5042000;12084;0.1789688889
3;0,5042500;12085;0.2253652237
3;0,5042999;12086;0.3026924483
3;0,5043499;12087;0.3645542280
3;0,5044000;12088;0.3954851178
3;0,5044500;12089;0.3645542280
3;0,5045000;12090;0.3026924483
3;0,5045500;12091;0.2253652237
3;0,5046000;12092;0.1635034440
3;0,5046499;12093;0.1325725541
3;0,5046999;12094;0.1480379991
3;0,5047500;12095;0.1789688889
3;0,5048000;12096;0.1944343338
3;0,5048500;12097;0.2098997788
3;0,5049000;12098;0.1944343338
3;0,5049500;12099;0.1635034440
3;0,5049999;12100;0.1171071092
3;0,5050499;12101;0.0861762194
3;0,5051000;12102;0.0707107744
3;0,5051500;12103;0.0707107744
3;0,5052000;12104;0.0861762194
3;0,5052500;12105;0.1171071092
3;0,5053000;12106;0.1635034440
idMetaData;TimeStamp;SampleNumber;ChannelValue
50;0,8799999;19600;-0.7106432894
50;0,8800499;19601;-0.7484265845
50;0,8801000;19602;-0.7232377211
50;0,8801500;19603;-0.6098878356
50;0,8802000;19604;-0.6098878356
50;0,8802500;19605;-0.6476711307
50;0,8802999;19606;-0.7232377211
50;0,8803499;19607;-0.7988043114
50;0,8803999;19608;-0.8617764701
50;0,8804500;19609;-0.8491820384
50;0,8805000;19610;-0.8617764701
50;0,8805500;19611;-0.7988043114
50;0,8806000;19612;-0.8239931749
50;0,8806499;19613;-0.7988043114
50;0,8806999;19614;-0.7736154480
50;0,8807499;19615;-0.6602655625
50;0,8807999;19616;-0.5972934038
50;0,8808500;19617;-0.6602655625
50;0,8809000;19618;-0.7484265845
50;0,8809500;19619;-0.8365876066
50;0,8809999;19620;-0.7862098797
50;0,8810499;19621;-0.8113987432
50;0,8810999;19622;-0.7988043114
50;0,8811499;19623;-0.6980488576
50;0,8812000;19624;-0.7232377211
50;0,8812500;19625;-0.7484265845
50;0,8813000;19626;-0.7232377211
50;0,8813500;19627;-0.8239931749
50;0,8813999;19628;-0.8491820384
50;0,8814499;19629;-0.8617764701
50;0,8814999;19630;-0.8365876066
50;0,8815500;19631;-0.8365876066
50;0,8816000;19632;-0.7988043114
50;0,8816500;19633;-0.8113987432
50;0,8817000;19634;-0.8113987432
50;0,8817499;19635;-0.7736154480
50;0,8817999;19636;-0.7232377211
50;0,8818499;19637;-0.6728599942
50;0,8819000;19638;-0.7232377211
50;0,8819500;19639;-0.7610210163
50;0,8820000;19640;-0.7106432894
50;0,8820500;19641;-0.6602655625
50;0,8820999;19642;-0.6602655625
50;0,8821499;19643;-0.6854544259
50;0,8821999;19644;-0.7736154480
50;0,8822500;19645;-0.8113987432
50;0,8823000;19646;-0.8869653335
50;0,8823500;19647;-0.8743709018
50;0,8824000;19648;-0.7988043114
50;0,8824499;19649;-0.8491820384
50;0,8824999;19650;-0.8239931749
50;0,8825499;19651;-0.8239931749
50;0,8825999;19652;-0.7232377211
50;0,8826500;19653;-0.6854544259
50;0,8827000;19654;-0.6728599942
50;0,8827500;19655;-0.6854544259
50;0,8827999;19656;-0.7232377211
50;0,8828499;19657;-0.7232377211
50;0,8828999;19658;-0.6980488576
50;0,8829499;19659;-0.6980488576
50;0,8830000;19660;-0.7106432894
50;0,8830500;19661;-0.6854544259
50;0,8831000;19662;-0.7484265845
50;0,8831499;19663;-0.7484265845
50;0,8831999;19664;-0.7736154480
50;0,8832499;19665;-0.7610210163
50;0,8832999;19666;-0.7610210163
50;0,8833500;19667;-0.7988043114
50;0,8834000;19668;-0.8617764701
50;0,8834500;19669;-0.9121541970
50;0,8835000;19670;-0.8869653335
50;0,8835499;19671;-0.8743709018
50;0,8835999;19672;-0.9121541970
50;0,8836499;19673;-0.8491820384
50;0,8837000;19674;-0.7988043114
50;0,8837500;19675;-0.7736154480
50;0,8838000;19676;-0.7106432894
50;0,8838500;19677;-0.6980488576
50;0,8838999;19678;-0.7484265845
50;0,8839499;19679;-0.8491820384
50;0,8839999;19680;-0.8491820384
50;0,8840500;19681;-0.7610210163
50;0,8841000;19682;-0.7106432894
50;0,8841500;19683;-0.7232377211
50;0,8842000;19684;-0.7962098797
50;0,8842499;19685;-0.7358321528
50;0,8842999;19686;-0.7232377211
50;0,8843499;19687;-0.7484265845
50;0,8844000;19688;-0.6728599942
50;0,8844500;19689;-0.6854544259
50;0,8845000;19690;-0.7106432894
50;0,8845500;19691;-0.7232377211
50;0,8845999;19692;-0.7862098797
50;0,8846499;19693;-0.7862098797
idMetaData;TimeStamp;SampleNumber;ChannelValue
15;0,3148000;8296;1.5081626404
15;0,3148500;8297;1.5081626404
15;0,3149000;8298;1.5727382554
15;0,3149500;8299;1.5081626404
15;0,3150000;8300;1.4920187367
15;0,3150500;8301;1.4435870254
15;0,3151000;8302;1.4274431217
15;0,3151500;8303;1.5243065442
15;0,3152000;8304;1.4920187367
15;0,3152500;8305;1.5081626404
15;0,3153000;8306;1.4920187367
15;0,3153500;8307;1.5565943516
15;0,3154000;8308;1.5081626404
15;0,3154500;8309;1.5404504479
15;0,3155000;8310;1.5081626404
15;0,3155500;8311;1.5727382554
15;0,3156000;8312;1.5404504479
15;0,3156500;8313;1.3951553142
15;0,3157000;8314;1.4758748329
15;0,3157500;8315;1.4435870254
15;0,3158000;8316;1.4920187367
15;0,3158500;8317;1.4920187367
15;0,3159000;8318;1.5081626404
15;0,3159500;8319;1.4597309292
15;0,3160000;8320;1.4274431217
15;0,3160500;8321;1.4274431217
15;0,3161000;8322;1.4597309292
15;0,3161500;8323;1.5565943516
15;0,3162000;8324;1.5888821591
15;0,3162500;8325;1.5565943516
15;0,3163000;8326;1.5243065442
15;0,3163500;8327;1.5404504479
15;0,3164000;8328;1.5404504479
15;0,3164500;8329;1.5404504479
15;0,3165000;8330;1.5404504479
I want to reduce the number of records by factor 10 or 20.
One solution would be to keep the average of 20 rows but then there is one problem, when there is a peek it will 'evaporate' in the average.
What I'd need is an average of 20 rows ('ChannelValue') but when there is a value that is a 'peek' -> definition: differs more than 10% -positive or negative- with the last (2?) value(s) than for this one do not take the average but the peek value, and from there continue the averages... This is the intelligence I mean in the title
I could also use some sort of 'distinct' logic that will also reduce the number of records by factor 8 to 10.
I read stuff about the NTILE function but this is all new for me.
Partition by idMetadata, order by id (there is an id column which I did not include right now)
Thanks so much in advance!
Here's one way. In SQL Server 2012 i'd use LEAD() or LAG() but since you are on 2008 we can use ROW_NUMBER() with a CTE and then limit on the variation.
declare #test table (idMetaData int, TimeStamp varchar(64), SampleNumber bigint, ChannelValue decimal(16,10))
insert into #test
values
(3,'0,5036500',12073,0.4573468975),
(3,'0,5037000',12074,0.4418814526),
(3,'0,5037500',12075,0.4109505628),
(3,'0,5038000',12076,0.4109505628),
(3,'0,5038500',12077,0.4264160077),
(3,'0,5038999',12078,0.4573468975),
(3,'0,5039499',12079,0.4573468975),
(3,'0,5039999',12080,0.4109505628),
(3,'0,5040500',12081,0.3336233382),
(3,'0,5041000',12082,0.2408306686),
(3,'0,5041500',12083,0.1789688889),
(3,'0,5042000',12084,0.1789688889)
--set the minimum variation you want to keep. Anything greate than this will be removed
declare #variation decimal(16,10) = 0.0000000010
--apply an order with row_number()
;with cte as(
select
idMetaData
,TimeStamp
,SampleNumber
,ChannelValue
,row_number() over (partition by idMetadata order by SampleNumber) as RN
from #test),
--self join to itself adding the next row as additional columns
cte2 as(
select
c.*
,c2.TimeStamp as C2TimeStamp
,c2.SampleNumber as C2SampleNumber
,c2.ChannelValue as C2ChannelValue
from cte c
left join cte c2 on c2.rn = c.rn + 1)
--only return the results where the variation is met. Change the variation to see this in action
select
idMetaData
,TimeStamp
,SampleNumber
,ChannelValue
from
cte2
where
ChannelValue - C2ChannelValue > #variation or C2ChannelValue is null
This doesn't take an "average" which would have to be a running average but what it allows you to do is to use a variance measurement to say that any consecutive measurements which only vary by n amount, treat as a single measurement. The higher the variance you choose, the more rows that will be "removed" or treated equally. It's a way to cluster your points in order to remove some noise without using something like K-Means which is hard in SQL.
Just for fun. I modified a stored procedure which generates dynamic stats for any table/query/measure. This has been tailored to be stand-alone.
This will generate a series of analytical items for groups of 10 ... an arbitrary value.
Just a side note: If there is no true MODE, ModeR1 and ModeR2 will represent the series range. When ModeR1 = ModeR2 then that would be the true mode.
dbFiddle
Example
;with cteBase as (Select GroupBy = [idMetaData]
,Item = Row_Number() over (Partition By [idMetaData] Order By SampleNumber) / 10
,RowNr = Row_Number() over (Partition By [idMetaData] Order By SampleNumber)
,Measure = ChannelValue
,TimeStamp
,SampleNumber
From #YourTable
),
cteMean as (Select GroupBy,Item,Mean=Avg(Measure),Rows=Count(*),MinRow=min(RowNr),MaxRow=max(RowNr) From cteBase Group By GroupBy,Item),
cteMedn as (Select GroupBy,Item,MedRow1=ceiling(Rows/2.0),MedRow2=ceiling((Rows+1)/2.0) From cteMean),
cteMode as (Select GroupBy,Item,Mode=Measure,ModeHits=count(*),ModeRowNr=Row_Number() over (Partition By GroupBy,Item Order By Count(*) Desc) From cteBase Group By GroupBy,Item,Measure)
Select idMetaData = A.GroupBy
,Bin = A.Item+1
,TimeStamp1 = min(TimeStamp)
,TimeStamp2 = max(TimeStamp)
,SampleNumber1 = min(SampleNumber)
,SampleNumber2 = max(SampleNumber)
,Records = count(*)
,StartValue = sum(case when RowNr=B.MinRow then Measure end)
,EndValue = sum(case when RowNr=B.MaxRow then Measure end)
,UniqueVals = count(Distinct A.Measure)
,MinVal = min(A.Measure)
,MaxVal = max(A.Measure)
,Mean = max(B.Mean)
,Median = isnull(Avg(IIF(RowNr between MedRow1 and MedRow2,Measure,null)),avg(A.Measure))
,ModeR1 = isnull(max(IIf(ModeHits>1,D.Mode,null)),min(A.Measure))
,ModeR2 = isnull(max(IIf(ModeHits>1,D.Mode,null)),max(A.Measure))
,StdDev = Stdev(A.Measure)
From cteBase A
Join cteMean B on (A.GroupBy=B.GroupBy and A.Item=B.Item)
Join cteMedn C on (A.GroupBy=C.GroupBy and A.Item=C.Item)
Join cteMode D on (A.GroupBy=D.GroupBy and A.Item=D.Item and ModeRowNr=1)
Group By A.GroupBy,A.Item
Order By A.GroupBy,A.Item
Returns

Select query with max date

I have this query
SQL query: selecting by branch and machine code, order by branch and date
SELECT
mb.machine_id AS 'MachineId',
MAX(mb.date) AS 'Date',
mi.branch_id AS 'BranchId',
b.branch AS 'Branch',
b.branch_code AS 'BranchCode'
FROM
dbo.machine_beat mb
LEFT JOIN dbo.machine_ids mi
ON mb.machine_id = mi.machine_id
LEFT JOIN dbo.branches b
ON mi.branch_id = b.lookup_key
GROUP BY
mb.machine_id,
mi.branch_id,
b.branch,
b.branch_code
ORDER BY
b.branch, [Date] DESC
Query result:
|==========|=======================|=========|==========|==========|
|MachineId |Date |BranchId |Branch |BranchCode|
|==========|=======================|=========|==========|==========|
|SS10000005|2014-03-31 19:10:17.110|3 |Mamamama |MMMM |
|SS10000043|2014-03-31 17:16:32.760|3 |Mamamama |MMMM |
|SS10000005|2014-02-17 14:58:42.523|3 |Mamamama |MMMM |
|==================================================================|
My problem is how to select the updated machine code? Expected query result:
|==========|=======================|=========|==========|==========|
|MachineId |Date |BranchId |Branch |BranchCode|
|==========|=======================|=========|==========|==========|
|SS10000005|2014-03-31 19:10:17.110|3 |Mamamama |MMMM |
|==================================================================|
Update
I created sqlfiddle. I also added data, aside from MMMM. I need the updated date for each branch. So probably, my result will be:
|==========|=======================|=========|==========|==========|
|MachineId |Date |BranchId |Branch |BranchCode|
|==========|=======================|=========|==========|==========|
|SS10000343|2014-06-03 13:43:40.570|1 |Cacacaca |CCCC |
|SS30000033|2014-03-31 18:59:42.153|8 |Fafafafa |FFFF |
|SS10000005|2014-03-31 19:10:17.110|3 |Mamamama |MMMM |
|==================================================================|
Try using Row_number with partition by
select * from
(
SELECT
mb.machine_id AS 'MachineId',
mb.date AS 'Date',
mi.branch_id AS 'BranchId',
b.branch AS 'Branch',
b.branch_code AS 'BranchCode',rn=row_number()over(partition by mb.machine_id order by mb.date desc)
FROM
dbo.machine_beat mb
LEFT JOIN dbo.machine_ids mi
ON mb.machine_id = mi.machine_id
LEFT JOIN dbo.branches b
ON mi.branch_id = b.lookup_key
WHERE
branch_code = 'MMMM'
/*
GROUP BY
mb.machine_id,
mi.branch_id,
b.branch,
b.branch_code
*/
)x
where x.rn=1
#861051069712110711711710997114 is looking in the right direction - this is a greatest-n-per-group question. Yours is more complicated than the usual because the greatest portion is coming from a different table than the group portion. The only issue with his answer is that you hadn't provided sufficient information to finish it correctly.
The following solves the problem:
WITH Most_Recent_Beat AS (SELECT Machine.branch_id,
Beat.machine_id, Beat.date,
ROW_NUMBER() OVER(PARTITION BY Machine.branch_id
ORDER BY Beat.date DESC) AS rn
FROM machine_id Machine
JOIN machine_beat Beat
ON Beat.machine_id = Machine.machine_id)
SELECT Beat.machine_id, Beat.date,
Branches.lookup_key, Branches.branch, Branches.branch_code
FROM Branches
JOIN Most_Recent_Beat Beat
ON Beat.branch_id = Branches.lookup_key
AND Beat.rn = 1
ORDER BY Branches.branch, Beat.date DESC
(and corrected SQL Fiddle for testing. You shouldn't be using a different RDBMS for the example, especially as there were syntax errors for the db you say you're using.)
Which yields your expected results.
So what's going on here? The key is the ROW_NUMBER()-function line. This function itself simply generates a number series. The OVER(...) clause defines what's known as a window, over which the function will be run. PARTITION BY is akin to GROUP BY - every time a new group occurs (new Machine.branch_id value), the function restarts. The ORDER BY inside the parenthesis simply says that, per group, entries should have the given function run on entries in that order. So, the greatest date (most recent, assuming all dates are in the past) gets 1, the next 2, etc.
This is done in a CTE here (it could also be done as part of a subquery table-reference) because only the most recent date is required - where the generated row number is 1; as SQL Server doesn't allow you to put SELECT-clause aliases into the WHERE clause, it needs to be wrapped in another level to be able to reference it that way.

SQL Server Join with Latest 2 Entries

I know the title of the post is bad but hear me out. A question like this arose the other day at work, and while I found a way around it, the problem still haunts me.
Lets assume Stackoverflow has only 3 tables.
Users ( username )
Comments ( comment, creationdate )
UsersCommentsJoin , this is the join table between the first 2 tables.
Now lets say I want to make a query that would return the all the users with the last 2 most recent comments. So the result set would look like this.
|username| most recent comment | second most recent comment|
How on earth do I go about creating that query ? I solved this problem earlier by simply only returning the most recent comment and not even trying to get the second one, and boy, let me tell you it seemed a WHOLE lot more involved than when I thought with subselects, TOP and other weird DB acrobatics.
Bonus Round Why do some queries which seem easy logically, turn out to be monster queries, at least from my rookie perspective ?
EDIT: I was using an MS SQL server.
You can use a crosstab query pivoting on ROW_NUMBER
WITH UC
AS (SELECT UCJ.userId,
C.comment,
ROW_NUMBER() OVER (PARTITION BY userId
ORDER BY creationdate DESC) RN
FROM UsersCommentsJoin UCJ
JOIN Comments C
ON C.commentId = U.commentId)
SELECT username,
MAX(CASE
WHEN RN = 1 THEN comment
END) AS MostRecent,
MAX(CASE
WHEN RN = 2 THEN comment
END) AS SecondMostRecent
FROM Users U
JOIN UC
ON UC.userId = U.userId
WHERE UC.RN <= 2
GROUP BY UC.userId