Sql select distinct row by a columns highest value - sql

I am having an issue trying to select one row per city name. This is the following collection I am getting:
This is my query so far:
select pl.PlaceId,
pl.Name,
pop.NumberOfPeople,
pop.Year
from dbo.Places pl
inner join dbo.Populations pop
on pop.PlaceId = pl.PlaceId
where pop.NumberOfPeople >= 1000
and pop.NumberOfPeople <= 99999
I am trying to get it to where it only selects a city one time, but uses the most recent date. So in the above picture, I would only see Abbeville for 2016 and not 2015. I believe I need to do either a group by or do a sub query to flatten the results. If anybody has any advice on how I can handle this, it will be greatly appreciated.

Assuming you are using SQLSERVER,you can use Rownumber
;with cte
as
(select pl.PlaceId,
pl.Name,
pop.NumberOfPeople,
pop.Year,
row_number() over(partition by pl.Name order by year desc) as rownum
from dbo.Places pl
inner join dbo.Populations pop
on pop.PlaceId = pl.PlaceId
where pop.NumberOfPeople >= 1000
and pop.NumberOfPeople <= 99999
)
select * from cte where rownum=1

The following query serves the purpose.
CREATE TABLE #TEMP_TEST
(
PlaceId INT,
Name VARCHAR(50),
NumberOfPeople INT,
YEAR INT
)
INSERT INTO #TEMP_TEST
SELECT 1,'Abbeville',2603,2016
UNION
SELECT 5,'Alabester',32948,2016
UNION
SELECT 9,'Aubum',63118,2016
UNION
SELECT 1,'Abbeville',2402,2015
UNION
SELECT 5,'Alabester',67902,2017
SELECT PlaceId, Name, NumberOfPeople, YEAR FROM
(
SELECT ROW_NUMBER() OVER (PARTITION BY PlaceId ORDER BY YEAR DESC) RNO,
PlaceId, Name, NumberOfPeople, YEAR
FROM #TEMP_TEST
)T
WHERE RNO = 1
DROP TABLE #TEMP_TEST

Related

Turn these temp tables into one longer subquery (can't use Temp tables in Power BI)

Currently I have created these temp tables to get the desired output I need. However, Power BI doesn't allow the use of temp tables so I need to get this all into 1 query using inner selects.
drop table if exists #RowNumber
Select Date, ID, ListID
, row_number() over (partition by ID order by ID) as rownum
into #RowNumber
from Table
where Date= cast(getdate()-1 as date)
group by Date, ID, ListID
order by ID
drop table if exists #1stListIDs
select ListID as FirstID, ID, Date
into #1stListIDs
from #RowNumber
where rownum = 1
drop table if exists #2ndlistids
Select ListID as SecondListID, ID, Date
into #2ndlistids
from #RowNumber
where rownum = 2
--Joins the Two Tables back together to allow the listids to be in the same row
drop table if exists #FinalTableWithTwoListIDs
select b.FirstListID, a.SecondListID, a.ID, a.Date
into #FinalTableWithTwoListIDs
from #2ndlistids a
join #1stListIDs b on a.ID= b.ID
order by ID
This code is simple and straight forward. However I can't seem to figure out using a subquery. Here is what I have. It works for the FirstListID select statement, but not the SecondListID portion. I believe this is because you can't reference the inner most select statement with multiple different outer select statements, but I could be wrong.
Select a.ListId as SecondListID, a.ID, a.Date
from (
select a.ListId as FirstListID, a.ID, a.Date
from (
Select Date, ID, ListId
, row_number() over (partition by ID order by ID) as rownum
from Table
where Date = cast(getdate()-1 as date)
group by Date, ID, ListId
order by ID) a
where a.rownum = 1) b
where a.rownum = 2) c
Just to show, for completeness, how you could use CTEs to replace the #temp tables, it would be something along the lines of
with RowNumber as (
select Date, ID, ListID
, row_number() over (partition by ID order by ID) as rownum
from Table
where Date= cast(dateadd(day,-1,getdate()) as date)
group by Date, ID, ListID
),
FirstListIDs as (
select ListID as FirstID, ID, Date
from RowNumber
where rownum = 1
),
SecondListIDs as (
select ListID as SecondID, ID, Date
from RowNumber
where rownum = 2
)
select f.FirstID, s.SecondID, s.ID, s.Date
from Secondlistids s
join FirstListIDs f on s.ID=f.ID
order by s.ID
Note the use of dateadd which is recommended over the ambiguousdate +/- value assumed to be days, and where relevant meaningful table aliases.
You could do it with a CTE and joining the two together, but that is inefficient and unnecessary.
It looks like you just need LAG to get the previous ListID
I note that PARTITION BY ID ORDER BY ID is non-deterministic and the ordering will be random. I strongly suggest you find a deterministic ordering.
SELECT
PrevID AS b.FirstListID,
ListID AS a.SecondListID,
ID,
Date
FROM (
SELECT
Date,
ID,
ListID,
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ID) AS rownum,
LAG(ListID) OVER (PARTITION BY ID ORDER BY ID) AS PrevID
from [Table]
where Date = cast(getdate() - 1 as date)
group by Date, ID, ListID
) AS WithRowAndLag
WHERE rownum = 2;
ORDER BY ID;

Select the newest record

I would like to run a select statement that runs and select only the newest record by Recored_timestampe field for the keys teacher_id and student_id. So any time, it runs it needs to provide only one record. how could I do it, please? The output could be without the field Recored_timestampe. Thanks
Using the window function,partitioned by teacher_id and student_id and sorting it by recorded_timestamp will give you the desired result.
select * from(select teacher_id,student_id,teacher_name,comment ,recorded_timestamp, row_number() over(partition by teacher_id,student_id order by recorded_timestamp desc)as rownum from temp0607)out1 where rownum=1
Also you may have to look at the way recorded_timestamp is stored. If it is stored as string, you can convert it into timestamp using from_unixtime(unix_timestamp(recorded_timestamp,'dd/MM/yyyy HH:mm'),'dd/MM/yyyy HH:mm')
First, arrange the record by datetime
SELECT *,RANK() OVER (PARTITION BY student_id ORDER BY Recored_timestamp desc) as ranking
FROM #temp
Then, if you want to know the newest record with student_id which is not null, then you can use OUTER APPLY to add a column which is non-NULL student_id.
OUTER APPLY (SELECT student_id
FROM #temp
WHERE #temp.teacher_id = ranktable.teacher_id
AND student_id IS NOT NULL
) AS jointable
Here is an example:
Create Table #temp
(
teacher_id int
,student_id int
,teacher_name varchar(40)
,comment varchar(100)
,Recored_timestamp datetime
)
INSERT INTO #temp
VALUES
(449,111,'lucy','Could be better','2021-05-04 07:00:00.000')
,(449,null,'lucy','smily','2021-05-11 07:00:00.000')
,(449,111,'lucy','not listening','2021-05-08 07:00:00.000')
,(448,null,'Toni','Good','2021-06-04 09:00:00.000')
,(448,222,'Toni','not doing as expected','2021-06-04 08:00:00.000')
SELECT DISTINCT teacher_id,
jointable.student_id,
teacher_name,
comment,
Recored_timestamp,
ranking
FROM
(
SELECT *,RANK() OVER (PARTITION BY teacher_id ORDER BY Recored_timestamp DESC) AS ranking
FROM #temp
) AS ranktable
OUTER APPLY (SELECT student_id
FROM #temp
WHERE #temp.teacher_id = ranktable.teacher_id
AND student_id IS NOT NULL
) AS jointable
WHERE ranking = 1 --only newest record will be extracted
Drop table #temp
You can base from this query to get the newest data.
SELECT TOP 1 * FROM tablename T1
INNER JOIN(SELECT teacher_id, Max(Recored_timestamp) as MaxDate from tablename GROUP BY teacher_id) T2 ON T2.teacher_id = T1.teacher_id AND T1.Recored_timestamp = T2.MaxDate

SQL - Find not distinct values with two criteria

Pretty much it is easier to show, than to explain:
I have the following table:
The idea is, that I need only the "Objekts", for which I have entered the Datum within the same month.
E.g., "aaa" is needed, because I have data for August twice. "bbb" is not needed, because I have once for August and once for Septermber, which is OK.
This is what I've tried so far:
SELECT objekt,count(*) as counter
FROM tempt_report
GROUP BY objekt
HAVING count(*)>1
But obviously, I do not mention the requirement for the "Datum", and thus I do not get what I want.
Thanks! :)
not sure if I'm missing something! You want >1 of any type in a month of a year
SELECT objekt,year(datum),month(datum),count(*) as counter
FROM tempt_report
GROUP BY objekt, year(datum),month(datum)
HAVING count(*)>1
SELECT objekt,dateadd(month,DATEDIFF(MONTH, 0, datum),0) m
FROM tempt_report
GROUP BY objekt,DATEDIFF(MONTH, 0, datum)
HAVING count(*)>1
select MONTH(Datum) +' '+ YEAR(Datum) AS Datum,
objekt,
COUNT(1) from #tempt_report
GROUP by objekt,YEAR(Datum), MONTH(Datum)
HAVING count(1) > 1
You can try this query?
You might try this like this:
DECLARE #tbl TABLE(YourDate DATE,YourObjekt VARCHAR(100));
INSERT INTO #tbl VALUES
({d'2016-08-01'},'aaa')
,({d'2016-08-31'},'aaa')
,({d'2016-08-31'},'bbb')
,({d'2016-09-01'},'aaa')
,({d'2016-09-02'},'bbb');
WITH PartitionedCounted AS
(
SELECT ROW_NUMBER() OVER(PARTITION BY YourObjekt,YearAndMonth ORDER BY YourDate) AS Nr
,YearAndMonth
,YourDate
,YourObjekt
FROM #tbl AS tbl
CROSS APPLY(SELECT CONVERT(VARCHAR(7),YourDate,102) AS YearAndMonth) AS A
)
SELECT pc.YearAndMonth,pc.YourObjekt,tbl.YourDate
FROM PartitionedCounted AS pc
INNER JOIN #tbl AS tbl ON tbl.YourObjekt=pc.YourObjekt AND CONVERT(VARCHAR(7),tbl.YourDate,102)=pc.YearAndMonth
WHERE pc.Nr > 1
UPDATE
Since you are using SQL Server 2014 you can use EOMONTH:
WITH PartitionedCounted AS
(
SELECT ROW_NUMBER() OVER(PARTITION BY YourObjekt,EOMONTH(YourDate) ORDER BY YourDate) AS Nr
,EOMONTH(YourDate) AS EOM
,YourDate
,YourObjekt
FROM #tbl AS tbl
)
SELECT pc.EOM,pc.YourObjekt,tbl.YourDate
FROM PartitionedCounted AS pc
INNER JOIN #tbl AS tbl ON tbl.YourObjekt=pc.YourObjekt AND EOMONTH(tbl.YourDate)=pc.EOM
WHERE pc.Nr > 1

Sorting top ten vendors and showing remained vendors as "other"

Please consider a table of vendors having two columns: VendorName and PayableAmount
I'm looking for a query which returns top ten vendors sorted by PayableAmount descending and sum of other payable amounts as "other" in 11th row.
Obviously, sum of PayableAmount from Vendors table should be equal to sum of PayableAmount from Query.
Technically, it's possible to do in one query:
declare #t table (
Name varchar(50) primary key,
Amount money not null
);
-- Dummy data
insert into #t (Name, Amount)
select top (20) sq.*
from (
select name, max(number) as [Amount]
from master.dbo.spt_values
where number between 100 and 100000
and name is not null
group by name
) sq
order by newid();
-- The table itself, for verification
select * from #t order by Amount desc;
-- Actual query
select top (11)
case when sq.RN > 10 then '<All others>' else sq.Name end as [VendorName],
case
when sq.RN > 10 then sum(sq.Amount) over(partition by case when sq.rn > 10 then 1 else 0 end)
else sq.Amount
end as [Value]
from (
select t.Name, t.Amount, row_number() over(order by t.Amount desc) as [RN]
from #t t
) sq
order by sq.RN;
It will even work on any SQL Server version starting with 2005. But, in real life, I would prefer to calculate these 2 parts separately and then UNION them.
This would perform the query you're looking for. Firstly extracting those in the top 10, then UNION ing that result with the higher ranked vendors, but calling those 'Other'
WITH rank AS (SELECT
VendorName,
PayableAmount,
ROW_NUMBER() OVER (ORDER BY PayableAmount DESC) AS rn
FROM vendors)
SELECT VendorName,
rn,
PayableAmount
FROM
rank WHERE rn <= 10
UNION
SELECT VendorName, 11 AS rn, PayableAmount
FROM
(
SELECT 'Other' AS VendorName,
SUM(PayableAmount) AS PayableAmount
FROM
rank WHERE rn > 10
) X11
ORDER BY rn
This has been tested in SQLFiddle.
this is for the 11th row
i didnt check it
declare #i int
set #i=
(select sum(x.PayableAmount)
from
(select * from table
except
select top 10 *from table
order by PayableAmount desc) as x)
select 'another',#i

Get each patients' highest bill

I have a table where an ID can be associated with more than one bill. What I need to do is find the MAX billing amount, the ID and the date of their highest (MAX) bill. The problem is that there can be thousands of billsper person, and hundreds on any given date.
My query
select patientID, max(amountPaid) as maxPaid
from myTable
group by patientID
gives me what I need, minus the date. My attempt at fixing this is
select t.patientID, t.maxPaid, myTable.billDate
from myTable
inner join
(
select patientid, max(amountPaid) as maxPaid
from myTable
group by patientID
) as t on t.patientID=myTable.patientID and =t.maxPaid=myTable.maxPaid
The error given is invalid column name maxPaid. I tried not giving the calculated field an alias but SQL Server wouldn't accept myTable.max(amountPaid) either. What's the quickest way to fix this? thanks in advance.
The problem with your current approach is that if a patient has two bills with the maximum amount, you will get both of them.
Try this instead:
SELECT
patientid,
amountPaid AS max_paid,
billDate
FROM
(
SELECT
patientid,
amountPaid,
billDate,
ROW_NUMBER() OVER (PARTITION BY patientid
ORDER BY amountpaid DESC) AS RowNumber
FROM myTable
) T1
WHERE T1.RowNumber = 1
This will always return one row per patient even if a patient has two bills that both have the same maximum amountpaid.
;WITH x AS (SELECT PatientID, BillDate, AmountPaid,
rn = ROW_NUMBER() OVER (PARTITION BY PatientID ORDER BY AmountPaid DESC)
FROM dbo.myTable
)
SELECT PatientID, BillDate, AmountPaid
FROM x
WHERE rn = 1;
Based on you description, I think you meant this:
select t1.patientID, t2.maxPaid, t1.billDate
from myTable t1
inner join
(
select patientid, max(amountPaid) as maxPaid
from myTable
group by patientID
) t2
on t1.patientID=t2.patientID
and t1.amountPaid=t2.maxPaid