SELECT based on multiple fields in MS-SQL - sql

I have a table with 4 columns:
AcctNumb | PeriodEndingDate | WaterConsumption | ReadingType
There are multiple records for each AcctNumb, with the date that each record was recorded.
What I want to do is grab the most recent date, consumption reading, and reading type for each account.
I have tried using MAX(PeriodEndingDate) and GROUP BY AcctNumb, but I would need to aggregate all the other values, and none of the aggregate functions help me for the WaterConsumption, etc.
Can anyone point me in the right direction?
Thanks
EDIT
Here is a sample table
+----------+------------------+------------------+-------------+
| AcctNumb | PeriodEndingDate | WaterConsumption | ReadingType |
+----------+------------------+------------------+-------------+
| 1000 | 2018-03-31 | 122230 | A |
| 1001 | 2018-03-31 | 24850 | A |
| 1002 | 2018-03-31 | 88540 | A |
| 1000 | 2017-12-31 | 123800 | A |
| 1001 | 2017-12-31 | 3000 | E |
+----------+------------------+------------------+-------------+
The ReadingType is whether it's an actual (A) reading, or an estimate (E).

Try this
SELECT
AcctNumb,
PeriodEndingDate,
WaterConsumption,
ReadingType
FROM (SELECT
AcctNumb,
PeriodEndingDate,
WaterConsumption,
ReadingType,
ROW_NUMBER() OVER (PARTITION BY AcctNumb ORDER BY PeriodEndingDate DESC) AS MostrecentRecord
FROM <TableName>) dt
WHERE MostrecentRecord= 1

This can be done using ROW_NUMBER. It has been asked an answered thousands of times but the query is easier to write than find a duplicate.
select *
from
(
select *
, RowNum = ROW_NUMBER() over(partition by AcctNumb order by PeriodEndingDate)
from YourTable
) x
where x.RowNum = 1

SELECT DQ.* FROM
(SELECT *,
Row_Number() OVER (PARTITION BY AcctNumb ORDER BY PeriodEndingDate DESC) AS RN
FROM YourTable
) AS DQ
WHERE DQ.RN = 1

Related

SQL Server Add row number each group

I working on a query for SQL Server 2016. I have order by serial_no and group by pay_type and I would like to add row number same example below
row_no | pay_type | serial_no
1 | A | 4000118445
2 | A | 4000118458
3 | A | 4000118461
4 | A | 4000118473
5 | A | 4000118486
1 | B | 4000118499
2 | B | 4000118506
3 | B | 4000118519
4 | B | 4000118521
1 | A | 4000118534
2 | A | 4000118547
3 | A | 4000118550
1 | B | 4000118562
2 | B | 4000118565
3 | B | 4000118570
4 | B | 4000118572
Help me please..
SELECT
ROW_NUMBER() OVER(PARTITION BY paytype ORDER BY serial_no) as row_no,
paytype, serial_no
FROM table
ORDER BY serial_no
You can assign groups to adjacent pay types that are the same and then use row_number(). For this purpose, the difference of row numbers is a good way to determine the groups:
select row_number() over (partition by pay_type, seqnum - seqnum_2 order by serial_no) as row_no,
t.*
from (select t.*,
row_number() over (order by serial_no) as seqnum,
row_number() over (partition by pay_type order by serial_no) as seqnum_2
from t
) t;
This type of problem is one example of a gaps-and-islands problem. Why does the difference of row numbers work? I find that the simplest way to understand is to look at the results of the subquery.
Here is a db<>fiddle.
add this to your select list
ROW_NUMBER() OVER ( ORDER BY (SELECT 1) )
since you already sorting by your stuff, so you don't need to sorting in your windowing function so consuming less CPU,

Select the highest value of column 2 per column 1

Given the following table P_PROV
+----+-----------+-----------+
| id | date | person_id |
+----+-----------+-----------+
| 1 |19/06/2019 | 1 |
| 2 |18/07/2010 | 2 |
| 3 |19/06/2020 | 1 |
| 4 |17/06/2020 | 2 |
| 5 |28/06/2020 | 3 |
+----+-----------+-----------+
I want this output
+----+-----------+-----------+
| id | date | person_id |
+----+-----------+-----------+
| 3 |19/06/2020 | 1 |
| 4 |17/06/2020 | 2 |
| 5 |28/06/2020 | 3 |
+----+-----------+-----------+
Putting this in words, I want to return per person the maximum date. I tried something like this
SELECT DISTINCT pp.date, pp.id FROM P_PROV pp
WHERE (SELECT MAX(aa.date)
FROM P_PROV aa) = pp.date;
This one is only returning one row (of course, because the MAX will return the maximum date only), but I really don't know how to approach this issue, any kind of help would be appreciated
ROW_NUMBER provides one way to handle this:
SELECT id, date, person_id
FROM
(
SELECT t.*, ROW_NUMBER() OVER (PARTITION BY person_id ORDER BY date DESC) rn
FROM yourTable t
) t
WHERE rn = 1;
Oracle has a fun way to do this using aggregation:
select max(id) keep (dense_rank first order by date desc) as id,
max(date) as date, person_id
from P_PROV
group by person_id;
Given that your ids are increasing, this probably also does what you want:
select max(id) as id, max(date) as date, person_id
from P_PROV
group by person_id;

SQL: Select single item per name with multiple criteria

I'm trying to select a single item per value in a "Name" column according to several criteria.
The criteria I want to use look like this:
Only include results where IsEnabled = 1
Return the single result with the lowest priority (we're using 1 to mean "top priority")
In case of a tie, return the result with the newest Timestamp
I've seen several other questions that ask about returning the newest timestamp for a given value, and I've been able to adapt that to return the minimum value of Priority - but I can't figure out how to filter off of both Priority and Timestamp.
Here is the question that's been most helpful in getting me this far.
Sample data:
+------+------------+-----------+----------+
| Name | Timestamp | IsEnabled | Priority |
+------+------------+-----------+----------+
| A | 2018-01-01 | 1 | 1 |
| A | 2018-03-01 | 1 | 5 |
| B | 2018-01-01 | 1 | 1 |
| B | 2018-03-01 | 0 | 1 |
| C | 2018-01-01 | 1 | 1 |
| C | 2018-03-01 | 1 | 1 |
| C | 2018-05-01 | 0 | 1 |
| C | 2018-06-01 | 1 | 5 |
+------+------------+-----------+----------+
Desired output:
+------+------------+-----------+----------+
| Name | Timestamp | IsEnabled | Priority |
+------+------------+-----------+----------+
| A | 2018-01-01 | 1 | 1 |
| B | 2018-01-01 | 1 | 1 |
| C | 2018-03-01 | 1 | 1 |
+------+------------+-----------+----------+
What I've tried so far (this gets me only enabled items with lowest priority, but does not filter for the newest item in case of a tie):
SELECT DATA.Name, DATA.Timestamp, DATA.IsEnabled, DATA.Priority
From MyData AS DATA
INNER JOIN (
SELECT MIN(Priority) Priority, Name
FROM MyData
GROUP BY Name
) AS Temp ON DATA.Name = Temp.Name AND DATA.Priority = TEMP.Priority
WHERE IsEnabled=1
Here is a SQL fiddle as well.
How can I enhance this query to only return the newest result in addition to the existing filters?
Use row_number():
select d.*
from (select d.*,
row_number() over (partition by name order by priority, timestamp) as seqnum
from mydata d
where isenabled = 1
) d
where seqnum = 1;
The most effective way that I've found for these problems is using CTEs and ROW_NUMBER()
WITH CTE AS(
SELECT *, ROW_NUMBER() OVER( PARTITION BY Name ORDER BY Priority, TimeStamp DESC) rn
FROM MyData
WHERE IsEnabled = 1
)
SELECT Name, Timestamp, IsEnabled, Priority
From CTE
WHERE rn = 1;

SQL - How to do something like value.Contains?

someone can help me, I need to exclude some repeated values, the result is:
There are some rows with null values and in that case I named 'No Informado'.
In line from 26 to 32 there is the same value1 and value2, but value3 is different.
I will need this result,
id | name | user
0x00E281759429DD4B807F467F8B2319E3 | PC_XBPOX0112 | llopez
0x00F37F5DA2C8854699EFBA30F7102DDD | PC_BSCTY1312 | No Informado
0x00F53DBE60CFF343942E3893ABA809EB | PC_SVCTY6834 | ntapia
0x00FDB75C00B8D84E8A1862A56C71A766 | NB_TSCTY06606 | jogonzalez
0x010029519191B34BB498E7F9FEAE3E21 | PC_BSCTY3229 | kfuentes
0x011506756396BC4588E705BFCFA84847 | PC_BSCTY3134 | csepulveda
0x0120BE537B242C4EB01C4F94E82E64BF | PC_BSCTY1296 | eaviles
0x01322ABEC4F19E41B2139291952838EE | PC_VSCTY6535 | vbravo
0x0133C6B80B50E44A928AF770510856E3 | PC_FSCTY0084 | mcarreno
0x01463ECF32DEBD41943330EC7C1822D4 | PC_BSCTY3220 | fegonzalez
0x01610C718C04264A8349FAEA6676363F | PC-FSCTY0543 | fcastro
someone can help me?
Forward thanks!
Another option is the WITH TIES clause in concert with Row_Number()
Example
Select Top 1 With Ties *
From YourTable
Order by Row_Number() over (Partition By ID Order by Date Desc)
Returns
id name date
1 name1 2018-01-01
2 name2 2018-01-01
3 name5 2018-02-01
SELECT Id
, MAX(name) AS Name
, MAX([date]) AS [date]
FROM TableName
GROUP BY Id

SQL - How to find which page is the first for users?

I have a table like this:
+----------+-------------------------------------+----------------------------------+
| user_id | time | url |
+----------+-------------------------------------+----------------------------------+
| 1 | 02.04.2017 8:56 | www.landingpage.com/ |
| 1 | 02.04.2017 8:57 | www.landingpage.com/about-us |
| 1 | 02.04.2017 8:58 | www.landingpage.com/faq |
| 2 | 02.04.2017 6:34 | www.landingpage.com/about-us |
| 2 | 02.04.2017 6:35 | www.landingpage.com/how-to-order |
| 3 | 03.04.2017 9:11 | www.landingpage.com/ |
| 3 | 03.04.2017 9:12 | www.landingpage.com/contact |
| 3 | 03.04.2017 9:13 | www.landingpage.com/about-us |
| 3 | 03.04.2017 9:14 | www.landingpage.com/our-legacy |
| 3 | 03.04.2017 9:15 | www.landingpage.com/ |
+----------+-------------------------------------+----------------------------------+
I want to figure out which page is the first for most users (first page a user see when he comes to the site) and count the number of times it is viewed as the first page.
Is there a way to write a query to do this? I guess I need to use
MIN(time)
in conjunction with grouping but I don't know how.
So regarding the sample I provided it should be like:
url url_count
---------------------------------------------------
www.landingpage.com/ 2
www.landingpage.com/about-us 1
Thanks!
You're correct, you'll need to use the min() aggregate function within a subselect.
select
my_table.url
from
my_table
where
my_table.time = (
select
min(t.time)
from
my_table t
where
t.user_id = my_table.user_id
)
replace my_table with whatever your table is actually named.
To include how many pages the user has seen, you'll need something like this:
select
my_table.url
, (
select
count(t.url)
from
my_table t
where
t.user_id = my_table.user_id
) as url_count
from
my_table
where
my_table.time = (
select
min(t.time)
from
my_table t
where
t.user_id = my_table.user_id
)
SELECT *
FROM my_table
WHERE time IN
(
SELECT min(time)
FROM my_table
GROUP BY url
);
You can query as below:
Select top (1) with ties *
from yourtable
order by row_number() over(partition by user_id order by [time])
You can use outer query to get the same as below:
Select * from (
Select *, RowN = row_number() over(partition by user_id order by [time]) from yourtable) a
Where a.RowN = 1