SQL - How to do something like value.Contains? - sql

someone can help me, I need to exclude some repeated values, the result is:
There are some rows with null values and in that case I named 'No Informado'.
In line from 26 to 32 there is the same value1 and value2, but value3 is different.
I will need this result,
id | name | user
0x00E281759429DD4B807F467F8B2319E3 | PC_XBPOX0112 | llopez
0x00F37F5DA2C8854699EFBA30F7102DDD | PC_BSCTY1312 | No Informado
0x00F53DBE60CFF343942E3893ABA809EB | PC_SVCTY6834 | ntapia
0x00FDB75C00B8D84E8A1862A56C71A766 | NB_TSCTY06606 | jogonzalez
0x010029519191B34BB498E7F9FEAE3E21 | PC_BSCTY3229 | kfuentes
0x011506756396BC4588E705BFCFA84847 | PC_BSCTY3134 | csepulveda
0x0120BE537B242C4EB01C4F94E82E64BF | PC_BSCTY1296 | eaviles
0x01322ABEC4F19E41B2139291952838EE | PC_VSCTY6535 | vbravo
0x0133C6B80B50E44A928AF770510856E3 | PC_FSCTY0084 | mcarreno
0x01463ECF32DEBD41943330EC7C1822D4 | PC_BSCTY3220 | fegonzalez
0x01610C718C04264A8349FAEA6676363F | PC-FSCTY0543 | fcastro
someone can help me?
Forward thanks!

Another option is the WITH TIES clause in concert with Row_Number()
Example
Select Top 1 With Ties *
From YourTable
Order by Row_Number() over (Partition By ID Order by Date Desc)
Returns
id name date
1 name1 2018-01-01
2 name2 2018-01-01
3 name5 2018-02-01

SELECT Id
, MAX(name) AS Name
, MAX([date]) AS [date]
FROM TableName
GROUP BY Id

Related

SQL: Select single item per name with multiple criteria

I'm trying to select a single item per value in a "Name" column according to several criteria.
The criteria I want to use look like this:
Only include results where IsEnabled = 1
Return the single result with the lowest priority (we're using 1 to mean "top priority")
In case of a tie, return the result with the newest Timestamp
I've seen several other questions that ask about returning the newest timestamp for a given value, and I've been able to adapt that to return the minimum value of Priority - but I can't figure out how to filter off of both Priority and Timestamp.
Here is the question that's been most helpful in getting me this far.
Sample data:
+------+------------+-----------+----------+
| Name | Timestamp | IsEnabled | Priority |
+------+------------+-----------+----------+
| A | 2018-01-01 | 1 | 1 |
| A | 2018-03-01 | 1 | 5 |
| B | 2018-01-01 | 1 | 1 |
| B | 2018-03-01 | 0 | 1 |
| C | 2018-01-01 | 1 | 1 |
| C | 2018-03-01 | 1 | 1 |
| C | 2018-05-01 | 0 | 1 |
| C | 2018-06-01 | 1 | 5 |
+------+------------+-----------+----------+
Desired output:
+------+------------+-----------+----------+
| Name | Timestamp | IsEnabled | Priority |
+------+------------+-----------+----------+
| A | 2018-01-01 | 1 | 1 |
| B | 2018-01-01 | 1 | 1 |
| C | 2018-03-01 | 1 | 1 |
+------+------------+-----------+----------+
What I've tried so far (this gets me only enabled items with lowest priority, but does not filter for the newest item in case of a tie):
SELECT DATA.Name, DATA.Timestamp, DATA.IsEnabled, DATA.Priority
From MyData AS DATA
INNER JOIN (
SELECT MIN(Priority) Priority, Name
FROM MyData
GROUP BY Name
) AS Temp ON DATA.Name = Temp.Name AND DATA.Priority = TEMP.Priority
WHERE IsEnabled=1
Here is a SQL fiddle as well.
How can I enhance this query to only return the newest result in addition to the existing filters?
Use row_number():
select d.*
from (select d.*,
row_number() over (partition by name order by priority, timestamp) as seqnum
from mydata d
where isenabled = 1
) d
where seqnum = 1;
The most effective way that I've found for these problems is using CTEs and ROW_NUMBER()
WITH CTE AS(
SELECT *, ROW_NUMBER() OVER( PARTITION BY Name ORDER BY Priority, TimeStamp DESC) rn
FROM MyData
WHERE IsEnabled = 1
)
SELECT Name, Timestamp, IsEnabled, Priority
From CTE
WHERE rn = 1;

SELECT based on multiple fields in MS-SQL

I have a table with 4 columns:
AcctNumb | PeriodEndingDate | WaterConsumption | ReadingType
There are multiple records for each AcctNumb, with the date that each record was recorded.
What I want to do is grab the most recent date, consumption reading, and reading type for each account.
I have tried using MAX(PeriodEndingDate) and GROUP BY AcctNumb, but I would need to aggregate all the other values, and none of the aggregate functions help me for the WaterConsumption, etc.
Can anyone point me in the right direction?
Thanks
EDIT
Here is a sample table
+----------+------------------+------------------+-------------+
| AcctNumb | PeriodEndingDate | WaterConsumption | ReadingType |
+----------+------------------+------------------+-------------+
| 1000 | 2018-03-31 | 122230 | A |
| 1001 | 2018-03-31 | 24850 | A |
| 1002 | 2018-03-31 | 88540 | A |
| 1000 | 2017-12-31 | 123800 | A |
| 1001 | 2017-12-31 | 3000 | E |
+----------+------------------+------------------+-------------+
The ReadingType is whether it's an actual (A) reading, or an estimate (E).
Try this
SELECT
AcctNumb,
PeriodEndingDate,
WaterConsumption,
ReadingType
FROM (SELECT
AcctNumb,
PeriodEndingDate,
WaterConsumption,
ReadingType,
ROW_NUMBER() OVER (PARTITION BY AcctNumb ORDER BY PeriodEndingDate DESC) AS MostrecentRecord
FROM <TableName>) dt
WHERE MostrecentRecord= 1
This can be done using ROW_NUMBER. It has been asked an answered thousands of times but the query is easier to write than find a duplicate.
select *
from
(
select *
, RowNum = ROW_NUMBER() over(partition by AcctNumb order by PeriodEndingDate)
from YourTable
) x
where x.RowNum = 1
SELECT DQ.* FROM
(SELECT *,
Row_Number() OVER (PARTITION BY AcctNumb ORDER BY PeriodEndingDate DESC) AS RN
FROM YourTable
) AS DQ
WHERE DQ.RN = 1

SQL Server partition by gives duplicate records

I have following table:
Date | ID | firstname
---------+----+------------
20161128 | 1 | Adam
20161128 | 2 | Steve
20161128 | 2 | Steve
20161128 | 3 | Aaron
20161129 | 1 | Adam
20161129 | 2 | Steve
20161129 | 2 | Steve
20161129 | 3 | Aaron
I want to get the first row by ID for one particular date.
So what I had was:
SELECT *
FROM tableA
WHERE Date = 20161128
this however, gives all records. So I used the partition over function:
SELECT
*,
row_number() over(partition by ID order by Date desc)
FROM tableA
WHERE Date = 20161128
In this case, I get following result:
Date | ID | firstname | rownum
---------+----+-----------+-------
20161129 | 1 | Adam | 1
20161129 | 1 | Adam | 2
20161129 | 2 | Steve | 1
20161129 | 2 | Steve | 2
20161129 | 2 | Steve | 3
20161129 | 2 | Steve | 4
20161129 | 2 | Steve | 5
20161129 | 2 | Steve | 6
20161129 | 3 | Aaron | 1
20161129 | 3 | Aaron | 2
As you can see, Most ID's appear 2 times. (ID 2 even appears 6 times). In other cases, I see a record appear 10 times even though it would only have one record if I used the first query.
Any idea why this happens and how this can be fixed? My guess would be the date/where clause, but I don't see how this can effect the result this much.
You need a WHERE clause if you want to filter the records:
SELECT a.*
FROM (SELECT a.*,
row_number() over(partition by ID order by Date desc) as seqnum
FROM tableA a
WHERE a.Date = '20161128'
) a
WHERE seqnum = 1;
This will return one row per date per id number.
You can replace
SELECT *,
row_number() over(partition by ID order by Date desc)
FROM tableA
WHERE Date = 20161128
to
SELECT *
FROM tableA
WHERE ID = (select min(ID) from tableA )
This will only display the first instance.
Select * from
(SELECT *,
rownum=row_number() over(partition by PersonID_EXT order by SnapshotDate desc)
FROM tableA
WHERE Date = 20161128)x where rownum =1

SQL : Getting duplicate rows along with other variables

I am working on Terradata SQL. I would like to get the duplicate fields with their count and other variables as well. I can only find ways to get the count, but not exactly the variables as well.
Available input
+---------+----------+----------------------+
| id | name | Date |
+---------+----------+----------------------+
| 1 | abc | 21.03.2015 |
| 1 | def | 22.04.2015 |
| 2 | ajk | 22.03.2015 |
| 3 | ghi | 23.03.2015 |
| 3 | ghi | 23.03.2015 |
Expected output :
+---------+----------+----------------------+
| id | name | count | // Other fields
+---------+----------+----------------------+
| 1 | abc | 2 |
| 1 | def | 2 |
| 2 | ajk | 1 |
| 3 | ghi | 2 |
| 3 | ghi | 2 |
What am I looking for :
I am looking for all duplicate rows, where duplication is decided by ID and to retrieve the duplicate rows as well.
All I have till now is :
SELECT
id, name, other-variables, COUNT(*)
FROM
Table_NAME
GROUP BY
id, name
HAVING
COUNT(*) > 1
This is not showing correct data. Thank you.
You could use a window aggregate function, like this:
SELECT *
FROM (
SELECT id, name, other-variables,
COUNT(*) OVER (PARTITION BY id) AS duplicates
FROM users
) AS sub
WHERE duplicates > 1
Using a teradata extension to ISO SQL syntax, you can simplify the above to:
SELECT id, name, other-variables,
COUNT(*) OVER (PARTITION BY id) AS duplicates
FROM users
QUALIFY duplicates > 1
As an alternative to the accepted and perfectly correct answer, you can use:
SELECT {all your required 'variables' (they are not variables, but attributes)}
, cnt.Count_Dups
FROM Table_NAME TN
INNER JOIN (
SELECT id
, COUNT(1) Count_Dups
GROUP BY id
HAVING COUNT(1) > 1 -- If you want only duplicates
) cnt
ON cnt.id = TN.id
edit: According to your edit, duplicates are on id only. Edited my query accordingly.
try this,
SELECT
id, COUNT(id)
FROM
Table_NAME
GROUP BY
id
HAVING
COUNT(id) > 1

SQL query to select the latest records with a distinct subject

I am using SQL Server and have a table set up like below:
| id | subject | content | moreContent | modified |
| 1 | subj1 | aaaa | aaaaaaaaaaa | 03/03/2015 |
| 2 | subj1 | bbbb | aaaaaaaaaaa | 03/05/2015 |
| 3 | subj2 | cccc | aaaaaaaaaaa | 03/03/2015 |
| 4 | subj1 | dddd | aaaaaaaaaaa | 03/01/2015 |
| 5 | subj2 | eeee | aaaaaaaaaaa | 07/02/2015 |
I want to select the latest record for each subject heading, so the records to be returned would be:
| id | subject | content | moreContent | modified |
| 2 | subj1 | bbbb | aaaaaaaaaaa | 03/05/2015 |
| 3 | subj2 | cccc | aaaaaaaaaaa | 03/03/2015 |
SELECT Subject, MAX(Modified) FROM [CareManagement].[dbo].[Careplans] GROUP BY Subject
I could do a query like the one above, but I want to preserve all of the content from the selected rows. To return the content columns I would need to apply an aggregate function, or add them to the group by clause which wouldn't give me the desired effect.
I have also looked at nested queries but not found a successful solution yet. If anyone could assist that would be great.
You can use ROW_NUMBER():
SELECT id, subject, content, moreContent, modified
FROM (
SELECT id, subject, content, moreContent, modified,
ROW_NUMBER() OVER (PARTITION BY subject
ORDER BY modified DESC) AS rn
FROM [CareManagement].[dbo].[Careplans] ) t
WHERE rn = 1
rn = 1 will return each record having the latest modified date per subject. In case there are two or more records sharing the same 'latest' date and you want all of these records returned, then you might have a look at RANK() window function.
Using ROW_NUMBER this becomes pretty simple.
with myCTE as
(
select id
, Subject
, content
, morecontent
, Modified
, ROW_NUMBER() over (PARTITION BY [Subject] order by Modified desc) as RowNum
from [CareManagement].[dbo].[Careplans]
)
select id
, Subject
, content
, morecontent
, Modified
from myCTE
where RowNum = 1
You could use the rank window function to retrieve only the latest record:
SELECT id, subject, content, moreContent, modified
FROM (SELECT id, subject, content, moreContent, modified,
RANK() OVER (PARTITION BY subject ORDER BY modified DESC) AS rk
FROM [CareManagement].[dbo].[Careplans]) t
WHERE rk = 1