How create a unique ID based on conditions in SQL? - sql

I would like to get a new ID, no matter the format (in the example below 11,12,13...)
Based on the following condition:
Every time the days column value is greater then 1 and not null then current row and all following ones will get the same ID until a new value will meet the condition.
Within the same email
Below you can see the expected 1 (in the format of XX)
I thought about using two conditions with the following order between them
Every time the days column value is greater then 1 then all following rows will get the same ID until a new value will meet the condition.
2.AND When lag (previous) is equal to 0/1/null.

Assuming you have an EmailDate column over which you're ordering (a DATETIME field, really), try something like this:
WITH
TableNameWithEmailDateIDs AS (
SELECT
*,
ROW_NUMBER() OVER (
ORDER BY
Email DESC,
EmailDate
) AS EmailDateID
FROM
TableName
),
IDs AS (
SELECT
*,
LEAD(EmailDateID, 1) OVER (
ORDER BY
Email,
EmailDate
) AS LeadEmailDateID
FROM
(
SELECT
*,
-- REMOVE +10 if you don't want 11 to be starting ID
ROW_NUMBER() OVER (
ORDER BY
Email DESC,
EmailDate
)+10 AS ID
FROM
TableNameWithEmailDateIDs
WHERE
Days > 1
OR Days IS NULL
) X
)
SELECT
COALESCE(TableName.EmailDate, IDs.EmailDate) AS EmailDate,
IDs.Email,
COALESCE(TableName.Days, IDs.Days) AS Days,
IDs.ID
FROM
IDs
LEFT JOIN TableNameWithEmailDateIDs TableName
ON IDs.Email = TableName.Email
AND TableName.EmailDateID BETWEEN
IDs.EmailDateID
AND IDs.LeadEmailDateID-1
ORDER BY
ID DESC,
TableName.EmailDate DESC
;
First, create a CTE that generates IDs for each distinct Email/Date combo (helpful for LEFT JOIN condition later). Then, create a CTE that generates IDs for rows that meet your condition (i.e. the important rows). Finally, LEFT JOIN your main table onto that CTE to fill in the "gaps", so to speak.
I suggest running each of the components of this query independently to fully understand what's going on.
Hope it helps!

Related

Getting MAX of a column and adding one more

I'm trying to make an SQL query that returns the greatest number from a column and its respective id.
For more information I have two columns ID and NUMBER. Both of them have 2 entries and I want to get the highest number with the ID next to it. This is what I tried but didn't success.
SELECT ID, MAX(NUMBER) AS MAXNUMB
FROM TABLE1
GROUP BY ID, MAXNUMB;
The problem I'm experiencing is that it just shows ALL the entries and if I add a "where" expression it just shows the same (all entries [ids+numbers]).
Pd.: Yes, I got what I wanted but only with one column (number) if I add another column (ID) to select it "brokes".
Try:
SELECT
ID,
A_NUMBER
FROM TABLE1
WHERE A_NUMBER = (
SELECT MAX(A_NUMBER)
FROM TABLE1);
Presuming you want the IDs* of the row with the highest number (and not, instead, the highest number for each ID -- if IDs were not unique in your table, for example).
* there may be more than one ID returned if there are two or more IDs with equal maximum numbers
you can try this
Select ID,maxNumber
From
(
SELECT
ID,
(Select Max(NUMBER) from Tmp where Id = t.Id) maxNumber
FROM
Tmp t
)T1
Group By ID,maxNumber
The query you posted has an illegal column name (number) and is group by the alias for the max value, which is illegal and also doesn't make sense; and you can't include the unaliased max() within the group-by either. So it's likely you're actually doing something like:
select id, max(numb) as maxnumb
from table1
group by id;
which will give one row per ID, with the maximum numb (which is the new name I've made up for your numeric column) for each ID. Or as you said you get "ALL the entries" you might have group by id, numb, which would show all rows from the table (unless there are duplicate combinations).
To get the maximum numb and the corresponding id you could group by id only, order by descending maxnumb, and then return the first row only:
select id, max(numb) as maxnumb
from table1
group by id
order by maxnumb desc
fetch first 1 row only
If there are two ID with the same maxnumb then you would only get one of them - and which one is indeterminate unless you modify the order by - but in that case you might prefer to use first 1 row with ties to see them all.
You could achieve the same thing with a subquery and analytic function to generating a ranking, and have the outer query return the highest-ranking row(s):
select id, numb as maxnumb
from (
select id, numb, dense_rank() over (order by numb desc) as rnk
from table1
)
where rnk = 1
You could also use keep to get the same result as first 1 row only:
select max(id) keep (dense_rank last order by numb) as id, max(numb) as maxnumb
from table1
fiddle

SQL query for filtering duplicate rows of a column by the minimum DateTime of those corresponding rows

I have a SQL database table, "Helium_Test_Data", that has multiple entries based on the KeyID column (the KeyID represents a single tested part ). I need to query the entries and only show one entry per KeyID (part) based on the earliest creation date-time (format example is 2018-12-29 08:22:11.123). This is because the same part was tested several times but the first reading is the one I need to use. Here is the query currently tried:
SELECT mt.*
FROM Helium_Test_Data mt
INNER JOIN
(
SELECT
KeyID,
MIN(DateTime) AS DateTime
FROM Helium_Test_Data
WHERE PSNo='11166565'
GROUP BY KeyID
) t ON mt.KeyID = t.KeyID AND mt.DateTime = t.DateTime
WHERE PSNo='11167197'
AND (mt.DateTime > '2018-12-29 07:00')
AND (mt.DateTime < '2018-12-29 18:00') AND OK=1
ORDER BY KeyId,DateTime
It returns only the rows that have no duplicate KeyID present in the table whereas I need one row per every single KeyID (duplicate or not). And for the duplicate ones, I need the earliest date.
Thanks in advance for the help.
use row_number() window function which support most dbms
select * from
(
select *,row_number() over(partition by KeyID order by DateTime) rn
from Helium_Test_Data
) t where t.rn=1
or you could use corelated subquery
select t1.* from Helium_Test_Data t1
where t1.DateTime= (select min(DateTime)
from Helium_Test_Data t2
where t2.KeyID=t1.KeyID
)

Select last item for each unique column value

I have a table containing message logs. Each conversation has a conversation ID.
I want to select distinct conversation IDs, and for each of them, find the latest message with that conversation ID and join it into the row.
This is what I tried but it doesn't add any data into the table except the two columns (conversationId and id). I want to get all columns from that table for each row with the latest
SELECT
logs.conversationId,
-- latest message id
MAX(logs.id) AS id
FROM [dbo].[Logs] AS logs
-- trying to get the remaining columns for the last message with that conversation ID
LEFT JOIN [dbo].[Logs] AS logs2 ON logs.id = logs2.id
WHERE
-- only conversations for last month
logs.timestamp >= DATEADD(month, -1, GETDATE())
GROUP BY logs.conversationId
When I try to add another column into SELECT, I get the error saying I need to add that column into the GROUP BY clause. But that causes the statement to run for an extremely long time, over 20 seconds for just a few dozen rows in the result.
use row_number() function
select *
from (
select *,
row_number() over(partition by conversationId order by id desc) as rn
from logs
) as t where t.rn=1
First get max log id per conversion from logs and then apply left join:
select * from
(SELECT
logs.conversationId,
MAX(logs.id) AS id
FROM [dbo].[Logs] AS logs group by logs.conversationId)a
left join [dbo].[Logs] AS logs2 ON a.id = logs2.id and a.conversationid=logs.conversationid
I would use a subquery in where to make it.
select *
from logs t
where t.id = (
SELECT MAX(tt.id)
from logs tt
WHERE tt.conversationId = t.conversationId
GROUP BY tt.conversationId
)
Note
if you make index in id might be faster than row_number version

Microsoft Access 2010: Select most recent max Record ID for each LANID

I need to filter out this data based on some criteria.
For every unique LANID, a user can have up to 2 records. Some users will only have 1 record.
I need to select the max Record ID for each LANID.
So create one query to determine the max(recordID) when grouped by LANID, then a second query using the first as the datasource joining it back to your table on LANID and max(recordID)
Assuming the last update date is not duplicated for a given row, then one method is to use a correlated subquery to get the last date and then get the rest of the columns in the row:
select sd.*
from sampleData as sd
where sd.RecordId = (select max(sd2.RecordId)
from sampleData as sd2
where sd2.lanId = sd.lanId
);
EDIT:
If you wanted the largest record id for the most recent update date:
select sd.*
from sampleData as sd
where sd.RecordId = (select top 1 sd2.RecordId
from sampleData as sd2
where sd2.lanId = sd.lanId
order by sd2.lastUpdateDate desc, sd2.RecordId desc
);

SQL Max returns duplicates if values are equal

I've got a view that contains a document ID column and a date column as well as a dozen other columns that aren't relevant to this problem. There can be multiple rows with the same document ID, but the dates are usually different. This signifies that it's the same document, just a revision of it. The problem is if I have two rows where the document ID and the date are the same, I get both. I just want to get one. It doesn't matter which one, as long as I only get one.
The following has duplicates where the document ID and date are the same.
SELECT FSD.*
FROM vFSD FSD
INNER JOIN
(
SELECT InternalID, MAX(FileLastUploadedDate) AS FileLastUploadedDate
FROM vFSD
GROUP BY InternalID
) gFSD ON FSD.InternalID = gFSD.InternalID AND FSD.FileLastUploadedDate = gFSD.FileLastUploadedDate
I've also tried it with DISTINCT, but it didn't fix the problem.
SELECT DISTINCT FSD.*
FROM vFSD FSD
INNER JOIN
(
SELECT DISTINCT InternalID, MAX(FileLastUploadedDate) AS FileLastUploadedDate
FROM vFSD
GROUP BY InternalID
) gFSD ON FSD.InternalID = gFSD.InternalID AND FSD.FileLastUploadedDate = gFSD.FileLastUploadedDate
You can use ROW_NUMBER to only bring back one arbitrary row in the event that two are tied with the same greatest FileLastUploadedDate for an InternalID
WITH CTE
AS (SELECT *,
ROW_NUMBER() OVER (PARTITION BY InternalID
ORDER BY FileLastUploadedDate DESC) AS RN
FROM vFSD)
SELECT InternalID,
FileLastUploadedDate
/*Other desired columns*/
FROM CTE
WHERE RN = 1