Selecting last row for unique index ordered by column - sql

Objective
I have a list of payment records that contain account_uuid, price, type & created_at. I need to get a list of the latest payment record specific to each account_uuid where the type = 0.
What I have tried
My first attempt was to ORDER BY on created_at to ensure the latest row was last, then to GROUP BY on account_uuid. The issue is that I would have to add both account_uuid and created_at to the GROUP BY expression which would include multiple records for the account_uuid as it will only group rows when both account_uuid and created_at are the same, which is never.
My second attempt was to SELECT DISTINCT ON account_uuid. This didn't work for the same reason above as it complains I must include my ORDER BY column in my DISTINCT ON expression which would yield the same result.
Sample Data
account_uuid
price
type
created_at (↑)
aa4dd27e-b72a-40fd-bdab-94810e585734
8.96
0
1649840899215
5c5625af-65e5-43d3-a39d-b896cd4d02a3
14.58
0
1649841117203
aa4dd27e-b72a-40fd-bdab-94810e585734
null
2
1649843706217
d8a106f9-dbf2-42f1-ac6b-a17e88700fab
3.939
0
1650434747192
aa4dd27e-b72a-40fd-bdab-94810e585734
14.58
0
1650438658596
Sample Result (Desired)
account_uuid
price
type (=0)
created_at (↑)
5c5625af-65e5-43d3-a39d-b896cd4d02a3
14.58
0
1649841117203
d8a106f9-dbf2-42f1-ac6b-a17e88700fab
3.939
0
1650434747192
aa4dd27e-b72a-40fd-bdab-94810e585734
14.58
0
1650438658596
Problem / Question
What I am trying to achieve is the sample result which you can see returns only the latest row for the account_uuid where type is 0 and created_at is ascending. Best case I would like to do it without any joins/subqueries but am happy for just getting it working.
Thank You

Edit: A simpler solution would be:
select distinct on (account_uuid)
max(created_at)
,price
,type
,account_uuid
from tableName
where type = 0
group by price,type,account_uuid
You can achieve this in Postgres without using join/sub-queries by using a combination of DISTINCT ON and window functions. This assumes that you truly want the "latest row where type = 0" and not "If the latest row = 0".
select distinct on (account_uuid)
max(created_at) over (partition by account_uuid order by created_at desc)
,price
,type
,account_uuid
from tableName
where type = 0
Fiddle here

You need select using ROW_NUMBER() in a sub query and then in the outer where statement you select RN = 1
There are many code examples of using ROW_NUMBER() if you search for it.

Related

need to pull a specific record

There is 1 record having duplicate values except in 1 column having x and y
record status
XXXXXXXXXX A
XXXXXXXXXX B
Need to pull A only and remove the other duplicate B
Select record
case
when status in ("'a', 'b'") then ('a')
from xyz
Let suppose you have data as below where Status is repeating for First column
but you are interesting in the status which is of having lower value as given below:
In this case following SQL may help. Here, we are partitioning on key field and ordering the Status so that we can apply filter on rank to get desired result.
WITH sampleData AS
 (SELECT '1234' as Field1,  'A' as STATUS UNION ALL 
  SELECT '1234',  'C' UNION ALL
  SELECT '5678', 'A' UNION ALL 
  SELECT '5678',  'B' )
 select * except(rank) from (
 select *, rank() over (partition by Field1 order by STATUS ASC) rank from sampleData)
 where rank = 1
 order by Field1
Consider below approach
select * from sampledata
qualify 1 = row_number() over win
window win as (partition by field1 order by if(status='A',1,2) )
if applied to sample data in your question - output is

Counting how many times one specific value changed to another specific value in order of date range and grouped by ID

I have a table like below where I need to query a count of how many times each ID went from specifically 'Waste Sale' in one value to 'On Stop' in the very next value based on ascending date and if there are no instances of this, the count will be 0
ID
Stage name
Stage Changed Date
1
Waste Sale
06-05-2022
1
On Stop
08-06-2022
1
Cancelled
09-02-2022
2
Waste Sale
06-05-2022
2
On Stop
07-05-2022
2
Waste Sale
08-06-2022
2
On Stop
10-07-2022
3
Cancelled
10-07-2022
3
On Stop
11-07-2022
The result I would be looking for based on the above table would be something like this:
ID
Count of 'Waste Sales to On Stops'
1
1
2
2
3
0
ID 1 having a count of 1 because there was one instance of 'Waste Sale' changing to 'On Stop' in the very next value based on date range
ID 3 having a count of 0 because even though the stage name changed to 'On Stop' the previous value based on date range wasn't 'Waste Sale'.
I have a hunch I would have to use something like LEAD() and GROUP BY/ ORDER BY but since I'm so new to SQL would really appreciate some help on the specific syntax and coding. Any version of SQL is okay.
We can use window function lead to take a peek at the next value of the query result.
select distinct id,
(
select count(*)
from
(
select *,
lead(stage_name)
over(
partition by id
order by stage_changed_date)
as stage_next
from sales s2
) s3
where s3.id = s1.id
and s3.stage_name = 'waste sale'
and s3.stage_next = 'on stop'
) as count_of_waste_sales_to_on_stop
from sales s1
order by id;
Query above uses lead(stage_name) over(partition by id order by stage_changed_date) to get the next stage_name in the query result while segregating it by id and order it based on stage_changed_date. Check the query on DB Fiddle.
Note:
I have no experience in zoho, so i'm unsure if the query will 100% works or not. They said it supported ansi-sql, however there might some differences with MySQL due to reasons.
The column names are not the exact same with op question due to testing only done using DB Fiddle.
There might better query out there waiting to be written.

Need to result of column based on available column in SQL

I am having one view which is returning the following result:
I need to put identifier just like below image
Required output:
Explanation of Output: If you can see the image 1 and in that image release 1 has 3 dates. From that I need to get 1 as an identifier for the MAX(IMPL_DATE).In RELEASE_ID = 1, We are having 08/20/2016, 08/09/2016 and 10/31/2016. From This 10/31/2016 is the largest date. So, Need Identifier as 1 and other 2 are going to be 0. Same thing with the RELEASE_ID 2 we have 2 dates and from them 01/13/2017 is the largest date so, need 1 in that row and other's going to be 0.
Thanks In advance...
You can do this with window functions:
select t.*,
(case when rank() over (partition by portfolio_id, release_id
order by impl_date desc
) = 1
then 1 else 0
end) as indentifier
from t;
The above will assign "1" to all rows with the maximum date. If you want to ensure that only one row is assigned a value (even when there are ties), then use row_number() instead of rank().

How to write an SQL query to have alternating pattern between rows (like -1 and 1)?

I cannot get an alternating pattern of 1 an -1 with my database.
This explains what I am trying to do.
ID Purpose Date Val
1 Derp 4/1/1969 1
1 Derp 4/1/1969 -1
2 Derp 4/2/2011 1
2 Derp 4/2/2011 -1
From a database that is something like
ID Purpose Date
1 Derp 4/1/1969
1 Herp 4/1/1911
2 Woot 4/2/1311
2 Wall 4/2/211
Here is my attempt:
SELECT
ID
,Purpose
,Date
,Val as 1
FROM (
SELECT FIRST(Purpose)
FROM DerpTable WHERE Purpose LIKE '%DERP%'
GROUP BY ID, DATE) as HerpTable, DerpTable
WHERE HerpTable.ID = DerpTable.ID AND DerpTable.ID = HerpTable.ID
This query does not work for me because my mssm does not recognize 'FIRST' or 'FIRST_VALUE' as built in functions. Thus, I have no way of numbering the first incident of derp and giving it a value.
Problems:
I am using sql2012 and thus cannot use First.
I tried using last_value and first_value as seen here but get errors indicating that function is not found
A bunch of sql queries. I've been staring at the MSDN T-SQL help pages
This is me right now.
What I need is a fresh perspective and assistance. Am I making this too hard?
Use a subquery along with ROW_NUMBER and the modulo operator:
select
ID,
Purpose,
Date,
case when rownum % 2 = 0 then 1 else -1 end as Val
from (
SELECT
ID
,Purpose
,Date
ROW_NUMBER() over (order by ID) as rownum
FROM (
SELECT
ID,
Purpose,
Date
FROM DerpTable WHERE Purpose LIKE '%DERP%'
GROUP BY ID, DATE) as HerpTable, DerpTable
WHERE HerpTable.ID = DerpTable.ID AND DerpTable.ID = HerpTable.ID
) [t1]
ROW_NUMBER will assign a value to each row, in this case it's an incrementing value. Using the modulus with 2 allows us to check if it's even or odd and assign 1 or -1.
Note: I don't know if this query will run since I don't know the architecture of your database, but the idea should get you there.
You can use first_value() in SQL Server 2012. I'm not sure what the WHERE condition is in your query, but the following should return your desired results:
SELECT ID,
FIRST_VALUE(Purpose) OVER (PARTITION BY ID ORDER BY DATE) as Purpose,
DATE,
2 * ROW_NUMBER() OVER (PARTITION BY ID ORDER BY DATE) - 1
FROM DERPTABLE
Why not add an incremental column, update the table using modulo to determine if it's even or odd, then drop the column?

Missing gaps in recurring series within a group

We have a table with following data
Id,ItemId,SeqNumber;DateTimeTrx
1,100,254,2011-12-01 09:00:00
2,100,1,2011-12-01 09:10:00
3,200,7,2011-12-02 11:00:00
4,200,5,2011-12-02 10:00:00
5,100,255,2011-12-01 09:05:00
6,200,3,2011-12-02 09:00:00
7,300,0,2011-12-03 10:00:00
8,300,255,2011-12-03 11:00:00
9,300,1,2011-12-03 10:30:00
Id is an identity column.
The sequence for an ItemId starts from 0 and goes till 255 and then resets to 0. All this information is stored in a table called Item. The order of sequence number is determined by the DateTimeTrx but such data can enter any time into the system. The expected output is as shown below-
ItemId,PrevorNext,SeqNumber,DateTimeTrx,MissingNumber
100,Previous,255,2011-12-01 09:05:00,0
100,Next,1,2011-12-01 09:10:00,0
200,Previous,3,2011-12-02 09:00:00,4
200,Next,5,2011-12-02 10:00:00,4
200,Previous,5,2011-12-02 10:00:00,6
200,Next,7,2011-12-02 11:00:00,6
300,Previous,1,2011-12-03 10:30:00,2
300,Next,255,2011-12-03 16:30:00,2
We need to get those rows one before and one after the missing sequence. In the above example for ItemId 300 - the record with sequence 1 has entered first (2011-12-03 10:30:00) and then 255(2011-12-03 16:30:00), hence the missing number here is 2. So 1 is previous and 255 is next and 2 is the first missing number. Coming to ItemId 100, the record with sequence 255 has entered first (2011-12-02 09:05:00) and then 1 (2011-12-02 09:10:00), hence 255 is previous and then 1, hence 0 is the first missing number.
In the above expected result, MissingNumber column is the first occuring missing number just to illustrate the example.
We will not have a case where we would have a complete series reset at one time i.e. it can be either a series rundown from 255 to 0 as in for itemid 100 or 0 to 255 as in ItemId 300. Hence we need to identify sequence missing when in ascending order (0,1,...255) or either in descending order (254,254,0,2) etc.
How can we accomplish this in a t-sql?
Could work like this:
;WITH b AS (
SELECT *
,row_number() OVER (ORDER BY ItemId, DateTimeTrx, SeqNumber) AS rn
FROM tbl
), x AS (
SELECT
b.Id
,b.ItemId AS prev_Itm
,b.SeqNumber AS prev_Seq
,c.ItemId AS next_Itm
,c.SeqNumber AS next_Seq
FROM b
JOIN b c ON c.rn = b.rn + 1 -- next row
WHERE c.ItemId = b.ItemId -- only with same ItemId
AND c.SeqNumber <> (b.SeqNumber + 1)%256 -- Seq cycles modulo 256
)
SELECT Id, prev_Itm, 'Previous' AS PrevNext, prev_Seq
FROM x
UNION ALL
SELECT Id, next_Itm ,'Next', next_Seq
FROM x
ORDER BY Id, PrevNext DESC
Produces exactly the requested result.
See a complete working demo on data.SE.
This solution takes gaps in the Id column into consideration, as there is no mention of a gapless sequence of Ids in the question.
Edit2: Answer to updated question:
I updated the CTE in the query above to match your latest verstion - or so I think.
Use those columns that define the sequence of rows. Add as many columns to your ORDER BY clause as necessary to break ties.
The explanation to your latest update is not entirely clear to me, but I think you only need to squeeze in DateTimeTrx to achieve what you want. I have SeqNumber in the ORDER BY additionally to break ties left by identical DateTimeTrx. I edited the query above.