How to compare the current row with all the others in PostgreSQL? - sql

I have a table like this
| id | state | updatedate |
|:--------|:---------------|:------------|
| 1 | state_review | 1668603529 |
| 1 | state_review | 1668601821 |
| 1 | state_review_2 | 1668601821 |
| 2 | state_review | 1668601709 |
| 2 | state_review | 1668600822 |
| 2 | state_review_2 | 1668600747 |
| 3 | state_review | 1668559849 |
| 3 | state_review_2 | 1668539849 |
| 3 | state_review | 1668529849 |
| 3 | state_review_2 | 1661599849 |
| 3 | state_review | 1668599849 |
I'm trying to find how to count first occurance of changed state for all ids based on provided values, i have two incoming states from(state_review) to(state_review_2)
in this particular case there would be only three changed states that are going
from state_review -> state_review_2
resulting table would look like this
| amount |
|:--------|
| 3 |
I suspect window function might help with this but i'm not sure how to compare current state with all the others, states have to be ordered by id
Was trying to use this query, but that doesn't seem to work, instead of counting the latest unique transitions it counts all of them, if the first found transition doesn't match given states then skip the entire section for a certain id
SELECT
COUNT(DISTINCT (
CASE
WHEN
(
q.state = 'state_review'
AND 'state_review' != 'state_review_2'
)
THEN
ID
END
)) AS amount
FROM
(
SELECT
id,
state
FROM
states_table
WHERE
updatedate >= 1668603529
AND updatedate <= 1671599849
AND
(
state = 'state_review'
OR state = 'state_review_2'
)
ORDER BY
id, updatedate DESC
)
AS q

Transitions between 2 predefined states can be obtained with a LAG function.
Example with state_review and state_review_2
SELECT *
FROM(
SELECT ID, LAG(State) OVER (PARTITION BY ID ORDER BY updatedate) As FromState, State, updatedate
FROM States_table
) T
WHERE FromState = 'state_review' AND state = 'state_review2'
You can do variations of the above:
To avoid double-counting when an id transitioned from state S1 to state S2 several times, change the sub-query with DISTINCT and without updatedate like so: SELECT DISTINCT ID, LAG(State) OVER (PARTITION BY ID ORDER BY updatedate) As FromState, State
And of course, do SELECT COUNT(*) instead if all you want is the count.

Related

Get some values from the table by selecting

I have a table:
| id | Number |Address
| -----| ------------|-----------
| 1 | 0 | NULL
| 1 | 1 | NULL
| 1 | 2 | 50
| 1 | 3 | NULL
| 2 | 0 | 10
| 3 | 1 | 30
| 3 | 2 | 20
| 3 | 3 | 20
| 4 | 0 | 75
| 4 | 1 | 22
| 4 | 2 | 30
| 5 | 0 | NULL
I need to get: the NUMBER of the last ADDRESS change for each ID.
I wrote this select:
select dh.id, dh.number from table dh where dh =
(select max(min(t.history)) from table t where t.id = dh.id group by t.address)
But this select not correctly handling the case when the address first changed, and then changed to the previous value. For example id=1: group by return:
| Number |
| -------- |
| NULL |
| 50 |
I have been thinking about this select for several days, and I will be happy to receive any help.
You can do this using row_number() -- twice:
select t.id, min(number)
from (select t.*,
row_number() over (partition by id order by number desc) as seqnum1,
row_number() over (partition by id, address order by number desc) as seqnum2
from t
) t
where seqnum1 = seqnum2
group by id;
What this does is enumerate the rows by number in descending order:
Once per id.
Once per id and address.
These values are the same only when the value is 1, which is the most recent address in the data. Then aggregation pulls back the earliest row in this group.
I answered my question myself, if anyone needs it, my solution:
select * from table dh1 where dh1.number = (
select max(x.number)
from (
select
dh2.id, dh2.number, dh2.address, lag(dh2.address) over(order by dh2.number asc) as prev
from table dh2 where dh1.id=dh2.id
) x
where NVL(x.address, 0) <> NVL(x.prev, 0)
);

Oracle SQL: Counting how often an attribute occurs for a given entry and choosing the attribute with the maximum number of occurs

I have a table that has a number column and an attribute column like this:
1.
+-----+-----+
| num | att |
-------------
| 1 | a |
| 1 | b |
| 1 | a |
| 2 | a |
| 2 | b |
| 2 | b |
+------------
I want to make the number unique, and the attribute to be whichever attribute occured most often for that number, like this (This is the end-product im interrested in) :
2.
+-----+-----+
| num | att |
-------------
| 1 | a |
| 2 | b |
+------------
I have been working on this for a while and managed to write myself a query that looks up how many times an attribute occurs for a given number like this:
3.
+-----+-----+-----+
| num | att |count|
------------------+
| 1 | a | 1 |
| 1 | b | 2 |
| 2 | a | 1 |
| 2 | b | 2 |
+-----------------+
But I can't think of a way to only select those rows from the above table where the count is the highest (for each number of course).
So basically what I am asking is given table 3, how do I select only the rows with the highest count for each number (Of course an answer describing providing a way to get from table 1 to table 2 directly also works as an answer :) )
You can use aggregation and window functions:
select num, att
from (
select num, att, row_number() over(partition by num order by count(*) desc, att) rn
from mytable
group by num, att
) t
where rn = 1
For each num, this brings the most frequent att; if there are ties, the smaller att is retained.
Oracle has an aggregation function that does this, stats_mode().:
select num, stats_mode(att)
from t
group by num;
In statistics, the most common value is called the mode -- hence the name of the function.
Here is a db<>fiddle.
You can use group by and count as below
select id, col, count(col) as count
from
df_b_sql
group by id, col

SQL group by but order is important

is there any option to group by items but order of grouping is important?
Let's assume I have table with hardware and it's assigned to some users. And this hardware has some states like broken, ok, service. I want to group this table to have information, how long user had this item, but state is not important.
What I have:
+----+-------+--------+------------+------------+
| id | owner | state | from | to |
+----+-------+--------+------------+------------+
| 1 | ow1 | ok | 01.02.2019 | 04.06.2019 |
| 2 | ow1 | broken | 04.06.2019 | 12.06.2019 |
| 3 | srvc | fixing | 12.06.2019 | 17.06.2019 |
| 4 | ow1 | ok | 17.06.2019 | null | -- null - still has
+----+-------+--------+------------+------------+
But I want to have:
+-------+------------+------------+
| owner | from | to |
+-------+------------+------------+
| ow1 | 01.02.2019 | 12.06.2019 | -- here we have min and max dates before state changed
| srvc | 12.06.2019 | 17.06.2019 |
| ow1 | 17.06.2019 | null | -- null - still has
+-------+------------+------------+
How to write query to achieve this result?
This looks like a gaps and islands problem. One solution is follows:
Mark rows where owner changes (different from previous row) with a value 1
Group all 1s and subsequent 0s together
I usually do this:
WITH cte1 AS (
SELECT *
, CASE WHEN owner = LAG(owner) OVER (PARTITION BY hardware_id ORDER BY [from]) THEN 0 ELSE 1 END AS chg
FROM t
), cte2 AS (
SELECT *
, SUM(chg) OVER (PARTITION BY hardware_id ORDER BY [from]) AS grp
FROM cte1
)
SELECT owner
, hardware_id
, grp
, MIN([from])
, MAX([to])
FROM cte2
GROUP BY owner, hardware_id, grp
I have assumed that you want separate results per every piece of hardware, remove the hardware column if that is not the case.
Demo on db<>fiddle
Try this below option with union all.
SELECT owner,from,to
FROM your_table
WHERE to IS NULL
UNION ALL
SELECT owner,MIN(from),MAX(to)
FROM your_table
WHERE to IS NOT NULL
GROUP BY owner

SQL: Select single item per name with multiple criteria

I'm trying to select a single item per value in a "Name" column according to several criteria.
The criteria I want to use look like this:
Only include results where IsEnabled = 1
Return the single result with the lowest priority (we're using 1 to mean "top priority")
In case of a tie, return the result with the newest Timestamp
I've seen several other questions that ask about returning the newest timestamp for a given value, and I've been able to adapt that to return the minimum value of Priority - but I can't figure out how to filter off of both Priority and Timestamp.
Here is the question that's been most helpful in getting me this far.
Sample data:
+------+------------+-----------+----------+
| Name | Timestamp | IsEnabled | Priority |
+------+------------+-----------+----------+
| A | 2018-01-01 | 1 | 1 |
| A | 2018-03-01 | 1 | 5 |
| B | 2018-01-01 | 1 | 1 |
| B | 2018-03-01 | 0 | 1 |
| C | 2018-01-01 | 1 | 1 |
| C | 2018-03-01 | 1 | 1 |
| C | 2018-05-01 | 0 | 1 |
| C | 2018-06-01 | 1 | 5 |
+------+------------+-----------+----------+
Desired output:
+------+------------+-----------+----------+
| Name | Timestamp | IsEnabled | Priority |
+------+------------+-----------+----------+
| A | 2018-01-01 | 1 | 1 |
| B | 2018-01-01 | 1 | 1 |
| C | 2018-03-01 | 1 | 1 |
+------+------------+-----------+----------+
What I've tried so far (this gets me only enabled items with lowest priority, but does not filter for the newest item in case of a tie):
SELECT DATA.Name, DATA.Timestamp, DATA.IsEnabled, DATA.Priority
From MyData AS DATA
INNER JOIN (
SELECT MIN(Priority) Priority, Name
FROM MyData
GROUP BY Name
) AS Temp ON DATA.Name = Temp.Name AND DATA.Priority = TEMP.Priority
WHERE IsEnabled=1
Here is a SQL fiddle as well.
How can I enhance this query to only return the newest result in addition to the existing filters?
Use row_number():
select d.*
from (select d.*,
row_number() over (partition by name order by priority, timestamp) as seqnum
from mydata d
where isenabled = 1
) d
where seqnum = 1;
The most effective way that I've found for these problems is using CTEs and ROW_NUMBER()
WITH CTE AS(
SELECT *, ROW_NUMBER() OVER( PARTITION BY Name ORDER BY Priority, TimeStamp DESC) rn
FROM MyData
WHERE IsEnabled = 1
)
SELECT Name, Timestamp, IsEnabled, Priority
From CTE
WHERE rn = 1;

Update an ordinal column based on the alphabetic ordering of another column

I have a table representing a system of folders and sub-folders with an ordinal m_order column.
Sometimes sub-folders are sorted alphanumerically, others are sorted by date or by importance.
I recently had to delete some sub-folders of a particular parent folder and add a few new ones. I also had to switch the ordering scheme to alphanumeric. This needed to be reflected in the m_order column.
Here's an example of the table:
+-----+-----------+-----------+------------+
| ID | parent | title | m_order |
+-----+-----------+-----------+------------+
| 100 | 1 | docs | 3 |
| 101 | 1 | reports | 2 |
| 102 | 1 | travel | 1 |
| 103 | 1 | weekly | 4 |
| 104 | 1 | briefings | 5 |
| ... | ... | ... | ... |
+-----+-----------+-----------+------------+
And here is what I want:
+-----+-----------+-----------+------------+
| ID | parent | title | m_order |
+-----+-----------+-----------+------------+
| 100 | 1 | docs | 3 |
| 101 | 1 | reports | 4 |
| 102 | 1 | travel | 5 |
| 200 | 1 | contacts | 2 |
| 201 | 1 | admin | 1 |
| ... | ... | ... | ... |
+-----+-----------+-----------+------------+
I would do this with a simple update:
with toupdate as (
select m.*, row_number() over (partition by parent order by title) as seqnum
from menu m
)
update toupdate
set m_order = toupdate.seqnum;
This restarts the ordering for each parent. If you have a particular parent in mind, use a WHERE clause:
where parentid = #parentid and m_order <> toupdate.seqnum
After deleting the old folders and inserting the new records, I accomplished the reordering by using MERGE INTO and ROW_NUMBER():
DECLARE #parentID INT
...
MERGE INTO menu
USING (
SELECT ROW_NUMBER() OVER (ORDER BY title) AS rowNumber, ID
FROM menu
WHERE parent = #parentID
) AS reordered
ON menu.ID = reordered.ID
WHEN MATCHED THEN
UPDATE
SET menu.m_order = reordered.rowNumber
I needed both t-sql and Oracle versions of this. To save future readers the struggle with the subtle differences in the ORA UPDATE syntax, here it is, shamelessly ripped off Gordon Linoff's answer:
update (
with toupdate as (
select
m.primarykey,
row_number() over(partition by parent order by title) as seqnum
from menu m
)
select m.primarykey, t.seqnum from menu m inner join toupdate t on t.primarykey=m.primarykey
)
set m_order = t.seqnum;