I have a scenario to update the rows
within the same condition(status = 1) but not the latest row.
So this is the table design.
--------------------------------------------------
|idx | status | var1 | date
--------------------------------------------------
| 2 | 1 | cat | 2018-06-17 15:41:32.110
| 3 | 1 | dog | 2018-06-17 11:41:32.110
| 2 | 1 | lamb | 2018-06-17 11:41:32.110
| 2 | 1 | pc | 2018-06-17 09:41:32.110
| 3 | 1 | doll | 2018-06-17 09:41:32.110
What I want is to get all the same conditions
where idx is equal and status = 1, and
update the status to 0 except the most recent row.
In this case, there are 3 rows which have idx of 2 and status = 1,
and 2 rows which have idx of 3 and status = 1.
After the query, the table should look like this
--------------------------------------------------
|idx | status | var1 | date
--------------------------------------------------
| 2 | 1 | cat | 2018-06-17 15:41:32.110
| 3 | 1 | dog | 2018-06-17 11:41:32.110
| 2 | 0 | lamb | 2018-06-17 11:41:32.110
| 2 | 0 | pc | 2018-06-17 09:41:32.110
| 3 | 0 | doll | 2018-06-17 09:41:32.110
I have no idea how to do this and tried to at least display
the rows which has more than 1 equal conditions and came up with this query
select Idx, status, COUNT(Idx) as count from table
group by Idx, status
having COUNT(Idx) > 1 and status = 1
order by Idx
This shows how many rows I have in the same condition,
but I would also like to have rows to display var1 and date
but I don't know how to do that.
As I am working in a .Net development, I could make a list of idx
to a list and do a for loop on each idx and update in that for loop,
but I would love to learn more about sql, how to solve this through.
We can try updating with a CTE:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY idx ORDER BY date DESC) rn
FROM yourTable
)
UPDATE cte
SET status = 0
WHERE rn > 1 AND status = 1;
You can also achieve it without the CTE:
UPDATE t SET status = 0 FROM tbl t WHERE NOT EXISTS
( SELECT 1 FROM tbl GROUP BY idx HAVING MAX(date)=t.date AND idx=t.idx );
see here: http://rextester.com/BVAS22315
The difference between Tim's and my solution would be that in case of two records with the same idx having exactly the same date, Tim's command would leave only one record unchanged (status=1) while my command would keep them both unchanged.
And, using the window function ROW_NUMBER(), you can also do it like this:
UPDATE t SET status=0 FROM
(SELECT *, ROW_NUMBER() OVER (PARTITION BY idx ORDER BY date DESC) rn
FROM tbl) t
WHERE rn>1
This second version will behave exactly like Tim's solution, see here: http://rextester.com/MFRAMR93418
(Note the identical dates for 'dog' and 'lamb' and only one gets updated.)
Related
I have a table that has a number column and an attribute column like this:
1.
+-----+-----+
| num | att |
-------------
| 1 | a |
| 1 | b |
| 1 | a |
| 2 | a |
| 2 | b |
| 2 | b |
+------------
I want to make the number unique, and the attribute to be whichever attribute occured most often for that number, like this (This is the end-product im interrested in) :
2.
+-----+-----+
| num | att |
-------------
| 1 | a |
| 2 | b |
+------------
I have been working on this for a while and managed to write myself a query that looks up how many times an attribute occurs for a given number like this:
3.
+-----+-----+-----+
| num | att |count|
------------------+
| 1 | a | 1 |
| 1 | b | 2 |
| 2 | a | 1 |
| 2 | b | 2 |
+-----------------+
But I can't think of a way to only select those rows from the above table where the count is the highest (for each number of course).
So basically what I am asking is given table 3, how do I select only the rows with the highest count for each number (Of course an answer describing providing a way to get from table 1 to table 2 directly also works as an answer :) )
You can use aggregation and window functions:
select num, att
from (
select num, att, row_number() over(partition by num order by count(*) desc, att) rn
from mytable
group by num, att
) t
where rn = 1
For each num, this brings the most frequent att; if there are ties, the smaller att is retained.
Oracle has an aggregation function that does this, stats_mode().:
select num, stats_mode(att)
from t
group by num;
In statistics, the most common value is called the mode -- hence the name of the function.
Here is a db<>fiddle.
You can use group by and count as below
select id, col, count(col) as count
from
df_b_sql
group by id, col
I have following table in SQL
id,date,records
1,2019-03-28 01:22:12,5
2,2019-03-29 01:23:23,5
3,2019-03-30 01:28:54,5
4,2019-03-28 01:12:21,2
5,2019-03-12 01:08:11,1
6,2019-03-28 01:01:21,12
7,2019-03-12 01:02:11,1
What i am trying to achieve is set a batch number that should keep on increasing after moving sum value crosses 15 and the moving sum should reset as well, so i am trying to create batch for records that has total moving sum value as 15
For ex. if Moving sum becomes 15 the batch number value should increment, which would given me rows containing total value of 15.
so the output i am looking for is
id,date,records, moving_sum,batch_number
1,2019-03-28 01:22:12,5,5,1
2,2019-03-29 01:23:23,5,10,1
3,2019-03-30 01:28:54,5,15,1
4,2019-03-28 01:12:21,2,2,2
5,2019-03-12 01:08:11,1,1,2
6,2019-03-28 01:01:21,2,12,2
7,2019-03-12 01:02:11,1,1,3
You need a recursive query for this:
with
tab as (select t.*, row_number() over(order by id) rn from mytable t),
cte as (
select
id,
date,
records,
records moving_sum,
1 batch_number,
rn
from tab
where rn = 1
union all
select
t.id,
t.date,
t.records,
case when c.moving_sum + t.records > 15 then t.records else c.moving_sum + t.records end,
case when c.moving_sum + t.records > 15 then c.batch_number + 1 else c.batch_number end,
t.rn
from cte c
inner join tab t on t.rn = c.rn + 1
)
select id, date, records, moving_sum, batch_number from cte order by id
The syntax for recursive common table expressions slightly varies across databases, so you might need to adapt that a little depending on your database.
Also note that if ids start at 1, and are always incrementing without gaps, you don't actually common table expression tab, and you can replace rn with id in the second common table expression.
Demo on DB Fiddle:
id | date | records | moving_sum | batch_number
-: | :--------- | ------: | ---------: | -----------:
1 | 2019-03-28 | 5 | 5 | 1
2 | 2019-03-29 | 5 | 10 | 1
3 | 2019-03-30 | 5 | 15 | 1
4 | 2019-03-28 | 2 | 2 | 2
5 | 2019-03-12 | 1 | 3 | 2
6 | 2019-03-28 | 12 | 15 | 2
7 | 2019-03-12 | 1 | 1 | 3
I'm trying to select a single item per value in a "Name" column according to several criteria.
The criteria I want to use look like this:
Only include results where IsEnabled = 1
Return the single result with the lowest priority (we're using 1 to mean "top priority")
In case of a tie, return the result with the newest Timestamp
I've seen several other questions that ask about returning the newest timestamp for a given value, and I've been able to adapt that to return the minimum value of Priority - but I can't figure out how to filter off of both Priority and Timestamp.
Here is the question that's been most helpful in getting me this far.
Sample data:
+------+------------+-----------+----------+
| Name | Timestamp | IsEnabled | Priority |
+------+------------+-----------+----------+
| A | 2018-01-01 | 1 | 1 |
| A | 2018-03-01 | 1 | 5 |
| B | 2018-01-01 | 1 | 1 |
| B | 2018-03-01 | 0 | 1 |
| C | 2018-01-01 | 1 | 1 |
| C | 2018-03-01 | 1 | 1 |
| C | 2018-05-01 | 0 | 1 |
| C | 2018-06-01 | 1 | 5 |
+------+------------+-----------+----------+
Desired output:
+------+------------+-----------+----------+
| Name | Timestamp | IsEnabled | Priority |
+------+------------+-----------+----------+
| A | 2018-01-01 | 1 | 1 |
| B | 2018-01-01 | 1 | 1 |
| C | 2018-03-01 | 1 | 1 |
+------+------------+-----------+----------+
What I've tried so far (this gets me only enabled items with lowest priority, but does not filter for the newest item in case of a tie):
SELECT DATA.Name, DATA.Timestamp, DATA.IsEnabled, DATA.Priority
From MyData AS DATA
INNER JOIN (
SELECT MIN(Priority) Priority, Name
FROM MyData
GROUP BY Name
) AS Temp ON DATA.Name = Temp.Name AND DATA.Priority = TEMP.Priority
WHERE IsEnabled=1
Here is a SQL fiddle as well.
How can I enhance this query to only return the newest result in addition to the existing filters?
Use row_number():
select d.*
from (select d.*,
row_number() over (partition by name order by priority, timestamp) as seqnum
from mydata d
where isenabled = 1
) d
where seqnum = 1;
The most effective way that I've found for these problems is using CTEs and ROW_NUMBER()
WITH CTE AS(
SELECT *, ROW_NUMBER() OVER( PARTITION BY Name ORDER BY Priority, TimeStamp DESC) rn
FROM MyData
WHERE IsEnabled = 1
)
SELECT Name, Timestamp, IsEnabled, Priority
From CTE
WHERE rn = 1;
I have the following table:
+----+--------+-----+
| id | fk_did | pos |
+----+--------+-----+
This table contains hundreds of rows, each of them referencing another table with fk_did. The value in pos is currently always zero which I want to change.
Basically, for each group of fk_did, the pos-column should start at zero and be ascending. It doesn't matter how the rows are ordered.
Example output (select * from table order by fk_did, pos) that I wanna get:
+----+--------+-----+
| id | fk_did | pos |
+----+--------+-----+
| xx | 0 | 0 |
| xx | 0 | 1 |
| xx | 0 | 2 |
| xx | 1 | 0 |
| xx | 1 | 1 |
| xx | 1 | 2 |
| xx | 4 | 0 |
| xx | 8 | 0 |
| xx | 8 | 1 |
| xx | 8 | 2 |
+----+--------+-----+
There must be no two rows that have the same combination of fk_did and pos
pos must be ascending for each fk_did
If there is a row with pos > 0, there must also be a row with the same fk_did and a lower pos.
Can this be done with a single update query?
You can do this using a window function:
update the_table
set pos = t.rn - 1
from (
select id,
row_number() over (partition by fk_id) as rn
from the_table
) t
where t.id = the_table.id;
The ordering of pos will be more or less random, as there is no order by, but you said that doesn't matter.
This assumes that id is unique, if not, you can use the internal column ctid instead.
If id is the PK of your table, then you can use the following query to update your table:
UPDATE mytable
SET pos = t.rn
FROM (
SELECT id, fk_did, pos,
ROW_NUMBER() OVER (PARTITION BY fk_did ORDER BY id) - 1 AS rn
FROM mytable) AS t
WHERE mytable.id = t.id
ROW_NUMBER window function, used with a PARTITION BY clause, generates sequence numbers starting from 1 for each fk_did slice.
Demo here
I'd suggest creating a temporary table if id column is not unique):
create temp table tmp_table as
select id, fk_did, row_number() over (partition by fk_did) - 1 pos
from table_name
And then truncate current table and insert records from the temp table
I'm not looking for the answer as much as what to search for as I think this is possible. I have a query where the result can be as such:
| ID | CODE | RANK |
I want to base rank off of the code so my I get these results
| 1 | A | 1 |
| 1 | B | 1 |
| 2 | A | 1 |
| 2 | C | 1 |
| 3 | B | 2 |
| 3 | C | 2 |
| 4 | C | 3 |
Basically, based on the group of IDs, if any of the CODEs = a certain value I want to adjust the rank so then I can order by rank first and then other columns. Never sure how to phrase things in SQL.
I tried
CASE WHEN CODE = 'A' THEN 1 WHEN CODE = 'B' THEN 2 ELSE 3 END rank
ORDER BY rank DESC
But I want to keep the ids together, I don't want them broken apart, I was thinking of doing all ranks the same based on the highest if I can't solve it another way?
Thoughts of a SQL function to look at?
You could use the MIN() OVER() analytic function to get the minimum rank value per group, and just order by that;
WITH cte AS (
SELECT id, code,
MIN(CASE WHEN code='A' THEN 1 WHEN code='B' THEN 2 ELSE 3 END)
OVER (PARTITION BY id) rank
FROM mytable
)
SELECT * FROM cte
ORDER BY rank, id, code
An SQLfiddle to test with.