how to not count 0 when doing row_number - sql

I am doing a row_number() over some data.
My problem is that I do not wnat to take into account data that is 0.
Here's my data sample:
+-----------+-----+
| Alex | 1 |
| Liza | 2 |
| Harry | 0 |
| Marge | 24 |
| Bla | 0 |
| Something | 234 |
+-----------+-----+
Here's what I want:
+-----------+--------+------------+
| name | number | row_number |
+-----------+--------+------------+
| Harry | 0 | 0 |
| Bla | 0 | 0 |
| Something | 234 | 1 |
| Marge | 24 | 2 |
| Liza | 2 | 3 |
| Alex | 1 | 4 |
+-----------+--------+------------+
as you can see the third column is the row_number()
this is what I have so far:
select name, number, row_number() over (partition by name order by name,number)
from myTable
How do I get the query above to return 0 for all 0's in the source data and not count the 0's at all towards the row_number sequence?

Try this SQL Fiddle:
SELECT name, number, 0 as [row_number]
FROM myTable
WHERE number = 0
UNION ALL
SELECT name, number, row_number() OVER (ORDER BY number DESC)
FROM myTable
WHERE number <> 0

Try this:
select name, number,
(case when number = 0 then 0
else row_number() over (partition by iszero order by number desc)
end) as row_number
from (select t.*, (case when number = 0 then 0 else 1 end) as iszero
from myTable t
) t
Since you are ordering by numbers descending, the following works if no numbers are negative:
select name, number,
(case when number = 0 then 0
else row_number() over (order by number desc)
end) as row_number
from myTable t

Related

Get some values from the table by selecting

I have a table:
| id | Number |Address
| -----| ------------|-----------
| 1 | 0 | NULL
| 1 | 1 | NULL
| 1 | 2 | 50
| 1 | 3 | NULL
| 2 | 0 | 10
| 3 | 1 | 30
| 3 | 2 | 20
| 3 | 3 | 20
| 4 | 0 | 75
| 4 | 1 | 22
| 4 | 2 | 30
| 5 | 0 | NULL
I need to get: the NUMBER of the last ADDRESS change for each ID.
I wrote this select:
select dh.id, dh.number from table dh where dh =
(select max(min(t.history)) from table t where t.id = dh.id group by t.address)
But this select not correctly handling the case when the address first changed, and then changed to the previous value. For example id=1: group by return:
| Number |
| -------- |
| NULL |
| 50 |
I have been thinking about this select for several days, and I will be happy to receive any help.
You can do this using row_number() -- twice:
select t.id, min(number)
from (select t.*,
row_number() over (partition by id order by number desc) as seqnum1,
row_number() over (partition by id, address order by number desc) as seqnum2
from t
) t
where seqnum1 = seqnum2
group by id;
What this does is enumerate the rows by number in descending order:
Once per id.
Once per id and address.
These values are the same only when the value is 1, which is the most recent address in the data. Then aggregation pulls back the earliest row in this group.
I answered my question myself, if anyone needs it, my solution:
select * from table dh1 where dh1.number = (
select max(x.number)
from (
select
dh2.id, dh2.number, dh2.address, lag(dh2.address) over(order by dh2.number asc) as prev
from table dh2 where dh1.id=dh2.id
) x
where NVL(x.address, 0) <> NVL(x.prev, 0)
);

Single query to split out data of one column, into two columns, from the same table based on different criteria [SQL]

I have the following data in a table, this is a single column shown from a table that has multiple columns, but only data from this column needs to be pulled into two column output using a query:
+----------------+--+
| DataText | |
| 1 DEC20 DDD | |
| 1 JUL20 DDD | |
| 1 JAN21 DDD | |
| 1 JUN20 DDD500 | |
| 1 JUN20 DDD500 | |
| 1 JUN20DDDD500 | |
| 1 JUN20DDDD500 | |
| 1 JUL20 DDD800 | |
| 1 JUL20 DDD800 | |
| 1 JUL20DDDD800 | |
| 1 JUL20DDDD400 | |
| 1 JUL20DDDD400 | |
+----------------+--+
Required result: distinct values based on the first 13 characters of the data, split into two columns based on "long data", and "short data", BUT only giving the first 13 characters in output for both columns:
+-------------+-------------+
| ShortData | LongData |
| 1 DEC20 DDD | 1 JUN20 DDD |
| 1 JUL20 DDD | 1 JUN20DDDD |
| 1 JAN21 DDD | 1 JUL20 DDD |
| | 1 JUL20DDDD |
+-------------+-------------+
Something like:
Select
(Select DISTINCT LEFT(DataText,13)
From myTable)
Where LEN(DataText)=13) As ShortData
,
(Select DISTINCT LEFT(DataText,13)
From myTable)
Where LEN(DataText)>13) As LongData
I would also like to query/"scan" the table only once if possible. I can't get any of the SO examples modified to make such a query work.
This is quite ugly, but doable. As a starter, you need a column that defines the order of the rows - I assumed that you have such a column, and that is called id.
Then you can select the distinct texts, put them in separate groups depending on their length, and finally pivot:
select
max(case when grp = 0 then dataText end) shortData,
max(case when grp = 1 then dataText end) longData
from (
select
dataText,
grp,
row_number() over(partition by grp order by id) rn
from (
select
id,
case when len(dataText) <= 13 then 0 else 1 end grp,
substring(dataText, 1, 13) dataText
from (select min(id) id, dataText from mytable group by dataText) t
) t
) t
group by rn
If you are content with ordering the records by the string column itself, it is a bit simpler (and, for your sample data, it produces the same results):
select
max(case when grp = 0 then dataText end) shortData,
max(case when grp = 1 then dataText end) longData
from (
select
dataText,
grp,
row_number() over(partition by grp order by dataText) rn
from (
select distinct
case when len(dataText) <= 13 then 0 else 1 end grp,
substring(dataText, 1, 13) dataText
from mytable
) t
) t
group by rn
Demo on DB Fiddle:
shortData | longData
:---------- | :------------
1 DEC20 DDD | 1 JUL20 DDD80
1 JAN21 DDD | 1 JUL20DDDD40
1 JUL20 DDD | 1 JUL20DDDD80
null | 1 JUN20 DDD50
null | 1 JUN20DDDD50

Partition & consecutive in SQL

fellow stackers
I have a data set like so:
+---------+------+--------+
| user_id | date | metric |
+---------+------+--------+
| 1 | 1 | 1 |
| 1 | 2 | 1 |
| 1 | 3 | 1 |
| 2 | 1 | 1 |
| 2 | 2 | 1 |
| 2 | 3 | 0 |
| 2 | 4 | 1 |
+---------+------+--------+
I am looking to flag those customers who has 3 consecutive "1"s in the metric column. I have a solution as below.
select distinct user_id
from (
select user_id
,metric +
ifnull( lag(metric, 1) OVER (PARTITION BY user_id ORDER BY date), 0 ) +
ifnull( lag(metric, 2) OVER (PARTITION BY user_id ORDER BY date), 0 )
as consecutive_3
from df
) b
where consecutive_3 = 3
While it works it is not scalable. As one can imagine what the above query would look like if I were looking for a consecutive 50.
May I ask if there is a scalable solution? Any cloud SQL will do. Thank you.
If you only want such users, you can use a sum(). Assuming that metric is only 0 or 1:
select user_id,
(case when max(metric_3) = 3 then 1 else 0 end) as flag_3
from (select df.*,
sum(metric) over (partition by user_id
order by date
rows between 2 preceding and current row
) as metric_3
from df
) df
group by user_id;
By using a windowing clause, you can easily expand to as many adjacent 1s as you like.

Efficient ROW_NUMBER increment when column matches value

I'm trying to find an efficient way to derive the column Expected below from only Id and State. What I want is for the number Expected to increase each time State is 0 (ordered by Id).
+----+-------+----------+
| Id | State | Expected |
+----+-------+----------+
| 1 | 0 | 1 |
| 2 | 1 | 1 |
| 3 | 0 | 2 |
| 4 | 1 | 2 |
| 5 | 4 | 2 |
| 6 | 2 | 2 |
| 7 | 3 | 2 |
| 8 | 0 | 3 |
| 9 | 5 | 3 |
| 10 | 3 | 3 |
| 11 | 1 | 3 |
+----+-------+----------+
I have managed to accomplish this with the following SQL, but the execution time is very poor when the data set is large:
WITH Groups AS
(
SELECT Id, ROW_NUMBER() OVER (ORDER BY Id) AS GroupId FROM tblState WHERE State=0
)
SELECT S.Id, S.[State], S.Expected, G.GroupId FROM tblState S
OUTER APPLY (SELECT TOP 1 GroupId FROM Groups WHERE Groups.Id <= S.Id ORDER BY Id DESC) G
Is there a simpler and more efficient way to produce this result? (In SQL Server 2012 or later)
Just use a cumulative sum:
select s.*,
sum(case when state = 0 then 1 else 0 end) over (order by id) as expected
from tblState s;
Other method uses subquery :
select *,
(select count(*)
from table t1
where t1.id < t.id and state = 0
) as expected
from table t;

How to sort by column and next row and order them by another column?

I have a table with four columns: ID, isError, SolidLine and HighestError.
Each row is related to another row by SolidLine column. So we have two related rows in the table.
For example, rows with ID 1 and 2 have relation by SolidLine(5).
----------------------------------------------------------------------
| ID | isError | SolidLine | HighestError
----------------------------------------------------------------------
| 1 | 0 | 5 | 1
| 2 | 0 | 5 | 1
| 3 | 0 | 8 | 1
| 4 | 0 | 8 | 1
| 5 | 1 | 10 | 50
| 6 | 0 | 10 | 1
| 7 | 1 | 4 | 80
| 8 | 0 | 4 | 1
| 9 | 1 | 7 | 80
| 10 | 0 | 7 | 1
| 11 | 0 | 3 | 1
| 12 | 0 | 3 | 1
----------------------------------------------------------------------
I would like to sort a table by the following condition:
If isError is 1, take the next row by SolidLine, then order by
HighestError
So the wish result should look like this:
----------------------------------------------------------------------
| ID | isError | SolidLine | HighestError
----------------------------------------------------------------------
| 7 | 1 | 4 | 80
| 8 | 0 | 4 | 1
| 9 | 1 | 7 | 80
| 10 | 0 | 7 | 1
| 5 | 1 | 10 | 50
| 6 | 0 | 10 | 1
| 1 | 0 | 5 | 1
| 2 | 0 | 5 | 1
| 3 | 0 | 8 | 1
| 4 | 0 | 8 | 1
| 11 | 0 | 3 | 1
| 12 | 0 | 3 | 1
----------------------------------------------------------------------
The first row becomes the first row as HighestError has maximum value in the table isError equals 1. Then the next row goes with ID = 8 as it SolidLine has the same value SolidLine of row with ID = 7.
SolidLine are pairs always together and does not depend upon isError column.
So the pair of rows tied by SolidLine should always be together.
I tried the following queries, but it gives wrong result:
--it breaks SolidLine ordering.
SELECT ID, isError, SolidLine, HighestError
FROM SolidThreads
ORDER BY SolidLine, isError, HighestError desc, id
and:
SELECT
ROW_NUMBER() OVER (PARTITION BY SolidLine ORDER BY isError DESC) [RowNumber],
ID, isError, SolidLine, HighestError
FROM SolidThreads
ORDER BY HighestError desc, id
What am I doing wrong? Or how can I do it?
As you describe it, you should be able to do this by...
adding a column for "This Solid Line Includes an Error Row"
adding a column for "The max error for this Solid Line"
using CASE expressions to change the sorting based on error state
http://sqlfiddle.com/#!18/84e7a/1
WITH
SolidThreadsSummary AS
(
SELECT
*,
MAX(isError ) OVER (PARTITION BY SolidLine) AS SolidLineHasError,
MAX(highestError) OVER (PARTITION BY SolidLine) AS SolidLineMaxError
FROM
SolidThreads
)
SELECT
*
FROM
SolidThreadsSummary
ORDER BY
SolidLineHasError DESC, -- Not really necessary for your data
SolidLineMaxError DESC,
CASE WHEN SolidLineHasError > 0 THEN SolidLine ELSE 1 END,
isError DESC,
id
This may be a little more robust if pairs are not always consecutive by id (for the pairs containing no error)...
http://sqlfiddle.com/#!18/84e7a/2
WITH
SolidThreadsSummary AS
(
SELECT
*,
MAX(isError ) OVER (PARTITION BY SolidLine) AS SolidLineHasError,
MAX(highestError) OVER (PARTITION BY SolidLine) AS SolidLineMaxError,
MIN(id ) OVER (PARTITION BY SolidLine) AS SolidLineMinID
FROM
SolidThreads
)
SELECT
*
FROM
SolidThreadsSummary
ORDER BY
SolidLineHasError DESC,
SolidLineMaxError DESC,
CASE WHEN SolidLineHasError > 0 THEN SolidLine ELSE 1 END,
isError DESC,
SolidLineMinID,
id
;
It seems like you want to sort keeping the SolidLines together, and ordering those groups first by HighestError then by the lowest ID in the group, then within the group show errors first. Assuming that's what you want, I would do this with a derived table:
ID, isError, SolidLine, HighestError
FROM SolidThreads INNER JOIN
(SELECT SolidLine, MAX(Highesterror) as sorting_HighestError, MIN(ID) as Sorting_Id
FROM SolidThreads GROUP BY SolidLine) as Sorting_DT
ON Sorting_DT.SolidLine = SolidThreads.SolidLine
ORDER BY sorting_HighestError DESC, Sorting_Id, isError Desc, Id
If the ID is always sequential for each SolidLine pair, you can simply do this:
SELECT T.*
FROM yourTable T
JOIN (SELECT SolidLine, MAX(HighestError) MaxError
FROM yourTable
GROUP BY SolidLine) T2 ON T.SolidLine = T2.SolidLine
ORDER BY MaxError DESC, ID