SQL: Count rows where column value changed from previous row [duplicate] - sql

This question already has an answer here:
increment row number when value of field changes in Oracle
(1 answer)
Closed 2 years ago.
Suppose I have Oracle or Postgresql database.
ID IdExample OrderByColumn What I want
---------- ---------- ---------- ----------
1 1 1300 1
2 1 2450 1
3 2 5000 2
4 2 4800 2
5 1 5100 3
6 1 6000 3
7 4 7000 4
8 1 8000 5
How do count the changes that are in idExample, data is sorted by OrderByColumn
I need output new column that is represented by "what I want"
pay attention to "1" in IdExample. It repeats but I wants to iterate.
The query should execute quickly with the table having tens of thousands of records.
THANKS

You need to use lag and sum analytical function as follows:
Select t.*,
sum(case when lg is null or lg <> idexample then 1 else 0 end)
over (order by id) as result
from
(Select t.*,
lag(idexample) over (order by id) as lg
From your_table t) t

Related

Delete duplicate rows based on two values in sql [duplicate]

This question already has answers here:
Removing duplicate rows from table in Oracle
(24 answers)
Closed 6 months ago.
I'm new to sql and I can't work out how to delete duplicate rows, I have a table like this called 'till_total':
till_id
total
1
80
1
80
1
60
2
30
2
30
2
50
I want to only delete full duplicate rows so the table ends up like this
till_id
total
1
80
1
60
2
30
2
50
I wrote this code to try and do it
SELECT till_id, total, COUNT(*) AS CNT
FROM till_total
GROUP BY till_id, total
HAVING COUNT(*) > 1
ORDER BY till_id;
But that seems to delete all rows where the till_id is repeated. Could anyone help me with this?
Good, old ROWID approach:
Before:
SQL> select * from till_total;
TILL_ID TOTAL
---------- ----------
1 80
1 80
1 60
2 30
2 30
2 50
6 rows selected.
Delete duplicates:
SQL> delete from till_total a
2 where a.rowid > (select min(b.rowid)
3 from till_total b
4 where b.till_id = a.till_id
5 and b.total = a.total
6 );
2 rows deleted.
After:
SQL> select * from till_total;
TILL_ID TOTAL
---------- ----------
1 80
1 60
2 30
2 50
SQL>
WITH till_total AS (
SELECT till_id
row_number() OVER(PARTITION BY till_id ORDER BY desc) AS row
FROM TABLE
)
DELETE till_total WHERE row > 1
This might work for you, deleting rows that are more than 1 duplicate, not less than 1.

Resetting a Count in SQL

I have data that looks like this:
ID num_of_days
1 0
2 0
2 8
2 9
2 10
2 15
3 10
3 20
I want to add another column that increments in value only if the num_of_days column is divisible by 5 or the ID number increases so my end result would look like this:
ID num_of_days row_num
1 0 1
2 0 2
2 8 2
2 9 2
2 10 3
2 15 4
3 10 5
3 20 6
Any suggestions?
Edit #1:
num_of_days represents the number of days since the customer last saw a doctor between 1 visit and the next.
A customer can see a doctor 1 time or they can see a doctor multiple times.
If it's the first time visiting, the num_of_days = 0.
SQL tables represent unordered sets. Based on your question, I'll assume that the combination of id/num_of_days provides the ordering.
You can use a cumulative sum . . . with lag():
select t.*,
sum(case when prev_id = id and num_of_days % 5 <> 0
then 0 else 1
end) over (order by id, num_of_days)
from (select t.*,
lag(id) over (order by id, num_of_days) as prev_id
from t
) t;
Here is a db<>fiddle.
If you have a different ordering column, then just use that in the order by clauses.

How to get average runs for each over in SQL?

The first six balls mean first over, next six balls mean second over & so on than how to get average runs for each over.
input as
Ball no Runs
1 4
2 6
3 3
4 2
5 6
6 1
1 2
2 4
3 6
4 3
5 1
6 1
1 2
output should be:
Over no avg runs
1 3.66
2 2.83
As Gordon Linoff suggested, SQL table represents unordered sets, So you have to use an ordered column in your table. If you can use such a column you may use below query -
SELECT Over_no AVG(Runs) avg_runs
FROM (SELECT Ball_no, Runs, CEIL(ROW_NUMBER() OVER(ORDER BY ORDER_COLUMN, Ball_no) RN / 6) Over_no
FROM YOUR_TABLE)
GROUP BY Over_no;
I have managed to solve my problem with the following query:
SELECT ROWNUM OVER_NO, AVG_RUNS
FROM(
SELECT ROWNUM RN,
ROUND(AVG(RUNS)OVER(ORDER BY ROWNUM RANGE BETWEEN CURRENT ROW AND 5 FOLLOWING),2) AVG_RUNS
FROM TABLE_NAME
)
WHERE RN=1 OR RN=7;

SQL involving MAX of two colums and Group BY [duplicate]

This question already has answers here:
Select first row in each GROUP BY group?
(20 answers)
Closed 7 years ago.
So... i got a table like this:
id group number year
1 1 1 2000
2 1 2 2000
3 1 1 2001
4 2 1 2000
5 2 2 2000
6 2 1 2001
7 2 2 2001
8 2 3 2001
And i need to select the bigger number of the bigger year for each group. So i expect the result of the exemple to be:
3 1 1 2001
8 2 3 2001
any ideias?
OBS: using Postgres
SELECT *
FROM (
SELECT *,
row_number() over (partition by "group" order by "year" desc, "number" desc ) x
FROM table1
) x
WHERE x = 1;
demo: http://sqlfiddle.com/#!15/cd78e/2
If it's just certain rows you want to get you can use DISTINCT. If you want different maximums on the same rows you could use GROUP BY
SELECT DISTINCT ON ("group") * FROM tbl
ORDER BY "group", year DESC, id DESC;

Inserting a new indicator column to tell if a given row maximizes another column in SQL

I currently have a table in SQL that looks like this
PRODUCT_ID_1 PRODUCT_ID_2 SCORE
1 2 10
1 3 100
1 10 3000
2 10 10
3 35 100
3 2 1001
That is, PRODUCT_ID_1,PRODUCT_ID_2 is a primary key for this table.
What I would like to do is use this table to add in a row to tell whether or not the current row is the one that maximizes SCORE for a value of PRODUCT_ID_1.
In other words, what I would like to get is the following table:
PRODUCT_ID_1 PRODUCT_ID_2 SCORE IS_MAX_SCORE_FOR_ID_1
1 2 10 0
1 3 100 0
1 10 3000 1
2 10 10 1
3 35 100 0
3 2 1001 1
I am wondering how I can compute the IS_MAX_SCORE_FOR_ID_1 column and insert it into the table without having to create a new table.
You can try like this...
Select PRODUCT_ID_1, PRODUCT_ID_2 ,SCORE,
(Case when b.Score=
(Select Max(a.Score) from TableName a where a.PRODUCT_ID_1=b. PRODUCT_ID_1)
then 1 else 0 End) as IS_MAX_SCORE_FOR_ID_1
from TableName b
You can use a window function for this:
select product_id_1,
product_id_2,
score,
case
when score = max(score) over (partition by product_id_1) then 1
else 0
end as is_max_score_for_id_1
from the_table
order by product_id_1;
(The above is ANSI SQL and should run on any modern DBMS)