Find subsequent occurrence of a value in a table

Find subsequent occurrence of a value in a table - sql

I have a table which looks like shown below
ID SubmittedValue ApprovedValue
1 25.9 0
1 29 29
1 25.9 25.9
1 50 0
1 45 0
1 10 0
1 10 10
Expected result
ID SubsequentlyApproved(CNT) Total_Amt_sub_aprvd
1 2 35.9
We get the above result because 25.9+10 since it is repeated in the subsequent rows.
How to perform VLOOKUP like functionality for this scenario. I tried the subquery but it didn't work.
SELECT a.id,
SUM(CASE WHEN a.ApprovedValue=0 THEN 1 ELSE 0 END) AS SUB_COUNT
FROM myTable a
join (select id, sum( case when SubmittedValue=ApprovedValue then 1 end) as check_value from myTable) b
on b.id=a.id and SUB_COUNT=check_value
but this is not giving me the expected result.

You seem to want to count rows where the values are the same and the first value appears more than once. If so, you can use window functions and aggregation:
select id, count(*), sum(ApprovedValue)
from (select t.*, count(*) over (partition by id, SubmittedValue) as cnt
from t
) t
where cnt > 1 and SubmittedValue = ApprovedValue
group by id

Without window functions using a semi-join
select id, count(*), sum(submittedvalue)
from test t1
where submittedvalue=approvedvalue
and exists (select 1
from test t2
where t1.id=t2.id and t1.submittedvalue=t2.submittedvalue
group by id, submittedvalue
having count(*)>1)
group by id;

Related

How do i select all columns, plus the result of the sum

I have this select:
"Select * from table" that return:
Id
Value
1
1
1
1
2
10
2
10
My goal is create a sum from each Value group by id like this:
Id
Value
Sum
1
1
2
1
1
2
2
10
20
2
10
20
I Have tried ways like:
SELECT Id,Value, (SELECT SUM(Value) FROM Table V2 WHERE V2.Id= V.Id GROUP BY IDRNC ) FROM Table v;
But the is not grouping by id.
Id
Value
Sum
1
1
1
1
1
1
2
10
10
2
10
10

Aggregation aggregates rows, reducing the number of records in the output. In this case you want to apply the result of a computation to each of your records, task carried out by the corresponding window function.
SELECT table.*, SUM(Value) OVER(PARTITION BY Id) AS sum_
FROM table
Check the demo here.

Your attempt looks correct.
Can you try the below query :
It works for me :
SELECT Id, Value,
(SELECT SUM(Value) FROM Table V2 WHERE V2.Id= V.Id GROUP BY ID) as sum
FROM Table v;

You can do it using inner join to join with selection grouped by id :
select t.*, sum
from _table t
inner join (
select id, sum(Value) as sum
from _table
group by id
) as s on s.id = t.id
You can check it here

Your select is ok if you adjust it just a little:
SELECT Id,Value, (SELECT SUM(Value) FROM Table V2 WHERE V2.Id= V.Id GROUP BY IDRNC ) FROM Table v;
GROUP BY IDRNC is a mistake and should be GROUP BY ID
you should give an alias to a sum column ...
subquery selecting the sum does not have to have self table alias to be compared with outer query that has one (this is not a mistake - works either way)
Test:
WITH
a_table (ID, VALUE) AS
(
Select 1, 1 From Dual Union All
Select 1, 1 From Dual Union All
Select 2, 10 From Dual Union All
Select 2, 10 From Dual
)
SELECT ID, VALUE, (SELECT SUM(VALUE) FROM a_table WHERE ID = v.ID GROUP BY ID) "ID_SUM" FROM a_table v;
ID VALUE ID_SUM
---------- ---------- ----------
1 1 2
1 1 2
2 10 20
2 10 20

COUNT() OVER possible using DISTINCT and WINDOWING IN HIVE

I want to calculate the number of distinct port numbers that exist between the current row and the X previous rows (sliding window), where x can be any integer number.
For instance,
If the input is:
ID PORT
1 21
2 22
3 23
4 25
5 25
6 21
The output should be:
ID PORT COUNT
1 21 1
2 22 2
3 23 3
4 25 4
5 25 4
6 21 4
I am using Hive, over RapidMiner and I have tried the following:
select id, port,
count (*) over (partition by srcport order by id rows between 5 preceding and current row)
This must work for big data and when X is big integer number.
Any feedback would be appreciated.

I don't think there is an easy way. One method uses lag():
select ( (case when port_5 is not null then 1 else 0 end) +
(case when port_4 is not null and port_4 not in (port_5) then 1 else 0 end) +
(case when port_3 is not null and port_3 not in (port_5, port_4) then 1 else 0 end) +
(case when port_2 is not null and port_2 not in (port_5, port_4, port_3) then 1 else 0 end) +
(case when port_1 is not null and port_1 not in (port_5, port_4, port_3, port_2) then 1 else 0 end) +
(case when port is not null and port not in (port_5, port_4, port_3, port_2, port_2) then 1 else 0 end)
) as cumulative_distinct_count
from (select t.*,
lag(port, 5) over (partition by srcport order by id rows) as port_5,
lag(port, 4) over (partition by srcport order by id rows) as port_4,
lag(port, 3) over (partition by srcport order by id rows) as port_3,
lag(port, 2) over (partition by srcport order by id rows) as port_2,
lag(port, 1) over (partition by srcport order by id rows) as port_1
from t
) t
This is a complicated query, but the performance should be ok.
Note: port and srcport I assume are the same thing, but this borrows from your query.

One way to do it is with a self join as distinct isn't supported in window functions.
select t1.id,count(distinct t2.port) as cnt
from tbl t1
join tbl t2 on t1.id-t2.id>=0 and t1.id-t2.id<=5 --change this number per requirements
group by t1.id
order by t1.id
This assumes id's are in sequential order.
If not, first get the row numbers and use the logic from above. It would be like
with rownums as (select id,port,row_number() over(order by id) as rnum
from tbl)
select r1.id,count(distinct r2.port)
from rownums r1
join rownums r2 on r1.rnum-r2.rnum>=0 and r1.rnum-r2.rnum<=5
group by r1.id

Calculation filtered sets in SQL

I have the following table invoving sort of 3 sets and I'm going to calculate the count of sets in whcich there is no (TaskId = 4), How can I achieve that?
SetId TaskId
1 0
1 1
1 4
2 0
2 2
2 3
3 0
3 2
3 4

Use conditional aggregation:
SELECT SetId
FROM yourTable
GROUP BY SetId
HAVING SUM(CASE WHEN TaskId = 4 THEN 1 ELSE 0 END) = 0;
The basic idea here is to scan each SetId group of records and count the number of times which a TaskId value of 4 occurs. The HAVING clause retains only groups for which the 4 value never occurs.

Use a CASE expression to check whether the TaskId value is 4. And use SUM function with grouping SetId.
Query
select [SetId],
SUM(case [TaskId] when 4 then 0 else 1 end) as [sum]
from [your_table_name]
group by [SetId];

I think you are looking for something like
SELECT *
FROM mytable t1
WHERE t1.SetId NOT IN (SELECT t2.SetId FROM mytable t2 WHERE t2.TaskId = 4)
(select the full sets that have no TaskId=4)
or
SELECT distinct SetId
FROM mytable t1
WHERE t1.SetId NOT IN (SELECT t2.SetId FROM mytable t2 WHERE t2.TaskId = 4)
(select just the SetIds that have no TaskId=4)

Running total (COUNT) SQL Server

I currently have this result
ID Code
1 AAA12
2 F5
3 GOFK568
4 G77
5 JLKJ4
6 FOG0
Now what i want to do is to create a third column that keeps a running total for codes that are above 4 in length.
Now, i have this code that gives me the sum of the code with above 4 in length.
SELECT * ,
SUM(CASE WHEN LENGTH(CODE) > 4 THEN 1 ELSE 0 END) AS [Count]
FROM Table1;
But this gives me this result
ID Code Count
1 AAA12 3
I am looking for a result like this
ID Code Running_Total
1 AAA12 1
2 F5 1
3 GOFK568 2
4 G77 2
5 JLKJ4 3
6 FOG0 3
I was working on something similar to this
SELECT * ,
CASE WHEN LENGTH(CODE) > 4 THEN (SUM(Code) OVER (PARTITION BY ID)) ELSE END
AS [Count]
FROM Table1;
But it still doesn't give me a running total.
I have an SQL Fiddle page
http://sqlfiddle.com/#!9/2746c/18
Any help would be great

Put the case in the sum:
SELECT Table1.* ,
SUM(case when len(Code) > 4 then 1 else 0 end) OVER (order BY ID) as counted
FROM Table1;

In Sql Server 2012+ you can use Sum() Over(Order by) function
SELECT Sum(CASE WHEN Len(code) > 4 THEN 1 ELSE 0 END)
OVER(ORDER BY id)
FROM Yourtable
for older versions
SELECT *
FROM Yourtable a
CROSS apply (SELECT Count(*)
FROM Yourtable b
WHERE a.ID >= b.ID
AND Len(code) > 4) cs (runn)
ANSI SQL method
SELECT ID,Code,
(SELECT count(*)
FROM Yourtable b
WHERE a.ID >= b.ID and char_length(code) > 4) AS runn
FROM Yourtable a

There are some good and efficient answers here.
But in case you want to try different approach then try following query:
SELECT
t1.*,
(Select sum(r.cnt) from
(SELECT COUNT(t2.code) as cnt FROM table1 AS t2
WHERE t2.Id <= t1.Id
group by t2.code
having len(t11.code) > 4) r
) AS Count
FROM table1 AS t1;
Here is the DEMO
Hope it helps!

Count consecutive duplicate values in SQL

I have a table like so
ID OrdID Value
1 1 0
2 2 0
3 1 1
4 2 1
5 1 1
6 2 0
7 1 0
8 2 0
9 2 1
10 1 0
11 2 0
I want to get the count of consecutive value where the value is 0. Using the example above the result will be 3 (Rows 6, 7 and 8). I am using sql server 2008 r2.

I am going to presume that id is unique and increasing. You can get counts of consecutive values by using the different of row numbers. The following counts all sequences:
select grp, value, min(id), max(id), count(*) as cnt
from (select t.*,
(row_number() over (order by id) - row_number() over (partition by value order by id)
) as grp
from table t
) t
group by grp, value;
If you want the longest sequence of 0s:
select top 1 grp, value, min(id), max(id), count(*) as cnt
from (select t.*,
(row_number() over (order by id) - row_number() over (partition by value order by id)
) as grp
from table t
) t
group by grp, value
having value = 0
order by count(*) desc

A query using not exists to find consecutive 0s
select top 1 min(t2.id), max(t2.id), count(*)
from mytable t
join mytable t2 on t2.id <= t.id
where not exists (
select 1 from mytable t3
where t3.id between t2.id and t.id
and t3.value <> 0
)
group by t.id
order by count(*) desc
http://sqlfiddle.com/#!3/52989/3

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Find subsequent occurrence of a value in a table - sql

Without window functions using a semi-join select id, count(), sum(submittedvalue) from test t1 where submittedvalue=approvedvalue and exists (select 1 from test t2 where t1.id=t2.id and t1.submittedvalue=t2.submittedvalue group by id, submittedvalue having count()>1) group by id;

Related

How do i select all columns, plus the result of the sum

COUNT() OVER possible using DISTINCT and WINDOWING IN HIVE

Calculation filtered sets in SQL

Running total (COUNT) SQL Server

Count consecutive duplicate values in SQL

Categories

Resources

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Find subsequent occurrence of a value in a table - sql

Without window functions using a semi-join select id, count(*), sum(submittedvalue) from test t1 where submittedvalue=approvedvalue and exists (select 1 from test t2 where t1.id=t2.id and t1.submittedvalue=t2.submittedvalue group by id, submittedvalue having count(*)>1) group by id;

Related

How do i select all columns, plus the result of the sum

COUNT() OVER possible using DISTINCT and WINDOWING IN HIVE

Calculation filtered sets in SQL

Running total (COUNT) SQL Server

Count consecutive duplicate values in SQL

Categories

Resources

Without window functions using a semi-join select id, count(), sum(submittedvalue) from test t1 where submittedvalue=approvedvalue and exists (select 1 from test t2 where t1.id=t2.id and t1.submittedvalue=t2.submittedvalue group by id, submittedvalue having count()>1) group by id;