Count rows in table that are the same in a sequence

Count rows in table that are the same in a sequence - sql

I have a table that looks like this
+----+------------+------+
| ID | Session_ID | Type |
+----+------------+------+
| 1 | 1 | 2 |
| 2 | 1 | 4 |
| 3 | 1 | 2 |
| 4 | 2 | 2 |
| 5 | 2 | 2 |
| 6 | 3 | 2 |
| 7 | 3 | 1 |
+----+------------+------+
And I would like to count all occurences of a type that are in a sequence.
Output look some how like this:
+------------+------+-----+
| Session_ID | Type | cnt |
+------------+------+-----+
| 1 | 2 | 1 |
| 1 | 4 | 1 |
| 1 | 2 | 1 |
| 2 | 2 | 2 |
| 3 | 2 | 1 |
| 3 | 1 | 1 |
+------------+------+-----+
A simple group by like
SELECT session_id, type, COUNT(type)
FROM table
GROUP BY session_id, type
doesn't work, since I need to group only rows that are "touching".
Is this possible with a merge sql-select or will I need some sort of coding. Stored Procedure or Application side coding?
UPDATE Sequence:
If the following row has the same type, it should be counted (ordered by ID).
to determine the sequence the ID is the key with the session_ID, since I just want to group rows with the same session_ID.
So if there are 3 rows is in one session
row with the ID 1 has type 1,
and the second row has type 1
and row 3 has type 2
Input:
+----+------------+------+
| ID | Session_ID | Type |
+----+------------+------+
| 1 | 1 | 1 |
| 2 | 1 | 1 |
| 3 | 1 | 2 |
+----+------------+------+
The squence is Row 1 to Row 2. This three row should output
Output:
+------------+------+-------+
| Session_ID | Type | count |
+------------+------+-------+
| 1 | 1 | 2 |
| 3 | 2 | 1 |
+------------+------+-------+

You can use a difference of id and row_number() to identify the gaps and then perform your count
;with cte as
(
Select *, id - row_number() over (partition by session_id,type order by id) as grp
from table
)
select session_id,type,count(*) as cnt
from cte
group by session_id,type,grp
order by max(id)

Related

Query to sum calls

I have table contain call durations of a telecom company.
ex:
Table 1
| callerid | receiverid | call duration
| 1 | 2 | 5
| 1 | 2 | 2
| 2 | 3 | 4
| 1 | 5 | 2
i need to query above table so the result table after query:
Table 2
| callerid | receiverid | call duration
| 1 | 2 | 7
| 2 | 3 | 4
| 1 | 5 | 2

use below
select callerid, receiverid, sum(call_duration) call_duration
from your_table
group by callerid, receiverid
if applied to sample data in your question - output is

SQL create a new field sessions given the value of another field

I have problems approaching the following task.
Given a table like
| user_id | hit_id | new_session |
|---------------|--------------|--------------|
| 1 | 1 | 0 |
| 1 | 2 | 0 |
| 1 | 3 | 1 |
| 1 | 4 | 0 |
| ... | ... | ... |
| 5 | 19 | 0 |
where
the combination of user_id and hit_id is unique
new_session is a boolean that determines if the hit started a new session or not for this particular user
I want to create a new column, session_number that splits hit_ids into sessions, taking into account that:
the first row for each user_id, once ordered by hit_id asc gets a value of 1 for the new column session_number
as long as new_session is 0, the value of session_number stays the same
when new_session is 1, I have to sum up 1 to the actual session count
the logic works over a partition by user_id ordered by hit_id asc, and therefore once the user_id changes, the session count is reset
I have created a db-fiddle with some example data
The expected output for user_id = 1 (which cover multiple corner cases) would be:
| user_id | hit_id | new_session | session_number |
|---------------|--------------|--------------|----------------|
| 1 | 1 | 0 | 1 |
| 1 | 2 | 0 | 1 |
| 1 | 3 | 1 | 2 |
| 1 | 4 | 0 | 2 |
| 1 | 5 | 0 | 2 |
| 1 | 6 | 1 | 3 |
| 1 | 7 | 0 | 3 |
| 1 | 8 | 1 | 4 |
| 1 | 8 | 1 | 5 |
I have tried with a combination of lag(), rank(), and dense_rank(), but I always find a corner case that makes all the attempts unsuccessful. Additionally, I am totally sure that there is a very easy approach for that that I am not taking into account.

You can use a cumulative sum:
select pv.*,
(1 + sum(new_session) over (partition by user_id order by hit_id)) as session_number
from pageviews pv;
Here is a db-fiddle.

Filter SQL Server data according to its max value

I have one SQL Server 2008 table like:
+------+-------+--------------------------------------+
| id | level | content |
+------+-------+--------------------------------------+
| 1 | 1 | ... |
| 2 | 2 | ... |
| 1 | 2 | ... |
| 1 | 3 | ... |
| 2 | 1 | ... |
| 1 | 4 | ... |
| 3 | 1 | ... |
+------+-------+--------------------------------------+
For every id, it may have three, two or four levels saved in table like above. How can I get the data for every id:
every id has at most three records in final table
if the max level of one id is higher than 3, the three records' level is from max to max-3;
if the max level of one id is equal or less than 3, just keep them as they are.
so the final table which I would like to get is:
+------+-------+--------------------------------------+
| id | level | content |
+------+-------+--------------------------------------+
| 1 | 1 | ... |
| 2 | 2 | ... |
| 1 | 2 | ... |
| 1 | 3 | ... |
| 2 | 1 | ... |
| 3 | 1 | ... |
+------+-------+--------------------------------------+
How can I the lines? Thanks a lot.

I think you want the 3 latest levels per id. If so, you can use window functions like so:
select *
from (
select t.*, row_number() over(partition by id order by level desc) rn
from mytable t
) t
where rn <= 3

Using LAG function with higher offset

Suppose we have the following input table
cat | value | position
------------------------
1 | A | 1
1 | B | 2
1 | C | 3
1 | D | 4
2 | C | 1
2 | B | 2
2 | A | 3
2 | D | 4
As you can see, the values A,B,C,D change position in each category, I want to track this change by adding a column change in front of each value, the output should look like this:
cat | value | position | change
---------------------------------
1 | A | 1 | NULL
1 | B | 2 | NULL
1 | C | 3 | NULL
1 | D | 4 | NULL
2 | C | 1 | 2
2 | B | 2 | 0
2 | A | 3 | -2
2 | D | 4 | 0
For example C was in position 3 in category 1 and moved to position 1 in category 2 and therefore has a change of 2. I tried inmplementing this using the LAG() function with an offset of 4 but I failed, how can I write this query.

Use lag() - with the proper partition by clause:
select
t.*,
lag(position) over(partition by value order by cat) - position change
from mytable t

You can use lag and then order by to maintain original order. Here is the demo.
select
*,
lag(position) over (partition by value order by cat) - position as change
from yourTable
order by
cat, position
output:
| cat | value | position | change |
| --- | ----- | -------- | ------ |
| 1 | A | 1 | null |
| 1 | B | 2 | null |
| 1 | C | 3 | null |
| 1 | D | 4 | null |
| 2 | C | 1 | 2 |
| 2 | B | 2 | 0 |
| 2 | A | 3 | -2 |
| 2 | D | 4 | 0 |

I think you just want lag() with the right partition by:
select t.*,
(lag(position) over (partition by value order by cat) - position) as change
from t;
Here is a db<>fiddle.

Semi-transposing a table in Oracle

I am having trouble semi-transposing the table below based on the 'LENGTH' column. I am using an Oracle database, sample data:
+-----------+-----------+--------+------+
| PERSON_ID | PERIOD_ID | LENGTH | FLAG |
+-----------+-----------+--------+------+
| 1 | 1 | 4 | 1 |
| 1 | 2 | 3 | 0 |
| 2 | 1 | 4 | 1 |
+-----------+-----------+--------+------+
I would like to lengthen this table based on the LENGTH row; basically duplicating the row for each value in the LENGTH column.
See the desired output table below:
+-----------+-----------+--------+------+
| PERSON_ID | PERIOD_ID | NUMBER | FLAG |
+-----------+-----------+--------+------+
| 1 | 1 | 1 | 1 |
| 1 | 1 | 2 | 1 |
| 1 | 1 | 3 | 1 |
| 1 | 1 | 4 | 1 |
| 1 | 2 | 1 | 0 |
| 1 | 2 | 2 | 0 |
| 1 | 2 | 3 | 0 |
| 2 | 1 | 1 | 1 |
| 2 | 1 | 2 | 1 |
| 2 | 1 | 3 | 1 |
| 2 | 1 | 4 | 1 |
+-----------+-----------+--------+------+
I typically work in Posgres so Oracle is new to me.
I've found some solutions using the connect by statement but they seem overly complicated, particularly when compared to the simple generate_series() command from Posgres.

A recursive CTE subtracting 1 from length until 1 is reached should work. (In Postgres too, BTW, should you need something working cross platform.)
WITH cte (person_id,
period_id,
number_,
flag)
AS
(
SELECT person_id,
period_id,
length number_,
flag
FROM elbat
UNION ALL
SELECT person_id,
period_id,
number_ - 1 number_,
flag
FROM cte
WHERE number_ > 1
)
SELECT *
FROM cte
ORDER BY person_id,
period_id,
number_;
db<>fiddle

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Count rows in table that are the same in a sequence - sql

You can use a difference of id and row_number() to identify the gaps and then perform your count ;with cte as ( Select , id - row_number() over (partition by session_id,type order by id) as grp from table ) select session_id,type,count() as cnt from cte group by session_id,type,grp order by max(id)

Related

Query to sum calls

SQL create a new field sessions given the value of another field

Filter SQL Server data according to its max value

Using LAG function with higher offset

Semi-transposing a table in Oracle

Categories

Resources