SQL SERVER COUNT LEAD Condition Group BY

SQL SERVER COUNT LEAD Condition Group BY - sql

I am trying to find a way to count based on groups and I was not able to figure out a way without having to use a Cursor. Since using a Cursor will be relatively slow I was hoping there might be a better way.
Simplified the data is structured as follows:
+----+--------+-------+--------+
| ID | NEXTID | RowNo | Status |
+----+--------+-------+--------+
| 1 | 2 | 1 | 1 |
| 2 | 3 | 1 | 1 |
| 3 | 4 | 1 | 0 |
| 4 | | 1 | 1 |
| 1 | 2 | 2 | 0 |
| 2 | 3 | 2 | 1 |
| 3 | 4 | 2 | 1 |
| 4 | | 2 | 1 |
| 1 | 2 | 3 | 1 |
| 2 | 3 | 3 | 1 |
| 3 | 4 | 3 | 1 |
| 4 | | 3 | 1 |
+----+--------+-------+--------+
I now want to COUNT the Status column in groups resulting in:
+-----+-------------+
| Row | StatusCount |
+-----+-------------+
| 1 | 2 |
| 1 | 1 |
| 2 | 3 |
| 3 | 4 |
+-----+-------------+
For Testing purposes I creating the following code:
SELECT
ID,
NEXTID,
RowNo,
Status,
LEAD(ID,1,0)
OVER (ORDER BY RowNo,ID) AS LEADER
INTO #TestTable
FROM
(
VALUES
(1, 2, 1, 1),
(2, 3, 1, 1),
(3, 4, 1, 0),
(4, '', 1, 1),
(1, 2, 2, 0),
(2, 3, 2, 1),
(3, 4, 2, 1),
(4, '', 2, 1),
(1, 2, 3, 1),
(2, 3, 3, 1),
(3, 4, 3, 1),
(4, '', 3, 1)
)
AS TestTable(
ID,
NEXTID,
RowNo,
Status);
GO
SELECT
RowNo,
Count(Status) AS StatusCount
FROM #TestTable
WHERE
Status = 1
GROUP BY
RowNo
This results in
+-----+-------------+
| Row | StatusCount |
+-----+-------------+
| 1 | 3 |
| 2 | 3 |
| 3 | 4 |
+-----+-------------+
Not separating the first row. I do realise that I need another GROUP BY condition but I can not figure out the appropriate condition.
Thank you very much for your help. If this has already been answered I was unable to find the topic and hints will also be appreciated.
With kind regards
freubau

You can identify the groups by doing a cumulative sum of the zeros up to each number. Then, the rest is just aggregation:
select rowno, count(*)
from (select t.*,
sum(case when status = 0 then 1 else 0 end) over (partition by rowno order by id) as grp
from #TestTable t
) t
where status = 1
group by rowno, grp
order by rowno, grp;
Here is a rex tester for it.

Related

Adding unique identifier based on repeating values

Using MS-SQL, I have the following table excerpt:
-----------------------------------
Market | Cycle | Milestone | Sale |
A | NULL | NULL | NULL |
A | 1 | NULL | NULL |
B | NULL | NULL | NULL |
B | 3 | 4 | NULL |
B | 3 | 4 | 5 |
A | 1 | 2 | NULL |
A | 1 | 2 | 1 |
NULL | C | 6 | 7 |
NULL | C | NULL | NULL |
D | 8 | NULL | NULL |
D | 8 | 9 | NULL |
Each row represents a new stage in the product life-cycle.
If the first stage of product C was Cycle, the next row for it will have values in Cycle and Milestone, and so forth.
I need to add an identifier for each group, based on the first not-null column for each value.
The required output for the above table would be as follows:
-------------------------------------------
Market | Cycle | Milestone | Sale | Group
A | NULL | NULL | NULL | 1
A | 1 | NULL | NULL | 1
B | NULL | NULL | NULL | 2
B | 3 | 4 | NULL | 2
B | 3 | 4 | 5 | 2
A | 1 | 2 | NULL | 1
A | 1 | 2 | 1 | 1
NULL | C | 6 | 7 | 3
NULL | C | NULL | NULL | 3
D | 8 | NULL | NULL | 4
D | 8 | 9 | NULL | 4
If a new row will be added with Market "D", it will receive Group 1.
If a new row will be added with Market Null and Cycle which has no appeared yet, it will start a new group 5. Future rows with the same cycle will also receive 5.
Hopefully this is clear enough...
Any assistance with SQL-Server code for this will be helpful.
Thank you!

This will set the Group value in the way you want, although it seems inelegant:
UPDATE tblGroup
SET Group = ASCII(COALESCE(Market, Cycle, Milestone, Sale)) - 64
...or select it via:
SELECT *, ASCII(COALESCE(Market, Cycle, Milestone, Sale)) - 64 AS Group
FROM tblGroup

You could use a window function as
SELECT *, DENSE_RANK() OVER(ORDER BY Market) [Group]
-- Or DENSE_RANK() OVER(ORDER BY COALESCE(Market, Cycle)) [Group] to get the exact results
FROM
(
VALUES
('A', NULL, NULL, NULL),
('A', '1', NULL, NULL),
('B', NULL, NULL, NULL),
('B', '3', 4, NULL),
('B', '3', 4, 5 ),
('A', '1', 2, NULL),
('A', '1', 2, 1 ),
(NULL, 'C', 6, 7 ),
(NULL, 'C', NULL, NULL),
('D', '8', NULL, NULL),
('D', '8', 9, NULL)
)T(Market, Cycle, Milestone, Sale)
Online Demo

SQL Update a table column with a sequence of values

I have a situation where I am required to create a copy of the data of one table within itself with a different range of foreign key in one of the columns. For example:
--------------------------------------------------------------
|TYPES |ITEMS |SUBITEMS |
|--------------|----------------------|----------------------|
| ID | VALUE | ID | VALUE | TYPEID | ID | VALUE | ITEMID |
|----|---------|----|--------|--------|----|--------|--------|
| 1 | TYPE1 | 1 | ITEMA | 1 | 1 | SUB1 | 1 |
| 2 | TYPE2 | 2 | ITEMB | 1 | 2 | SUB2 | 2 |
| | | 3 | ITEMC | 1 | 3 | SUB3 | 3 |
| | | 4 | ITEMD | 2 | | | |
| | | 5 | ITEME | 2 | | | |
| | | 6 | ITEMF | 2 | | | |
--------------------------------------------------------------
Here I have to copy from SUBITEMS and insert back but with ITEMIDs that have TYPEID as 2 resulting in the following example:
--------------------------------------------------------------
|TYPES |ITEMS |SUBITEMS |
|--------------|----------------------|----------------------|
| ID | VALUE | ID | VALUE | TYPEID | ID | VALUE | ITEMID |
|----|---------|----|--------|--------|----|--------|--------|
| 1 | TYPE1 | 1 | ITEMA | 1 | 1 | SUB1 | 1 |
| 2 | TYPE2 | 2 | ITEMB | 1 | 2 | SUB2 | 2 |
| | | 3 | ITEMC | 1 | 3 | SUB3 | 3 |
| | | 4 | ITEMD | 2 | 4 | SUB1 | 4 |
| | | 5 | ITEME | 2 | 5 | SUB2 | 5 |
| | | 6 | ITEMF | 2 | 6 | SUB3 | 6 |
--------------------------------------------------------------
EDIT 2: If the amount of rows differ in either of the tables (4 Items while 3 SubItems or 3 Items while 4 SubItems) then only those rows should be considered that are enough for a 1:1 relation between the two tables (3 result since that is the least count among either) as shown in the following example.
--------------------------------------------------------------
|TYPES |ITEMS |SUBITEMS |
|--------------|----------------------|----------------------|
| ID | VALUE | ID | VALUE | TYPEID | ID | VALUE | ITEMID |
|----|---------|----|--------|--------|----|--------|--------|
| 1 | TYPE1 | 1 | ITEMA | 1 | 1 | SUB1 | 1 |
| 2 | TYPE2 | 2 | ITEMB | 1 | 2 | SUB2 | 2 |
| | | 3 | ITEMC | 1 | 3 | SUB3 | 3 |
| | | 4 | ITEMD | 2 | 4 | SUB1 | 4 |
| | | 5 | ITEME | 2 | 5 | SUB2 | 5 |
| | | 6 | ITEMF | 2 | 6 | SUB3 | 6 |
| | | 7 | ITEMG | 2 | | | |
--------------------------------------------------------------
Of course the actual data isn't as simple and has many other types and items n subitems and the required IDs would be missing some sequence like 10001, 10008, 40042, etc with many other columns all defining what data is being copied and which IDs need to be thrown over them. It's just the matter of how each data row obtained should get mapped 1:1 to each ID obtained (assuming both as if in their own temp tables before the moment of this merger). Following is a sample of what I am able to do so far:
CREATE TABLE #SubItemsTemp (Value VARCHAR(100))
CREATE TABLE #ItemIDsTemp (TypeID INT)
INSERT INTO #SubItemsTemp (Value)
SELECT
SI.Value
FROM
SubItems SI
JOIN Items IT ON SI.ItemID = IT.ID
WHERE
IT.TypeID = 1
INSERT INTO #ItemIDsTemp(Value)
SELECT IT.ID
FROM Items IT
WHERE IT.TypeID = 2
--What next?
EDIT 1: Forgot to mention the actual question line... How to insert them together into the SUBITEMS table such that the second example comes to fruition?
Footnote: This is a extreme simplification of the actual queries that have several joins to get to "TYPE"

Try this query. Query assumes that ID column in SUBITEMS table is identity and will work only with TypeId's 1 and 2
declare #TYPES table(ID int, VALUE varchar(100))
declare #ITEMS table(ID int, VALUE varchar(100), TYPEID int)
declare #SUBITEMS table(ID int identity(1,1), VALUE varchar(100), ITEMID int)
insert into #TYPES values (1, 'TYPE1'), (2, 'TYPE2')
insert into #ITEMS values (1, 'ITEMA', 1), (2, 'ITEMB', 1), (3, 'ITEMC', 1), (4, 'ITEMD', 2), (5, 'ITEME', 2), (6, 'ITEMF', 2), (7, 'ITEMG', 2)
insert into #SUBITEMS values ('SUB1', 1), ('SUB2', 2), ('SUB3', 3)
; with cte_1 as (
select
s.VALUE, rn = row_number() over (order by i.ID)
from
#ITEMS i
join #SUBITEMS s on s.ITEMID = i.ID
where
i.TYPEID = 1
)
, cte_2 as (
select
ID, rn = row_number() over (order by ID)
from
#ITEMS
where
TYPEID = 2
)
insert into #SUBITEMS
select
a.VALUE, b.ID
from
cte_1 a
join cte_2 b on a.rn = b.rn
select * from #SUBITEMS
Output
ID Value ItemId
------------------
1 SUB1 1
2 SUB2 2
3 SUB3 3
4 SUB1 4
5 SUB2 5
6 SUB3 6

SQL server rank rows by group and condition

I need help with ranking of rows in one table.
+-----+-------+-------------+------------+-------+------+
| ID | group | typeInGroup | rankOfType | score | Rank |
+-----+-------+-------------+------------+-------+------+
| 1 | a | type1 | 1 | 40 | |
| 2 | a | type2 | 2 | 55 | |
| 3 | a | type1 | 1 | 20 | |
| 4 | b | type3 | 3 | 80 | |
| 5 | b | type2 | 2 | 60 | |
| 6 | b | type1 | 1 | 70 | |
| 7 | b | type1 | 1 | 70 | |
+-----+-------+-------------+------------+-------+------+
I am basically looking for solution which would give me order for last column "Rank".
Each "group" has up to 9 "typeInGroup" which are ranked by 1-9 in column "rankOfTypes". Each "typeInGroup" has "score". When i am calculating last column "Rank" i look at the "score" and "rankOfType" column.
The row with the higgest score should be ranked first unless there is row with "rankOfType" column that has lower value and score that is <= 15 than the score we have been looking at. Order of rows with same "score" and "rankOfType" is not important.
I should do this check for every single row in group and in the end end with something like this:
+-----+-------+-------------+------------+-------+------+
| ID | group | typeInGroup | rankOfType | score | Rank |
+-----+-------+-------------+------------+-------+------+
| 1 | a | type1 | 1 | 40 | 1 |
| 2 | a | type2 | 2 | 55 | 2 |
| 3 | a | type1 | 1 | 20 | 3 |
| 4 | b | type3 | 3 | 80 | 3 |
| 5 | b | type2 | 2 | 60 | 4 |
| 6 | b | type1 | 1 | 70 | 1 |
| 7 | b | type1 | 1 | 70 | 2 |
+-----+-------+-------------+------------+-------+------+
Any idea how to do this?

the CROSS APPLY query, checks for any rows that meet your special requirement, if exists, than that row will have higher priority
try it out with larger data set and verify the result
declare #tbl table
(
ID int,
Grp char,
typeInGrp varchar(5),
rankOfType int,
score int
)
insert into #tbl select 1, 'a', 'type1', 1, 40
insert into #tbl select 2, 'a', 'type2', 2, 55
insert into #tbl select 3, 'a', 'type1', 1, 20
insert into #tbl select 4, 'b', 'type3', 3, 80
insert into #tbl select 5, 'b', 'type2', 2, 60
insert into #tbl select 6, 'b', 'type1', 1, 70
insert into #tbl select 7, 'b', 'type1', 1, 70
select *,
[rank] = row_number() over (partition by Grp
order by case when cnt > 0 then 1 else 2 end,
score desc)
from #tbl t
cross apply
(
select cnt = count(*)
from #tbl x
where x.Grp = t.Grp
and x.ID <> t.ID
and x.rankOfType > t.rankOfType
and x.score - t.score <= 15
) s
order by ID

Transpose sequential times data to derive duration

I have a table of actions within a session and duration (milliseconds) between each step:
+-----------------------------------------------------------------------+
| | userid | sessionid | action sequence | action | milliseconds | |
| +--------+-----------+-----------------+-------------+--------------+ |
| | 1 | 1 | 1 | event start | 0 | |
| | 1 | 1 | 2 | other | 188114 | |
| | 1 | 1 | 3 | event end | 248641 | |
| | 1 | 1 | 4 | other | 398215 | |
| | 1 | 1 | 5 | event start | 488284 | |
| | 1 | 1 | 6 | other | 528445 | |
| | 1 | 1 | 7 | other | 572711 | |
| | 1 | 1 | 8 | event end | 598123 | |
| | 1 | 2 | 1 | event start | 0 | |
| | 1 | 2 | 2 | event end | 54363 | |
| | 2 | 1 | 1 | other | 0 | |
| | 2 | 1 | 2 | other | 2345 | |
| | 2 | 1 | 1 | other | 75647 | |
| | 3 | 1 | 2 | other | 0 | |
| | 3 | 1 | 3 | event start | 34678 | |
| | 3 | 1 | 4 | other | 46784 | |
| | 3 | 1 | 5 | other | 78905 | |
| | 4 | 1 | 1 | event start | 0 | |
| | 4 | 1 | 2 | other | 7454 | |
| | 4 | 1 | 3 | other | 11245 | |
| | 4 | 1 | 4 | event end | 24567 | |
| | 4 | 1 | 5 | other | 29562 | |
| | 4 | 1 | 6 | other | 43015 | |
| +--------+-----------+-----------------+-------------+--------------+ |
I would like to capture complete events -- sessions containing both an event start and end (some may have a start but no end, an end but no start, or neither -- I don't want those), and their start and end times. Ultimately I want to track duration by transposing the sequential rows of times into columns so I can calculate a difference. The above data table would ideally be transposed into:
+--------+-----------+---------------+--------+--------+
| userid | sessionid | full event id | start | end |
+--------+-----------+---------------+--------+--------+
| 1 | 1 | 1 | 0 | 248641 |
| 1 | 1 | 2 | 488284 | 598123 |
| 1 | 2 | 1 | 0 | 54363 |
| 4 | 1 | 1 | 0 | 24567 |
+--------+-----------+---------------+--------+--------+
I attempted something like:
select a.userid, a.sessionid, a.milliseconds as start, b.milliseconds as end
from #table a
inner join #table b
on a.userid=b.userid
and a.sessionid=b.sessionid
and a.action='event start'
and b.action='event end'
However, that doesn't work since some users may have multiple event start and ends in on session (like userid 1). I am stuck on how to best transpose the times data for each event. Thanks for you help!

So, given your above data:
CREATE TABLE test_table (
`userid` int,
`sessionid` int,
`actionSequence` int,
`action` varchar(11),
`milliseconds` int
);
INSERT INTO test_table
(`userid`, `sessionid`, `actionSequence`, `action`, `milliseconds`)
VALUES
(1, 1, 1, 'event start', 0),
(1, 1, 2, 'other', 188114),
(1, 1, 3, 'event end', 248641),
(1, 1, 4, 'other', 398215),
(1, 1, 5, 'event start', 488284),
(1, 1, 6, 'other', 528445),
(1, 1, 7, 'other', 572711),
(1, 1, 8, 'event end', 598123),
(1, 2, 1, 'event start', 0),
(1, 2, 2, 'event end', 54363),
(2, 1, 1, 'other', 0),
(2, 1, 2, 'other', 2345),
(2, 1, 1, 'other', 75647),
(3, 1, 2, 'other', 0),
(3, 1, 3, 'event start', 34678),
(3, 1, 4, 'other', 46784),
(3, 1, 5, 'other', 78905),
(4, 1, 1, 'event start', 0),
(4, 1, 2, 'other', 7454),
(4, 1, 3, 'other', 11245),
(4, 1, 4, 'event end', 24567),
(4, 1, 5, 'other', 29562),
(4, 1, 6, 'other', 43015);
The following query should get you where you want to be (you were on the right track):
SELECT
tt1.userid,
tt1.sessionid,
tt1.actionSequence,
tt1.milliseconds AS startMS,
MIN(tt2.milliseconds) AS endMS,
MIN(tt2.milliseconds) - tt1.milliseconds AS totalMS
FROM test_table tt1
INNER JOIN test_table tt2
ON tt2.userid = tt1.userid
AND tt2.sessionid = tt1.sessionid
AND tt2.actionSequence > tt1.actionSequence
AND tt2.action = 'event end'
WHERE tt1.action = 'event start'
GROUP BY tt1.userid, tt1.sessionid, tt1.actionSequence, startMS
Giving you this result set:
userid sessionid actionSequence startMS endMS totalMS
1 1 1 0 248641 248641
1 1 5 488284 598123 109839
1 2 1 0 54363 54363
4 1 1 0 24567 24567
The GROUP BY is important, because there are two rows with action = 'event end' and sequence > 1 for sessionid = 1 and userid = 1, so (I assume) we want the one closest to the current sequence, i.e. the MIN(milliseconds). As you can see, it also allows you to go ahead and take the difference of the two columns in this result set, saving you the extra step you were planning :]
Here is a SQLFiddle of this query in action on MySQL 5.6. You did not specify an RDBMS, but I believe the language used by this query should be simple enough to work in any sql engine.

SQL Server - highlight sequence of rows with specific conditions [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
From my query I get:
----------
| Val | Avg |
----------
| 1 | 7 |
----------
| 5 | 7 |
----------
| 2 | 7 |
----------
| 5 | 7 |
----------
| 6 | 7 |
----------
| 5 | 7 |
Assume the above is in some table "t". I want to check if there are three or more rows where the value is less then the corresponding average. If there are three or more points that satisfy that condition then they should be highlighted in a result like this:
----------
| Val | Avg | BelowAvg |
----------
| 8 | 7 | 0 |
----------
| 7 | 7 | 0 |
----------
| 9 | 7 | 0 |
----------
| 5 | 7 | 1 |
----------
| 6 | 7 | 1 |
----------
| 5 | 7 | 1 |
Any suggestions?

This works with your previous data. A problem will arise when you will have the same average for another set of VAL:
SQL Fiddle
MS SQL Server 2012 Schema Setup:
CREATE TABLE t
([Val] int, [Avg] int)
;
INSERT INTO t
([Val], [Avg])
VALUES
(1, 3),
(5, 3),
(2, 3),
(5, 7),
(6, 7),
(5, 7)
;
Query 1:
SELECT t.*,
CASE WHEN t2.cnt >= 3 THEN 1 ELSE 0 END as BelowAvg
FROM t
LEFT OUTER JOIN (SELECT avg, count(*) as cnt
FROM t
WHERE val < avg
GROUP BY avg) t2 ON t.avg = t2.avg
Results:
| VAL | AVG | BELOWAVG |
|-----|-----|----------|
| 1 | 3 | 0 |
| 5 | 3 | 0 |
| 2 | 3 | 0 |
| 5 | 7 | 1 |
| 6 | 7 | 1 |
| 5 | 7 | 1 |
EDIT: Assuming that this is related to a question, you can have something like this :
SQL Fiddle
MS SQL Server 2012 Schema Setup:
CREATE TABLE t
([QuestionID] int, [Val] int, [Avg] int)
;
INSERT INTO t
([QuestionID], [Val], [Avg])
VALUES
(1, 1, 3),
(1, 5, 3),
(1, 2, 3),
(2, 5, 7),
(2, 6, 7),
(2, 5, 7)
;
Query 1:
SELECT t.*,
CASE WHEN t2.cnt >= 3 THEN 1 ELSE 0 END as BelowAvg
FROM t
LEFT OUTER JOIN (SELECT QuestionID, count(*) as cnt
FROM t
WHERE val < avg
GROUP BY QuestionID) t2 ON t.QuestionID = t2.QuestionID
Results:
| QUESTIONID | VAL | AVG | BELOWAVG |
|------------|-----|-----|----------|
| 1 | 1 | 3 | 0 |
| 1 | 5 | 3 | 0 |
| 1 | 2 | 3 | 0 |
| 2 | 5 | 7 | 1 |
| 2 | 6 | 7 | 1 |
| 2 | 5 | 7 | 1 |

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL SERVER COUNT LEAD Condition Group BY - sql

Related

Adding unique identifier based on repeating values

SQL Update a table column with a sequence of values

SQL server rank rows by group and condition

Transpose sequential times data to derive duration

SQL Server - highlight sequence of rows with specific conditions [closed]

Categories

Resources