row_number over multiple columns based on trigger - sql

I'm trying to factor in multiple conditions to a dataset I'm working with. Row_number seems like the way to go with lag function in a second query but I can't quite get it 100%.
Here is how my data is structured:
CREATE TABLE emailhell(
mainID INTEGER NOT NULL PRIMARY KEY
,acctID VARCHAR(4) NOT NULL
,emailID VARCHAR(2) NOT NULL
,type INTEGER NOT NULL
,created DATETIME NOT NULL
);
INSERT INTO emailhell(mainID,acctID,emailID,type,created) VALUES (1,'1234','1',6,'1/1/2018');
INSERT INTO emailhell(mainID,acctID,emailID,type,created) VALUES (2,'1234','1',11,'1/1/2018');
INSERT INTO emailhell(mainID,acctID,emailID,type,created) VALUES (3,'1234','2',6,'1/2/2018');
INSERT INTO emailhell(mainID,acctID,emailID,type,created) VALUES (4,'1234','3',6,'1/3/2018');
INSERT INTO emailhell(mainID,acctID,emailID,type,created) VALUES (5,'1234','4',6,'1/4/2018');
INSERT INTO emailhell(mainID,acctID,emailID,type,created) VALUES (6,'ABC','89',6,'1/5/2018');
INSERT INTO emailhell(mainID,acctID,emailID,type,created) VALUES (7,'ABC','90',6,'1/6/2018');
INSERT INTO emailhell(mainID,acctID,emailID,type,created) VALUES (8,'ABC','90',11,'1/7/2018');
INSERT INTO emailhell(mainID,acctID,emailID,type,created) VALUES (9,'258','22',6,'1/7/2018');
INSERT INTO emailhell(mainID,acctID,emailID,type,created) VALUES (10,'258','1',6,'1/10/2018');
INSERT INTO emailhell(mainID,acctID,emailID,type,created) VALUES (11,'258','2',6,'1/30/2018');
INSERT INTO emailhell(mainID,acctID,emailID,type,created) VALUES (12,'258','3',6,'1/31/2018');
INSERT INTO emailhell(mainID,acctID,emailID,type,created) VALUES (13,'258','29',6,'2/15/2018');
INSERT INTO emailhell(mainID,acctID,emailID,type,created) VALUES (14,'258','29',11,'2/16/2018');
INSERT INTO emailhell(mainID,acctID,emailID,type,created) VALUES (15,'258','31',6,'3/1/2018');
and my desired output
+--------+--------+---------+------+-----------+-------+------------+
| mainID | acctID | emailID | type | created | index | touchcount |
+--------+--------+---------+------+-----------+-------+------------+
| 1 | 1234 | 1 | 6 | 1/1/2018 | 1 | |
| 2 | 1234 | 1 | 11 | 1/1/2018 | 2 | 1 |
| 3 | 1234 | 2 | 6 | 1/2/2018 | 1 | |
| 4 | 1234 | 3 | 6 | 1/3/2018 | 2 | |
| 5 | 1234 | 4 | 6 | 1/4/2018 | 3 | |
| 6 | ABC | 89 | 6 | 1/5/2018 | 1 | |
| 7 | ABC | 90 | 6 | 1/6/2018 | 2 | |
| 8 | ABC | 90 | 11 | 1/7/2018 | 3 | 2 |
| 9 | 258 | 22 | 6 | 1/7/2018 | 1 | |
| 10 | 258 | 1 | 6 | 1/10/2018 | 2 | |
| 11 | 258 | 2 | 6 | 1/30/2018 | 3 | |
| 12 | 258 | 3 | 6 | 1/31/2018 | 4 | |
| 13 | 258 | 29 | 6 | 2/15/2018 | 5 | |
| 14 | 258 | 29 | 11 | 2/16/2018 | 6 | 5 |
| 15 | 258 | 31 | 6 | 3/1/2018 | 1 | |
+--------+--------+---------+------+-----------+-------+------------+
Here's what I was working with but It's having issues for some reason when the activity looks like, Type 6 followed by an 11 followed by a 6, 11, etc. Here's my start of the query and I'm sure there's a better way to do this. I am then doing a similar query with the LAG function to grab the times where type 11 appeared.
SELECT dm.TABLE.*,
row_number() over(partition by dm.acctId, dm.type order by dm.acctId, dm.created_date) as index into dm.table2
from dm.TABLE with (NOLOCK)

You are defining groups by acctId and 11. Then for the 11s, you want one less than the size of the group. So, cumulative sum and some other stuff:
select t.*,
row_number() over (partition by acctId, grp order by mainId) as index,
(case when type = 11
then count(*) over (partition by acctId, grp ) - 1
end) as touchcount
from (select t.*,
sum(case when type = 11 then 1 else 0 end) over (partition by acctId order by mainId desc) as grp
from t
) t;
I should note that the definition of group requires counting backwards, rather than forwards. That is because 11 is included in the "previous" group rather than the first record in the "next" group.

Related

Count without using functions (like count) oracle

I have two tables:
TABLE A :
CREATE TABLE z_ostan ( id NUMBER PRIMARY KEY,
name VARCHAR2(30) NOT NULL CHECK (upper(name)=name)
);
TABLE B:
CREATE TABLE z_shahr ( id NUMBER PRIMARY KEY,
name VARCHAR2(30) NOT NULL CHECK (upper(name)=name),
ref_ostan NUMBER,
CONSTRAINT fk_ref_ostan FOREIGN KEY (ref_ostan) REFERENCES z_ostan(id)
);
How can I find the second and third place "id" from -Table A- The least used table B in the table? Without using predefined functions like "count()"
This only processes existing references to Table A.
Updated for oracle (used 12c)
Without using any aggregate or window functions:
Sample data for Table: tblb
+----+---------+---------+
| id | name | tbla_id |
+----+---------+---------+
| 1 | TBLB_01 | 1 |
| 2 | TBLB_02 | 1 |
| 3 | TBLB_03 | 1 |
| 4 | TBLB_04 | 1 | 4 rows
| 5 | TBLB_05 | 2 |
| 6 | TBLB_06 | 2 |
| 7 | TBLB_07 | 2 | 3 rows
| 8 | TBLB_08 | 3 |
| 9 | TBLB_09 | 3 |
| 10 | TBLB_10 | 3 |
| 11 | TBLB_11 | 3 |
| 12 | TBLB_12 | 3 |
| 13 | TBLB_13 | 3 | 6 rows
| 14 | TBLB_14 | 4 |
| 15 | TBLB_15 | 4 |
| 16 | TBLB_16 | 4 | 3 rows
| 17 | TBLB_17 | 5 | 1 row
| 18 | TBLB_18 | 6 |
| 19 | TBLB_19 | 6 | 2 rows
| 20 | TBLB_20 | 7 | 1 row
+----+---------+---------+
There are many ways to express this logic.
Step by step with CTE terms.
The intent is (for each set of tbla_id rows in tblb)
generate a row_number (n) for the rows in each partition.
We would normally use window functions for this.
But I assume these are not allowed.
Use this row_number (n) to determine the count of rows in each tbla_id partition.
To find that count per partition, find the last row in each partition (from step 1).
Order the results of step 2 by n of these last rows.
Choose the 2nd and 3rd row of this result
Done.
WITH first AS ( -- Find the first row per tbla_id
SELECT t1.*
FROM tblb t1
LEFT JOIN tblb t2
ON t1.id > t2.id
AND t1.tbla_id = t2.tbla_id
WHERE t2.id IS NULL
)
, rnum (id, name, tbla_id, n) AS ( -- Generate a row_number (n) for each tbla_id partition
SELECT f.*, 1 FROM first f UNION ALL
SELECT n.id, n.name, n.tbla_id, c.n+1
FROM rnum c
JOIN tblb n
ON c.tbla_id = n.tbla_id
AND c.id < n.id
LEFT JOIN tblb n2
ON n.tbla_id = n2.tbla_id
AND c.id < n2.id
AND n.id > n2.id
WHERE n2.id IS NULL
)
, last AS ( -- Find the last row in each partition to obtain the count of tbla_id references
SELECT t1.*
FROM rnum t1
LEFT JOIN rnum t2
ON t1.id < t2.id
AND t1.tbla_id = t2.tbla_id
WHERE t2.id IS NULL
)
SELECT * FROM last
ORDER BY n, tbla_id OFFSET 1 ROWS FETCH NEXT 2 ROWS ONLY
;
Final Result, where n is the count of references to tbla:
+------+---------+---------+------+
| id | name | tbla_id | n |
+------+---------+---------+------+
| 20 | TBLB_20 | 7 | 1 |
| 19 | TBLB_19 | 6 | 2 |
+------+---------+---------+------+
Some intermediate results...
last CTE term result. The 2nd and 3rd rows of this become the final result.
+------+---------+---------+------+
| id | name | tbla_id | n |
+------+---------+---------+------+
| 17 | TBLB_17 | 5 | 1 |
| 20 | TBLB_20 | 7 | 1 |
| 19 | TBLB_19 | 6 | 2 |
| 7 | TBLB_07 | 2 | 3 |
| 16 | TBLB_16 | 4 | 3 |
| 4 | TBLB_04 | 1 | 4 |
| 13 | TBLB_13 | 3 | 6 |
+------+---------+---------+------+
rnum CTE term result. This provides the row_number over tbla_id partitions ordered by id
+------+---------+---------+------+
| id | name | tbla_id | n |
+------+---------+---------+------+
| 1 | TBLB_01 | 1 | 1 |
| 2 | TBLB_02 | 1 | 2 |
| 3 | TBLB_03 | 1 | 3 |
| 4 | TBLB_04 | 1 | 4 |
| 5 | TBLB_05 | 2 | 1 |
| 6 | TBLB_06 | 2 | 2 |
| 7 | TBLB_07 | 2 | 3 |
| 8 | TBLB_08 | 3 | 1 |
| 9 | TBLB_09 | 3 | 2 |
| 10 | TBLB_10 | 3 | 3 |
| 11 | TBLB_11 | 3 | 4 |
| 12 | TBLB_12 | 3 | 5 |
| 13 | TBLB_13 | 3 | 6 |
| 14 | TBLB_14 | 4 | 1 |
| 15 | TBLB_15 | 4 | 2 |
| 16 | TBLB_16 | 4 | 3 |
| 17 | TBLB_17 | 5 | 1 |
| 18 | TBLB_18 | 6 | 1 |
| 19 | TBLB_19 | 6 | 2 |
| 20 | TBLB_20 | 7 | 1 |
+------+---------+---------+------+
There are a few other ways to tackle this problem in just SQL.

Copy value into rows below until greater value is found in SQL

I have been working on copying first sequential value in "episode" until another value > than itself is found(see column "episode_final" below) without too much luck. The logic should partition the data by id ordered by date in SQL server 2012. Any help will be appreciated.
You can try to use LEAD window function get the episode next value.
Then use CASE WHEN check episode> nextVal does increase 1.
CREATE TABLE T(
id varchar(50),
date date,
episode int
);
INSERT INTO T VALUES (123,'2018-01-01',1);
INSERT INTO T VALUES (123,'2018-01-02',1);
INSERT INTO T VALUES (123,'2018-01-10',1);
INSERT INTO T VALUES (123,'2018-01-11',1);
INSERT INTO T VALUES (123,'2018-01-12',1);
INSERT INTO T VALUES (123,'2018-01-20',2);
INSERT INTO T VALUES (123,'2018-03-20',1);
INSERT INTO T VALUES (123,'2018-05-01',1);
INSERT INTO T VALUES (123,'2018-05-10',3);
INSERT INTO T VALUES (123,'2018-05-20',1);
INSERT INTO T VALUES (345,'2018-06-20',1);
INSERT INTO T VALUES (345,'2018-07-21',1);
INSERT INTO T VALUES (345,'2018-07-22',2);
Query 1:
SELECT t1.Id,
t1.Date,
t1.episode,
(SUM(CASE WHEN episode> coalesce(nextVal,preVal) THEN 1 ELSE 0 END) over (partition by id order by [date]) + 1) episode_final
FROM (
SELECT T.*,LEAD(episode) over (partition by id order by [date]) nextVal,
LAG(episode) over (partition by id order by [date]) preVal
FROM T
)t1
Results:
| Id | Date | episode | episode_final |
|-----|------------|---------|---------------|
| 123 | 2018-01-01 | 1 | 1 |
| 123 | 2018-01-02 | 1 | 1 |
| 123 | 2018-01-10 | 1 | 1 |
| 123 | 2018-01-11 | 1 | 1 |
| 123 | 2018-01-12 | 1 | 1 |
| 123 | 2018-01-20 | 2 | 2 |
| 123 | 2018-03-20 | 1 | 2 |
| 123 | 2018-05-01 | 1 | 2 |
| 123 | 2018-05-10 | 3 | 3 |
| 123 | 2018-05-20 | 1 | 3 |
| 345 | 2018-06-20 | 1 | 1 |
| 345 | 2018-07-21 | 1 | 1 |
| 345 | 2018-07-22 | 2 | 2 |

Get the Id of the matched data from other table. No duplicates of ID from both tables

Here is my table A.
| Id | GroupId | StoreId | Amount |
| 1 | 20 | 7 | 15000 |
| 2 | 20 | 7 | 1230 |
| 3 | 20 | 7 | 14230 |
| 4 | 20 | 7 | 9540 |
| 5 | 20 | 7 | 24230 |
| 6 | 20 | 7 | 1230 |
| 7 | 20 | 7 | 1230 |
Here is my table B.
| Id | GroupId | StoreId | Credit |
| 12 | 20 | 7 | 1230 |
| 14 | 20 | 7 | 15000 |
| 15 | 20 | 7 | 14230 |
| 16 | 20 | 7 | 1230 |
| 17 | 20 | 7 | 7004 |
| 18 | 20 | 7 | 65523 |
I want to get this result without getting duplicate Id of both table.
I need to get the Id of table B and A where the Amount = Credit.
| A.ID | B.ID | Amount |
| 1 | 14 | 15000 |
| 2 | 12 | 1230 |
| 3 | 15 | 14230 |
| 4 | null | 9540 |
| 5 | null | 24230 |
| 6 | 16 | 1230 |
| 7 | null | 1230 |
My problem is when I have 2 or more same Amount in table A, I get duplicate ID of table B. which should be null. Please help me. Thank you.
I think you want a left join. But this is tricky because you have duplicate amounts, but you only want one to match. The solution is to use row_number():
select . . .
from (select a.*, row_number() over (partition by amount order by id) as seqnum
from a
) a left join
(select b.*, row_number() over (partition by credit order by id) as seqnum
from b
)b
on a.amount = b.credit and a.seqnum = b.seqnum;
Another approach, I think simplier and shorter :)
select ID [A.ID],
(select top 1 ID from TABLE_B where Credit = A.Amount) [B.ID],
Amount
from TABLE_A [A]

How to get values of rows and columns

I have two tables.
Student Table
Property Table
Result Table
How can I get the value of Student Table and the property ID of the column fron the Property table and merge that into the Result table?
Any advice would be helpful.
Update #1:
I tried using Christian Moen 's suggestion, this is what i get.
You need to UNPIVOT the Student's columns first, to get the columns (properties names) in one column as rows. Then join with the Property table based on the property name like this:
WITH UnPivoted
AS
(
SELECT ID, value,col
FROM
(
SELECT ID,
CAST(Name AS NVARCHAR(50)) AS Name,
CAST(Class AS NVARCHAR(50)) AS Class,
CAST(ENG AS NVARCHAR(50)) AS ENG,
CAST(TAM AS NVARCHAR(50)) AS TAM,
CAST(HIN AS NVARCHAR(50)) AS HIN,
CAST(MAT AS NVARCHAR(50)) AS MAT,
CAST(PHY AS NVARCHAR(50)) AS PHY
FROM Student
) AS s
UNPIVOT
(value FOR col IN
([Name], [class], [ENG], [TAM], [HIN], [MAT], [PHY])
)AS unpvt
)
SELECT
ROW_NUMBER() OVER(ORDER BY u.ID,PropertyID) AS ID,
p.PropertyID,
u.Value,
u.ID AS StudID
FROM Property AS p
INNER JOIN UnPivoted AS u ON p.PropertyName = u.col;
For the first ID, I used the ranking function ROW_NUMBER() to generate this sequence id.
This will give the exact results that you are looking for.
Results:
| ID | PropertyID | Value | StudID |
|----|------------|--------|--------|
| 1 | 1 | Jack | 1 |
| 2 | 2 | 10 | 1 |
| 3 | 3 | 89 | 1 |
| 4 | 4 | 88 | 1 |
| 5 | 5 | 45 | 1 |
| 6 | 6 | 100 | 1 |
| 7 | 7 | 98 | 1 |
| 8 | 1 | Jill | 2 |
| 9 | 2 | 10 | 2 |
| 10 | 3 | 89 | 2 |
| 11 | 4 | 99 | 2 |
| 12 | 5 | 100 | 2 |
| 13 | 6 | 78 | 2 |
| 14 | 7 | 91 | 2 |
| 15 | 1 | Trevor | 3 |
| 16 | 2 | 12 | 3 |
| 17 | 3 | 100 | 3 |
| 18 | 4 | 50 | 3 |
| 19 | 5 | 49 | 3 |
| 20 | 6 | 94 | 3 |
| 21 | 7 | 100 | 3 |
| 22 | 1 | Jim | 4 |
| 23 | 2 | 8 | 4 |
| 24 | 3 | 100 | 4 |
| 25 | 4 | 91 | 4 |
| 26 | 5 | 92 | 4 |
| 27 | 6 | 100 | 4 |
| 28 | 7 | 100 | 4 |
Other option is to use of apply if you don't want to go unpivot way
select row_number() over (order by (select 1)) ID, p.PropertyID [PropID], a.Value, a.StuID
from Student s
cross apply
(
values (s.ID, 'Name', s.Name),
(s.ID, 'Class', cast(s.Class as varchar)),
(s.ID, 'ENG', cast(s.ENG as varchar)),
(s.ID, 'TAM', cast(s.TAM as varchar)),
(s.ID, 'HIN', cast(s.HIN as varchar)),
(s.ID, 'MAT', cast(s.MAT as varchar)),
(s.ID, 'PHY', cast(s.PHY as varchar))
) as a(StuID, Property, Value)
join Property p on p.PropertyName = a.Property

Sequential Group By in sql server

For this Table:
+----+--------+-------+
| ID | Status | Value |
+----+--------+-------+
| 1 | 1 | 4 |
| 2 | 1 | 7 |
| 3 | 1 | 9 |
| 4 | 2 | 1 |
| 5 | 2 | 7 |
| 6 | 1 | 8 |
| 7 | 1 | 9 |
| 8 | 2 | 1 |
| 9 | 0 | 4 |
| 10 | 0 | 3 |
| 11 | 0 | 8 |
| 12 | 1 | 9 |
| 13 | 3 | 1 |
+----+--------+-------+
I need to sum sequential groups with the same Status to produce this result.
+--------+------------+
| Status | Sum(Value) |
+--------+------------+
| 1 | 20 |
| 2 | 8 |
| 1 | 17 |
| 2 | 1 |
| 0 | 15 |
| 1 | 9 |
| 3 | 1 |
+--------+------------+
How can I do that in SQL Server?
NB: The values in the ID column are contiguous.
Per the tag I added to your question this is a gaps and islands problem.
The best performing solution will likely be
WITH T
AS (SELECT *,
ID - ROW_NUMBER() OVER (PARTITION BY [STATUS] ORDER BY [ID]) AS Grp
FROM YourTable)
SELECT [STATUS],
SUM([VALUE]) AS [SUM(VALUE)]
FROM T
GROUP BY [STATUS],
Grp
ORDER BY MIN(ID)
If the ID values were not guaranteed contiguous as stated then you would need to use
ROW_NUMBER() OVER (ORDER BY [ID]) -
ROW_NUMBER() OVER (PARTITION BY [STATUS] ORDER BY [ID]) AS Grp
Instead in the CTE definition.
SQL Fiddle