Merging data in a single SQL table without a Cursor - sql

I have a table with an ID column and another column with a number. One ID can have multiple numbers. For example
ID | Number
1 | 25
1 | 26
1 | 30
1 | 24
2 | 4
2 | 8
2 | 5
Now based of this data, in a new table, I want to have this
ID | Low | High
1 | 24 | 26
1 | 30 | 30
2 | 4 | 5
2 | 8 | 8
As you can see, I want to merge any data where the numbers are consecutive, like 24, 25, 26. So now the low was 24, the high was 26, and then 30 is still a separate range. I am dealing with large amounts of data, so I would prefer to not use a cursor for performance sake (which is what I was previously doing, and was slowing things down quite a bit)...What is the best way to achieve this? I'm no SQL pro, so I'm not sure if there is a function available that could make this easier, or what the fastest way to accomplish this would be.
Thanks for the help.

The key observation is that a sequence of numbers minus another sequence is a constant. We can generate another sequence using row_number. This identifies all the groups:
select id, MIN(number) as low, MAX(number) as high
from (select t.*,
(number - ROW_NUMBER() over (partition by id order by number) ) as groupnum
from t
) t
group by id, groupnum
The rest is just aggregation.

Solution with CTE and recursion:
WITH CTE AS (
SELECT T.ID, T.NUMBER, T.NUMBER AS GRP
FROM T
LEFT OUTER JOIN T T2 ON T.ID = T2.ID AND T.NUMBER -1 = T2.NUMBER
WHERE T2.ID IS NULL
UNION ALL
SELECT T.ID, T.NUMBER, GRP
FROM CTE
INNER JOIN T
ON T.ID = CTE.ID AND T.NUMBER = CTE.NUMBER + 1
)
SELECT ID, MAX( NUMBER ), MIN(NUMBER)
FROM CTE
GROUP BY ID, GRP
Results at fiddlesql

I'd suggest using a WHILE loop structure with a table variable instead of the cursor.
For example,
DECLARE #TableVariable TABLE
(
MyID int IDENTITY (1, 1) PRIMARY KEY NOT NULL,
[ID] int,
[Number] int
)
DECLARE #Count int, #Max int
INSERT INTO #TableVariable (ID, Number)
SELECT ID, Number
FROM YourSourceTable
SELECT #Count = 1, #Max = MAX(MyID)
FROM #TableVariable
WHILE #Count <= #Max
BEGIN
...do your processing here...
SET #Count = #Count + 1
END

CREATE TABLE Table1
([ID] int, [Number] int)
;
INSERT INTO Table1
([ID], [Number])
VALUES
(1, 25),
(1, 26),
(1, 30),
(1, 24),
(2, 4),
(2, 8),
(2, 5)
;
select ID,
MIN(Number)
,(SELECT MIN(Number)
FROM (SELECT TOP 2 Number from Table1 WHERE ID =
T1.Id ORDER BY Number DESC) as DT)
from Table1 as T1
GROUP BY ID
UNION
SELECT ID, MAX(Number), MAX(Number)
FROM Table1 as T1
GROUP BY ID;
Live Example

Related

SQL count number of records where value remains constant

I need to find the count of tracker_id where position remains 1 through out the table.
tracker_id | position
---------------------
5 | 1
11 | 1
4 | 1
4 | 2
5 | 2
4 | 1
4 | 1
11 | 1
14 | 1
9 | 2
Here, the output should be 2 since, position of tracker_id:11 and 14 remains 1 through out the table.
You can use not exists
select count(*) from tbl a
where not exists(select 1
from tbl b
where a.tracker_id = b.tracker_id
and a.position <> b.position )
and a.position = 1
Output: 2
declare #table1 as table (tracker_id int,postion int)
insert into #table1 values (5,1)
insert into #table1 values (11,1)
insert into #table1 values (4,1)
insert into #table1 values (4,2)
insert into #table1 values (5,2)
insert into #table1 values (4,1)
insert into #table1 values (4,1)
insert into #table1 values (11,1)
insert into #table1 values (14,1)
insert into #table1 values (9,2)
select count(tracker_id),tracker_id,postion from #table1 group by tracker_id,postion
You can also do:
select ( count(distinct tracker_id) -
count(distinct tracker_id) filter (where position <> 1)
) as num_all_1s
from t;
Using uncorrelated subquery
select count(distinct tracker_id)
from t
where position=1
and tracker_id not in (select tracker_id from t where position<>1);
Using window function
select count(distinct tracker_id)
from (select *, avg(position) over (partition by tracker_id) as avg_pos from t) a
where avg_pos=1;
This one is just for giggles
select distinct count(*) over ()
from t
group by tracker_id
having count(*) = sum(position);
And if you really want to have fun
select count(distinct tracker_id)-count(distinct case when position<>1 then tracker_id end)
from t;
If position can only be 1, then you can use this, which gets all the tracker_ids with only a single position value, and then limits that to those records where position = 1:
WITH agg AS
(
SELECT
tracker_id
, p = MAX(position)
FROM table1
GROUP BY tracker_id
HAVING COUNT(DISTINCT position) = 1
)
SELECT COUNT(tracker_id)
FROM agg
WHERE p = 1

Is it possible to find (in an ordered table) multiple rows in sequence? [duplicate]

This question already has an answer here:
Compare Current Row with Previous/Next row in SQL Server
(1 answer)
Closed 4 years ago.
If I have a table ordered by ID like so:
|---------------------|------------------|
| ID | Key |
|---------------------|------------------|
| 1 | Foo |
|---------------------|------------------|
| 2 | Bar |
|---------------------|------------------|
| 3 | Test |
|---------------------|------------------|
| 4 | Test |
|---------------------|------------------|
Is there a way to detect two rows that match a where clause in sequence?
For example, in the table above, I would like to see if any two rows in succession have a Key of 'test'.
Is this possible in SQL?
Another option is a variation of Gaps-and-Islands
Example
Declare #YourTable Table ([ID] int,[Key] varchar(50))
Insert Into #YourTable Values
(1,'Foo')
,(2,'Bar')
,(3,'Test')
,(4,'Test')
Select ID_R1 = min(ID)
,ID_R2 = max(ID)
,[Key]
From (
Select *
,Grp = ID-Row_Number() over(Partition By [Key] Order by ID)
From #YourTable
) A
Group By [Key],Grp
Having count(*)>1
Returns
ID_R1 ID_R2 Key
3 4 Test
EDIT - Just in case the IDs are NOT Sequential
Select ID_R1 = min(ID)
,ID_R2 = max(ID)
,[Key]
From (
Select *
,Grp = Row_Number() over(Order by ID)
-Row_Number() over(Partition By [Key] Order by ID)
From #YourTable
) A
Group By [key],Grp
Having count(*)>1
You can try to use ROW_NUMBER window function check the gap.
SELECT [Key]
FROM (
SELECT *,ROW_NUMBER() OVER(ORDER BY ID) -
ROW_NUMBER() OVER(PARTITION BY [Key] ORDER BY ID) grp
FROM T
)t1
GROUP BY [Key]
HAVING COUNT(grp) = 2
You can do a self join as
CREATE TABLE T(
ID INT,
[Key] VARCHAR(45)
);
INSERT INTO T VALUES
(1, 'Foo'),
(2, 'Bar'),
(3, 'Test'),
(4, 'Test');
SELECT MIN(T1.ID) One,
MAX(T2.ID) Two,
T1.[Key] OnKey
FROM T T1 JOIN T T2
ON T1.[Key] = T2.[Key]
AND
T1.ID <> T2.ID
GROUP BY T1.[Key];
Or a CROSS JOIN as
SELECT MIN(T1.ID) One,
MAX(T2.ID) Two,
T1.[Key] OnKey
FROM T T1 CROSS JOIN T T2
WHERE T1.[Key] = T2.[Key]
AND
T1.ID <> T2.ID
GROUP BY T1.[Key]
Demo
You can use the LEAD() window function, as in:
with
x as (
select
id, [key],
lead(id) over(order by id) as next_id,
lead([key]) over(order by id) as next_key
from my_table
)
select id, next_id from x where [key] = 'test' and next_key = 'test'

How can I select distinct by one column?

I have a table with the columns below, and I need to get the values if COD is duplicated, get the non NULL on VALUE column. If is not duplicated, it can get a NULL VALUE. Like the example:
I'm using SQL SERVER.
This is what I get:
COD ID VALUE
28 1 NULL
28 2 Supermarket
29 1 NULL
29 2 School
29 3 NULL
30 1 NULL
This is what I want:
COD ID VALUE
28 2 Supermarket
29 2 School
30 1 NULL
What I'm tryin' to do:
;with A as (
(select DISTINCT COD,ID,VALUE from CodId where ID = 2)
UNION
(select DISTINCT COD,ID,NULL from CodId where ID != 2)
)select * from A order by COD
You can try this.
DECLARE #T TABLE (COD INT, ID INT, VALUE VARCHAR(20))
INSERT INTO #T
VALUES(28, 1, NULL),
(28, 2 ,'Supermarket'),
(29, 1 ,NULL),
(29, 2 ,'School'),
(29, 3 ,NULL),
(30, 1 ,NULL)
;WITH CTE AS (
SELECT *, RN= ROW_NUMBER() OVER (PARTITION BY COD ORDER BY VALUE DESC) FROM #T
)
SELECT COD, ID ,VALUE FROM CTE
WHERE RN = 1
Result:
COD ID VALUE
----------- ----------- --------------------
28 2 Supermarket
29 2 School
30 1 NULL
Another option is to use the WITH TIES clause in concert with Row_Number()
Example
Select top 1 with ties *
from YourTable
Order By Row_Number() over (Partition By [COD] order by Value Desc)
Returns
COD ID VALUE
28 2 Supermarket
29 2 School
30 1 NULL
I would use GROUP BY and JOIN. If there is no NOT NULL value for a COD than it should be resolved using the OR in JOIN clause.
SELECT your_table.*
FROM your_table
JOIN (
SELECT COD, MAX(value) value
FROM your_table
GROUP BY COD
) gt ON your_table.COD = gt.COD and (your_table.value = gt.value OR gt.value IS NULL)
If you may have more than one non null value for a COD this will work
drop table MyTable
CREATE TABLE MyTable
(
COD INT,
ID INT,
VALUE VARCHAR(20)
)
INSERT INTO MyTable
VALUES (28,1, NULL),
(28,2,'Supermarket'),
(28,3,'School'),
(29,1,NULL),
(29,2,'School'),
(29,3,NULL),
(30,1,NULL);
WITH Dups AS
(SELECT COD FROM MyTable GROUP BY COD HAVING count (*) > 1 )
SELECT MyTable.COD,MyTable.ID,MyTable.VALUE FROM MyTable
INNER JOIN dups ON MyTable.COD = Dups.COD
WHERE value IS NOT NULL
UNION
SELECT MyTable.COD,MyTable.ID,MyTable.VALUE FROM MyTable
LEFT JOIN dups ON MyTable.COD = Dups.COD
WHERE dups.cod IS NULL

Compare two number SQL

In SQL,I am trying to compare two numbers in the same field. Both numbers contain different information, but for some technical reason they are same. The problem is when exist one sub-string of length 5 and another of length 4 and the last 4 digits of both are same.I want to get the first one with length 5.
Example:
--------------------------------
|ID | Number| Description |
---------------------------------
| 1 | 12345 | Project X,Ready |
---------------------------------
| 2 | 2345 | Project X,onDesign |
---------------------------------
I should always get 12345(or biggest one) if exist numbers with last 4 digits same. Is there any CASE or CTE statement which can give me an easy resolution for this issue?
Try this:
SELECT Id
,Number
,Description
FROM (
SELECT Id
,Number
,Description
,rank() OVER (PARTITION BY right(cast([Number] AS VARCHAR(20)), 4) ORDER BY Number DESC) AS Ranking
FROM YourTable
) InnerTable
WHERE ranking = 1
Here is an example with not exists:
DECLARE #t TABLE
(
ID INT ,
Number INT ,
Description VARCHAR(100)
)
INSERT INTO #t
VALUES ( 1, 12345, 'Project 1' ),
( 2, 2345, 'Project 2' ),
( 3, 77777, 'Project 3' ),
( 4, 7777, 'Project 4' ),
( 5, 88888, 'Project 5' ),
( 6, 9999, 'Project 6' )
SELECT * FROM #t t1
WHERE NOT EXISTS(SELECT * FROM #t t2
WHERE t2.ID <> t1.ID AND
CAST(t2.Number AS VARCHAR(10)) LIKE '%' + CAST(t1.Number AS VARCHAR(10)))
Output:
ID Number Description
1 12345 Project 1
3 77777 Project 3
5 88888 Project 5
6 9999 Project 6
So you need to join using last 4 digits. You could do this by using simple MOD operator. It's used as a percentage sign in SQL Server.
SELECT 12345 % 10000;
This outputs 2345. Exactly what we are looking for.
So we could build the following query to use that calculation:
DECLARE #Test TABLE
(
ID INT
, Number INT
, Description VARCHAR(500)
);
INSERT INTO #Test(ID, Number, Description)
VALUES (1, 12345, 'Project X,Ready')
, (2, 2345, 'Project X,onDesign');
SELECT T1.*
FROM #Test AS T1
INNER JOIN #Test AS T2
ON T2.Number = T1.Number % 10000
WHERE T2.Number <> T1.Number;
Output:
╔════╦════════╦═════════════════╗
║ ID ║ Number ║ Description ║
╠════╬════════╬═════════════════╣
║ 1 ║ 12345 ║ Project X,Ready ║
╚════╩════════╩═════════════════╝
Note that I've added WHERE T2.Number <> T1.Number. It eliminates equal numbers, because SELECT 2345 % 10000 is 2345 as well.
Update
This could be done using ROW_NUMBER()
;WITH Data (ID, Number, Description, RN)
AS (
SELECT ID
, Number
, Description
, ROW_NUMBER() OVER (PARTITION BY Number % 10000 ORDER BY Number DESC)
FROM #Test
)
SELECT *
FROM Data
WHERE RN = 1;
This will do the classic row_number stuff. It will partition windows by Number % 10000, which means that 12345 and 2345 will fall under same window and the highest number will always come first.
Try this:
SELECT DISTINCT A.*
FROM [Tablename] AS A
INNER JOIN [Tablename] AS B
ON B.Number =RIGHT(A.Number,4)
WHERE B.Number <> A.Number;
RIGHT(A.Number,4) will compare the last 4 digits and will give the output
The query might be RDBMS spesific. For example with MSSQL you can do like this:
SELECT *
FROM myTable AS d1
WHERE NOT EXISTS ( SELECT *
FROM myTable AS d2
WHERE SUBSTRING(d2.number, 2, 4) = d1.number );
EDIT: Ah, you edited and it is an INT! Then you can use the % operator instead of substring.
Sample with CTE:
DECLARE #dummy TABLE
(
id INT IDENTITY
PRIMARY KEY ,
number INT ,
[description] VARCHAR(20)
);
INSERT #dummy ( [number], [description] )
VALUES ( 12345, 'P' ),
( 22345, 'P' ),
( 2345, 'P' ),
( 3456, 'P' ),
( 13456, 'P' ),
( 4567, 'P' );
WITH d AS (
SELECT MAX(number) AS maxNum
FROM #dummy AS [d]
GROUP BY [d].[number] % 10000
)
SELECT d1.*
FROM #dummy AS [d1]
INNER JOIN d ON d.[maxNum] = d1.[number];

SQL - Identify Distinct Values Including Count For ALL Columns In A Table

I want to be able to identify the distinct values including a count of the value for each column in a table.
I reviewed - Get distinct records with counts
And it shows me how to do this for an individual column and works great. However, I have a table with over 600 columns, and coding each column would be incredibly time consuming.
Is there a way to code my sql where I could get these same results for all columns in a table, without having to individually input each column?
So to use the example from the link:
personid, msg
-------------
1, 'msg1'
2, 'msg2'
2, 'msg3'
3, 'msg4'
1, 'msg2'
My results would be:
personid, count | msg, count
-----------------------------
1, 2 | msg1, 1
2, 2 | msg2, 2
3, 1 | msg3, 1
_, _ | msg4, 1
Is this possible? I've tried getting at it using distincts and wildcards (*) but no luck.
Apologize if this isn't detailed enough, this is my first post and I'm no SQL expert, and Googling hasn't found an answer. Thanks.
I am not sure that it convinient, but you can do it like this:
CREATE TABLE #temp (
personid int,
message nvarchar(max)
);
GO
INSERT INTO #temp
SELECT 1, 'msg1' UNION ALL
SELECT 2, 'msg2' UNION ALL
SELECT 2, 'msg3' UNION ALL
SELECT 3, 'msg4' UNION ALL
SELECT 1, 'msg2';
GO
SELECT
isnull(t1.rn, t2.rn) as rn,
t1.personid as personid, t1.cnt as personid_cnt,
t2.message as message, t2.cnt as message_cnt
FROM
(SELECT personid, count(*) as cnt,
ROW_NUMBER() over (order by personid) as rn
FROM #temp GROUP BY personid) t1
FULL JOIN
(SELECT message, count(*) as cnt,
ROW_NUMBER() over (order by message) as rn
FROM #temp GROUP BY message) t2
ON t1.rn = t2.rn
ORDER BY rn
DROP table #temp;
result:
rn personid personid_cnt message message_cnt
1 1 2 msg1 1
2 2 2 msg2 2
3 3 1 msg3 1
4 NULL NULL msg4 1