Linking Related IDs together through two other ID columns - sql

I have a table of about 100k rows with the following layout:
+----+-----------+------------+-------------------+
| ID | PIN | RAID | Desired Output ID |
+----+-----------+------------+-------------------+
| 1 | 80602627 | 1737852-1 | 1 |
| 2 | 80602627 | 34046655-1 | 1 |
| 3 | 351418172 | 33661 | 2 |
| 4 | 351418172 | 33661 | 2 |
| 5 | 351418172 | 33661 | 2 |
| 6 | 351418172 | 34443321-1 | 2 |
| 7 | 491863017 | 26136 | 3 |
| 8 | 491863017 | 34575 | 3 |
| 9 | 491863017 | 34575 | 3 |
| 10 | 661254727 | 26136 | 3 |
| 11 | 661254727 | 26136 | 3 |
| 12 | NULL | 7517 | 4 |
| 13 | NULL | 7517 | 4 |
| 14 | NULL | 7517 | 4 |
| 15 | NULL | 7517 | 4 |
| 16 | NULL | 7517 | 4 |
| 17 | 554843813 | 33661 | 2 |
| 18 | 554843813 | 33661 | 2 |
+----+-----------+------------+-------------------+
The ID column has unique values, with the PIN and RAID columns being two separate identifying numbers used to group linked IDs together. The Desired Output ID column is what I would like SQL to do, essentially looking at both the PIN and RAID columns to spot where there are any relationships between them.
So for example Where Desired Output ID = 2, IDs 3-6 match on PIN = 351418172, and then IDs 17-18 also match as the RAID of 33661 was in the rows for IDs 3-5.
To add as well, NULLs will be in the PIN Column but not in any others.
I did spot a similar question Text however as it is in BigQuery I wasnt sure it would help.
Have been trying to crack this one for a while with no luck, any help massively appreciated.

I suppose DENSE_RANK can solve your problem. Not sure what the combination of PIN and RAID should be, but I think you'll be able to figure it out how to do it like this:
SELECT *,DENSE_RANK( ) over (ORDER BY isnull(pin,id) ),DENSE_RANK( ) over (ORDER BY raid)
FROM accounts

I believe I have found a bit of a bodged solution to this. It runs very slowly as it goes row by row and will only go two links deep on PIN/RAID, but this should be sufficient for 99%+ cases.
Would appreciate any suggestions to speeding it up if anything is immediately obvious.
ID in post above is DebtorNo in Code:
DECLARE #Counter INT = 1
DECLARE #EndCounter INT = 0
IF OBJECT_ID('Tempdb..#OrigACs') IS NOT NULL
BEGIN
DROP TABLE #OrigACs
END
SELECT DebtorNo,
Name,
PostCode,
DOB,
RAJoin,
COALESCE(PIN,DebtorNo COLLATE DATABASE_DEFAULT) AS PIN,
RelatedAssets,
RAID,
PINRelatedAssets
INTO #OrigACs
FROM MIReporting..HC_RA_Test_Data RA
IF OBJECT_ID('Tempdb..#Accounts') IS NOT NULL
BEGIN
DROP TABLE #Accounts
END
SELECT *,
ROW_NUMBER() OVER (ORDER BY CAST(RA.DebtorNo AS INT)) AS Row
INTO #Accounts
FROM #OrigACs RA
ORDER BY CAST(RA.DebtorNo AS INT)
CREATE INDEX Temp_HC_Index ON #OrigACs (RAID,PIN)
SET #EndCounter = (SELECT MAX(Row) FROM #Accounts)
WHILE #Counter <= #EndCounter
BEGIN
IF OBJECT_ID('Tempdb..#RAID1') IS NOT NULL
BEGIN
DROP TABLE #RAID1
END
SELECT *
INTO #RAID1
FROM #OrigACs A
WHERE A.RAID IN (SELECT RAID FROM #Accounts WHERE [Row] = #Counter)
IF OBJECT_ID('Tempdb..#PIN1') IS NOT NULL
BEGIN
DROP TABLE #PIN1
END
SELECT *
INTO #PIN1
FROM #OrigACs A
WHERE A.PIN IN (SELECT PIN FROM #RAID1)
IF OBJECT_ID('Tempdb..#RAID2') IS NOT NULL
BEGIN
DROP TABLE #RAID2
END
SELECT *
INTO #RAID2
FROM #OrigACs A
WHERE A.RAID IN (SELECT RAID FROM #PIN1)
IF OBJECT_ID('Tempdb..#PIN2') IS NOT NULL
BEGIN
DROP TABLE #PIN2
END
SELECT *
INTO #PIN2
FROM #OrigACs A
WHERE A.PIN IN (SELECT PIN FROM #RAID2)
INSERT INTO MIReporting..HC_RA_Final_ACs
SELECT DebtorNo,
Name,
PostCode,
DOB,
RAJoin,
CASE
WHEN PIN = DebtorNo COLLATE DATABASE_DEFAULT THEN NULL
ELSE PIN
END AS PIN,
RelatedAssets,
RAID,
PINRelatedAssets,
COALESCE((SELECT MAX(FRAID) FROM MIReporting..HC_RA_Final_ACs),0) + 1 AS FRAID
FROM #PIN2
SET #Counter = (SELECT MIN([ROW]) FROM #Accounts O WHERE O.DebtorNo NOT IN (SELECT DebtorNo FROM MIReporting..HC_RA_Final_ACs));
END;
SELECT *
FROM MIReporting..HC_RA_Final_ACs
DROP TABLE #OrigACs
DROP TABLE #Accounts
DROP TABLE #RAID1
DROP TABLE #PIN1
DROP TABLE #RAID2
DROP TABLE #PIN2

Related

Counting columns if certain Id

I have a table tblTitles that I am attempting to run a select query on. I would like to select a count based reports are there with IdState and do a count on how many of those titles belong to IsOnSaleCountId which would be if that column has an id of 1
Here is an example of the table:
+---------+----------+-------------------+-----------------+
| IdState | RegionId | Title | IsOnSaleId |
+---------+----------+-------------------+-----------------+
| 22 | 1 | Online Shopping | 0 |
| 22 | 1 | Retail Shopping | 1 |
| 22 | 1 | Pick Up | 0 |
| | | | |
+---------+----------+-------------------+-----------------+
My expected outcome should read that IdState of 22 has 3 reports and 1 report is onSale due to the 1 integer in the second row. Which would look similar to this:
+---------+-------------+---------------+
| IdState | ReportCount | IsOnSaleCount |
+---------+-------------+---------------+
| 22 | 3 | 1 |
+---------+-------------+---------------+
I am having issues when doing a select statement with this count. The IsOnSaleCount is identical to the ReportCount number which they should not be.
I believe this is the case due to my line of code of case when count(i.IsOnSaleId) > 0 THEN count(1) Else 0 End as IsOnSaleCount
Is this something that I can do in a SELECT query?
Here is an example of my query :
select
i.IdState,
count(i.RegionId) as ReportCount,
case when count(i.IsOnSaleId) > 0 THEN count(1) Else 0 End as IsOnSaleCount,
0 as EnterpriseReportCount,
i.IdReportCollection_PK_PrimaryCollection
from IBIS_Local.dbo.tblindustry i
If you want the count:
count(i.IsOnSaleId) as IsOnSaleCount,
If you just want a 0/1 flag, you could do:
sign(count(i.IsOnSaleId)) as IsOnSaleCount,
IF OBJECT_ID('stack.report') IS NOT NULL DROP TABLE stack.report
CREATE TABLE stack.report ( IdState TINYINT, RegionID TINYINT, Title VARCHAR(50), IsOnSaleId INT)
INSERT INTO stack.report
VALUES
(22,1,'Online Shopping', 0)
, (22,1,'Retail Shopping', 1)
, (22,1,'Pick Up', 0)
SELECT *, CONVERT(TINYINT, IsOnSaleId) isonsaleint FROM stack.report
SELECT IdState, COUNT(*) ReportCount, SUM(IsOnSaleId) OnSaleCount
FROM stack.report
GROUP BY IdState
ORDER BY IdState
Result
IdState | ReportCount | OnSaleCount
22 | 3 | 1
The SUM works if IsOnSaleId is an INT, SMALLINT or TINYINT. If IsOnSalesId datatype is BIT (commonly used for flags), then you will need to convert to one of the int types like this SUM(CONVERT(INT, IsOnSaleId))

How to delete the rows with three same data columns and one different data column

I have a table "MARK_TABLE" as below.
How can I delete the rows with same "STUDENT", "COURSE" and "SCORE" values?
| ID | STUDENT | COURSE | SCORE |
|----|---------|--------|-------|
| 1 | 1 | 1 | 60 |
| 3 | 1 | 2 | 81 |
| 4 | 1 | 3 | 81 |
| 9 | 2 | 1 | 80 |
| 10 | 1 | 1 | 60 |
| 11 | 2 | 1 | 80 |
Now I already filtered the data I want to KEEP, but without the "ID"...
SELECT student, course, score FROM mark_table
INTERSECT
SELECT student, course, score FROM mark_table
The output:
| STUDENT | COURSE | SCORE |
|---------|--------|-------|
| 1 | 1 | 60 |
| 1 | 2 | 81 |
| 1 | 3 | 81 |
| 2 | 1 | 80 |
Use the following query to delete the desired rows:
DELETE FROM MARK_TABLE M
WHERE
EXISTS (
SELECT
1
FROM
MARK_TABLE M_IN
WHERE
M.STUDENT = M_IN.STUDENT
AND M.COURSE = M_IN.COURSE
AND M.SCORE = M_IN.SCORE
AND M.ID < M_IN.ID
)
OUTPUT
db<>fiddle demo
Cheers!!
use distinct
SELECT distinct student, course, score FROM mark_table
Assuming you don't just want to select the unique data you want to keep (you mention you've already done this), you can proceed as follows:
Create a temporary table to hold the data you want to keep
Insert the data you want to keep into the temporary table
Empty the source table
Re-Insert the data you want to keep into the source table.
select * from
(
select row_number() over (partition by student,course,score order by score)
rn,student,course,score from mark_table
) t
where rn=1
Use CTE with RowNumber
create table #MARK_TABLE (ID int, STUDENT int, COURSE int, SCORE int)
insert into #MARK_TABLE
values
(1,1,1,60),
(3,1,2,81),
(4,1,3,81),
(9,2,1,80),
(10,1,1,60),
(11,2,1,80)
;with cteDeleteID as(
Select id, row_number() over (partition by student,course,score order by score) [row_number] from #MARK_TABLE
)
delete from #MARK_TABLE where id in
(
select id from cteDeleteID where [row_number] != 1
)
select * from #MARK_TABLE
drop table #MARK_TABLE

Temp table - group by - delete - keep top 10

I have a temp table with 50 000 records. If I do a GROUP BY, with COUNT, it will look like this:
+--------+--------+
|GrpById | Count |
+--------+--------+
| 1 | 10000 |
| 2 | 8000 |
| 3 | 12000 |
| 4 | 9000 |
| 5 | 11000 |
+--------+--------+
I would like to delete some records, so from each Id's (1,2,3,4,5) I would have only 10 records left after deletion.
So eventually If I would make a new GROUP BY with COUNT, I would have something like this:
+--------+--------+
|GrpById | Count |
+--------+--------+
| 1 | 10 |
| 2 | 10 |
| 3 | 10 |
| 4 | 10 |
| 5 | 10 |
+--------+--------+
Can I do it without FETCH NEXT ?
To just preserve an arbitrary 10 per group you can use
WITH CTE AS
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY GrpById ORDER BY GrpById) AS RN
FROM YourTable
)
DELETE FROM
CTE WHERE RN > 10;
Change the ORDER BY if you need something less arbitrary.
declare #id int;
declare #count int;
set #id =1;
select #count=count(1) from table where id = #id
delete top(#count-10) from table where id = #id
Try the above query for all values if id in variable #id

Inserting four columns into one

Good morning,
I have a table TestSeed that stores a multiple choices test with the following structure:
QNo QText QA1 QA2 QA3 QA4
It already contains data.
I would like to move some of the columns to a temp table with the following structure:
QNo QA
Where QNo will store the question number from the first table and QA will store QA1, QA2, QA3 and QA4 over four rows of data.
I am trying to do it in a SQL stored procedure. And it got down to the following situation:
I want to create a nested loop where I can go through the TestSeed table rows in the outer loop and then go through the four QA fields and insert them in the inner loop.
So my code will look something like this:
Declare #TempAnswers as table
(
[QNo] int,
[QAnswer] [nvarchar](50) NULL,
)
DECLARE #QNO int
DECLARE QROW CURSOR LOCAL FOR select QNo from #TempSeed
OPEN QROW
FETCH NEXT FROM QROW into #QNO
WHILE ##FETCH_STATUS = 0
BEGIN
DECLARE #i INT
SET #i = 1
WHILE (#i <=4)
Begin
insert into #TempAnswers
(
[QNo],
[QAnswer]
)
select QNo, 'QA'+#i --This is the part I need
from #TempSeed
SET #i = #i +1
END
FETCH NEXT FROM QROW into #QNO
END
CLOSE IDs
DEALLOCATE IDs
So I guess my question is: can I use a concatenated string to refer to a column name in SQL? and if so how?
I am sort of a beginner. I would appreciate any help I can.
No need for loop, you can simply use the UNPIVOT table operator to do this:
INSERT INTO temp
SELECT
QNO,
val
FROM Testseed AS t
UNPIVOT
(
val
FOR col IN([QA1], [QA2], [QA3], [QA4])
) AS u;
For example, if you have the following sample data:
| QNO | QTEXT | QA1 | QA2 | QA3 | QA4 |
|-----|-------|-----|-----|-----|-----|
| 1 | q1 | a | b | c | d |
| 2 | q2 | b | c | d | e |
| 3 | q3 | e | a | b | c |
| 4 | q4 | a | c | d | e |
| 5 | q5 | c | d | e | a |
The previous query will fill the temp table with:
| QNO | QA |
|-----|----|
| 1 | a |
| 1 | b |
| 1 | c |
| 1 | d |
| 2 | b |
| 2 | c |
| 2 | d |
| 2 | e |
| 3 | e |
| 3 | a |
| 3 | b |
| 3 | c |
| 4 | a |
| 4 | c |
| 4 | d |
| 4 | e |
| 5 | c |
| 5 | d |
| 5 | e |
| 5 | a |
SQL Fiddle Demo
The UNPIVOT table operator, will convert the values of the four columns [QA1], [QA2], [QA3], [QA4] into rows, only one row.
Then you can put that query inside a stored procedure.
So, to answer your last question, you can use Dynamic SQL which involves creating your query as a STRING and then executing it, in case you really want to stick to the method you already started.
You will have to declare a variable to store the text of your query:
DECLARE #query NVARCHAR(MAX)
SET #query = 'SELECT QNo, QA' + #i + ' FROM #TempSeed'
EXEC sp_executesql #query
This will have to be done everytime you build your query which is to be executed (declaration, seting the text of the query and executing it).
If you want something simpler, there are other answers here which will work.
Try this:
Declare #TempAnswers as table
(
[QNo] int,
[QAnswer] [nvarchar](50) NULL,
);
INSERT INTO #TempAnswers(QNo, QAnswer)
SELECT QNo, QA
FROM (SELECT QNo, QA1 AS QA FROM TestSeed
UNION
SELECT QNo, QA2 AS QA FROM TestSeed
UNION
SELECT QNo, QA3 AS QA FROM TestSeed
UNION
SELECT QNo, QA4 AS QA FROM TestSeed
) AS A
ORDER BY QNo;

Create New Table From Other Table After Grouping

How can I insert to a table a value from "grouping" other table?
That means I have 2 table with different structure.
The table ORDRE with existed DATA
Table ORDRE:
ORDRE ID | CODE_DEST |
-------------------------
1 | a |
2 | b |
3 | c |
4 | a |
5 | a |
6 | b |
7 | g |
I want to INSERT the value FROM Table ORDRE INTO TABLE VOIT:
ID_VOIT | ORDRE ID | CODE_DEST |
---------------------------------------
1 | 1 | a |
1 | 4 | a |
1 | 5 | a |
2 | 2 | b |
2 | 6 | b |
3 | 3 | c |
4 | 7 | g |
This is my best guess on what you need using only the info available.
declare #Ordre table
(
ordre_id int,
code_dest char(1)
)
declare #Voit table
(
id_voit int,
ordre_id int,
code_dest char(1)
)
insert into #Ordre values
(1,'a'),
(2,'b'),
(3,'c'),
(4,'a'),
(5,'a'),
(6,'b'),
(7,'g')
insert into #Voit
select id_voit, ordre_id, rsOrdre.code_dest
from #Ordre rsOrdre
inner join
(
select code_dest, ROW_NUMBER() over (order by code_dest) as id_voit
from #Ordre
group by code_dest
) rsVoit on rsVoit.code_dest = rsOrdre.code_dest
order by id_voit, ordre_id
select * from #Voit
Working Example.
For the specific data you give as an example, this works:
insert into VOIT
select
case code_dest
when 'a' then 1
when 'b' then 2
when 'c' then 3
when 'g' then 4
else 0
end, orderId, code_dest from ORDRE order by code_dest, orderId
But it kind of sucks because it requires hard-coding in a huge case statement.
Test is here - https://data.stackexchange.com/stackoverflow/q/119442/
What I like more is moving the VOIT ID / Code_Dest associations to a new table, so then you could do an inner join instead.
insert into VOIT
select voit_id, orderId, t.code_dest
from ORDRE t
join Voit_CodeDest t2 on t.code_dest = t2.code_dest
order by code_dest, orderId
Working example of that here - https://data.stackexchange.com/stackoverflow/q/119443/