stuck with the logic in looping - sql

I was stuck in the middle
SO i placed the sample data which will represent my data set up
MAP and NET_P WERE SAME
DROP TABLE #TEMP_O
CREATE TABLE #TEMP_O
([A] INT, [B] INT, [RESET] INT, [P] NUMERIC(22,6), [MAP] NUMERIC(22,6), [NET_P] NUMERIC(22,6), [MAP_PER] INT, [PRICE_RESET] INT, [RESET_P_VALUE] INT, [PRICE] NUMERIC(22,6), [PT] INT)
;
INSERT INTO #TEMP_O
([A], [B], [RESET], [P], [MAP], [NET_P], [MAP_PER], [PRICE_RESET], [RESET_P_VALUE], [PRICE], [PT])
VALUES
(1404592, 1, NULL, '39', '165.18', '165.18', 1, 1, 50, 39, 10),
(1404592, 2, NULL, '39', '165.18', '165.18', 1, 1, 50, 39, 10),
(1404592, 3, NULL, '39', '165.18', '165.18', 1, 1, 50, 39, 10),
(1404592, 4, NULL, NULL, NULL, NULL, 1, 2, 60, 42, 10),
(1404592, 5, NULL, NULL, NULL, NULL, 1, 2, 60, 48, 10),
(1404592, 6, NULL, NULL, NULL, NULL, 2, 3, 60, 49, 10),
(1404592, 7, NULL, NULL, NULL, NULL, 2, 3, 70, 56, 10),
(1404592, 8, NULL, NULL, NULL, NULL, 2, 3, 70, 65, 10),
(1404592, 9, NULL, NULL, NULL, NULL, 2, 4, 70, 69, 10),
(1404676, 1, NULL, '70', '165.18', '165.18', 1, 1, 52, 70, 10),
(1404676, 2, NULL, '70', '165.18', '165.18', 1, 1, 52, 79, 10),
(1404676, 3, NULL, '70', '165.18', '165.18', 1, 1, 52, 89, 10),
(1404676, 4, NULL, NULL, NULL, NULL, 2, 2, 56, 90, 10),
(1404676, 5, NULL, NULL, NULL, NULL, 2, 2, 56, 97, 10),
(1404676, 6, NULL, NULL, NULL, NULL, 2, 2, 56, 97, 10),
(1404676, 7, NULL, NULL, NULL, NULL, 3, 3, 63, 98, 10),
(1404676, 8, '1', NULL, NULL, NULL, 4, 4, 63, 98, 10),
(1404676, 9, NULL, NULL, NULL, NULL, 4, 4, 63, 99, 10)
;
I got the result but it was taking time i used loops.
need a result like following
the required output will be as follows.

Related

Average with same id

INSERT INTO `iot_athlete_ind_cont_non_cur_chlng_consp`
(`aicicc_id`, `aicid_id`, `aicidl_id`, `aica_id`, `at_id`, `aicicc_type`, `aicicc_tp`, `aicicc_attempt`, `aicicc_lastposition`, `aicicc_status`, `pan_percentile`, `age_percentile`, `created_at`, `updated_at`) VALUES
(270, 3, 14, 17, 7, 'Time', 50, 1, 5, 'Active', NULL, NULL, '2022-11-15 08:34:40', '2022-11-15 08:34:40'),
(271, 3, 14, 20, 7, 'Time', 60, 1, 231, 'Active', NULL, NULL, '2022-11-15 08:34:45', '2022-11-15 08:35:21'),
(272, 3, 14, 21, 7, 'Time', 70, 1, 20, 'Active', NULL, NULL, '2022-11-15 08:34:45', '2022-11-15 08:35:21'),
(273, 3, 14, 17, 7, 'Time', 90, 2, 5, 'Active', NULL, NULL, '2022-11-15 08:34:45', '2022-11-15 12:13:42'),
(274, 3, 14, 20, 7, 'Time', 40, 2, 231, 'Active', NULL, NULL, '2022-11-15 08:34:45', '2022-11-15 08:35:21'),
(275, 3, 14, 21, 7, 'Time', 70, 2, 20, 'Active', NULL, NULL, '2022-11-15 08:34:45', '2022-11-15 08:35:21'),
(276, 3, 10, 17, 3, 'Time', 80, 1, 5, 'Active', NULL, NULL, '2022-11-15 08:34:45', '2022-11-15 12:10:25'),
(277, 3, 10, 20, 3, 'Time', 60, 1, 231, 'Active', NULL, NULL, '2022-11-15 08:34:45', '2022-11-15 12:10:43'),
(278, 3, 10, 21, 3, 'Time', 60, 1, 20, 'Active', NULL, NULL, '2022-11-15 08:34:45', '2022-11-15 12:11:03');
I need 3 rows form this table with average like this
at_id
aicicc_attempt
average
7
1
60
7
2
66.66
3
1
66.66
my query is
SELECT DISTINCT at_id, AVG(aicicc_tp) OVER (PARTITION BY aicicc_attempt) as average
FROM iot_athlete_ind_cont_non_cur_chlng_consp
WHERE aicid_id = '3';
but its not working properly average calculation is wrong here in my query.
Let's first say that's basically just a combination of average and GROUP BY:
SELECT at_id, aicicc_attempt,
AVG(aicicc_tp) AS average
FROM iot_athlete_ind_cont_non_cur_chlng_consp
GROUP BY at_id, aicicc_attempt
ORDER BY at_id DESC;
(I don't know if the ORDER BY clause is necessary, otherwise the last row appears as first row).
The "problem" might be that the "average" column will not exactly be shown as you wanted. For example, let's assume the column "aicicc_tp" has been declared as data type int and you are using a SQLServer DB. In this case, your outcome will also show the average as integer:
at_id
aicicc_attempt
average
7
1
60
7
2
66
3
1
66
You will need to cast the column as float and format the outcome to generate exactly the desired result (of course, the correct average for your sample data is 66.67, not 66.66):
SELECT at_id, aicicc_attempt,
FORMAT(AVG(CAST(aicicc_tp AS float)), '##.##') AS average
FROM iot_athlete_ind_cont_non_cur_chlng_consp
GROUP BY at_id, aicicc_attempt
ORDER BY at_id DESC;
at_id
aicicc_attempt
average
7
1
60
7
2
66.67
3
1
66.67
If you are using another DB type, the concrete query will differ. That's why it's recommended to tag your DB type.
EDIT because you changed the question:
Adding the WHERE clause that is now mentioned in your question does not change the result:
SELECT at_id, aicicc_attempt,
FORMAT(AVG(CAST(aicicc_tp AS float)), '##.##') AS average
FROM iot_athlete_ind_cont_non_cur_chlng_consp
WHERE aicid_id = 3
GROUP BY at_id, aicicc_attempt
ORDER BY at_id DESC;
The outcome is still the same and correct:
at_id
aicicc_attempt
average
7
1
60
7
2
66.67
3
1
66.67
And a last note: If you want to keep your PARTITION BY idea, your query must be like this:
SELECT DISTINCT at_id, aicicc_attempt,
AVG(aicicc_tp)
OVER (PARTITION BY at_id, aicicc_attempt) AS average
FROM iot_athlete_ind_cont_non_cur_chlng_consp
WHERE aicid_id = 3
ORDER BY at_id DESC;
But I don't recommend this because of some disadvantages:
The usage of DISTINCT often slows down the query.
This will not solve the issue the average is possibly built as integer which is not intended according to your description.
Window functions often differ depending on the DB type, GROUP BY works the same on each DB type.
You can verify all these things here: db<>fiddle
For the average, you may use AVG()
Solution below is for sql-server, as you didn't tag your rdbms.
SELECT at_id, aicicc_attempt, AVG(aicicc_tp) FROM iot_athlete_ind_cont_non_cur_chlng_consp
GROUP BY at_id, aicicc_attempt
with cte (aicicc_id, aicid_id,aicidl_id, aica_id, at_id, aicicc_type, aicicc_tp, aicicc_attempt, aicicc_lastposition, aicicc_status,
pan_percentile, age_percentile, created_at, updated_at) as (
values (270, 3, 14, 17, 7, 'Time', 50, 1, 5, 'Active', NULL, NULL, '2022-11-15 08:34:40', '2022-11-15 08:34:40'),
(271, 3, 14, 20, 7, 'Time', 60, 1, 231, 'Active', NULL, NULL, '2022-11-15 08:34:45', '2022-11-15 08:35:21'),
(272, 3, 14, 21, 7, 'Time', 70, 1, 20, 'Active', NULL, NULL, '2022-11-15 08:34:45', '2022-11-15 08:35:21'),
(273, 3, 14, 17, 7, 'Time', 90, 2, 5, 'Active', NULL, NULL, '2022-11-15 08:34:45', '2022-11-15 12:13:42'),
(274, 3, 14, 20, 7, 'Time', 40, 2, 231, 'Active', NULL, NULL, '2022-11-15 08:34:45', '2022-11-15 08:35:21'),
(275, 3, 14, 21, 7, 'Time', 70, 2, 20, 'Active', NULL, NULL, '2022-11-15 08:34:45', '2022-11-15 08:35:21'),
(276, 3, 10, 17, 3, 'Time', 80, 1, 5, 'Active', NULL, NULL, '2022-11-15 08:34:45', '2022-11-15 12:10:25'),
(277, 3, 10, 20, 3, 'Time', 60, 1, 231, 'Active', NULL, NULL, '2022-11-15 08:34:45', '2022-11-15 12:10:43'),
(278, 3, 10, 21, 3, 'Time', 60, 1, 20, 'Active', NULL, NULL, '2022-11-15 08:34:45', '2022-11-15 12:11:03')
)
select at_id, aicicc_attempt, AVG(aicicc_tp) from cte group by 1,2
Result :

SQL help for selecting most recent non-Null value for a unique plant

I have a SQL Server table with data on various factories (plants), with rows identified by a root plant ID, and a sub plant ID. The root ID is the same for the facility for its entire life. And the sub ID is added each time the plant data is changed with the regulatory agency.
Sometimes when the plant data was re-filed with the regulator, only the changed data was submitted, and other fields were left blank (Null).
I'm looking for an elegant way to write a query that will return all of the data from the most recent sub ID record, except that for Capacity, it will pull the most recent sub for which a non-Null Capacity was actually specified.
Assume that these are the fields in the Plant table:
RecordId (primary key)
RootId
SubId
Fuel
Capacity
Here is the SQL for selecting the data for the most recent SubId:
SELECT p1.* FROM Plant as p1
WHERE
p1.SubId = (
SELECT TOP 1 p2.SubId FROM Plant as p2
WHERE p1.RootId = p2.RootId
ORDER BY p2.SubId DESC)
I've been thinking about this for a while, but haven't come up with an approach. Even just a push in the right direction would be appreciated. Here is some SQL code to generate sample data:
CREATE TABLE Plant (
RecordId INTEGER PRIMARY KEY,
RootId VARCHAR(12) not null,
SubID INTEGER not null,
Fuel INTEGER not null,
Capacity DECIMAL(10,4)
);
INSERT INTO Plant
VALUES
(451, 'PLT03-39', 3, 1, 4399.67),
(471, 'PLT03-39', 4, 1, 4399.67),
(1809, 'PLT03-39', 5, 1, 4399.67),
(4888, 'PLT03-39', 6, 1, Null),
(6111, 'PLT03-39', 7, 1, Null),
(450, 'PLT03-40', 3, 1, 15531.67),
(472, 'PLT03-40', 4, 1, Null),
(1810, 'PLT03-40', 5, 1, 14767.61),
(4882, 'PLT03-40', 6, 1, Null),
(6113, 'PLT03-40', 7, 1, Null),
(454, 'PLT03-41', 5, 1, 23726.34),
(455, 'PLT03-41', 6, 1, 23726.34),
(469, 'PLT03-41', 7, 1, 23726.34),
(1807, 'PLT03-41', 8, 1, 22850.96),
(4884, 'PLT03-41', 9, 1, 22850.96),
(6110, 'PLT03-41', 10, 1, 22850.96),
(452, 'PLT03-42', 3, 1, 9120.65),
(470, 'PLT03-42', 4, 1, Null),
(1808, 'PLT03-42', 5, 1, 9120.65),
(4883, 'PLT03-42', 6, 1, 9120.65),
(6109, 'PLT03-42', 7, 1, Null),
(449, 'PLT03-43', 4, 1, 7923.96),
(474, 'PLT03-43', 5, 1, 7923.96),
(1811, 'PLT03-43', 6, 1, 7357.24),
(4881, 'PLT03-43', 7, 1, Null),
(5107, 'PLT03-43', 7, 1, 7711.44),
(5133, 'PLT03-43', 7, 1, Null),
(6112, 'PLT03-43', 8, 1, 7711.44),
(98, 'PLT05-25', 2, 18, 26.565),
(528, 'PLT05-25', 2, 18, 26033.7),
(139, 'PLT05-25', 2, 18, 26565),
(380, 'PLT05-25', 2, 18, Null),
(381, 'PLT05-25', 2, 18, 51854.88),
(7398, 'PLT06-143', 0, 18, 4091.01),
(4112, 'PLT06-143', 1, 18, 4091.01),
(5309, 'PLT06-143', 2, 18, 4091.01),
(73982, 'PLT06-143', 2, 18, 4091.01),
(73981, 'PLT06-143', 3, 18, Null),
(7397, 'PLT06-145', 0, 18, 4091.01),
(73971, 'PLT06-145', 1, 18, 4091.01),
(4109, 'PLT06-145', 1, 18, Null),
(5314, 'PLT06-145', 2, 18, 4091.01),
(73972, 'PLT06-145', 2, 18, Null),
(73973, 'PLT06-145', 3, 18, 4091.01),
(177, 'PLT06-342', 2, 1, 35420),
(1307, 'PLT06-342', 3, 1, 30360),
(5946, 'PLT06-342', 4, 1, 30360),
(6220, 'PLT06-342', 5, 1, Null),
(13264, 'PLT06-342', 6, 1, Null),
(1312, 'PLT06-344', 2, 1, 15180),
(5106, 'PLT06-344', 3, 1, 15180),
(5945, 'PLT06-344', 4, 1, 15180),
(6218, 'PLT06-344', 5, 1, Null),
(10550, 'PLT06-344', 6, 1, 10120),
(13271, 'PLT06-344', 7, 1, 10120),
(2724, 'PLT06-87', 2, 6, 143.451),
(5039, 'PLT06-87', 3, 6, 143.451),
(5886, 'PLT06-87', 4, 6, Null),
(10586, 'PLT06-87', 5, 6, 143.451),
(22759, 'PLT06-87', 6, 6, Null),
(158, 'PLT07-234', 1, 18, 21274.77),
(341, 'PLT07-234', 2, 18, 21274.77),
(7813, 'PLT07-234', 3, 18, 21274.77),
(24562, 'PLT07-234', 4, 18, Null),
(24584, 'PLT07-234', 4, 18, 2488.508),
(5965, 'PLT07-328', 2, 1, 19607.5),
(6073, 'PLT07-328', 2, 1, 19607.5),
(5996, 'PLT07-328', 2, 1, 19607.5),
(6644, 'PLT07-328', 3, 1, 19607.5),
(6701, 'PLT07-328', 3, 1, Null),
(7664, 'PLT07-328', 4, 1, Null),
(227, 'PLT07-39', 2, 18, 50347),
(1269, 'PLT07-39', 3, 18, 50258.45),
(1821, 'PLT07-39', 4, 18, 50258.45),
(1976, 'PLT07-39', 4, 18, 50258.45),
(5282, 'PLT07-39', 5, 18, Null),
(374, 'PLT08-25', 2, 18, 55331.1),
(135, 'PLT08-25', 2, 18, 30.36),
(134, 'PLT08-25', 2, 18, 56.925),
(533, 'PLT08-25', 2, 18, 55.7865),
(93, 'PLT08-25', 2, 18, 56.925),
(4081, 'PLT08-437', 1, 18, 5206.74),
(4241, 'PLT08-437', 2, 18, 5206.74),
(4242, 'PLT08-437', 3, 18, 5206.74),
(4532, 'PLT08-437', 4, 18, 4946.656),
(24344, 'PLT08-437', 5, 18, Null),
(460, 'PLT10-574', 0, 18, 198207.284),
(943, 'PLT10-574', 2, 18, 198207.284),
(1248, 'PLT10-574', 3, 18, 198207.284),
(2371, 'PLT10-574', 4, 18, 198207.284),
(6173, 'PLT10-574', 5, 18, 198207.284),
(17787, 'PLT10-574', 6, 18, 198207.284),
(23533, 'PLT10-574', 7, 18, 198207.284)
;
And here is the expected result of the query I'm seeking:
RecordId RootId SubId Fuel Capacity
6111 PLT03-39 7 1 4399.67
6113 PLT03-40 7 1 14767.61
6110 PLT03-41 10 1 22850.96
6109 PLT03-42 7 1 9120.65
6112 PLT03-43 8 1 7711.44
381 PLT05-25 2 18 51854.88
7398 PLT06-143 3 18 4091.01
7397 PLT06-145 3 18 4091.01
13264 PLT06-342 6 1 30360
13271 PLT06-344 7 1 10120
22759 PLT06-87 6 6 143.451
24584 PLT07-234 4 18 2488.508
7664 PLT07-328 4 1 19607.5
5282 PLT07-39 5 18 50258.45
93 PLT08-25 2 18 56.925
24344 PLT08-437 5 18 4946.656
23533 PLT10-574 7 18 198207.284
Below is one solution to this problem. I used a CTE and MAX aggregate to determine the latest RecordId for each RootId. After joining that back to the Plant table used an OUTER APPLY to retrieve the most recent capacity.
WITH LATEST AS
(
SELECT RootId, MAX(RecordId) AS RecordId
FROM Plant
GROUP BY RootId
)
SELECT
P.RecordId
, P.RootId
, P.SubID
, P.Fuel
, CAP.Capacity
FROM
LATEST AS L
JOIN Plant AS P
ON L.RecordId = P.RecordId
OUTER APPLY
(
SELECT TOP 1 Capacity
FROM Plant
WHERE RootId = P.RootId AND Capacity IS NOT NULL
ORDER BY SubID DESC
) AS CAP
ORDER BY
L.RootId

How can I prevent SQL Server from squaring the number of rows scanned?

I'm running a query over a table variable that holds 22 227 rows. The query used to take 2-3 seconds to complete (which I still think is too slow) but since I added another field to the ORDER BY clause in DENSE_RANK() it now completes in 4.5 minutes!
If I include [t2].[aisdt] with or without [t2].[aiID], the execution plan shows that it's scanning 494 039 529 rows, which is 22 227 squared. The following query generates the correct results, just much too slowly to be useful.
SELECT MAX([t].[SetNum]) OVER (PARTITION BY NULL) AS [MaxSet]
,*
FROM (
SELECT DENSE_RANK() OVER (ORDER BY [t2].[aisdt], [t2].[aiID]) AS [SetNum]
,[t2].*
FROM (
SELECT [aiID]
,COUNT(DISTINCT [acID]) AS [noac]
FROM #Temp
GROUP BY [aiID]
) [t1]
JOIN #Temp [t2]
ON [t2].[aiID] = [t1].[aiID]
WHERE [t1].[noac] < [t2].[asm]
) [t]
Just to be clear, the culprit is the bold section in "DENSE_RANK() OVER (ORDER BY [t2].[aisdt], [t2].[aiID])". Removing this field (which needs to remain) drops the execution time back down to 2-3 seconds. I think it might have something to do with JOINing the table to itself on [aiID] but not [aisdt].
How can I speed this query up to complete in the same time as before, or less?
EDIT
Table definition:
DECLARE #Temp TABLE (
[aiID] INT NOT NULL INDEX [IX_Temp_aiID] -- not unique
,[aisdt] DATETIME NOT NULL INDEX [IX_Temp_aisdt] -- not unique
,[asm] INT NOT NULL
,[cpcID] INT NULL
,[cpce] VARCHAR(10) NULL
,[acID] INT NULL
,[ctvID] INT NULL
,[ct] VARCHAR(100) NULL
,[_36_other_non_matched_fields_] VARCHAR(MAX)
,UNIQUE ([aiID], [cpcID], [cpce], [acID], [ctvID], [ct])
)
[aisdt] is unique per [aiID], but there can be multiple [aiID]s with the same [aisdt].
INSERT INTO #TEMP
VALUES (64, '2017-03-23 10:00:00', 1, 17, '', NULL, NULL, NULL, 'blah')
,(64, '2017-03-23 10:00:00', 1, 34, '', NULL, NULL, NULL, 'blah')
,(99, '2017-04-08 09:00:00', 1, 25, 'Y', NULL, NULL, NULL, 'blah')
,(99, '2017-04-08 09:00:00', 1, 16, 'Y', NULL, NULL, NULL, 'blah')
,(99, '2017-04-08 09:00:00', 1, 76, 'Y', NULL, NULL, NULL, 'blah')
,(99, '2017-04-08 09:00:00', 1, 82, 'Y', NULL, NULL, NULL, 'blah')
,(42, '2017-04-14 16:00:00', 2, 32, '', 32, NULL, NULL, 'blah')
,(42, '2017-04-14 16:00:00', 2, 32, '', 47, NULL, NULL, 'blah')
,(42, '2017-04-14 16:00:00', 2, 47, '', 32, NULL, NULL, 'blah')
,(42, '2017-04-14 16:00:00', 2, 47, '', 47, NULL, NULL, 'blah')
,(54, '2017-03-23 10:00:00', 1, 17, '', NULL, NULL, NULL, 'blah')
,(54, '2017-03-23 10:00:00', 1, 34, '', NULL, NULL, NULL, 'blah')
,(89, '2017-04-08 09:00:00', 1, 25, 'Y', NULL, NULL, NULL, 'blah')
,(89, '2017-04-08 09:00:00', 1, 16, 'Y', NULL, NULL, NULL, 'blah')
,(89, '2017-04-08 09:00:00', 1, 76, 'Y', NULL, NULL, NULL, 'blah')
,(89, '2017-04-08 09:00:00', 1, 82, 'Y', NULL, NULL, NULL, 'blah')
,(32, '2017-04-14 16:00:00', 3, 32, '', 32, NULL, NULL, 'blah')
,(32, '2017-04-14 16:00:00', 3, 32, '', 47, NULL, NULL, 'blah')
,(32, '2017-04-14 16:00:00', 3, 47, '', 32, NULL, NULL, 'blah')
,(32, '2017-04-14 16:00:00', 3, 47, '', 47, NULL, NULL, 'blah')
It must be sorted by [aisdt] (datetime) first, then [aiID], then numbered into sets based on [aiID].
I want to see:
5, 1, 54, '2017-03-23 10:00:00', 1, 17, '', NULL, NULL, NULL, 'blah'
5, 1, 54, '2017-03-23 10:00:00', 1, 34, '', NULL, NULL, NULL, 'blah'
5, 2, 64, '2017-03-23 10:00:00', 1, 17, '', NULL, NULL, NULL, 'blah'
5, 2, 64, '2017-03-23 10:00:00', 1, 34, '', NULL, NULL, NULL, 'blah'
5, 3, 89, '2017-04-08 09:00:00', 1, 25, 'Y', NULL, NULL, NULL, 'blah'
5, 3, 89, '2017-04-08 09:00:00', 1, 16, 'Y', NULL, NULL, NULL, 'blah'
5, 3, 89, '2017-04-08 09:00:00', 1, 76, 'Y', NULL, NULL, NULL, 'blah'
5, 3, 89, '2017-04-08 09:00:00', 1, 82, 'Y', NULL, NULL, NULL, 'blah'
5, 4, 99, '2017-04-08 09:00:00', 1, 25, 'Y', NULL, NULL, NULL, 'blah'
5, 4, 99, '2017-04-08 09:00:00', 1, 16, 'Y', NULL, NULL, NULL, 'blah'
5, 4, 99, '2017-04-08 09:00:00', 1, 76, 'Y', NULL, NULL, NULL, 'blah'
5, 4, 99, '2017-04-08 09:00:00', 1, 82, 'Y', NULL, NULL, NULL, 'blah'
5, 5, 32, '2017-04-14 16:00:00', 3, 32, '', 32, NULL, NULL, 'blah'
5, 5, 32, '2017-04-14 16:00:00', 3, 32, '', 47, NULL, NULL, 'blah'
5, 5, 32, '2017-04-14 16:00:00', 3, 47, '', 32, NULL, NULL, 'blah'
5, 5, 32, '2017-04-14 16:00:00', 3, 47, '', 47, NULL, NULL, 'blah'
The main idea is taken from Partition Function COUNT() OVER possible using DISTINCT that #Jayvee pointed out with a small addition that would make it work when acID has NULL values.
Most likely you can remove all indexes from your #Temp table, the server will have to sort it in several different ways for different window functions anyway, but there is no self-join, so it should be faster.
The plan will have many sorts and they also can be slow, especially when engine underestimates the number of rows in a table. And table variable is exactly this case. Optimiser thinks that table variable has only 1 row. So, I'd recommend to use a classic #Temp table here, even without indexes.
An index on (aiID, acID) should help, but there will be other sorts any way.
WITH
CTE_Counts
AS
(
SELECT
*
-- use DENSE_RANK() to calculate COUNT(DISTINCT)
, DENSE_RANK() OVER (PARTITION BY [aiID] ORDER BY [acID])
+ DENSE_RANK() OVER (PARTITION BY [aiID] ORDER BY [acID] DESC)
-- subtract extra 1 if acID has NULL values within the partition
- MAX(CASE WHEN [acID] IS NULL THEN 1 ELSE 0 END) OVER (PARTITION BY [aiID])
- 1 AS [noac]
FROM #Temp
)
,CTE_SetNum
AS
(
SELECT
*
, DENSE_RANK() OVER (ORDER BY [aisdt], [aiID]) AS [SetNum]
FROM CTE_Counts
WHERE [noac] < [asm]
)
SELECT
*
, MAX([SetNum]) OVER () AS [MaxSet]
FROM CTE_SetNum
ORDER BY
[aisdt]
,[aiID]
,[SetNum]
;
Index as suggested in the comments would definitely play a major part but also I think you can re-write the query without self join in this way:
SELECT MAX([t].[SetNum]) OVER (PARTITION BY NULL) AS [MaxSet]
,*
FROM (
SELECT DENSE_RANK() OVER (ORDER BY [t1].[aisdt], [t1].[aiID]) AS [SetNum]
,[t1].*
FROM (
SELECT * ,dense_rank() over(partition by aiID order by [acID]) -
dense_rank() over(partition by aiID order by [acID]) - 1 AS [noac]
FROM #Temp
) [t1]
WHERE [t1].[noac] < [t1].[asm]
) [t]

How to select unique subsequences in SQL?

In generic terms I have a sequence of events, from which i'd like to select unique non-repeatable sequences using MS SQL Server 2008 R2.
Specifically in this case, each test has a series of recordings, each of which have a specific sequence of stimuli. I'd like to select the unique sequences of stimuli from inside the recordings of one test, insert them into another table and assign the sequence group id to the original table.
DECLARE #Sequence TABLE
([ID] INT
,[TestID] INT
,[StimulusID] INT
,[RecordingID] INT
,[PositionInRecording] INT
,[SequenceGroupID] INT
)
INSERT #Sequence
VALUES
(1, 1, 101, 1000, 1, NULL),
(2, 1, 102, 1000, 2, NULL),
(3, 1, 103, 1000, 3, NULL),
(4, 1, 103, 1001, 1, NULL),
(5, 1, 103, 1001, 2, NULL),
(6, 1, 101, 1001, 3, NULL),
(7, 1, 102, 1002, 1, NULL),
(8, 1, 103, 1002, 2, NULL),
(9, 1, 101, 1002, 3, NULL),
(10, 1, 102, 1003, 1, NULL),
(11, 1, 103, 1003, 2, NULL),
(12, 1, 101, 1003, 3, NULL),
(13, 2, 106, 1004, 1, NULL),
(14, 2, 107, 1004, 2, NULL),
(15, 2, 107, 1005, 1, NULL),
(16, 2, 106, 1005, 2, NULL)
After correctly identifying the unique sequences, the results should look like this
DECLARE #SequenceGroup TABLE
([ID] INT
,[TestID] INT
,[SequenceGroupName] NVARCHAR(50)
)
INSERT #SequenceGroup VALUES
(1, 1, '101-102-103'),
(2, 1, '103-103-101'),
(3, 1, '102-103-101'),
(4, 2, '106-107'),
(5, 2, '107-106')
DECLARE #OutcomeSequence TABLE
([ID] INT
,[TestID] INT
,[StimulusID] INT
,[RecordingID] INT
,[PositionInRecording] INT
,[SequenceGroupID] INT
)
INSERT #OutcomeSequence
VALUES
(1, 1, 101, 1000, 1, 1),
(2, 1, 102, 1000, 2, 1),
(3, 1, 103, 1000, 3, 1),
(4, 1, 103, 1001, 1, 2),
(5, 1, 103, 1001, 2, 2),
(6, 1, 101, 1001, 3, 2),
(7, 1, 102, 1002, 1, 3),
(8, 1, 103, 1002, 2, 3),
(9, 1, 101, 1002, 3, 3),
(10, 1, 102, 1003, 1, 3),
(11, 1, 103, 1003, 2, 3),
(12, 1, 101, 1003, 3, 3),
(13, 2, 106, 1004, 1, 4),
(14, 2, 107, 1004, 2, 4),
(15, 2, 107, 1005, 1, 5),
(16, 2, 106, 1005, 2, 5)
This is fairly easy to do in MySQL and other databases that support some version of GROUP_CONCAT functionality. It's apparently a good deal harder in SQL Server. Here's a stackoverflow question that discusses one technique. Here's another with some information about SQL Server 2008 specific solutions that might also get you started.
This will do it. Had to add an column to #SequenceGroup.
DECLARE #Sequence TABLE
([ID] INT
,[TestID] INT
,[StimulusID] INT
,[RecordingID] INT
,[PositionInRecording] INT
,[SequenceGroupID] INT
)
INSERT #Sequence
VALUES
(1, 1, 101, 1000, 1, NULL),
(2, 1, 102, 1000, 2, NULL),
(3, 1, 103, 1000, 3, NULL),
(4, 1, 103, 1001, 1, NULL),
(5, 1, 103, 1001, 2, NULL),
(6, 1, 101, 1001, 3, NULL),
(7, 1, 102, 1002, 1, NULL),
(8, 1, 103, 1002, 2, NULL),
(9, 1, 101, 1002, 3, NULL),
(10, 1, 102, 1003, 1, NULL),
(11, 1, 103, 1003, 2, NULL),
(12, 1, 101, 1003, 3, NULL),
(13, 2, 106, 1004, 1, NULL),
(14, 2, 107, 1004, 2, NULL),
(15, 2, 107, 1005, 1, NULL),
(16, 2, 106, 1005, 2, NULL)
DECLARE #SequenceGroup TABLE
([ID] INT IDENTITY(1, 1)
,[TestID] INT
,[SequenceGroupName] NVARCHAR(50)
,[RecordingID] INT
)
insert into #SequenceGroup
select TestID, (stuff((select '-' + cast([StimulusID] as nvarchar(100))
from #Sequence t1
where t2.RecordingID = t1.RecordingID
for xml path('')), 1, 1, '')), RecordingID
from #Sequence t2
group by RecordingID, TestID
order by RecordingID
select * from #SequenceGroup
update #Sequence
set SequenceGroupID = sg.ID
from #Sequence s
join #SequenceGroup sg on s.RecordingID=sg.RecordingID and s.TestID=sg.testid
select * from #Sequence

How do I write this MySQL query to get the correct information? (Subquery, multiple subqueries)

Here is the query that I am working on:
SELECT `unitid`, `name` FROM apartmentunits
WHERE aptid = (
SELECT `aptid` FROM rentconditionsmap WHERE rentcondid = 1 AND condnum = 1
)
What I am having trouble figuring out is how to write this to add more rentcondition limiters to filter this list down more.
SELECT `aptid` FROM rentconditionsmap WHERE rentcondid = 1 AND condnum = 1
Data:
CREATE TABLE IF NOT EXISTS `rentconditionsmap` (
`rcid` bigint(10) unsigned NOT NULL AUTO_INCREMENT,
`rentcondid` int(3) unsigned NOT NULL,
`condnum` tinyint(3) unsigned NOT NULL,
`aptid` bigint(10) unsigned DEFAULT NULL,
PRIMARY KEY (`rcid`), KEY `aptid` (`aptid`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=18 ;
INSERT INTO `rentconditionsmap`
(`rcid`, `rentcondid`, `condnum`, `aptid`)
VALUES
(1, 1, 1, 1),
(2, 2, 1, 1),
(3, 3, 0, 1),
(4, 4, 1, 1),
(5, 8, 0, 1);
CREATE TABLE IF NOT EXISTS `apartmentunits` (
`unitid` bigint(10) NOT NULL AUTO_INCREMENT,
`aptid` bigint(10) NOT NULL,
`name` varchar(6) NOT NULL,
`verified` tinyint(1) NOT NULL DEFAULT '0',
`rentcost` int(4) unsigned DEFAULT NULL,
`forrent` tinyint(1) NOT NULL DEFAULT '0',
`unittypekey` varchar(2) DEFAULT NULL,
`sqft` smallint(6) DEFAULT NULL,
PRIMARY KEY (`unitid`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=121 ;
INSERT INTO `apartmentunits`
(`unitid`, `aptid`, `name`, `verified`, `rentcost`, `forrent`, `unittypekey`, `sqft`)
VALUES
(1, 1, '3', 1, 540, 0, '2B', NULL),
(2, 1, '5', 1, NULL, 0, '2B', NULL),
(3, 1, '7', 1, NULL, 0, '2B', NULL),
(53, 1, '1', 1, NULL, 0, '2B', NULL),
(54, 1, '2', 1, NULL, 0, '2B', NULL),
(55, 1, '4', 1, 570, 0, '2B', NULL),
(56, 1, '6', 1, NULL, 0, '2B', NULL),
(57, 1, '8', 1, NULL, 0, '2B', NULL),
(58, 1, '9', 1, NULL, 0, '2B', NULL),
(59, 1, '10', 1, NULL, 0, '2B', NULL),
(60, 1, '11', 1, NULL, 0, '2B', NULL);
As Eric J said as a comment:
Try changing = to IN
SELECT `unitid`, `name` FROM apartmentunits
WHERE `aptid` IN (
SELECT `aptid` FROM rentconditionsmap WHERE rentcondid = 1 AND condnum = 1
)
why not:
SELECT unitid, name
FROM apartmentunits a
INNER JOIN rentconditionsmap r on a.aptid = r.aptid
WHERE (rentcondid = 1 and condnum = 1) OR (rentcondid = 2 and condnum = 2)
Using ANSI-92 join syntax:
SELECT au.unitid,
au.name
FROM APARTMENTUNITS au
JOIN RENTCONDITIONSMAP rcm ON rcm.aptid = au.aptid
AND rcm.rentcondid = 1
AND rcm.condnum = 1
Use a JOIN (below is TSQL sintax for a join, or you can use the explicit INNER JOIN).
SELECT apartmentunits.unitid, apartmentunits.name
FROM apartmentunits, rentconditionsmap
WHERE apartmentunits.aptid = rentconditionsmap.aptid
AND rentconditionsmap.rentcondid = 1
AND rentconditionsmap.condnum = 1
-- AND whatever else...