Retrieving Subset of a group of data in SQL or SAS - sql

My dataset is like the below table.
ARR INST DUE_DATE
1 1 1-Dec
1 2 8-Dec
1 3 15-Dec
1 4 22-Dec
2 1 1-Dec
2 2 8-Dec
3 1 5-Dec
3 2 12-Dec
3 3 19-Dec
4 1 6-Nov
4 2 13-Nov
4 3 20-Nov
4 4 27-Nov
4 5 4-Dec
4 6 11-Dec
5 1 1-Jan
5 2 7-Jan
5 3 13-Jan
5 4 20-Jan
5 5 27-Jan
5 6 3-Feb
5 7 10-Feb
5 8 17-Feb
5 9 23-Feb
5 10 24-Feb
I need to retrieve data for each arrangements based on the number of installments paid.
Eg. If the total no of installments for a particular arrangement is Less than or equal to 4 then the output should have all the installments values till the 4th installment.
If it is greater than four, or a multiple of four, the the values should be the next subsequent set of four values.
The output should be something like
ARR INST DUE_DATE
1 1 1-Dec
1 2 8-Dec
1 3 15-Dec
1 4 22-Dec
2 1 1-Dec
2 2 8-Dec
3 1 5-Dec
3 2 12-Dec
3 3 19-Dec
4 5 4-Dec
4 6 11-Dec
5 9 23-Feb
5 10 24-Feb
How to get this output either in SQL server or SAS Enterprise Guide?
Thanks.

You can use this.
DECLARE #MyTable TABLE (ARR INT, INST INT, DUE_DATE VARCHAR(10))
INSERT INTO #MyTable VALUES
(1 , 1 , '1-Dec '),
(1 , 2 , '8-Dec '),
(1 , 3 , '15-Dec'),
(1 , 4 , '22-Dec'),
(2 , 1 , '1-Dec '),
(2 , 2 , '8-Dec '),
(3 , 1 , '5-Dec '),
(3 , 2 , '12-Dec'),
(3 , 3 , '19-Dec'),
(4 , 1 , '6-Nov '),
(4 , 2 , '13-Nov'),
(4 , 3 , '20-Nov'),
(4 , 4 , '27-Nov'),
(4 , 5 , '4-Dec '),
(4 , 6 , '11-Dec'),
(5 , 1 , '1-Jan '),
(5 , 2 , '7-Jan '),
(5 , 3 , '13-Jan'),
(5 , 4 , '20-Jan'),
(5 , 5 , '27-Jan'),
(5 , 6 , '3-Feb '),
(5 , 7 , '10-Feb'),
(5 , 8 , '17-Feb'),
(5 , 9 , '23-Feb'),
(5 , 10, '24-Feb'),
(5 , 11, '25-Feb'),
(5 , 12, '26-Feb'),
(6 , 1, '27-Feb')
DECLARE #numofinst INT = 4
SELECT ARR, INST, DUE_DATE FROM (
SELECT *,
RN = ROW_NUMBER() OVER(PARTITION BY ARR ORDER BY INST ),
CNT = COUNT(*) OVER(PARTITION BY ARR )
FROM #MyTable
) AS T
WHERE
RN > (( CEILING( CAST( CNT AS decimal(18,2) ) / CAST( #numofinst AS decimal(18,2) )) - 1 ) * #numofinst)
Result:
ARR INST DUE_DATE
----------- ----------- ----------
1 1 1-Dec
1 2 8-Dec
1 3 15-Dec
1 4 22-Dec
2 1 1-Dec
2 2 8-Dec
3 1 5-Dec
3 2 12-Dec
3 3 19-Dec
4 5 4-Dec
4 6 11-Dec
5 9 23-Feb
5 10 24-Feb
5 11 25-Feb
5 12 26-Feb
6 1 27-Feb

For the case of sorted SAS data sets, or a remote data source delivering ordered data, the following DATA Step example shows how a double DOW loop can identify and output the rows belonging to the final 4-row chunk of each id:
data want(label="Rows from each ids last 4-row chunk");
do _n_ = 0 by 1 until (last.id);
set have;
by id sequence; %* by sequence not strictly necessary, but enforces the expectation of increasing sequence within id;
end;
_out_from_n = floor ( _n_ / 4 ) * 4;
do _n_ = 0 to _n_;
set have;
if _n_ >= _out_from_n then OUTPUT;
end;
drop _:;
run;

Alternatively, you could modify the code by Richard to use random read access with the SET Statement POINT= option as follows:
data want;
retain point 1;
drop point;
do _n_ = 0 by 1 until (last.arr);
set have;
by arr inst;
end;
do point = point+(floor(_n_/4)*4) to point+_n_;
set have point=point;
output;
end;
run;

Related

SQL Server Pivoting - No middle column, no aggregation

--EDIT: original table sample, requested in comments
job_id
change_id
change
1
1
5□6□
1
2
7□8□
1
3
9□10□
2
4
1□3□
This is a C# reflection of an object to serialise the data in the Change field.
The desired result is the following:
Job ID
Change ID
Change from
Change to
1
1
5
6
1
2
7
8
1
3
9
10
2
4
1
3
I managed to identify the character as CHAR(1), in order to be able to split it using the following query (which lead to the unpivoted table, which might or might not be useful- apparently not as per comments below, since the order is uncertain):
SELECT job_id, change_id, VALUE change
FROM change_table
CROSS APPLY STRING_SPLIT(change,CHAR(1))
Job ID
Change ID
Changes
1
1
5
1
1
6
1
1
1
2
7
1
2
8
1
2
1
3
9
1
3
10
1
3
2
4
1
2
4
3
2
4
It's kind of painful when delimited data has a trailing delimiter. Here is a simple solution to this using PARSENAME. I had to add and extra space back on the end here because the PARSENAME function gets confused when the last character is a period.
declare #Changes table
(
job_id int
, change_id int
, change varchar(20)
)
insert #Changes values
(1, 1, '5 6 ')
, (1, 2, '7 8 ')
, (1, 3, '9 10 ')
, (2, 4, '1 3 ')
select c.job_id
, c.change_id
, ChangeFrom = parsename(replace(c.change, ' ', '.') + ' ', 3)
, ChangeTo = parsename(replace(c.change, ' ', '.') + ' ', 2)
from #Changes c
Assuming, the Changes value of the last of three rows is ''.
Does this work for you?
SELECT
*,
'' blank
FROM (
SELECT
job_id,
change_id,
changes AS changes_from,
LEAD(changes) OVER (PARTITION BY job_id, change_id ORDER BY job_id) AS changes_to
FROM jobs
) j
WHERE changes_from != '' AND changes_to != ''
Output
job_id
change_id
changes_from
changes_to
blank
1
1
5
6
1
1
7
8
1
2
9
10
2
3
1
3
db<>fiddle here

Get data with condition

declare #t table (
Id int ,
Section int,
Moment date
);
insert into #t values
( 1 , 1 , '2014-01-01'),
( 2 , 1 , '2014-01-02'),
( 3 , 1 , '2014-01-03'),
( 4 , 1 , '2014-01-04'),
( 5 , 1 , '2014-01-05'),
( 6 , 2 , '2014-02-06'),
( 7 , 2 , '2014-02-07'),
( 8 , 2 , '2014-02-08'),
( 9 , 2 , '2014-02-09'),
( 10 , 2 , '2014-02-10'),
( 11 , 3 , '2014-03-11'),
( 12 , 3 , '2014-03-12'),
( 13 , 3 , '2014-03-13'),
( 14 , 3 , '2014-03-14'),
( 15 , 3 , '2014-03-15');
getting data like this
select * from #t
Id Section Moment
1 1 2014-01-01
2 1 2014-01-02
3 1 2014-01-03
4 1 2014-01-04
5 1 2014-01-05
6 2 2014-02-06
7 2 2014-02-07
8 2 2014-02-08
9 2 2014-02-09
10 2 2014-02-10
11 3 2014-03-11
12 3 2014-03-12
13 3 2014-03-13
14 3 2014-03-14
15 3 2014-03-15
But i want data like this.group by 3 and Section wise
if ant Section have 5 rows there will create 2 group.
Id Section Moment Group by 3
1 1 1/1/2014 1
2 1 1/2/2014 1
3 1 1/3/2014 1
4 1 1/4/2014 2
5 1 1/5/2014 2
6 2 2/6/2014 3
7 2 2/7/2014 3
8 2 2/8/2014 3
9 2 2/9/2014 4
10 2 2/10/2014 4
11 3 3/11/2014 5
12 3 3/12/2014 5
13 3 3/13/2014 5
14 3 3/14/2014 6
15 3 3/15/2014 6
You can use window functions and arithmetic. The following enumerates within each section:
select (row_number() over (partition by section order by moment) + 2) / 3, t.*
from #t;
Then applying dense_rank() gets what you want:
select dense_rank() over (order by section, tempcol) as group3,
t.*
from (select (row_number() over (partition by section order by moment) + 2) / 3 as tempcol, t.*
from t
) t
order by id
Here is a db<>fiddle.
There might be mistake in interpretion but till where acknowledge the problem i beleive u are looking for this.I have done it using cursor .Hope it helps u .
DECLARE #i int =0 -- row count
DECLARE #GroupCount int=1
DECLARE #Id int
DECLARE #Section int
DECLARE #Moment DateTime
declare #temp table (
SNO int,
Id int ,
Section int,
Moment date,
GroupedIn nvarchar(200)
);
DECLARE db_cursor CURSOR FOR
SELECT Id,Section,Moment
FROM #t
WHERE Section = 3 --suppose
OPEN db_cursor
FETCH NEXT FROM db_cursor INTO #Id,#Section,#Moment
WHILE ##FETCH_STATUS = 0
BEGIN
Set #i=#i+1
Insert into #temp values(#i,#Id,#Section,#Moment,'G'+CONVERT(nvarchar(20),#GroupCount))
FETCH NEXT FROM db_cursor INTO #Id,#Section,#Moment
END
CLOSE db_cursor
DEALLOCATE db_cursor
Select * from #temp

SQL Recursive CTE unexpectedly returns alternating sets

I am trying to get the use recursive CTE to repeat the same pattern over and over, resetting when "Scenario" increases in value. RowNumber repeats 1-21 (as desired), but whenever "Scenario" is an even number, there are too few items in the "Vals" column to feed into "Value". I can't figure out which part of the code is causing me to be 1 short for only even Scenarios.
Below are the results of the code I'm using at the bottom.
Scenario RowNumber Value Vals
1 1 A A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,B,C
1 2 A A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,B,C
1 3 A A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,B,C
1 4 A A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,B,C
1 5 A A,A,A,A,A,A,A,A,A,A,A,A,A,A,B,C
1 6 A A,A,A,A,A,A,A,A,A,A,A,A,A,B,C
1 7 A A,A,A,A,A,A,A,A,A,A,A,A,B,C
1 8 A A,A,A,A,A,A,A,A,A,A,A,B,C
1 9 A A,A,A,A,A,A,A,A,A,A,B,C
1 10 A A,A,A,A,A,A,A,A,A,B,C
1 11 A A,A,A,A,A,A,A,A,B,C
1 12 A A,A,A,A,A,A,A,B,C
1 13 A A,A,A,A,A,A,B,C
1 14 A A,A,A,A,A,B,C
1 15 A A,A,A,A,B,C
1 16 A A,A,A,B,C
1 17 A A,A,B,C
1 18 A A,B,C
1 19 A B,C
1 20 B C
1 21 C
2 1 A A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,B,B,C
2 2 A A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,B,B,C
2 3 A A,A,A,A,A,A,A,A,A,A,A,A,A,A,B,B,C
2 4 A A,A,A,A,A,A,A,A,A,A,A,A,A,B,B,C
2 5 A A,A,A,A,A,A,A,A,A,A,A,A,B,B,C
2 6 A A,A,A,A,A,A,A,A,A,A,A,B,B,C
2 7 A A,A,A,A,A,A,A,A,A,A,B,B,C
2 8 A A,A,A,A,A,A,A,A,A,B,B,C
2 9 A A,A,A,A,A,A,A,A,B,B,C
2 10 A A,A,A,A,A,A,A,B,B,C
2 11 A A,A,A,A,A,A,B,B,C
2 12 A A,A,A,A,A,B,B,C
2 13 A A,A,A,A,B,B,C
2 14 A A,A,A,B,B,C
2 15 A A,A,B,B,C
2 16 A A,B,B,C
2 17 A B,B,C
2 18 B B,C
2 19 B C
2 20 C
2 21 A A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,B,B,C
3 1 A A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,B,C,C
3 2 A A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,B,C,C
3 3 A A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,B,C,C
3 4 A A,A,A,A,A,A,A,A,A,A,A,A,A,A,B,C,C
3 5 A A,A,A,A,A,A,A,A,A,A,A,A,A,B,C,C
3 6 A A,A,A,A,A,A,A,A,A,A,A,A,B,C,C
3 7 A A,A,A,A,A,A,A,A,A,A,A,B,C,C
3 8 A A,A,A,A,A,A,A,A,A,A,B,C,C
3 9 A A,A,A,A,A,A,A,A,A,B,C,C
3 10 A A,A,A,A,A,A,A,A,B,C,C
3 11 A A,A,A,A,A,A,A,B,C,C
3 12 A A,A,A,A,A,A,B,C,C
3 13 A A,A,A,A,A,B,C,C
3 14 A A,A,A,A,B,C,C
3 15 A A,A,A,B,C,C
3 16 A A,A,B,C,C
3 17 A A,B,C,C
3 18 A B,C,C
3 19 B C,C
3 20 C C
3 21 C
4 1 A A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,B,B,B,C
4 2 A A,A,A,A,A,A,A,A,A,A,A,A,A,A,B,B,B,C
4 3 A A,A,A,A,A,A,A,A,A,A,A,A,A,B,B,B,C
4 4 A A,A,A,A,A,A,A,A,A,A,A,A,B,B,B,C
4 5 A A,A,A,A,A,A,A,A,A,A,A,B,B,B,C
4 6 A A,A,A,A,A,A,A,A,A,A,B,B,B,C
4 7 A A,A,A,A,A,A,A,A,A,B,B,B,C
4 8 A A,A,A,A,A,A,A,A,B,B,B,C
4 9 A A,A,A,A,A,A,A,B,B,B,C
4 10 A A,A,A,A,A,A,B,B,B,C
4 11 A A,A,A,A,A,B,B,B,C
4 12 A A,A,A,A,B,B,B,C
4 13 A A,A,A,B,B,B,C
4 14 A A,A,B,B,B,C
4 15 A A,B,B,B,C
4 16 A B,B,B,C
4 17 B B,B,C
4 18 B B,C
4 19 B C
4 20 C
This is the code I used to generate the above sample. Where am I going wrong?
CREATE TABLE #temp3
(
Scenario INT
,Vals VARCHAR(64)
,LEN INT
)
;
WITH vals AS
(
SELECT
v.*
FROM
(VALUES ('A'), ('B'), ('C')) v(x)
),
CTE AS
(
SELECT CAST('A' AS VARCHAR(MAX)) AS STR, 0 AS LEN
UNION ALL
SELECT (CTE.STR + ',' + vals.x), CTE.LEN + 1
FROM
CTE
JOIN vals
ON vals.x >= RIGHT(CTE.STR, 1)
WHERE CTE.LEN < 19
)
INSERT INTO #temp3
SELECT
ROW_NUMBER() OVER(ORDER BY STR + ',C') AS Scenario
,STR + ',C' AS Vals
,LEN
FROM
CTE
WHERE
STR + 'C' LIKE '%B%'
AND LEN = 19
;
-- Split strings created above into individual characters
WITH cte(Scenario, Value, Vals) AS
(
SELECT
Scenario
,CAST(LEFT(Vals, CHARINDEX(',',Vals+',')-1) AS VARCHAR(10)) AS Value
,STUFF(Vals, 1, CHARINDEX(',',Vals+','), '') AS Vals
FROM #temp3
UNION ALL
SELECT
Scenario
,CAST(LEFT(Vals, CHARINDEX(',',Vals+',')-1) AS VARCHAR(10))
,STUFF(Vals, 1, CHARINDEX(',',Vals+','), '')
FROM cte
WHERE Vals > ''
)
SELECT
Scenario
,ROW_NUMBER() OVER (PARTITION BY Scenario ORDER BY Scenario) RowNumber
,Value
,Vals
FROM cte t
I'm not exactly sure what the problem you are describing is, but the ROW_NUMBER() should use an ORDER BY clause that completely orders the rows in each partition.
When you use "PARTITION BY Scenario ORDER BY Scenario" the order in which the ROW_NUMBER() values are assigned is undefined. Try something like
WITH cte(Scenario, depth, Value, Vals) AS
(
SELECT
Scenario, 0 depth
,CAST(LEFT(Vals, CHARINDEX(',',Vals+',')-1) AS VARCHAR(10)) AS Value
,STUFF(Vals, 1, CHARINDEX(',',Vals+','), '') AS Vals
FROM #temp3
UNION ALL
SELECT
Scenario, depth+1
,CAST(LEFT(Vals, CHARINDEX(',',Vals+',')-1) AS VARCHAR(10))
,STUFF(Vals, 1, CHARINDEX(',',Vals+','), '')
FROM cte
WHERE Vals > ''
)
SELECT
Scenario
,depth
,ROW_NUMBER() OVER (PARTITION BY Scenario ORDER BY depth ) RowNumber
,Value
,Vals
FROM cte t

Postgres query which it replys number which is last different value

I want to query which replys last different values row number from current row.
NUMBER takes only 2 value.
Table A is given
ROWNUM NUMBER
1 1
2 1
3 1
4 1
5 -1
6 -1
7 1
8 1
9 -1
10 -1
11 -1
EXPECTED result FROM Table A by some query.
ROWNUM NUMBER LASTDIFFERENT
1 1 5
2 1 4
3 1 3
4 1 2
5 -1 3
6 -1 2
7 1 3
8 1 2
9 -1
10 -1
11 -1
This might fall into the category of "just because you can doesn't mean you should." I don't see any elegant solutions to your problem, but this is a working solution, at least for your sample data:
with switches as(
select
rownum, number,
case
when lag(number) over (order by rownum) = number then 0
else 1
end switch
from TableA
),
groups as (
select
rownum, number, sum (switch) over (order by rownum) group_id
from switches
)
select
rownum, number, -- group_id,
max (rownum) over (partition by group_id) - rownum + 2 as last_different
from groups
I ran this on your sample data and got these results:
rownum number last_different
1 1 5
2 1 4
3 1 3
4 1 2
5 -1 3
6 -1 2
7 1 3
8 1 2
9 -1 4
10 -1 3
11 -1 2

Convert row to column using sql server 2008?

Table name is Looupvalue
id Ptypefield Value
1 1 D
2 1 E
3 1 F
4 1 G
5 1 H
6 2 FL
7 2 IF
8 2 VVS1
9 2 VVS2
10 2 VS1
11 2 VS2
12 3 0.50
13 3 1.00
14 3 1.50
15 3 2.00
16 4 Marquise
17 4 Round
18 4 Pear
19 4 Radiant
20 4 Princess
Lookupvalue table value convert roow to column depends on ptypefield
Like
id 1 id 2 id 3 id 4
1 D 6 fl 12 0.50 16 Marquise
2 E 7 If 13 1 17 Round....
3 F 8 vvs2 14 1.5
4 G 9 vvs2 15 2
5 H 10 vs1
11 vs2
Thanks
In your sample output, it is not clear why values from columns 1 and 2 would be related to columns 3 and 4. However, here is a possible solution:
;With RowNumbers As
(
Select Id, PTypeField, Value
, Row_Number() Over( Partition By PTypeField Order By Id ) As Rownum
From #Test
)
Select RowNum
, Min( Case When PTypeField = 1 Then Id End ) As Id
, Min( Case When PTypeField = 1 Then Value End ) As [1]
, Min( Case When PTypeField = 2 Then Id End ) As Id
, Min( Case When PTypeField = 2 Then Value End ) As [2]
, Min( Case When PTypeField = 3 Then Id End ) As Id
, Min( Case When PTypeField = 3 Then Value End ) As [3]
, Min( Case When PTypeField = 4 Then Id End ) As Id
, Min( Case When PTypeField = 4 Then Value End ) As [4]
From RowNumbers
Group By RowNum
If you wanted to dynamically generate the columns, the only way to do that in SQL is to use some fugly dynamic SQL. T-SQL was not designed for this sort of output and instead you should use a reporting tool or do the crosstabbing in a middle tier component or class.
This data schema looks like an EAV which would explain why retrieving the data you want is so difficult.