PostgreSQL Order by stepping numbers - sql

I need to order records from a table by a column. The old system the customer was using manually selected level 1 items, then all the children of level 1 items for level 2, then so on and so forth through level 5. That is horrible IMHO, as it requires hundreds of queries and calls to the DB.
So in the new DB structure I'm trying to make it all one query to the DB if possible and have it order it correctly the first time. The customer wants it displayed to them this way so I have no choice but to figure out a way to order this way.
This is an example of the items and their level codes (1 being the single digit codes, 2 the 2 digit codes, 3 for 4 digit codes, 4 for 6 digit codes and level 5 for 8 digit codes):
It's supposed to order basically everything that starts with a 5 goes under Code 5. Everything that starts with a 51 goes under code 51. If you look at the column n_mad_id it links to the "Mother" ID of the code that is the mother of that code, so code 51's mother is code 5. Code 5101's mother is code 51. Code 5201's mother is code 52. And so on and so forth.
Then the n_nivel column is the level that the code belongs to. Each code has a level and a mother. The top level codes (i.e. 1, 2, 3, 4, 5) are all level 1 since they are only one digit.
I was hoping that there might be an easy ORDER BY way to do this. I've been playing with it for two days and can't seem to get it to obey.

The absolutely simplest way would be to cast the n_cod field to text and then order on that:
SELECT *
FROM mytable
WHERE left(n_cod::text, 1) = '5' -- optional
ORDER BY n_cod::text;
Not pretty, but functional.
You could consider changing your table definition to make n_cod of type char(8) because you do not use it as a number anyway (in the sense of performing calculations). That would make the query a lot faster.

Interesting task. As I understand that you want to get result in order like
n_id n_cod n_nivel n_mad_id
10 5 1 0
11 51 2 10
12 5101 3 11
14 510101 4 12
...
13 52 2 10
...
?
If yes then it may do the trick:
with recursive
tt(n_id, n_mad_id, n_cod, x) as (
select t.n_id, t.n_mad_id, t.n_cod, array[t.n_id]
from yourtable t where t.n_mad_id = 0
union all
select t.n_id, t.n_mad_id, t.n_cod, x || t.n_id
from tt join yourtable t on t.n_mad_id = tt.n_id)
select * from tt order by x;
Here is my original test query:
create table t(id, parent) as values
(1, null),
(3, 1),
(7, 3),
(5, 3),
(6, 5),
(2, null),
(8, 2),
(4, 2);
with recursive
tt(id, parent, x) as (
select t.id, t.parent, array[t.id] from t where t.parent is null
union all
select t.id, t.parent, x || t.id from tt join t on t.parent = tt.id)
select * from tt order by x;
and its result:
id | parent | x
----+--------+-----------
1 | (null) | {1}
3 | 1 | {1,3}
5 | 3 | {1,3,5}
6 | 5 | {1,3,5,6}
7 | 3 | {1,3,7}
2 | (null) | {2}
4 | 2 | {2,4}
8 | 2 | {2,8}
(8 rows)
Read about recursive queries.

Related

Using recursive CTE to generate hierarchy results ordered by depth without the use of heiarchyid

I would like to query hierarchy results ordered by depth first without the use of SQL's heiarchyid built in function. Essentially, I am hoping to accomplish the depth ordering without any fancy functions.
I have provided a temp table below that contains these records:
Id
p_Id
order1
name1
1
null
1
josh
2
null
2
mary
3
null
3
george
4
1
1
joe
5
1
2
jeff
6
2
1
marg
7
2
2
moore
8
2
3
max
9
3
1
gal
10
3
2
guy
11
4
1
tod
12
4
2
ava
13
9
1
ron
14
9
2
bill
15
9
100
pat
where p_Id is the id of the parent record, and order1 is essentially just the ordering of which the depth first output should be displayed. To show why my query does not fully work, I made the order1 of the last record 100 instead of say, 3. However this should not ultimately matter since 100 and 3 both come after the previous order1 value, 2.
An example of a correct result table is shown below:
Id
p_Id
order1
name1
Descendants
1
null
1
josh
josh
4
1
1
joe
josh/joe
11
4
1
tod
josh/joe/tod
12
4
2
ava
josh/joe/ava
5
1
2
jeff
josh/jeff
2
null
2
mary
mary
6
2
1
marg
mary/marg
7
2
2
moore
mary/moore
8
2
3
max
mary/max
3
null
3
george
george
9
3
1
gal
george/gal
13
9
1
ron
george/gal/ron
15
9
2
bill
george/gal/bill
14
9
100
pat
george/gal/pat
10
3
2
guy
george/guy
Where an example of my results are shown below:
Id
p_Id
order1
name1
Descendants
levels
1
null
1
josh
josh
.1
4
1
1
joe
josh/joe
.1.1
11
4
1
tod
josh/joe/tod
.1.1.1
12
4
2
ava
josh/joe/ava
.1.1.2
5
1
2
jeff
josh/jeff
.1.2
2
null
2
mary
mary
.2
6
2
1
marg
mary/marg
.2.1
7
2
2
moore
mary/moore
.2.2
8
2
3
max
mary/max
.2.3
3
null
3
george
george
.3
9
3
1
gal
george/gal
.3.1
13
9
1
ron
george/gal/ron
.3.1.1
15
9
100
pat
george/gal/pat
.3.1.100
14
9
2
bill
george/gal/bill
.3.1.2
10
3
2
guy
george/guy
.3.2
where I have created a levels column that essentially concatenates the order1 values and separates them with a period. This almost returns the correct results, but due to the fact that I am ordering by this string (of numbers and periods), the levels value of .3.1.100 will come before .3.1.2 , which is not what the desired output should look like. I am sure there is a different method to return the correct depth order. See below for the code that generates a temp table, and the code that I used to generate the incorrect output that I have so far.
if object_id('tempdb..#t1') is not null drop table #t1
CREATE TABLE #t1 (Id int, p_Id int, order1 int, name1 varchar(150))
INSERT into #t1 VALUES
(1, null, 1, 'josh'),
(2, null, 2, 'mary'),
(3, null, 3, 'george'),
(4, 1, 1, 'joe'),
(5, 1, 2, 'jeff'),
(6, 2, 1, 'marg'),
(7, 2, 2, 'moore'),
(8, 2, 3, 'max'),
(9, 3, 1, 'gal'),
(10, 3, 2, 'guy'),
(11, 4, 1, 'tod'),
(12, 4, 2, 'ava'),
(13, 9, 1, 'ron'),
(14, 9, 2, 'bill'),
(100, 9, 100, 'pat');
select * from #t1
-- Looking to generate heiarchy results ordered by depth --
; with structure as (
-- Non-recursive term.
-- Select the records where p_Id is null
select p.Id,
p.p_Id,
p.order1,
p.name1,
cast(p.name1 as varchar(64)) as Descendants,
cast(concat('.', p.order1) as varchar(150)) as levels
from #t1 p
where p.p_Id is null
union all
-- Recursive term.
-- Treat the records from previous iteration as parents.
-- Stop when none of the current records have any further sub records.
select c.Id,
c.p_Id,
c.order1,
c.name1,
cast(concat(p.Descendants, '/', c.name1) as varchar(64)) as Descendants,
cast(concat(p.levels, '.', c.order1) as varchar(150)) as levels
from #t1 c -- c being the 'child' records
inner join structure p -- p being the 'parent' records
on c.p_Id = p.Id
)
select *
from structure
order by replace(levels, '.', '') asc
Take II. As pointed out by OP my original answer fails for more than 10 children. So what we can do (OP's suggestion) is pad the values out with zeros to a constant length. But what length? We need to take the largest number of children under a node and add this to the largest value or order, so for the example provided this is 100 + 3, and then take the length of that (3) and pad every order with zeros to 3 digits long. This means we will always be ordering as desired.
declare #PadLength int = 0;
select #PadLength = max(children)
from (
select len(convert(varchar(12),max(order1)+count(*))) children
from #t1
group by p_Id
) x;
-- Looking to generate heiarchy results ordered by depth --
with structure as (
-- Non-recursive term
-- Select the records where p_Id is null
select
p.Id [Id]
, p.p_Id [ParentId]
, p.order1 [OrderBy]
, p.name1 [Name]
, cast(p.name1 as varchar(64)) Descendants
, concat('.', right(replicate('0',#Padlength) + convert(varchar(12),p.order1), #PadLength)) Levels
from #t1 p
where p.p_Id is null
union all
-- Recursive term
-- Treat the records from previous iteration as parents.
-- Stop when none of the current records have any further sub records.
select
c.Id,
c.p_Id,
c.order1,
c.name1,
cast(concat(p.Descendants, '/', c.name1) as varchar(64)),
concat(p.levels, '.', right(replicate('0',#Padlength) + convert(varchar(12),c.order1), #PadLength))
from #t1 c -- c being the 'child' records
inner join structure p on c.p_Id = p.Id -- p being the 'parent' records
)
select *
from structure
order by replace(levels, '.', '') asc;
Note: This answer fails in the case when there are more than 10 children under a particular node. Leaving for interest.
So this issue you have run into is that you are ordering by a string not a number. So the string 100 comes before the string 2. But you need to order by a string to take care of the hierarchy, so one solution is to replace order1 with row_number() based on the order1 column while its still a number and use the row_number() to build your ordering string.
So you replace:
cast(concat(p.levels, '.', c.order1) as varchar(150)) as levels
with
cast(concat(p.levels, '.', row_number() over (order by c.Order1)) as varchar(150))
giving a full query of
with structure as (
-- Non-recursive term.
-- Select the records where p_Id is null
select p.Id,
p.p_Id,
p.order1,
p.name1,
cast(p.name1 as varchar(64)) as Descendants,
cast(concat('.', p.order1) as varchar(150)) as levels
from #t1 p
where p.p_Id is null
union all
-- Recursive term.
-- Treat the records from previous iteration as parents.
-- Stop when none of the current records have any further sub records.
select c.Id,
c.p_Id,
c.order1,
c.name1,
cast(concat(p.Descendants, '/', c.name1) as varchar(64)) as Descendants,
cast(concat(p.levels, '.', row_number() over (order by c.Order1)) as varchar(150))
from #t1 c -- c being the 'child' records
inner join structure p -- p being the 'parent' records
on c.p_Id = p.Id
)
select *
from structure
order by replace(levels, '.', '') asc;
Which returns the desired results.
Note: good question, well written.

Minimum transfer Sql query

I have two Tables and I need to fulfill table 2 Need_qty with minimum movement from table 1 sending_qty.
Table 1
sending_QTY STORE_ID_A
30 30105
16 21168
10 21032
9 30118
6 30011
5 21190
2 21016
Table 2
Need_QTY STORE_ID_B
15 21005
10 30019
11 21006
16 30001
11 21015
7 21004
Expected output
STORE_ID_A |STORE_ID_B |TRANSFERRED_QTY_FROM_A |
-----------|-----------|-----------------------|
30105 |21005 |15 |
30105 |30019 |10 |
30105 |21006 |5 |
21168 |21006 |6 |
21168 |30001 |10 |
21032 |30001 |6 |
21032 |21015 |4 |
30118 |21015 |7 |
30118 |21004 |2 |
30011 |21004 |5 |
There are several other combinations to achieve this but I need to find minimum possible transfer so that table 2 need_qty gets fullfill
Is there any way to achieve this without procedural approach?
So far I have tried to cross join to find the combinations but didn't help much
This can be solved using regular SQL by utilizing an assisting table, containing partitions of integers. For the sake of this example, let's assume this assisting table has three columns: partitions, rank and number. For any given number, there will be rows of possible partitions, each with its rank. If the number is 5, selecting all the rows from this table where number is 5 will bring up:
partitions rank number
5 1 5
4, 1 2 5
3, 2 2 5
3, 1, 1 3 5
2, 2, 1 3 5
2, 1, 1, 1 4 5
1, 1, 1, 1, 1 5 5
The rank is the number of partitions used in the row, and it's important for the problem you provided because it allows us to select the minimum transfers.
For the number 5 we have 7 rows representing partitions. For higher numbers the rows returned will be much higher - the number 12 will have 77 partitions! - but in the scale we work in with databases, this assisting table of partitions is easy to query in numbers 1 through 99, such as the example provided. Higher numbers are a question of scalability.
If you want instructions to create such a table, I'll be happy to provide you - but since this is a long solution, let's set the generation of the assisting table aside for now.
Let's look at Store ID A, which had quantity to send. Their quantities are:
30
16
10
9
6
5
2
For each store quantity we can query the partitions assisting table, and get the various partitions for this quantity and their rank. We can then create out own combination of partitions. For instance, 30 will bring many rows, one of which will be:
partitions rank number
15,10,5 3 30
and 10 will bring in, among many others:
partitions rank number
6,4 2 10
You could build a Cartesian product of all possible candidates by a cross join between the results, and for each row of this product have the partitions ordered in ascending order and the rank be the sum of the partition ranks.
On the other side, you have Store ID B which needs quantities. You just do the same exact treatment, ending up with yet another quite large Cartesian product of ranked, ordered partitions. Congratulations on getting so far.
Now, you only need to see the partition rows in which Store ID B partition collection is completely contained within Store ID A partition collection. This will thin the large collection considerably to a few rows of potential transfers. One of the rows from Store ID B (given the example above) will be:
partitions rank
15,10,10,7,6,6,5,5,4,2 10
Since it appears in both Store ID A and Store ID B. in Store ID A it will be a combination of:
30 = 15,10,5 rank 3
16 = 10,6 rank 2
10 = 6,4 rank 2
9 = 7,2 rank 2
6 = 5,1 rank 2
5 = 5 rank 1
2 = 2 rank 1
giving you the line:
partitions rank
15,10,10,7,6,6,5,5,5,4,2,2,1 13
The last step is to select the row with the lowest rank on the Store ID B. This will be the minimal number of transfers, and you can output it just like you did above.
BONUS for getting this far: If you want to see if We can completely deplete Store ID A's entire inventory (rather then fulfil Store ID B), reverse the containment relation: Make sure the partition collection A is completely contained in partition collection B. To see the minimal transfers for moving exactly each item from A to B fulfilling B and depleting A, look for identical partitions in both collections.
And some actual SQL to simulate the algorithm, at least in part:
-- this function handles the sorting. It's not necessary but it help make the result look better.
WITH
FUNCTION SORT_PARTITIONS(p_id IN VARCHAR2) RETURN VARCHAR2 IS
result VARCHAR2(100);
BEGIN
select rtrim(XMLAGG(XMLELEMENT(E,str||',')).EXTRACT('//text()'),',') into result
from (
with temp as (
select p_id num from dual
)
SELECT trim(regexp_substr(str, '[^,]+', 1, level)) str
FROM (SELECT num str FROM temp) t
CONNECT BY instr(str, ',', 1, level - 1) > 0
order by to_number(str)
);
return result;
END;
-- this function handles containment - when we want to fulfil store ID B, and not necessarily deplete store ID A, or visa-versa.
FUNCTION PARTITION_CONTAINED(seta_partition IN VARCHAR2, setb_partition IN VARCHAR2) RETURN NUMBER IS
result NUMBER;
BEGIN
with seta as
(select str, count(str) cnt from (
SELECT trim(regexp_substr(str, '[^,]+', 1, level)) str
FROM (SELECT num str FROM (select SETA_PARTITION num from dual)) t
CONNECT BY instr(str, ',', 1, level - 1) > 0)
group by str),
setb as
(select str, count(str) cnt from (
SELECT trim(regexp_substr(str, '[^,]+', 1, level)) str
FROM (SELECT num str FROM (select SETB_PARTITION num from dual)) t
CONNECT BY instr(str, ',', 1, level - 1) > 0)
group by str),
lenab as (select count(1) strab from seta, setb where seta.str=setb.str and seta.cnt>=setb.cnt),
lenb as (select count(1) strb from setb)
select strb-strab into result from lenb,lenab;
RETURN result;
END;
-- this store_a simply represents a small Cartesian product of two stores from the stores ID A table - one with quantity 5, the other with quantity 4. I found this was easier to set up.
store_a as (select SORT_PARTITIONS(n1||','||n2) partitions_sending, rank1+rank2 rank_sending from (select num_partitions n1, rank rank1 from n_partitions where num=5),(select num_partitions n2, rank rank2 from n_partitions where num=4)),
-- this store_b represents the stores ID B's Cartesian product of partitions, again for simplicity. The receiving quantities are 3, 3 and 3.
store_b as (select SORT_PARTITIONS(n1||','||n2||','||n3) partitions_receive, rank1+rank2+rank3 rank_receive from (select num_partitions n1, rank rank1 from n_partitions where num=3),(select num_partitions n2, rank rank2 from n_partitions where num=3),(select num_partitions n3, rank rank3 from n_partitions where num=3))
-- and finally, the filtering that provides all possible transfers - with both "=" (which works for deplete and fulfil) and "partition_contained" which allows for fulfil or deplete. You can choose to leave both or just use partition contained, as it is more flexible.
SELECT * from store_a, store_b where store_a.partitions_sending=store_b.partitions_receive or partition_contained(store_a.partitions_sending,store_b.partitions_receive)=0 order by store_b.rank_receive, store_a.rank_sending asc;

Order by followed by dependent order by

I have a table with part of data like below . I have done order by on edition_id .
Now there is further requirement of ordering laungauge_id which depends on value of edition_id.
Edition_id refers to city from which a newspaper is published.
Language_id refers to different languages in which newspaper is
published.
So suppose edition_id = 5 it means New Delhi.
For New Delhi language_id are 13(English ), 5 (Hindi) ,1(Telugu ),4(Urdu).
What i want is to display for New Delhi , is display all English articles first , followed by hindi , followed by Telugu followed by Urdu.
If edition_id=1 then order of language_id should be 13,1,2.
Similarly ,
If edition_id=5 then order of language_id should be 13,5,1,4
Right now what I have is
Edition_id | Language_id
1 1
1 2
1 13
1 1
1 13
1 2
5 4
5 1
5 1
5 4
5 13
5 5
5 13
What is required
Edition_id | Language_id
1 13
1 13
1 1
1 1
1 2
1 2
5 13
5 13
5 5
5 1
5 1
5 4
5 4
How to do this ? Please help.
Is something like this possibe
Select * from <table>
order by edition_id ,
case when edition=6 then <order specified for language_id ie 13,5,1,4>
I would create a supplementary ranking table. I would then JOIN to provide your sort order. Eg:
EDITION_SORT_ORDER
EDITION_ID LANGUAGE_ID RANK
---------- ----------- ----
1 13 1
1 1 2
1 2 3
5 13 1
5 5 2
5 1 3
5 4 4
Using this table in a query might look like this:
SELECT E.EDITION_ID, E.LANGUAGE_ID
FROM <TABLE> E LEFT OUTER JOIN EDITION_SORT_ORDER S ON
E.EDITION_ID = S.EDITION_ID AND
E.LANGUAGE_ID = S.LANGUAGE_ID
ORDER BY S.RANK
This way you can add other rules in future, and it isn't a huge mess of CASE logic.
Alternatively, if you want to avoid a JOIN, you could create a stored function which did a similar lookup and returned a rank (based on passed parameters of EDITION_ID and LANGUAGE_ID).
If you must use CASE, then I'd confine it to a function so you can re-use the logic elsewhere.
If there is no mathematical logic behind it, I would insert another column that can be used for proper sorting.
If you cannot do this, you can simply type out the rules for the relation like this:
Order By Edition_Id,
case Edition_id
when 1 then
case Language_id
when 13 then 1
when 1 then 2
when 2 then 3
end
when 5 then
case Language_id
when 13 then 1
when 5 then 2
when 1 then 3
when 4 then 4
end
end
without a fixed order colum you could things like that, but the logic is not comprehensible.
Assuming first criteria is length of Language_id,
Second is Edition_id= Language_id,
rest is order of Language_id it could or work this way:
Declare #t table(Edition_id int, Language_id int)
insert into #t values
(1, 1),
(1, 2),
(1, 13),
(1, 1),
(1, 13),
(1, 2),
(5, 4),
(5, 1),
(5, 1),
(5, 4),
(5, 13),
(5, 5),
(5, 13);
Select * from #t
order by Edition_id,Case when len (Cast(Language_ID as Varchar(10)))=1 then '1' else '0' end
+case when Edition_id=Language_id then '0' else '1' end
,Language_ID
You've probably considered this but if your desired ordering is always based of the actual alphabetical name of the language then there would usually be a table with the language description that you could join with and then sort by. I base this on your quote below.
...English articles first , followed by hindi , followed by Telugu
followed by Urdu.
SELECT E.EDITION_ID, E.LANGUAGE_ID, LN.LANGUAGE_NAME
FROM <TABLE> E LEFT OUTER JOIN <LANGUAGE_NAMES> LN ON
E.LANGUAGE_ID = LN.LANGUAGE_ID
ORDER BY 1, 3

How to track how many times a column changed its value?

I have a table called crewWork as follows :
CREATE TABLE crewWork(
FloorNumber int, AptNumber int, WorkType int, simTime int )
After the table was populated, I need to know how many times a change in apt occurred and how many times a change in floor occurred. Usually I expect to find 10 rows on each apt and 40-50 on each floor.
I could just write a scalar function for that, but I was wondering if there's any way to do that in t-SQL without having to write scalar functions.
Thanks
The data will look like this:
FloorNumber AptNumber WorkType simTime
1 1 12 10
1 1 12 25
1 1 13 35
1 1 13 47
1 2 12 52
1 2 12 59
1 2 13 68
1 1 14 75
1 4 12 79
1 4 12 89
1 4 13 92
1 4 14 105
1 3 12 115
1 3 13 129
1 3 14 138
2 1 12 142
2 1 12 150
2 1 14 168
2 1 14 171
2 3 12 180
2 3 13 190
2 3 13 200
2 3 14 205
3 3 14 216
3 4 12 228
3 4 12 231
3 4 14 249
3 4 13 260
3 1 12 280
3 1 13 295
2 1 14 315
2 2 12 328
2 2 14 346
I need the information for a report, I don't need to store it anywhere.
If you use the accepted answer as written now (1/6/2023), you get correct results with the OP dataset, but I think you can get wrong results with other data.
CONFIRMED: ACCEPTED ANSWER HAS A MISTAKE (as of 1/6/2023)
I explain the potential for wrong results in my comments on the accepted answer.
In this db<>fiddle, I demonstrate the wrong results. I use a slightly modified form of accepted answer (my syntax works in SQL Server and PostgreSQL). I use a slightly modified form of the OP's data (I change two rows). I demonstrate how the accepted answer can be changed slightly, to produce correct results.
The accepted answer is clever but needs a small change to produce correct results (as demonstrated in the above db<>fiddle and described here:
Instead of doing this as seen in the accepted answer COUNT(DISTINCT AptGroup)...
You should do thisCOUNT(DISTINCT CONCAT(AptGroup, '_', AptNumber))...
DDL:
SELECT * INTO crewWork FROM (VALUES
-- data from question, with a couple changes to demonstrate problems with the accepted answer
-- https://stackoverflow.com/q/8666295/1175496
--FloorNumber AptNumber WorkType simTime
(1, 1, 12, 10 ),
-- (1, 1, 12, 25 ), -- original
(2, 1, 12, 25 ), -- new, changing FloorNumber 1->2->1
(1, 1, 13, 35 ),
(1, 1, 13, 47 ),
(1, 2, 12, 52 ),
(1, 2, 12, 59 ),
(1, 2, 13, 68 ),
(1, 1, 14, 75 ),
(1, 4, 12, 79 ),
-- (1, 4, 12, 89 ), -- original
(1, 1, 12, 89 ), -- new , changing AptNumber 4->1->4 ges)
(1, 4, 13, 92 ),
(1, 4, 14, 105 ),
(1, 3, 12, 115 ),
...
DML:
;
WITH groupedWithConcats as (SELECT
*,
CONCAT(AptGroup,'_', AptNumber) as AptCombo,
CONCAT(FloorGroup,'_',FloorNumber) as FloorCombo
-- SQL SERVER doesnt have TEMPORARY keyword; Postgres doesn't understand # for temp tables
-- INTO TEMPORARY groupedWithConcats
FROM
(
SELECT
-- the columns shown in Andriy's answer:
-- https://stackoverflow.com/a/8667477/1175496
ROW_NUMBER() OVER ( ORDER BY simTime) as RN,
-- AptNumber
AptNumber,
ROW_NUMBER() OVER (PARTITION BY AptNumber ORDER BY simTime) as RN_Apt,
ROW_NUMBER() OVER ( ORDER BY simTime)
- ROW_NUMBER() OVER (PARTITION BY AptNumber ORDER BY simTime) as AptGroup,
-- FloorNumber
FloorNumber,
ROW_NUMBER() OVER (PARTITION BY FloorNumber ORDER BY simTime) as RN_Floor,
ROW_NUMBER() OVER ( ORDER BY simTime)
- ROW_NUMBER() OVER (PARTITION BY FloorNumber ORDER BY simTime) as FloorGroup
FROM crewWork
) grouped
)
-- if you want to see how the groupings work:
-- SELECT * FROM groupedWithConcats
-- otherwise just run this query to see the counts of "changes":
SELECT
COUNT(DISTINCT AptCombo)-1 as CountAptChangesWithConcat_Correct,
COUNT(DISTINCT AptGroup)-1 as CountAptChangesWithoutConcat_Wrong,
COUNT(DISTINCT FloorCombo)-1 as CountFloorChangesWithConcat_Correct,
COUNT(DISTINCT FloorGroup)-1 as CountFloorChangesWithoutConcat_Wrong
FROM groupedWithConcats;
ALTERNATIVE ANSWER
The accepted-answer may eventually get updated to remove the mistake. If that happens I can remove my warning but I still want leave you with this alternative way to produce the answer.
My approach goes like this: "check the previous row, if the value is different in previous row vs current row, then there is a change". SQL doesn't have idea or row order functions per se (at least not like in Excel for example; )
Instead, SQL has window functions. With SQL's window functions, you can use the window function RANK plus a self-JOIN technique as seen here to combine current row values and previous row values so you can compare them. Here is a db<>fiddle showing my approach, which I pasted below.
The intermediate table, showing the columns which has a value 1 if there is a change, 0 otherwise (i.e. FloorChange, AptChange), is shown at the bottom of the post...
DDL:
...same as above...
DML:
;
WITH rowNumbered AS (
SELECT
*,
ROW_NUMBER() OVER ( ORDER BY simTime) as RN
FROM crewWork
)
,joinedOnItself AS (
SELECT
rowNumbered.*,
rowNumberedRowShift.FloorNumber as FloorShift,
rowNumberedRowShift.AptNumber as AptShift,
CASE WHEN rowNumbered.FloorNumber <> rowNumberedRowShift.FloorNumber THEN 1 ELSE 0 END as FloorChange,
CASE WHEN rowNumbered.AptNumber <> rowNumberedRowShift.AptNumber THEN 1 ELSE 0 END as AptChange
FROM rowNumbered
LEFT OUTER JOIN rowNumbered as rowNumberedRowShift
ON rowNumbered.RN = (rowNumberedRowShift.RN+1)
)
-- if you want to see:
-- SELECT * FROM joinedOnItself;
SELECT
SUM(FloorChange) as FloorChanges,
SUM(AptChange) as AptChanges
FROM joinedOnItself;
Below see the first few rows of the intermediate table (joinedOnItself). This shows how my approach works. Note the last two columns, which have a value of 1 when there is a change in FloorNumber compared to FloorShift (noted in FloorChange), or a change in AptNumber compared to AptShift (noted in AptChange).
floornumber
aptnumber
worktype
simtime
rn
floorshift
aptshift
floorchange
aptchange
1
1
12
10
1
0
0
2
1
12
25
2
1
1
1
0
1
1
13
35
3
2
1
1
0
1
1
13
47
4
1
1
0
0
1
2
12
52
5
1
1
0
1
1
2
12
59
6
1
2
0
0
1
2
13
68
7
1
2
0
0
Note instead of using the window function RANK and JOIN, you could use the window function LAG to compare values in the current row to the previous row directly (no need to JOIN). I don't have that solution here, but it is described in the Wikipedia article example:
Window functions allow access to data in the records right before and after the current record.
If I am not missing anything, you could use the following method to find the number of changes:
determine groups of sequential rows with identical values;
count those groups;
subtract 1.
Apply the method individually for AptNumber and for FloorNumber.
The groups could be determined like in this answer, only there's isn't a Seq column in your case. Instead, another ROW_NUMBER() expression could be used. Here's an approximate solution:
;
WITH marked AS (
SELECT
FloorGroup = ROW_NUMBER() OVER ( ORDER BY simTime)
- ROW_NUMBER() OVER (PARTITION BY FloorNumber ORDER BY simTime),
AptGroup = ROW_NUMBER() OVER ( ORDER BY simTime)
- ROW_NUMBER() OVER (PARTITION BY AptNumber ORDER BY simTime)
FROM crewWork
)
SELECT
FloorChanges = COUNT(DISTINCT FloorGroup) - 1,
AptChanges = COUNT(DISTINCT AptGroup) - 1
FROM marked
(I'm assuming here that the simTime column defines the timeline of changes.)
UPDATE
Below is a table that shows how the distinct groups are obtained for AptNumber.
AptNumber RN RN_Apt AptGroup (= RN - RN_Apt)
--------- -- ------ ---------
1 1 1 0
1 2 2 0
1 3 3 0
1 4 4 0
2 5 1 4
2 6 2 4
2 7 3 4
1 8 5 => 3
4 9 1 8
4 10 2 8
4 11 3 8
4 12 4 8
3 13 1 12
3 14 2 12
3 15 3 12
1 16 6 10
… … … …
Here RN is a pseudo-column that stands for ROW_NUMBER() OVER (ORDER BY simTime). You can see that this is just a sequence of rankings starting from 1.
Another pseudo-column, RN_Apt contains values produces by the other ROW_NUMBER, namely ROW_NUMBER() OVER (PARTITION BY AptNumber ORDER BY simTime). It contains rankings within individual groups of identical AptNumber values. You can see that, for a newly encountered value, the sequence starts over, and for a recurring one, it continues where it stopped last time.
You can also see from the table that if we subtract RN from RN_Apt (could be the other way round, doesn't matter in this situation), we get the value that uniquely identifies every distinct group of same AptNumber values. You might as well call that value a group ID.
So, now that we've got these IDs, it only remains for us to count them (count distinct values, of course). That will be the number of groups, and the number of changes is one less (assuming the first group is not counted as a change).
add an extra column changecount
CREATE TABLE crewWork(
FloorNumber int, AptNumber int, WorkType int, simTime int ,changecount int)
increment changecount value for each updation
if want to know count for each field then add columns corresponding to it for changecount
Assuming that each record represents a different change, you can find changes per floor by:
select FloorNumber, count(*)
from crewWork
group by FloorNumber
And changes per apartment (assuming AptNumber uniquely identifies apartment) by:
select AptNumber, count(*)
from crewWork
group by AptNumber
Or (assuming AptNumber and FloorNumber together uniquely identifies apartment) by:
select FloorNumber, AptNumber, count(*)
from crewWork
group by FloorNumber, AptNumber

Recursive SQL CTE's and Custom Sort Ordering

Image you are creating a DB schema for a threaded discussion board. Is there an efficient way to select a properly sorted list for a given thread? The code I have written works but does not sort the way I would like it too.
Let's say you have this data:
ID | ParentID
-----------------
1 | null
2 | 1
3 | 2
4 | 1
5 | 3
So the structure is supposed to look like this:
1
|- 2
| |- 3
| | |- 5
|- 4
Ideally, in the code, we want the result set to appear in the following order: 1, 2, 3, 5, 4
PROBLEM: With the CTE I wrote it is actually being returned as: 1, 2, 4, 3, 5
I know this would be easy to group/order by using LINQ but I am reluctant to do this in memory. It seems like the best solution at this point though...
Here is the CTE I am currently using:
with Replies as (
select c.CommentID, c.ParentCommentID 1 as Level
from Comment c
where ParentCommentID is null and CommentID = #ParentCommentID
union all
select c.CommentID, c.ParentCommentID, r.Level + 1 as Level
from Comment c
inner join Replies r on c.ParentCommentID = r.CommentID
)
select * from Replies
Any help would be appreciated; Thanks!
I'm new to SQL and had not heard about hierarchyid datatype before. After reading about it from this comment I decided I may want to incorporate this into my design. I will experiment with this tonight and post more information if I have success.
Update
Returned result from my sample data, using dance2die's suggestion:
ID | ParentID | Level | DenseRank
-------------------------------------
15 NULL 1 1
20 15 2 1
21 20 3 1
17 22 3 1
22 15 2 2
31 15 2 3
32 15 2 4
33 15 2 5
34 15 2 6
35 15 2 7
36 15 2 8
I am sure that you will love this.
I recently find out about Dense_Rank() function, which is for "ranking within the partition of a result set" according to MSDN
Check out the code below and how "CommentID" is sorted.
As far as I understand, you are trying to partition your result set by ParentCommentID.
Pay attention to "denserank" column.
with Replies (CommentID, ParentCommentID, Level) as
(
select c.CommentID, c.ParentCommentID, 1 as Level
from Comment c
where ParentCommentID is null and CommentID = 1
union all
select c.CommentID, c.ParentCommentID, r.Level + 1 as Level
from Comment c
inner join Replies r on c.ParentCommentID = r.CommentID
)
select *,
denserank = dense_rank() over (partition by ParentCommentID order by CommentID)
from Replies
order by denserank
Result below
You have to use hierarchyid (sql2008 only) or a bunch of string (or byte) concatenation.
Hmmmm - I am not sure if your structure is the best suited for this problem. Off the top of my head I cannot think of anyway to sort the data as you want it within the above query.
The best I can think of is if you have a parent table that ties your comments together (eg. a topic table). If you do you should be able to simply join your replies onto that (you will need to include the correct column obviously), and then you can sort by the topicID, Level to get the sort order you are after (or whatever other info on the topic table represents a good value for sorting).
Consider storing the entire hierarchy (with triggers to update it if it changes ) in a field.
This field in your example would have:
1
1.2
1.2.3
1.2.5
1.4
then you just have to sort on that field, try this and see:
create table #temp (test varchar (10))
insert into #temp (test)
select '1'
union select '1.2'
union select '1.2.3'
union select '1.2.5'
union select '1.4'
select * from #temp order by test asc