SQL Query for sorting in particular order - sql

I have this data:
parent_id
comment_id
comment_level
NULL
0xB2
1
NULL
0xB6
1
NULL
0xBA
1
NULL
0xBE
1
NULL
0xC110
1
NULL
0xC130
1
123
0xC13580
2
456
0xC135AC
3
123
0xC13680
2
I want the result in such a way that rows where comment_level=1 should be in descending order by comment_id and other rows(i.e. where comment_level!=1) should be in ascending order by comment_id but the order of comment level greater than 1 should be inserted according to to order of comment level 1 (what I mean is that rows with comment_level=1 should remain in descending order and then rows with comment_level!=1 should be inserted in increasing order but it should be inserted following rows where comment_id is less than it)
Result should look like this
NULL 0xC130 1
123 0xC13580 2
456 0xC135AC 3
123 0xC13680 2
NULL 0xC110 1
NULL 0xBE 1
NULL 0xBA 1
NULL 0xB6 1
NULL 0xB2 1
Note the bold rows in above sort by comment_id in ascending order, but they come after their "main" row (with comment_level = 1), where these main rows sort DESC by comment_id.
I tried creating 2 tables for different comment level and used sorting for union but it didn't work out because 2 different order by doesn't work maybe I tried from this Using different order by with union but it gave me an error and after all even if this worked it still might not have given me the whole answer.

I think I understand what you're going for, and a UNION will not be able to do it.
To accomplish this, each row needs to match with a specific "parent" row that does have 1 for the comment_level. If the comment_level is already 1, the row is its own parent. Then we can sort first by the comment_id from that parent record DESC, and then sort ascending by the local comment_id within the a given group of matching parent records.
You'll need something like this:
SELECT t0.*
FROM [MyTable] t0
CROSS APPLY (
SELECT TOP 1 comment_id
FROM [MyTable] t1
WHERE t1.comment_level = 1 AND t1.comment_id <= t0.comment_id
ORDER BY t1.comment_id DESC
) parents
ORDER BY parents.comment_id DESC,
case when t0.comment_level = 1 then 0 else 1 end,
t0.comment_id
See it work here:
https://dbfiddle.uk/qZBb3YjO
There's probably also a solution using a windowing function that will be more efficient.
And here it is:
SELECT parent_id, comment_id, comment_level
FROM (
SELECT t0.*, t1.comment_id as t1_comment_id
, row_number() OVER (PARTITION BY t0.comment_id ORDER BY t1.comment_id desc) rn
FROM [MyTable] t0
LEFT JOIN [MyTable] t1 ON t1.comment_level = 1 and t1.comment_id <= t0.comment_id
) d
WHERE rn = 1
ORDER BY t1_comment_id DESC,
case when comment_level = 1 then 0 else 1 end,
comment_id
See it here:
https://dbfiddle.uk/me1vGNdM

Those varbinary values aren't arbitrary, they're hierarchyid values. And just a guess, they're probably typed that way in the schema (it's too much to be a coincidence). We can use this fact to do what we're looking to do.
with d_raw as (
select * from (values
(NULL, 0xC130 , 1),
(123 , 0xC13580, 2),
(456 , 0xC135AC, 3),
(123 , 0xC13680, 2),
(NULL, 0xC110 , 1),
(NULL, 0xBE , 1),
(NULL, 0xBA , 1),
(NULL, 0xB6 , 1),
(NULL, 0xB2 , 1)
) as x(parent_id, comment_id, comment_level)
),
d as (
select parent_id, comment_id = cast(comment_id as hierarchyid), comment_level
from d_raw
)
select *,
comment_id.ToString(),
comment_id.GetAncestor(comment_id.GetLevel() - 1).ToString()
from d
order by comment_id.GetAncestor(comment_id.GetLevel() - 1) desc,
comment_id
Note - the CTEs that I'm using are just to get the data into the right format and the last SELECT adds additional columns just to show what's going on; you can omit them from your query without consequence. I think the only interesting part of this solution is using comment_id.GetAncestor(comment_id.GetLevel() - 1) to get the root-level node.

One way to do this and possibly the only one is create two separate queries and union them together. Something like:
(select * from table where comment_level = 1 order by comment_id desc)
union
(select * from table where not comment_level = 1 order by comment_id asc)

Related

Group and count by another columns value

I have a table like below:
CREATE TABLE public.test_table
(
"ID" serial PRIMARY KEY NOT NULL,
"CID" integer NOT NULL,
"SEG" integer NOT NULL,
"DDN" character varying(3) NOT NULL
)
and data looks like this:
ID CID SEG DDN
1 1 1 "711"
2 1 2 "800"
3 1 3 "124"
4 2 1 "711"
5 3 1 "711"
6 3 2 "802"
7 4 1 "799"
8 5 1 "799"
9 5 2 "804"
10 6 1 "799"
I need to group these data by CID column and get column counts depends on DDN columns first values but counts must give me two different information, if it's more than 1 or not.
I'm really sorry if couldn't explains clearly. Let me show you what I need..
DDN END TRA
711 1 2
799 2 1
As you can see, DDN:711 has 1 record of single count (ID:4). This is END column.
But 2 times has multiple SEG count (ID:1to3 and ID:5to6). This is TRA column.
I can not be sure what column should be in group clause!
My solution:
Just found a solution like below
WITH x AS (
SELECT
(SELECT t1."DDN" FROM public.test_table AS t1
WHERE t1."CID"=t."CID" AND t1."SEG"=1) AS ddn,
COUNT("CID") AS seg_count
FROM public.test_table AS t
GROUP BY "CID"
)
SELECT ddn, COUNT(seg_count) AS "TOTAL",
SUM(CASE WHEN x.seg_count=1 THEN 1 ELSE 0 END) as "END",
SUM(CASE WHEN x.seg_count>1 THEN 1 ELSE 0 END) as "TRA"
FROM x
GROUP BY ddn;
Equivalent, faster query:
SELECT "DDN"
, COUNT(*) AS "TOTAL"
, COUNT(*) FILTER (WHERE seg_count = 1) AS "END"
, COUNT(*) FILTER (WHERE seg_count > 1) AS "TRA"
FROM (
SELECT DISTINCT ON ("CID")
"DDN" -- assuming min "SEG" is always 1
, COUNT(*) OVER (PARTITION BY "CID") AS seg_count
FROM test_table
ORDER BY "CID", "SEG"
) sub
GROUP BY "DDN";
db<>fiddle here
Notes
CTEs are typically slower and should only be used where needed in Postgres.
This is equivalent to the query in the question assuming that the minimum "SEG" per "CID" is always 1 - since this query returns the row with the minimum "SEG" while your query returns the one with "SEG" = 1. Typically, you would want the "first" segment and my query implements this requirement more reliably, but that's not clear from the question.
COUNT(*) is slightly faster than COUNT(column) and equivalent while not involving NULL values (applicable here). Related:
PostgreSQL: running count of rows for a query 'by minute'
About DISTINCT ON:
Select first row in each GROUP BY group?
The aggregate FILTER syntax requires Postgres 9.4+:
Conditional SQL count
Here is the solution i propose, the query can be simplified i guess.
CREATE TABLE test_table
(
ID serial PRIMARY KEY NOT NULL,
CID integer NOT NULL,
SEG integer NOT NULL,
DDN character varying(3) NOT NULL
);
insert into test_table(CID,SEG,DDN)
values
( 1, 1, '711'),
( 1, 2, '800'),
( 1, 3, '124'),
( 2, 1, '711'),
( 3, 1, '711'),
( 3, 2, '802'),
( 4, 1, '799'),
( 5, 1, '799'),
( 5, 2, '804'),
( 6, 1, '799');
with summary as (with ddn_t as (select cid,ddn,row_number() OVER( PARTITION BY cid)from test_table)
select a.cid,count(distinct a.ddn),b.ddn
from ddn_t a
join ddn_t b on b.cid=a.cid and b.row_number=1
group by a.cid, b.ddn)
select ddn,
sum (case when count >1 then 1 else 0 end) as TRA,
sum (case when count = 1 then 1 else 0 end) as END
from summary
group by ddn;

Select Distinct values once from multiple columns in this table preserving original order?

I have a (subquery) table that lists meal preferences for my friends. Each meal can only be taken once, and each person can only eat one meal.
row_number person_id meal_id
1 1 3
2 2 1
3 2 2
4 2 3
5 3 1
6 3 2
7 3 3
The picking order is determined by the original order of the table, so I would like the result to be:
person_id meal_id
1 3
2 1
3 2
Because meal 1 is taken by user 2, user 3 gets meal 2. I think this could be solved by selecting distinct values in both columns based on their original order, but I cannot figure out how to write that query. Any help appreciated.
Update Added row_number to original table.
If I understand correctly, this is a rather complicated graph walking problem. I should first note that there is no guarantee of an optimal solution -- without lots and lots of work. But you can implement a greedy algorithm using recursive CTEs:
with recursive t as (
select v.*
from (values (1, 1, 3), (2, 2, 1), (3, 2, 2), (4, 2, 3), (5, 3, 1), (6, 3, 2), (7, 3, 3)
) v(row_number, person_id, meal_id)
),
cte (row_number, person_id, meal_id, rows, persons, meals, lev) as (
select row_number, person_id, meal_id, array[row_number], array[person_id], array[meal_id], 1 as lev
from t
where row_number = 1
union all
select t.row_number, t.person_id, t.meal_id,
(case when t.person_id = any(cte.persons) or t.meal_id = any(cte.meals)
then cte.rows
else array_append(cte.rows, t.row_number)
end),
(case when t.person_id = any(cte.persons) or t.meal_id = any(cte.meals)
then cte.persons
else array_append(cte.persons, t.person_id)
end),
(case when t.person_id = any(cte.persons) or t.meal_id = any(cte.meals)
then cte.meals
else array_append(cte.meals, t.meal_id)
end),
cte.lev + 1
from cte join
t
on t.row_number = cte.row_number + 1
)
select t.*
from t cross join
(select rows from cte order by lev desc fetch first 1 row only) as last1
where t.row_number = any (last1.rows);
Here is a db<>fiddle.

sort table with null as "insignificant"

I have a table with two columns: col_order (int), and name (text). I would like to retrieve ordered rows such that, when col_order is not null, it determines the order, but when its null, then name determines the order. I thought of an order by clause such as
order by coalesce( col_order, name )
However, this won't work because the two columns have different type. I am considering converting both into bytea, but: 1) to convert the integer is there a better method than just looping moding by 256, and stacking up individual bytes in a function, and 2) how do I convert "name" to insure some sort of sane collation order (assuming name has order ... well citext would be nice but I haven't bothered to rebuild to get that ... UTF8 for the moment).
Even if all this is possible (suggestions on details welcome) it seems like a lot of work. Is there a better way?
EDIT
I got an excellent answer by Gordon, but it shows I didn't phrase the question correctly. I want a sort order by name where col_order represents places where this order is overridden. This isn't a well posed problem, but here is one acceptable solution:
col_order| name
----------------
null | a
1 | foo
null | foo1
2 | bar
Ie -- here if col_order is null name should be inserted after name closest in alphabetical order but less that it. Otherwise, this could be gotten by:
order by col_order nulls last, name
EDIT 2
Ok ... to get your creative juices flowing, this seems to be going in the right direction:
with x as ( select *,
case when col_order is null then null else row_number() over (order by col_order) end as ord
from temp )
select col_order, name, coalesce( ord, lag(ord,1) over (order by name) + .5) as ord from x;
It gets the order from the previous row, sorted by name, when there is no col_order. It isn't right in general... I guess I'd have to go back to the first row with non-null col_order ... it would seem that sql standard has "ignore nulls" for window functions which might do this, but isn't implemented in postgres. Any suggestions?
EDIT 3
The following would seem close -- but doesn't work. Perhaps window evaluation is a bit strange with recursive queries.
with recursive x(col_order, name, n) as (
select col_order, name, case when col_order is null then null
else row_number() over (order by col_order) * t end from temp, tot
union all
select col_order, name,
case when row_number() over (order by name) = 1 then 0
else lag(n,1) over (order by name) + 1 end from x
where x.n is null ),
tot as ( select count(*) as t from temp )
select * from x;
Just use multiple clauses:
order by (case when col_order is not null then 1 else 2 end),
col_order,
name
When col_order is not null, then 1 is assigned for the first sort key. When it is null, then 2 is assigned. Hence, the not-nulls will be first.
Ok .. the following seems to work -- I'll leave the question "unanswered" though pending criticism or better suggestions:
Using the last_agg aggregate from here:
with
tot as ( select count(*) as t from temp ),
x as (
select col_order, name,
case when col_order is null then null
else (row_number() over (order by col_order)) * t end as n,
row_number() over (order by name) - 1 as i
from temp, tot )
select x.col_order, x.name, coalesce(x.n,last_agg(y.n order by y.i)+x.i, 0 ) as n
from x
left join x as y on y.name < x.name
group by x.col_order, x.n, x.name, x.i
order by n;

how to normalize / update a "order" column

i have a table "mydata" with some data data :
id name position
===========================
4 foo -3
6 bar -2
1 baz -1
3 knork -1
5 lift 0
2 pitcher 0
i fetch the table ordered using order by position ASC;
the position column value may be non unique (for some reason not described here :-) and is used to provide a custom order during SELECT.
what i want to do :
i want to normalize the table column "position" by associating a unique position to each row which doesnt destroy the order. furthermore the highest position after normalising should be -1.
wished resulting table contents :
id name position
===========================
4 foo -6
6 bar -5
1 baz -4
3 knork -3
5 lift -2
2 pitcher -1
i tried several ways but failed to implement the correct update statement.
i guess that using
generate_series( -(select count(*) from mydata), -1)
is a good starting point to get the new values for the position column but i have no clue how to merge that generated column data into the update statement.
hope somebody can help me out :-)
Something like:
with renumber as (
select id,
-1 * row_number() over (order by position desc, id) as rn
from foo
)
update foo
set position = r.rn
from renumber r
where foo.id = r.id
and position <> r.rn;
SQLFiddle Demo
Try this one -
Query:
CREATE TABLE temp
(
id INT
, name VARCHAR(10)
, position INT
)
INSERT INTO temp (id, name, position)
VALUES
(4, 'foo', -3),
(6, 'bar', -2),
(1, 'baz', -1),
(3, 'knork', -1),
(5, 'lift', 0),
(2, 'pitcher', 0)
SELECT
id
, name
, position = -ROW_NUMBER() OVER (ORDER BY position DESC, id)
FROM temp
ORDER BY position
Update:
UPDATE temp
SET position = t.rn
FROM (
SELECT id, rn = - ROW_NUMBER() OVER (ORDER BY position DESC, id)
FROM temp
) t
WHERE temp.id = t.id
Output:
id name position
----------- ---------- --------------------
4 foo -6
6 bar -5
3 knork -4
1 baz -3
5 lift -2
2 pitcher -1
#a_horse_with_no_name is really near the truth - thank you !
UPDATE temp
SET position=t.rn
FROM (SELECT
id, name,
-((select count( *)
FROM temp)
+1-row_number() OVER (ORDER BY position ASC)) as rn
FROM temp) t
WHERE temp.id=t.id;
SELECT * FROM temp ORDER BY position ASC;
see http://sqlfiddle.com/#!1/d1770/6
update mydata temp1, (select a.*,#var:=#var-1 sno from mydata a, (select #var:=0) b
order by position desc, id asc) temp2
set temp1.position = temp2.sno
where temp1.id = temp2.id;

Tricky SQL. Consolidating rows

I have a (in my oppinion) tricky SQL problem.
I got a table with subscriptions. Each subscription has an ID and a set of attributes which will change over time. When an attribute value changes a new row is created with the subscription key and the new values – but ONLY for the changed attributes. The values for the attributes that weren’t changed are left empty. It looks something like this (I left out the ValidTo and ValidFrom dates that I use to sort the result correctly):
SubID Att1 Att2
1 J
1 L
1 B
1 H
1 A H
I need to transform this table so I can get the following result:
SubID Att1 Att2
1 J
1 J L
1 B L
1 B H
1 A H
So basically; if an attribute is empty then take the previous value for that attribute.
Anything solution goes…. I mean it doesn’t matter what I have to do to get the result: a view on top of the table, an SSIS package to create a new table or something third.
You can do this with a correlated subquery:
select t.subid,
(select t2.att1 from t t2 where t2.rowid <= t.rowid and t2.att1 is not null order by rowid desc limit 1) as att1,
(select t2.att2 from t t2 where t2.rowid <= t.rowid and t2.att2 is not null order by rowid desc limit 1) as att1
from t
This assumes that you have a rowid or equivalent (such as date time created) that specifies the ordering of the rows. It also uses limit to limit the results. In other databases, this might use top instead. (And Oracle uses a slightly more complex expression.)
I would write this using ValidTo. However, because there is ValidTo and ValidFrom, the actual expression is much more complicated. I would need for the question to clarify the rules for using these values with respect to imputing values at other times.
this one works in oracle 11g
select SUBID
,NVL(ATT1,LAG(ATT1) over(order by ValidTo)) ATT1
,NVL(ATT2,lag(ATT2) over(order by ValidTo)) ATT2
from table_name
i agree with Gordon Linoff and Jack Douglas.this code has limitation as when multiple records with nulls are inserted..
but below code will handle that..
select SUBID
,NVL(ATT1,LAG(ATT1 ignore nulls) over(order by VALIDTO)) ATT1
,NVL(ATT2,LAG(ATT2 ignore nulls) over(order by VALIDTO)) ATT2
from Table_name
please see sql fiddle
http://sqlfiddle.com/#!4/3b530/4
Assuming (based on the fact that you mentioned SSIS) you can use OUTER APPLY to get the previous row:
DECLARE #T TABLE (SubID INT, Att1 CHAR(1), Att2 CHAR(2), ValidFrom DATETIME);
INSERT #T VALUES
(1, 'J', '', '20121201'),
(1, '', 'l', '20121202'),
(1, 'B', '', '20121203'),
(1, '', 'H', '20121204'),
(1, 'A', 'H', '20121205');
SELECT T.SubID,
Att1 = COALESCE(NULLIF(T.att1, ''), prev.Att1, ''),
Att2 = COALESCE(NULLIF(T.att2, ''), prev.Att2, '')
FROM #T T
OUTER APPLY
( SELECT TOP 1 Att1, Att2
FROM #T prev
WHERE prev.SubID = T.SubID
AND prev.ValidFrom < t.ValidFrom
ORDER BY ValidFrom DESC
) prev
ORDER BY T.ValidFrom;
(I've had to add random values for ValidFrom to ensure the order by is correct)
EDIT
The above won't work if you have multiple consecutive rows with blank values - e.g.
DECLARE #T TABLE (SubID INT, Att1 CHAR(1), Att2 CHAR(2), ValidFrom DATETIME);
INSERT #T VALUES
(1, 'J', '', '20121201'),
(1, '', 'l', '20121202'),
(1, 'B', '', '20121203'),
(1, '', 'H', '20121204'),
(1, '', 'J', '20121205'),
(1, 'A', 'H', '20121206');
If this is likely to happen you will need two OUTER APPLYs:
SELECT T.SubID,
Att1 = COALESCE(NULLIF(T.att1, ''), prevAtt1.Att1, ''),
Att2 = COALESCE(NULLIF(T.att2, ''), prevAtt2.Att2, '')
FROM #T T
OUTER APPLY
( SELECT TOP 1 Att1
FROM #T prev
WHERE prev.SubID = T.SubID
AND prev.ValidFrom < t.ValidFrom
AND COALESCE(prev.Att1 , '') != ''
ORDER BY ValidFrom DESC
) prevAtt1
OUTER APPLY
( SELECT TOP 1 Att2
FROM #T prev
WHERE prev.SubID = T.SubID
AND prev.ValidFrom < t.ValidFrom
AND COALESCE(prev.Att2 , '') != ''
ORDER BY ValidFrom DESC
) prevAtt2
ORDER BY T.ValidFrom;
However, since each OUTER APPLY is only returning one value I would change this to a correlated subquery, since the above will evaluate PrevAtt1.Att1 and `PrevAtt2.Att2' for every row whether required or not. However if you change this to:
SELECT T.SubID,
Att1 = COALESCE(
NULLIF(T.att1, ''),
( SELECT TOP 1 Att1
FROM #T prev
WHERE prev.SubID = T.SubID
AND prev.ValidFrom < t.ValidFrom
AND COALESCE(prev.Att1 , '') != ''
ORDER BY ValidFrom DESC
), ''),
Att2 = COALESCE(
NULLIF(T.att2, ''),
( SELECT TOP 1 Att2
FROM #T prev
WHERE prev.SubID = T.SubID
AND prev.ValidFrom < t.ValidFrom
AND COALESCE(prev.Att2 , '') != ''
ORDER BY ValidFrom DESC
), '')
FROM #T T
ORDER BY T.ValidFrom;
The subquery will only evaluate when required (ie. when Att1 or Att2 is blank) rather than for every row. The execution plan does not show this, and in fact the "Actual Execution Plan" of the latter appears more intensive it almost certainly won't be. But as always, the key is testing, run both on your data and see which performs the best, and check the IO statistics for reads etc.
I never touched SQL Server, but I read that it supports analytical functions just like Oracle.
> select * from MYTABLE order by ValidFrom;
SUBID A A VALIDFROM
---------- - - -------------------
1 J 2012-12-06 15:14:51
2 j 2012-12-06 15:15:20
1 L 2012-12-06 15:15:31
2 l 2012-12-06 15:15:39
1 B 2012-12-06 15:15:48
2 b 2012-12-06 15:15:55
1 H 2012-12-06 15:16:03
2 h 2012-12-06 15:16:09
1 A H 2012-12-06 15:16:20
2 a h 2012-12-06 15:16:29
select
t.SubID
,last_value(t.Att1 ignore nulls)over(partition by t.SubID order by t.ValidFrom rows between unbounded preceding and current row) as Att1
,last_value(t.Att2 ignore nulls)over(partition by t.SubID order by t.ValidFrom rows between unbounded preceding and current row) as Att2
,t.ValidFrom
from MYTABLE t;
SUBID A A VALIDFROM
---------- - - -------------------
1 J 2012-12-06 15:45:33
1 J L 2012-12-06 15:45:41
1 B L 2012-12-06 15:45:49
1 B H 2012-12-06 15:45:58
1 A H 2012-12-06 15:46:06
2 j 2012-12-06 15:45:38
2 j l 2012-12-06 15:45:44
2 b l 2012-12-06 15:45:53
2 b h 2012-12-06 15:46:02
2 a h 2012-12-06 15:46:09
with Tricky1 as (
Select SubID, Att1, Att2, row_number() over(order by ValidFrom) As rownum
From Tricky
)
select T1.SubID, T1.Att1, T2.Att2
from Tricky1 T1
cross join Tricky1 T2
where (ABS(T1.rownum-T2.rownum) = 1 or (T1.rownum = 1 and T2.rownum = 1))
and T1.Att1 is not null
;
Also, have a look at accessing previous value, when SQL has no notion of previous value, here.
I was at it for quite a while now. I found a rather simple way of doing it. Not the best solution as such as i know there must be other way, but here it goes.
I had to consolidates duplicates too and in 2008R2.
So if you can try to create a table which contains one set of duplicates records.
According to your example create one table where 'ATT1' is blank. Then use Update queries with Inner join on 'SubId' to populate the data that you need