sort table with null as "insignificant"

sort table with null as "insignificant" - sql

I have a table with two columns: col_order (int), and name (text). I would like to retrieve ordered rows such that, when col_order is not null, it determines the order, but when its null, then name determines the order. I thought of an order by clause such as
order by coalesce( col_order, name )
However, this won't work because the two columns have different type. I am considering converting both into bytea, but: 1) to convert the integer is there a better method than just looping moding by 256, and stacking up individual bytes in a function, and 2) how do I convert "name" to insure some sort of sane collation order (assuming name has order ... well citext would be nice but I haven't bothered to rebuild to get that ... UTF8 for the moment).
Even if all this is possible (suggestions on details welcome) it seems like a lot of work. Is there a better way?
EDIT
I got an excellent answer by Gordon, but it shows I didn't phrase the question correctly. I want a sort order by name where col_order represents places where this order is overridden. This isn't a well posed problem, but here is one acceptable solution:
col_order| name
----------------
null | a
1 | foo
null | foo1
2 | bar
Ie -- here if col_order is null name should be inserted after name closest in alphabetical order but less that it. Otherwise, this could be gotten by:
order by col_order nulls last, name
EDIT 2
Ok ... to get your creative juices flowing, this seems to be going in the right direction:
with x as ( select *,
case when col_order is null then null else row_number() over (order by col_order) end as ord
from temp )
select col_order, name, coalesce( ord, lag(ord,1) over (order by name) + .5) as ord from x;
It gets the order from the previous row, sorted by name, when there is no col_order. It isn't right in general... I guess I'd have to go back to the first row with non-null col_order ... it would seem that sql standard has "ignore nulls" for window functions which might do this, but isn't implemented in postgres. Any suggestions?
EDIT 3
The following would seem close -- but doesn't work. Perhaps window evaluation is a bit strange with recursive queries.
with recursive x(col_order, name, n) as (
select col_order, name, case when col_order is null then null
else row_number() over (order by col_order) * t end from temp, tot
union all
select col_order, name,
case when row_number() over (order by name) = 1 then 0
else lag(n,1) over (order by name) + 1 end from x
where x.n is null ),
tot as ( select count(*) as t from temp )
select * from x;

Just use multiple clauses:
order by (case when col_order is not null then 1 else 2 end),
col_order,
name
When col_order is not null, then 1 is assigned for the first sort key. When it is null, then 2 is assigned. Hence, the not-nulls will be first.

Ok .. the following seems to work -- I'll leave the question "unanswered" though pending criticism or better suggestions:
Using the last_agg aggregate from here:
with
tot as ( select count(*) as t from temp ),
x as (
select col_order, name,
case when col_order is null then null
else (row_number() over (order by col_order)) * t end as n,
row_number() over (order by name) - 1 as i
from temp, tot )
select x.col_order, x.name, coalesce(x.n,last_agg(y.n order by y.i)+x.i, 0 ) as n
from x
left join x as y on y.name < x.name
group by x.col_order, x.n, x.name, x.i
order by n;

Related

SQL Query for sorting in particular order

I have this data:
parent_id
comment_id
comment_level
NULL
0xB2
1
NULL
0xB6
1
NULL
0xBA
1
NULL
0xBE
1
NULL
0xC110
1
NULL
0xC130
1
123
0xC13580
2
456
0xC135AC
3
123
0xC13680
2
I want the result in such a way that rows where comment_level=1 should be in descending order by comment_id and other rows(i.e. where comment_level!=1) should be in ascending order by comment_id but the order of comment level greater than 1 should be inserted according to to order of comment level 1 (what I mean is that rows with comment_level=1 should remain in descending order and then rows with comment_level!=1 should be inserted in increasing order but it should be inserted following rows where comment_id is less than it)
Result should look like this
NULL 0xC130 1
123 0xC13580 2
456 0xC135AC 3
123 0xC13680 2
NULL 0xC110 1
NULL 0xBE 1
NULL 0xBA 1
NULL 0xB6 1
NULL 0xB2 1
Note the bold rows in above sort by comment_id in ascending order, but they come after their "main" row (with comment_level = 1), where these main rows sort DESC by comment_id.
I tried creating 2 tables for different comment level and used sorting for union but it didn't work out because 2 different order by doesn't work maybe I tried from this Using different order by with union but it gave me an error and after all even if this worked it still might not have given me the whole answer.

I think I understand what you're going for, and a UNION will not be able to do it.
To accomplish this, each row needs to match with a specific "parent" row that does have 1 for the comment_level. If the comment_level is already 1, the row is its own parent. Then we can sort first by the comment_id from that parent record DESC, and then sort ascending by the local comment_id within the a given group of matching parent records.
You'll need something like this:
SELECT t0.*
FROM [MyTable] t0
CROSS APPLY (
SELECT TOP 1 comment_id
FROM [MyTable] t1
WHERE t1.comment_level = 1 AND t1.comment_id <= t0.comment_id
ORDER BY t1.comment_id DESC
) parents
ORDER BY parents.comment_id DESC,
case when t0.comment_level = 1 then 0 else 1 end,
t0.comment_id
See it work here:
https://dbfiddle.uk/qZBb3YjO
There's probably also a solution using a windowing function that will be more efficient.
And here it is:
SELECT parent_id, comment_id, comment_level
FROM (
SELECT t0.*, t1.comment_id as t1_comment_id
, row_number() OVER (PARTITION BY t0.comment_id ORDER BY t1.comment_id desc) rn
FROM [MyTable] t0
LEFT JOIN [MyTable] t1 ON t1.comment_level = 1 and t1.comment_id <= t0.comment_id
) d
WHERE rn = 1
ORDER BY t1_comment_id DESC,
case when comment_level = 1 then 0 else 1 end,
comment_id
See it here:
https://dbfiddle.uk/me1vGNdM

Those varbinary values aren't arbitrary, they're hierarchyid values. And just a guess, they're probably typed that way in the schema (it's too much to be a coincidence). We can use this fact to do what we're looking to do.
with d_raw as (
select * from (values
(NULL, 0xC130 , 1),
(123 , 0xC13580, 2),
(456 , 0xC135AC, 3),
(123 , 0xC13680, 2),
(NULL, 0xC110 , 1),
(NULL, 0xBE , 1),
(NULL, 0xBA , 1),
(NULL, 0xB6 , 1),
(NULL, 0xB2 , 1)
) as x(parent_id, comment_id, comment_level)
),
d as (
select parent_id, comment_id = cast(comment_id as hierarchyid), comment_level
from d_raw
)
select *,
comment_id.ToString(),
comment_id.GetAncestor(comment_id.GetLevel() - 1).ToString()
from d
order by comment_id.GetAncestor(comment_id.GetLevel() - 1) desc,
comment_id
Note - the CTEs that I'm using are just to get the data into the right format and the last SELECT adds additional columns just to show what's going on; you can omit them from your query without consequence. I think the only interesting part of this solution is using comment_id.GetAncestor(comment_id.GetLevel() - 1) to get the root-level node.

One way to do this and possibly the only one is create two separate queries and union them together. Something like:
(select * from table where comment_level = 1 order by comment_id desc)
union
(select * from table where not comment_level = 1 order by comment_id asc)

How to write a BigQuery query that produces the count of the unique transactions and the combination of column names populated

I’m trying to write a query in BigQuery that produces the count of the unique transactions and the combination of column names populated.
I have a table:
TRAN CODE
Full Name
Given Name
Surname
DOB
Phone
The result set I’m after is:
TRAN CODE
UNIQUE TRANSACTIONS
NAME OF POPULATED COLUMNS
A
3
Full Name
A
4
Full Name,Phone
B
5
Given Name,Surname
B
10
Given Name,Surname,DOB,Phone
The result set shows that for TRAN CODE A
3 distinct customers provided Full Name
4 distinct customers provided Full Name and Phone #
For TRAN CODE B
5 distinct customers provided Given Name and Surname
10 distinct customers provided Given Name, Surname, DOB, Phone #
Currently to produce my results I’m doing it manually.
I tried using ARRAY_AGG but couldn’t get it working.
Any advice work be appreciated.
Thank you.

I think you want something like this:
select tran_code,
array_to_string(array[case when full_name is not null then 'full_name' end,
case when given_name is not null then 'given_name' end,
case when surname is not null then 'surname' end,
case when dob is not null then 'dob' end,
case when phone is not null then 'phone' end
], ','),
count(*)
from t
group by 1, 2

Consider below approach - no any dependency on column names rather than TRAN_CODE - quite generic!
select TRAN_CODE,
count(distinct POPULATED_VALUES) as UNIQUE_TRANSACTIONS,
POPULATED_COLUMNS
from (
select TRAN_CODE,
( select as struct
string_agg(col, ', ' order by offset) POPULATED_COLUMNS,
string_agg(val order by offset) POPULATED_VALUES,
string_agg(cast(offset as string) order by offset) pos
from unnest(regexp_extract_all(to_json_string(t), r'"([^"]+?)":')) col with offset
join unnest(regexp_extract_all(to_json_string(t), r'"[^"]+?":("[^"]+?"|null)')) val with offset
using(offset)
where val != 'null'
and col != 'TRAN_CODE'
).*
from `project.dataset.table` t
)
group by TRAN_CODE, POPULATED_COLUMNS
order by TRAN_CODE, any_value(pos)
below is output example

#Gordon_Linoff's solution is the best, but an alternative would be to do the following:
SELECT
TRAN_CODE,
COUNT(TRAN_ROW) AS unique_transactions,
populated_columns
FROM (
SELECT
TRAN_CODE,
TRAN_ROW,
# COUNT(value) AS unique_transactions,
STRING_AGG(field, ",") AS populated_columns
FROM (
SELECT
* EXCEPT(DOB),
CAST(DOB AS STRING ) AS DOB,
ROW_NUMBER() OVER () AS TRAN_ROW
FROM
sample) UNPIVOT(value FOR field IN (Full_name,
Given_name,
Surname,
DOB,
Phone))
GROUP BY
TRAN_CODE,
TRAN_ROW )
GROUP BY
TRAN_CODE,
populated_columns
But this should be more expensive...

How to partially filter substring from a table with count

I am trying to filter substring from a string. I achieve it like. But I can print count value but it is always 1. I need to print real count number.
#standardSQL
WITH `project.dataset.table` AS (
select term from(
select LOWER(REGEXP_EXTRACT(textPayload,"Search term:(.*)")) as term from `log_dataset.backend_*`
where REGEXP_CONTAINS(textPayload, "Search term:.*")=true
)
group by term
order by count(*) desc
), temp AS (
SELECT term, COUNT(1) `count`
FROM `project.dataset.table`
GROUP BY term
)
SELECT term , `count` FROM (
SELECT term, `count`, STARTS_WITH(prev_str, term) AND
ARRAY_LENGTH(REGEXP_EXTRACT_ALL(term, r' ')) = ARRAY_LENGTH(REGEXP_EXTRACT_ALL(prev_str, r' ')) AS flag
FROM (
SELECT term, `count`, LAG(term) OVER(ORDER BY term DESC) AS prev_str
FROM temp
)
)
WHERE NOT IFNULL(flag, FALSE)
These are a list of terms
anderstand
anderstan
andersta
anderst
understand
understan
understa
underst
unders
under
understand i
understand i
understand it
understand it
understand it y
understand it ye
understand it yes
understand it yes it
understand it yes it
Desired output is
Row str count
1 understand it yes it 2
2 anderstand 1
3 understand it yes 1
4 understand 1
5 understand it 2

To obtain the desired output you can employ a GROUP BY statement as follows:
SELECT
str,
COUNT(*) AS count
FROM
`project_id.dataset.table`
GROUP BY
str
In addition, the LIKE operator can be used to filter the words in the str field.

BigQuery use the where clause to filter on a column that not always exists in the table

I need to create some kind of a uniform query for multiple tables. Some tables contain a certain column with a type. If this is the case, I need to apply filtering to it. I don't know how to do this.
I have for example two tables
table_customer_1
CustomerId, CustomerType
1, 1
2, 1
3, 2
Table_customer_2
Customerid
4
5
6
The query needs to be something like the one below and should work for both tables (the table name wil be replaced by the customer that uses the query):
With input1 as(
SELECT
(CASE WHEN exists(customerType) THEN customerType ELSE "0" END) as customerType, *
FROM table_customer_1)
SELECT * from input1
WHERE customerType != 2

Below is for BigQuery Standard SQL
#standardSQL
SELECT *
FROM `project.dataset.table` t
WHERE SAFE_CAST(IFNULL(JSON_EXTRACT_SCALAR(TO_JSON_STRING(t), '$.CustomerType'), '0') AS INT64) != 2
or as a simplification you can ignore casting to INT64 and use comparison to STRING
#standardSQL
SELECT *
FROM `project.dataset.table` t
WHERE IFNULL(JSON_EXTRACT_SCALAR(TO_JSON_STRING(t), '$.CustomerType'), '0') != '2'
above will work for whatever table you put instead of project.dataset.table: either project.dataset.table_customer_1 or project.dataset.table_customer_2 - so quite generic I think

I can think of no good reason for doing this. However, it is possible by playing with the scoping rules for subqueries:
SELECT t.*
FROM (SELECT t.*,
(SELECT customerType -- will choose from tt if available, otherwise x
FROM table_customer_1 tt
WHERE tt.Customerid = t.Customerid
) as customerType
FROM (SELECT t.* EXCEPT (Customerid)
FROM table_customer_1 t
) t CROSS JOIN
(SELECT 0 as customerType) x
) t
WHERE customerType <> 2

How do I sort a VARCHAR column in SQL server that contains numbers?

I have a VARCHAR column in a SQL Server 2000 database that can contain either letters or numbers. It depends on how the application is configured on the front-end for the customer.
When it does contain numbers, I want it to be sorted numerically, e.g. as "1", "2", "10" instead of "1", "10", "2". Fields containing just letters, or letters and numbers (such as 'A1') can be sorted alphabetically as normal. For example, this would be an acceptable sort order.
1
2
10
A
B
B1
What is the best way to achieve this?

One possible solution is to pad the numeric values with a character in front so that all are of the same string length.
Here is an example using that approach:
select MyColumn
from MyTable
order by
case IsNumeric(MyColumn)
when 1 then Replicate('0', 100 - Len(MyColumn)) + MyColumn
else MyColumn
end
The 100 should be replaced with the actual length of that column.

There are a few possible ways to do this.
One would be
SELECT
...
ORDER BY
CASE
WHEN ISNUMERIC(value) = 1 THEN CONVERT(INT, value)
ELSE 9999999 -- or something huge
END,
value
the first part of the ORDER BY converts everything to an int (with a huge value for non-numerics, to sort last) then the last part takes care of alphabetics.
Note that the performance of this query is probably at least moderately ghastly on large amounts of data.

select
Field1, Field2...
from
Table1
order by
isnumeric(Field1) desc,
case when isnumeric(Field1) = 1 then cast(Field1 as int) else null end,
Field1
This will return values in the order you gave in your question.
Performance won't be too great with all that casting going on, so another approach is to add another column to the table in which you store an integer copy of the data and then sort by that first and then the column in question. This will obviously require some changes to the logic that inserts or updates data in the table, to populate both columns. Either that, or put a trigger on the table to populate the second column whenever data is inserted or updated.

SELECT *, CONVERT(int, your_column) AS your_column_int
FROM your_table
ORDER BY your_column_int
OR
SELECT *, CAST(your_column AS int) AS your_column_int
FROM your_table
ORDER BY your_column_int
Both are fairly portable I think.

you can always convert your varchar-column to bigint as integer might be too short...
select cast([yourvarchar] as BIGINT)
but you should always care for alpha characters
where ISNUMERIC([yourvarchar] +'e0') = 1
the +'e0' comes from http://blogs.lessthandot.com/index.php/DataMgmt/DataDesign/isnumeric-isint-isnumber
this would lead to your statement
SELECT
*
FROM
Table
ORDER BY
ISNUMERIC([yourvarchar] +'e0') DESC
, LEN([yourvarchar]) ASC
the first sorting column will put numeric on top.
the second sorts by length, so 10 will preceed 0001 (which is stupid?!)
this leads to the second version:
SELECT
*
FROM
Table
ORDER BY
ISNUMERIC([yourvarchar] +'e0') DESC
, RIGHT('00000000000000000000'+[yourvarchar], 20) ASC
the second column now gets right padded with '0', so natural sorting puts integers with leading zeros (0,01,10,0100...) in correct order (correct!) - but all alphas would be enhanced with '0'-chars (performance)
so third version:
SELECT
*
FROM
Table
ORDER BY
ISNUMERIC([yourvarchar] +'e0') DESC
, CASE WHEN ISNUMERIC([yourvarchar] +'e0') = 1
THEN RIGHT('00000000000000000000' + [yourvarchar], 20) ASC
ELSE LTRIM(RTRIM([yourvarchar]))
END ASC
now numbers first get padded with '0'-chars (of course, the length 20 could be enhanced) - which sorts numbers right - and alphas only get trimmed

I solved it in a very simple way writing this in the "order" part
ORDER BY (
sr.codice +0
)
ASC
This seems to work very well, in fact I had the following sorting:
16079 Customer X
016082 Customer Y
16413 Customer Z
So the 0 in front of 16082 is considered correctly.

This seems to work:
select your_column
from your_table
order by
case when isnumeric(your_column) = 1 then your_column else 999999999 end,
your_column

This query is helpful for you. In this query, a column has data type varchar is arranged by good order.For example- In this column data are:- G1,G34,G10,G3. So, after running this query, you see the results: - G1,G10,G3,G34.
SELECT *,
(CASE WHEN ISNUMERIC(column_name) = 1 THEN 0 ELSE 1 END) IsNum
FROM table_name
ORDER BY IsNum, LEN(column_name), column_name;

This may help you, I have tried this when i got the same issue.
SELECT *
FROM tab
ORDER BY IIF(TRY_CAST(val AS INT) IS NULL, 1, 0),TRY_CAST(val AS INT);

The easiest and efficient way to get the job done is using TRY_CAST
SELECT my_column
FROM my_table
WHERE <condition>
ORDER BY TRY_CAST(my_column AS NUMERIC) DESC
This will sort all numbers in descending order and push down all non numeric values

SELECT FIELD FROM TABLE
ORDER BY
isnumeric(FIELD) desc,
CASE ISNUMERIC(test)
WHEN 1 THEN CAST(CAST(test AS MONEY) AS INT)
ELSE NULL
END,
FIELD
As per this link you need to cast to MONEY then INT to avoid ordering '$' as a number.

SELECT *,
ROW_NUMBER()OVER(ORDER BY CASE WHEN ISNUMERIC (ID)=1 THEN CONVERT(NUMERIC(20,2),SUBSTRING(Id, PATINDEX('%[0-9]%', Id), LEN(Id)))END DESC)Rn ---- numerical
FROM
(
SELECT '1'Id UNION ALL
SELECT '25.20' Id UNION ALL
SELECT 'A115' Id UNION ALL
SELECT '2541' Id UNION ALL
SELECT '571.50' Id UNION ALL
SELECT '67' Id UNION ALL
SELECT 'B48' Id UNION ALL
SELECT '500' Id UNION ALL
SELECT '147.54' Id UNION ALL
SELECT 'A-100' Id
)A
ORDER BY
CASE WHEN ISNUMERIC (ID)=0 /* alphabetical sort */
THEN CASE WHEN PATINDEX('%[0-9]%', Id)=0
THEN LEFT(Id,PATINDEX('%[0-9]%',Id))
ELSE LEFT(Id,PATINDEX('%[0-9]%',Id)-1)
END
END DESC

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

sort table with null as "insignificant" - sql

Just use multiple clauses: order by (case when col_order is not null then 1 else 2 end), col_order, name When col_order is not null, then 1 is assigned for the first sort key. When it is null, then 2 is assigned. Hence, the not-nulls will be first.

Related

SQL Query for sorting in particular order

How to write a BigQuery query that produces the count of the unique transactions and the combination of column names populated

How to partially filter substring from a table with count

BigQuery use the where clause to filter on a column that not always exists in the table

How do I sort a VARCHAR column in SQL server that contains numbers?

Categories

Resources