I want to display all the numbers (even / odd / mixed) between two numbers (1-9; 2-10; 11-20) in one (or two) column.
Example initial data:
| rang | | r1 | r2 |
-------- -----|-----
| 1-9 | | 1 | 9 |
| 2-10 | | 2 | 10 |
| 11-20 | or | 11 | 20 |
CREATE TABLE initialtableone(rang TEXT);
INSERT INTO initialtableone(rang) VALUES
('1-9'),
('2-10'),
('11-20');
CREATE TABLE initialtabletwo(r1 NUMERIC, r2 NUMERIC);
INSERT INTO initialtabletwo(r1, r2) VALUES
('1', '9'),
('2', '10'),
('11', '20');
Result:
| output |
----------------------------------
| 1,3,5,7,9 |
| 2,4,6,8,10 |
| 11,12,13,14,15,16,17,18,19,20 |
Something like this:
create table ranges (range varchar);
insert into ranges
values
('1-9'),
('2-10'),
('11-20');
with bounds as (
select row_number() over (order by range) as rn,
range,
(regexp_split_to_array(range,'-'))[1]::int as start_value,
(regexp_split_to_array(range,'-'))[2]::int as end_value
from ranges
)
select rn, range, string_agg(i::text, ',' order by i.ordinality)
from bounds b
cross join lateral generate_series(b.start_value, b.end_value) with ordinality i
group by rn, range
This outputs:
rn | range | string_agg
---+-------+------------------------------
3 | 2-10 | 2,3,4,5,6,7,8,9,10
1 | 1-9 | 1,2,3,4,5,6,7,8,9
2 | 11-20 | 11,12,13,14,15,16,17,18,19,20
Building on your first example, simplified, but with PK:
CREATE TABLE tbl1 (
tbl1_id serial PRIMARY KEY -- optional
, rang text -- can be NULL ?
);
Use split_part() to extract lower and upper bound. (regexp_split_to_array() would be needlessly expensive and error-prone). And generate_series() to generate the numbers.
Use a LATERAL join and aggregate the set immediately to simplify aggregation. An ARRAY constructor is fastest in this case:
SELECT t.tbl1_id, a.output -- array; added id is optional
FROM (
SELECT tbl1_id
, split_part(rang, '-', 1)::int AS a
, split_part(rang, '-', 2)::int AS z
FROM tbl1
) t
, LATERAL (
SELECT ARRAY( -- preserves rows with NULL
SELECT g FROM generate_series(a, z, CASE WHEN (z-a)%2 = 0 THEN 2 ELSE 1 END) g
) AS output
) a;
AIUI, you want every number in the range only if upper and lower bound are a mix of even and odd numbers. Else, only return every 2nd number, resulting in even / odd numbers for those cases. This expression implements the calculation of the interval:
CASE WHEN (z-a)%2 = 0 THEN 2 ELSE 1 END
Result as desired:
output
-----------------------------
1,3,5,7,9
2,4,6,8,10
11,12,13,14,15,16,17,18,19,20
You do not need WITH ORDINALITY in this case, because the order of elements is guaranteed.
The aggregate function array_agg() makes the query slightly shorter (but slower) - or use string_agg() to produce a string directly, depending on your desired output format:
SELECT a.output -- string
FROM (
SELECT split_part(rang, '-', 1)::int AS a
, split_part(rang, '-', 2)::int AS z
FROM tbl1
) t
, LATERAL (
SELECT string_agg(g::text, ',') AS output
FROM generate_series(a, z, CASE WHEN (z-a)%2 = 0 THEN 2 ELSE 1 END) g
) a;
Note a subtle difference when using an aggregate function or ARRAY constructor in the LATERAL subquery: Normally, rows with rang IS NULLare excluded from the result because the LATERAL subquery returns no row.
If you aggregate the result immediately, "no row" is transformed to one row with a NULL value, so the original row is preserved. I added demos to the fiddle.
SQL Fiddle.
You do not need a CTE for this, which would be more expensive.
Aside: The type conversion to integer removes leading / training white space automatically, so a string like this works as well for rank: ' 1 - 3'.
Related
Say I have a table Schema.table with these columns
id | json_col
on the forms e.g
id=1
json_col ={"names":["John","Peter"],"ages":["31","40"]}
The lengths of names and ages are always equal but might vary from id to id (size is at least 1 but no upper limit).
How do we get an "exploded" table - a table with a row for each "names", "ages" e.g
id | names | ages
---+-------+------
1 | John | 31
1 | Peter | 41
2 | Jim | 17
3 | Foo | 2
.
.
I have tried OPENJSON and CROSS APPLY but the following gives any combination of names and ages which is not correct, thus I need to to a lot of filtering afterwards
SELECT *
FROM Schema.table
CROSS APPLY OPENJSON(Schema.table,'$.names')
CROSS APPLY OPENJSON(Schema.table,'$.ages')
Here's my suggestion
DECLARE #tbl TABLE(id INT,json_col NVARCHAR(MAX));
INSERT INTO #tbl VALUES(1,N'{"names":["John","Peter"],"ages":["31","40"]}')
,(2,N'{"names":["Jim"],"ages":["17"]}');
SELECT t.id
,B.[key] As ValueIndex
,B.[value] AS PersonNam
,JSON_VALUE(A.ages,CONCAT('$[',B.[key],']')) AS PersonAge
FROM #tbl t
CROSS APPLY OPENJSON(t.json_col)
WITH(names NVARCHAR(MAX) AS JSON
,ages NVARCHAR(MAX) AS JSON) A
CROSS APPLY OPENJSON(A.names) B;
The idea in short:
We use OPENJSON with a WITH clause to read names and ages into new json variables.
We use one more OPENJSON to "explode" the names-array
As the key is the value's position within the array, we can use JSON_VALUE() to read the corresponding age-value by its position.
One general remark: If this JSON is under your control, you should change this to an entity-centered approach (array of objects). Such a position dependant storage can be quite erronous... Try something like
{"persons":[{"name":"John","age":"31"},{"name":"Peter","age":"40"}]}
Conditional Aggregation along with applying CROSS APPLY might be used :
SELECT id,
MAX(CASE WHEN RowKey = 'names' THEN value END) AS names,
MAX(CASE WHEN RowKey = 'ages' THEN value END) AS ages
FROM
(
SELECT id, Q0.[value] AS RowArray, Q0.[key] AS RowKey
FROM tab
CROSS APPLY OPENJSON(JsonCol) AS Q0
) r
CROSS APPLY OPENJSON(r.RowArray) v
GROUP BY id, v.[key]
ORDER BY id, v.[key]
id | names | ages
---+-------+------
1 | John | 31
1 | Peter | 41
2 | Jim | 17
3 | Foo | 2
Demo
The first argument for OPENJSON would be a JSON column value, but not a table itself
I have the following data:
cte
=================
gp_id | m_ids
------|----------
1 | {123}
2 | {432,222}
3 | {123,222}
And a function with a signature like this (which in fact returns not a table but a couple of ids):
FUNCTION foo(m_ids integer[])
RETURNS TABLE (
first_id integer,
second_id integer
)
Now, I've got to iterate over each row and perform some calculations with that function, so I would get something like this:
gp_id | first_id | second_id
------|----------|-----------
1 | 25 | 25
2 | 13 | 24
3 | 25 | 11
To achieve that I tried the following code:
SELECT gp_id,
(
SELECT *
FROM foo(
(
SELECT m_ids
FROM cte c2
WHERE c2.gp_id = c1.gp_id)) limit 1)
FROM cte c1
The problem is in the SELECT * statement. If I use SELECT first_id, everything works well (except for that I have to run two consecutive queries, which I'd like to avoid, obviously), but in the former case I'm getting the error
subquery must return only one column
which is somewhat expected.
So how can I correctly iterate over the table in one single query?
Use the function in a lateral join:
select gp_id, first_id, second_id
from cte,
lateral foo(m_ids);
I need to SORT all the digits from some string values in Postgres.
For instance, if I have two strings, e.g.
"70005" ==> "00057"
"70001" ==> "00017"
"32451" ==> "12345"
I can't cast the strings to integer or bigint due to my logic limitations. Is it possible to do this?
Use a recursive cte. Take the first char. if is '0' ignore it other wise go to the begining of target string.
Then use LPAD to append 0 until you get length 10.
SQL DEMO
WITH RECURSIVE cte (id, source, target) as (
SELECT 1 as id, '70001' as source , '' as target
UNION
SELECT 2 as id, '70005' as source , '' as target
UNION ALL
SELECT id,
substring(source from 2 for length(source)-1) as source,
CASE WHEN substring(source from 1 for 1) = '0' THEN target
ELSE substring(source from 1 for 1) || target
END
FROM cte
WHERE length(source) > 0
), reverse as (
SELECT id,
target,
row_number() over (partition by id
order by length(target) desc) rn
FROM cte
)
SELECT id, LPAD(target::text, 10, '0')
FROM reverse
WHERE rn = 1
OUTPUT
| id | lpad |
|----|------------|
| 1 | 0000000017 |
| 2 | 0000000057 |
Assuming that your data is organized like this:
Table: strings
| id | string |
|----+---------|
| 1 | '70005' |
| 2 | '70001' |
etc...
Then you can use a query like this:
SELECT all_digits.id,
array_to_string(array_agg(all_digits.digit ORDER BY all_digits.digit), '')
FROM (
SELECT strings.id, digits.digit
FROM strings, unnest(string_to_array(strings.string, NULL)) digits(digit)
) all_digits
GROUP BY all_digits.id
What this query does is split your table up into one row for each character in the string, sorts the table, and then aggregates the characters back into a string.
There's a SQL fiddle here: http://sqlfiddle.com/#!15/7f7fb0/14
How to sort this table in Oracle9:
START | END | VALUE
A | F | 1
D | H | 9
F | C | 8
C | D | 12
To make it look like this?:
START | END | VALUE
A | F | 1
F | C | 12
C | D | 8
D | H | 9
Goal is to start every next row with the end from the previous row.
This cannot be done with the order by clause alone, as it would have to find the record without a predecessor first, then find the next record comparing end and start column of the two records etc. This is an iterative process for which you need a recursive query.
That recursive query would find the first record, then the next and so on, giving them sequence numbers. Then you'd use the result and order by those generated numbers.
Here is how to do it in standard SQL. This is supported from Oracle 11g onwards only, however. In Oracle 9 you'll have to use CONNECT BY with which I am not familiar. Hopefully you or someone else can convert the query for you:
with chain(startkey, endkey, value, pos) as
(
select startkey, endkey, value, 1 as pos
from mytable
where not exists (select * from mytable prev where prev.endkey = mytable.startkey)
union all
select mytable.startkey, mytable.endkey, mytable.value, chain.pos + 1 as pos
from chain
join mytable on mytable.startkey = chain.endkey
)
select startkey, endkey, value
from chain
order by pos;
UPDATE: As you say the data is cyclic, you'd have to change above query so as to start with an arbitrarily chosen row and stop when through:
with chain(startkey, endkey, value, pos) as
(
select startkey, endkey, value, 1 as pos
from mytable
where rownum = 1
union all
select mytable.startkey, mytable.endkey, mytable.value, chain.pos + 1 as pos
from chain
join mytable on mytable.startkey = chain.endkey
)
cycle startkey set cycle to 1 default 0
select startkey, endkey, value
from chain
where cycle = 0
order by pos;
I have a giant table that has billions of records like this:
ID | H | N | Q | other
-----+-----+------+-----+--------
AAAA | 0 | 7 | Y | ...
BBBB | 1 | 5 | Y | ...
CCCC | 0 | 11 | N | ...
DDDD | 3 | 123 | N | ...
EEEE | 6 | 4 | Y | ...
These four columns are part of an index. What I want to do is construct a query that gives me the 1st row, followed by the row at 10%, 20%, 30%, 40%, ... so that the query will always give me 10 rows regardless of how big the table is (as long as # rows >= 10).
Is this even possible with SQL? If so, how would I do it? What kind of performance characteristics does it have?
One option would be
SELECT id,
h,
n,
q
FROM (
SELECT id,
h,
n,
q,
row_number() over (partition by decile order by id, n) rn
FROM (
SELECT id,
h,
n,
q,
ntile(10) over (order by id, n) decile
FROM your_table
)
)
WHERE rn = 1
There is probably a more efficient approach using PERCENTILE_DISC or CUME_DIST that isn't clicking for me at the moment. But this should work.
You can use a histogram to get this information. The huge downside is that the results will only be approximate, and it's very difficult to say how approximate they will be. And you'll need to gather table statistics to refresh the results, but you're probably already doing that. On the positive side, the query to get the results will be very fast. And using statistics instead of a query would be so cool.
Here's a quick demo:
--Create a table with the IDs AA - ZZ.
create table test(id varchar2(100), h number, n number, q varchar2(100)
,other varchar2(100));
insert into test
select letter1||letter2 letters, row_number() over (order by letter1||letter2), 1, 1, 1
from
(select chr(65+level-1) letter1 from dual connect by level <= 26) letters1
cross join
(select chr(65+level-1) letter2 from dual connect by level <= 26) letters2
;
commit;
--Gather stats, create a histogram with 11 buckets (we'll only use the first 10)
begin
dbms_stats.gather_table_stats(user, 'TEST', cascade=>true,
method_opt=>'FOR ALL COLUMNS SIZE AUTO, FOR COLUMNS SIZE 10 ID');
end;
/
--Getting the values from user_histograms is kinda tricky, especially for varchars.
--There are problems with rounding, so some of the values may not actually exist.
--
--This query is from Jonathan Lewis:
-- http://jonathanlewis.wordpress.com/2010/10/05/frequency-histogram-4/
select
endpoint_number,
endpoint_number - nvl(prev_endpoint,0) frequency,
hex_val,
chr(to_number(substr(hex_val, 2,2),'XX')) ||
chr(to_number(substr(hex_val, 4,2),'XX')) ||
chr(to_number(substr(hex_val, 6,2),'XX')) ||
chr(to_number(substr(hex_val, 8,2),'XX')) ||
chr(to_number(substr(hex_val,10,2),'XX')) ||
chr(to_number(substr(hex_val,12,2),'XX')),
endpoint_actual_value
from (
select
endpoint_number,
lag(endpoint_number,1) over(
order by endpoint_number
) prev_endpoint,
to_char(endpoint_value,'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX')hex_val,
endpoint_actual_value
from
user_histograms
where table_name = 'TEST'
and column_name = 'ID'
)
where
endpoint_number < 10
order by
endpoint_number
;
Here's a comparison of the histogram results with the real results from #Justin Cave's query:
Histogram: Real results:
A# AA
CP CQ
FF FG
HV HW
KL KM
NB NC
PR PS
SG SH
UU UW
XK XL