I tried to find solution for this problem for some time but without success so any help would be much appreciated. List of IDs needs to be compared against a table and find out which records exist (and one of their values) and which are non existent. There is a list of IDs, in text format:
100,
200,
300
a DB table:
ID(PK) value01 value02 value03 .....
--------------------------------------
100 Ann
102 Bob
300 John
304 Marry
400 Jane
and output I need is:
100 Ann
200 missing or empty or whatever indication
300 John
Obvious solution is to create table and join but I have only read access (DB is closed vendor product, I'm just a user). Writing a PL/SQL function also seems complicated because table has 200+ columns and 100k+ records and I had no luck with creating dynamic array of records. Also, list of IDs to be checked contains hundreds of IDs and I need to do this periodically so any solution where each ID has to be changed in separate line of code wouldn't be very useful.
Database is Oracle 10g.
there are many built in public collection types. you can leverage one of them like this:
with ids as (select /*+ cardinality(a, 1) */ column_value id
from table(UTL_NLA_ARRAY_INT(100, 200, 300)) a
)
select ids.id, case when m.id is null then '**NO MATCH**' else m.value end value
from ids
left outer join my_table m
on m.id = ids.id;
to see a list of public types on your DB, run :
select owner, type_name, coll_type, elem_type_name, upper_bound, precision, scale from all_coll_types
where elem_type_name in ('FLOAT', 'INTEGER', 'NUMBER', 'DOUBLE PRECISION')
the hint
/*+ cardinality(a, 1) */
is just used to tell oracle how many elements are in our array (if not specified, the default will be an assumption of 8k elements). just set to a reasonably accurate number.
You can transform a variable into a query using CONNECT BY (tested on 11g, should work on 10g+):
SQL> WITH DATA AS (SELECT '100,200,300' txt FROM dual)
2 SELECT regexp_substr(txt, '[^,]+', 1, LEVEL) item FROM DATA
3 CONNECT BY LEVEL <= length(txt) - length(REPLACE(txt, ',', '')) + 1;
ITEM
--------------------------------------------
100
200
300
You can then join this result to the table as if it were a standard view:
SQL> WITH DATA AS (SELECT '100,200,300' txt FROM dual)
2 SELECT v.id, dbt.value01
3 FROM dbt
4 RIGHT JOIN
5 (SELECT to_number(regexp_substr(txt, '[^,]+', 1, LEVEL)) ID
6 FROM DATA
7 CONNECT BY LEVEL <= length(txt) - length(REPLACE(txt, ',', '')) + 1) v
8 ON dbt.id = v.id;
ID VALUE01
---------- ----------
100 Ann
300 John
200
One way of tackling this is to dynamically create a common table expression that can then be included in the query. The final synatx you'd be aiming for is:
with list_of_values as (
select 100 val from dual union all
select 200 val from dual union all
select 300 val from dual union all
...)
select
lov.val,
...
from
list_of_values lov left outer join
other_data t on (lov.val = t.val)
It's not very elegant, particularly for large sets of values, but compatibility with a database on which you might have few privileges is very good.
Related
Is there a simple way to delete a STRUCT from the nested and repeated field in the BigQuery (BQ table column Type: RECORD, Mode: REPEATED).
Let's say I have the following tables:
wishlist
name toy.id toy.priority
Alice 1 high
2 medium
3 low
Kazik 3 high
1 medium
toys
id name available
1 car 0
2 doll 1
3 bike 1
I'd like to DELETE from wishlist toys that are not available (toys.available==0). In this case, it's toy.id==1.
As a result, the wishlist would look like this:
name toy.id toy.priority
Alice 2 medium
3 low
Kazik 3 high
I know how to select it:
WITH `project.dataset.wishlist` AS
(
SELECT 'Alice' name, [STRUCT<id INT64, priority STRING>(1, 'high'), (2, 'medium'), (3, 'low')] toy UNION ALL
SELECT 'Kazik' name, [STRUCT<id INT64, priority STRING>(3, 'high'), (1, 'medium')]
), toys AS (
SELECT 1 id, 'car' name, 0 available UNION ALL
SELECT 2 id, 'doll' name, 1 available UNION ALL
SELECT 3 id, 'bike' name, 1 available
)
SELECT wl.name, ARRAY_AGG(STRUCT(unnested_toy.id, unnested_toy.priority)) as toy
FROM `project.dataset.wishlist` wl, UNNEST (toy) as unnested_toy
LEFT JOIN toys t ON unnested_toy.id=t.id
WHERE t.available != 0
GROUP BY name
But I don't know how to remove structs <toy.id, toy.priority> from wishlist when toys.available==0.
There are very similar questions like How to delete/update nested data in bigquery or How to Delete rows from Structure in bigquery but the answers are either unclear to me in terms of deletion or suggest copying the whole wishlist to the new table using the selection statement. My 'wishlist' is huge and 'toys.availabililty' changes often. Copying it seems to me very inefficient.
Could you please suggest a solution aligned with BQ best practices?
Thank you!
... since row Deletion was implemented in BQ, I thought that STRUCT deletion inside a row is also possible.
You can use UPDATE DML for this (not DELETE as it is used for deletion of whole row(s), while UPDATE can be used to modify the row)
update `project.dataset.wishlist` wl
set toy = ((
select array_agg(struct(unnested_toy.id, unnested_toy.priority))
from unnest(toy) as unnested_toy
left join `project.dataset.toys` t on unnested_toy.id=t.id
where t.available != 0
))
where true;
You can UNNEST() and reaggregate:
SELECT wl.name,
(SELECT ARRAY_AGG(t)
FROM UNNEST(wl.toy) t JOIN
toys
ON toys.id = t.id
WHERE toys.available <> 0
) as available_toys
FROM `project.dataset.wishlist` wl;
This is the sample data in the column. I want to extract the values only associated with 5 in dynamically.
'{"2113":5,"2112":5,"2114":4,"2511":5}'
The final structure should be 3 rows of names and values?
I tried with JSON extract function but that not help. Thanks
Final result i want,
value | Key
2113 5
2112 5
2115 5
So, what you need to do is to unnest the json object (have a key-value pair per row). Unnesting in Readshift is tricky. One needs a sequence table, and then perfom a CROSS JOIN with proper filter condition. Usually unnesting is done on an array, and then it's easier, since indicies are easy to generate. To unnest a key-value map (JSON object) one needs to know all the keys (Redshift cannot do it). Your example is lucky, since the keys are integers and they're cardinality is relatively low.
This is a sketched out solution. Please note that you will have to change the way the sequence table is created:
WITH input(json) AS (
SELECT '{"2113":5,"2112":5,"2114":4,"2511":5}'::varchar
)
, sequence(idx) AS (
-- instead of the below you should use sequence table
SELECT 2113
UNION ALL
SELECT 2112
UNION ALL
SELECT 2114
UNION ALL
SELECT 2511
UNION ALL
SELECT 2512
UNION ALL
SELECT 2513
UNION ALL
SELECT 2514
)
, unnested(key, val) AS (
SELECT idx::varchar as key,
json_extract_path_text(json, key) as val
FROM input
CROSS JOIN sequence
WHERE val IS NOT NULL
)
SELECT *
FROM unnested
WHERE val = 5
key | val
2113 | 5
2112 | 5
2511 | 5
how to generate a large sequence in Redshift:
...
sequence(idx) AS (
SELECT row_number() OVER ()
FROM arbitrary_table_having_enough_rows
limit 10000
)
...
Other option is to have a specialized sequence table - here there's an idea on how to do it http://www.silota.com/docs/recipes/redshift-sequential-generate-series-numbers-time.html
Achieved the result using multiple splits.
`SELECT distinct split_part(split_part(replace(replace(replace(json_field,'{',''),'}',''),'"',''),',',i),': ',1) as value,` `split_part(split_part(replace(replace(replace(json_field,'{',''),'}',''),'"',''),',',i),':',2) as key FROM table
JOIN schema.seq_1_to_100 as numbers
ON i <=regexp_count(json_field,':') `
In the image above which represents an SQL table I would like to search 1111 and retrieve its last replaced number which should be 4444 where 1111 is just a single number replaced by a single number 4444. then i would like to search 5555 which should return (6666,9999,8888). NB 9999 had replaced 7777.
so 1111 was a single part number replaced multiple times and 5555 was a group number with multiple parts breakdown with one replaced number within(7777>>9999).
What would be the fastest and most efficient method?
if possible a solution using SQL for efficiency.
if unable within SQL then from within PHP.
What I have tried:
1) while loop. but need to access database 1000 times for 1000 replaced numbers. ##too inefficient.
2)
SELECT C.RPLPART
FROM TABLE A
left join TABLE B on A.RPLPART=B.PART#
left join TABLE C on B.RPLPART=C.PART#
WHERE A.PART#='1111' ##Unable to know when last number is reached.
A recursive Common Table Expression (CTE) would seem to be the ticket.
Something like so...
with rcte (lvl, topPart, part#, rplpart) as
(select 1 as lvl, part#, part#, rplpart
from MYTABLE
union all
select p.lvl + 1, p.topPart, c.part#, c.rplpart
from rcte p, MYTABLE c
where p.rplpart = c.part#
)
select topPart, rplpart
from rcte
where toppart = 1111
order by lvl desc
fetch first row only;
You can do this using a recursive CTE that generates a complete replacement chain for a given starting part id, and then limit the result to just those that don't exist in the parts# column:
WITH cte(part) AS
(SELECT replpart FROM parts WHERE part# = 1111
UNION ALL
SELECT parts.replpart FROM parts, cte WHERE parts.part# = cte.part)
SELECT DISTINCT part
FROM cte
WHERE part NOT IN (SELECT part# FROM parts);
Fiddle example
In Oracle 12c, I have a view, which takes a little time to run. When I add the where clause, it will return exactly one row of interest. The row has columns/value like this...
I need this flipped so that I can see one row per EACH "set". I need the SQL to return something like
I know I can do a UNION ALL for each of the entry sets, but as the view takes a little while to run, plus there are about 30 different sets (I only showed 3 - Car, Boat, and truck)
Is there a better way of doing this? I have looked at PIVOT/UNPIVOT, but I didn't see how to make this work.
I think you are looking for UNPIVOT
WITH TEMP_DATA (ID1, CarPrice, CarTax, BoatPrice, BoatTax, TruckPrice, TruckTax)
AS (
select 'AAA', 1, 2, 3, 4, 5, 6 from dual )
select TYPE, PRICE, TAX
from temp_data
unpivot
(
(PRICE, TAX)
for TYPE IN
(
(CarPrice, CarTax) as 'CAR',
(BoatPrice, BoatTax) as 'BOAT',
(TruckPrice, TruckTax) as 'TRUCK'
)
)
;
OUTPUT:
TYPE PRICE TAX
----- ---------- ----------
CAR 1 2
BOAT 3 4
TRUCK 5 6
I've got a table with 20 columns which I like to categorize like;
0-25 --> 1
25-50 --> 2
50-75 --> 3
75-100 --> 4
I prefer not to use 20 case ... when statements. Anyone who knows how to do this more dynamically & efficiently? Can be SQL or PL/SQL.
I tried some PL/SQL, but I didn't see a simple method to use the column names as variables.
Many thanks.
Frans
Your example is a bit confusing, but assuming you want to put a certain value into those categories, the function width_bucket might be what you are after:
Something like this:
with sample_data as (
select trunc(dbms_random.value(1,100)) as val
from dual
connect by level < 10
)
select val, width_bucket(val, 0, 100, 4) as category
from sample_data;
This will assign the numbers 1-4 to the (random) values from sample_data. the 0, 100 defines the range from which to build the buckets, and the final parameter 4 says in how many (equally wide) buckets this should be distributed. The result of the function is the bucket into which the value val would fall.
SQLFiddle example: http://sqlfiddle.com/#!4/d41d8/10721
The case statement is probably the most efficient way of doing it. A more dynamic way would be to create a table using the with statement. Here is an example of the code:
with ref as (
select 0 as lower, 25 as higher 1 as val from dual union all
select 25, 59, 2 from dual union all
select 50, 75, 3 from dual union all
select 75, 100, 4 from dual
)
select ref.val
from t left outer join ref
on t.col >= ref.lower and t.col < ref.higher
That said, this particular lookup could be done with arithmetic:
select trunc((t.col - 1) / 25) + 1 as val
from t
And, if your problem is managing the different columns, you might consider unpivot. However, I think it is probably easier just to write the code and modify the column names in a text editor or Excel.