Error with SELECT statement in a FOR loop - sql

When I try to execute the code below I receive this error here:
Error(9,4): PLS-00103: Encountered the symbol "SELECT" when expecting
one of the following: ( ) - + case mod new not null table continue avg count current exists max min prior sql
stddev sum variance execute multiset the both leading trailing
forall merge year month day hour minute second timezone_hour
timezone_minute timezone_region timezone_abbr time timestamp
interval date
CREATE OR REPLACE PROCEDURE PROC_LIST_SIMILAR_TVSERIES
(seriesName IN SERIES.NAME%TYPE)
AS
CURSOR series IS (SELECT IDS FROM SERIES WHERE NAME = seriesName);
allSeries SERIES%ROWTYPE;
BEGIN
FOR series IN allSeries
(SELECT 2* ( SELECT COUNT(*)
FROM DICT d
WHERE d.idt IN ( SELECT DISTINCT IDT
FROM POSTING
WHERE IDS = series
INTERSECT
SELECT DISTINCT IDT
FROM POSTING
WHERE IDS = allSeries.IDS
)
)
/ ( ( SELECT DISTINCT COUNT(IDT)
FROM POSTING
WHERE IDS = series
) +
( SELECT DISTINCT COUNT(IDT)
FROM POSTING
WHERE IDS = allSeries.IDS )
)
INTO similarity
FROM SERIES s1
SERIES s2
WHERE s1.IDS = series
AND s2.IDS != series
);
IF similarity > 0.7 THEN
DBMS_OUTPUT.PUT_LINE('ok');
END LOOP;
END;
/
What the code does is take in a name, find it's ID, and compare it to other id's (and avoid comparing it to the same ID). I'm trying to print out "ok" whenever the similarity calculation is over 0.7 . No idea why this doesn't work.

First, if you have a cursor named the same as the table, what is the series%rowtype going to look like? The cursor or the table? Bad idea.
Second, you never execute the cursor to get the ID, so your subsequent cursor loop is looking for records that match allSeries.IDS which is null because you haven't populated it.
Try this as a starting point, although I'm guessing that you still will have work to do on your cursor query. Still, at least it points you to the right code structures...
CREATE OR REPLACE PROCEDURE PROC_LIST_SIMILAR_TVSERIES
(seriesName IN SERIES.NAME%TYPE)
AS
CURSOR seriesCur IS (SELECT IDS FROM SERIES WHERE NAME = seriesName);
allSeries seriesCur%ROWTYPE;
BEGIN
OPEN seriesCur;
FETCH seriesCur INTO allSeries;
IF seriesCur%NOTFOUND
THEN
CLOSE seriesCur;
raise_application_error(-20001,'Your SeriesName does not exist');
END IF;
CLOSE seriesCur;
FOR seriesRec IN
-- this query is a mess! Tried to fix up some aspects of it according to what I THINK you're trying to do.
(SELECT 2*
(SELECT COUNT(*) FROM DICT d WHERE d.idt IN (
SELECT DISTINCT IDT FROM POSTING WHERE IDS = allSeries.IDS
INTERSECT
SELECT DISTINCT IDT FROM POSTING WHERE IDS = allSeries.IDS))
/ ((SELECT DISTINCT COUNT(IDT) FROM POSTING WHERE IDS = allSeries.IDS) +
(SELECT DISTINCT COUNT(IDT) FROM POSTING WHERE IDS = allSeries.IDS) ) similarity
FROM SERIES s1, SERIES s2
WHERE s1.IDS = allSeries.IDS
AND s2.IDS != allSeries.IDS)
LOOP
IF seriesRec.similarity > 0.7 THEN
DBMS_OUTPUT.PUT_LINE('ok');
END IF;
END LOOP;
END;
/

I am still trying to understand the logic in the SQL statement. But i
have hopefully tried to remove the syntactical error. Hope it helps.
CREATE OR REPLACE PROCEDURE PROC_LIST_SIMILAR_TVSERIES(
seriesName IN SERIES.NAME%TYPE)
AS
similarity PLS_INTEGER;
BEGIN
FOR i IN
(SELECT IDS FROM SERIES WHERE NAME = seriesName
)
LOOP
--The logic i am still not able to understand
SELECT *,
(SELECT COUNT(*)
FROM DICT d
WHERE d.idt IN
( SELECT DISTINCT IDT FROM POSTING WHERE IDS = I.IDS
INTERSECT
SELECT DISTINCT IDT
FROM POSTING
WHERE IDS = allSeries.IDS
) / (
(SELECT DISTINCT COUNT(IDT) FROM POSTING WHERE IDS = i.IDS
) +
(SELECT DISTINCT COUNT(IDT) FROM POSTING WHERE IDS = I.IDS
) )
)
INTO similarity
FROM SERIES s1,
SERIES s2
WHERE s1.IDS = s2.IDS
AND s2.IDS != I.IDS;
IF similarity > 0.7 THEN
DBMS_OUTPUT.PUT_LINE('ok');
END IF;
END LOOP;
END;
/

Related

Perform loop and calculation on BigQuery Array type

My original data, B is an array of INT64:
And I want to calculate the difference between B[n+1] - B[n], hence result in a new table as follow:
I figured out I can somehow achieve this by using LOOP and IF condition:
DECLARE x INT64 DEFAULT 0;
LOOP
SET x = x + 1
IF(x < array_length(table.B))
THEN INSERT INTO newTable (SELECT A, B[OFFSET(x+1)] - B[OFFSET(x)]) from table
END IF;
END LOOP;
The problem is that the above idea doesn't work on each row of my data, cause I still need to loop through each row in my data table, but I can't find a way to integrate my scripting part into a normal query, where I can
SELECT A, [calculation script] from table
Can someone point me how can I do it? Or any better way to solve this problem?
Thank you.
Below actually works - BigQuery
select * replace(
array(select diff from (
select offset, lead(el) over(order by offset) - el as diff
from unnest(B) el with offset
) where not diff is null
order by offset
) as B
)
from `project.dataset.table` t
if to apply to sample data in your question - output is
You can use unnest() with offset for this purpose:
select id, a,
array_agg(b_el - prev_b_el order by n) as b_diffs
from (select t.*, b_el, lag(b_el) over (partition by t.id order by n) as prev_b_el
from t cross join
unnest(b) b_el with offset n
) t
where prev_b_el is not null
group by t.id, t.a

Update Oracle table based on nested select incl. package function

I have a query which calls a function in a package to create a percentage value as a sum of about 30 columns.
What I'd like to do is update each row based on the "sum of counted columns" as a percentage.
The select query is:
SELECT
checklist_id,
row_status,
eba_cm_checklist_std.get_row_percent_complete(pc.id,pc.checklist_id,pc.max_col_num) AS percent_complete
FROM
(
SELECT
(
SELECT
COUNT(id)
FROM
eba_cm_checklist_columns
WHERE
checklist_id = r.checklist_id
) AS max_col_num,
r.*
FROM
eba_cm_checklist_rows r
ORDER BY
r.row_order,
r.name
) pc
and the package.function that creates the percentage is "eba_cm_checklist_std.get_row_percent_complete".
The query outputs the following:
checklist_id row_status percent_complete
97176759931088640236098007249022291412 Red 0
97176759931071715274623402440576404948 Red 0
97176759931071715274623402440576404948 Red 0
97176759931071715274623402440576404948 Red 0
97176759931088640236098007249022291412 Red 0
97176759931088640236098007249022291412 Red 0
97176759931081386681180319473974054356 Grey 100
97176759931051163535689953744606399956 Grey 100
The difficulty I'm having is that the expression is using a nested select statement based on the output of a function and I can't get my head around how to update the physical "eba_cm_checklist_rows" table based on the output of the query.
Basically, I want to do the following:
update set row_status = 'Green' where percent_complete = 100
Another option is to use bulk select/update:
declare
type t_rid_arr is table of rowid index by pls_integer;
type t_row_status_arr is table of eba_cm_checklist_rows.row_status%type index by pls_integer;
type t_percent_complete_arr is table of number index by pls_integer;
l_rid_arr t_rid_arr;
l_row_status_arr t_row_status_arr;
l_percent_complete_arr t_percent_complete_arr;
begin
SELECT
rid,
row_status,
eba_cm_checklist_std.get_row_percent_complete(pc.id,pc.checklist_id,pc.max_col_num) AS percent_complete
bulk collect into
l_rid_arr,
l_row_status_arr,
l_percent_complete_arr
FROM (
SELECT r.rowid as rid,
r.*,
count(*) over(partition by checklist_id) AS max_col_num
FROM eba_cm_checklist_rows r
) pc;
forall i in 1..l_rid_arr.count
update eba_cm_checklist_rows
set row_status = case when l_percent_complete_arr(i) = 100 then 'Green' else row_status end
where rowid = l_rid_arr(i);
end;
/
you could try to use CTE and SUM() OVER() function, hope you find it useful.
Try something like this:
merge into eba_cm_checklist_rows trg
using (
SELECT
rid,
checklist_id,
row_status,
eba_cm_checklist_std.get_row_percent_complete(pc.id,pc.checklist_id,pc.max_col_num) AS percent_complete
FROM
(
SELECT r.rowid as rid,
r.*,
count(*) over(partition by checklist_id) AS max_col_num
FROM eba_cm_checklist_rows r
) pc
) src
on (trg.rowid = src.rid)
when matched then update set row_status = 'Green' where percent_complete = 100;
I don't like the idea to use function inside the SQL - it's probably could be replaced with normal SQL.

What should I use: dynamic SQL, static SQL or other solution in Oracle database

I'm senior JEE developer but I really new in PL/SQL development and i need help from seniors Oracle developers.
My problem is:
I need update 3 million of rows in database.
I made a procedure that worked so well in laboratory database.
But in production database, this is really slow and i really dont want lock the database.
The first trying was create this procedure:
CREATE OR REPLACE
PROCEDURE "PR_UPDATE_R" AS
BEGIN
DBMS_OUTPUT.PUT_LINE ('1.Find measures...') ;
FOR obj IN (
SELECT SC.ID ID
FROM SIMET.TB_SC SC
JOIN SIMET.TB_D D ON D.HASH = SC.HASH
JOIN (
SELECT
T1.ID_D_FK,
MIN (T1.UPDATE_DATE) LIMIT_DATE
FROM
TB_SBVH T1
WHERE
T1. VERSION > 10000
GROUP BY
T1.ID_D_FK
) SBVH ON SBVH.ID_D_FK = D .ID_D_PK
WHERE
(
SBVH.LIMIT_DATE IS NULL OR SC.TIMESTAMP_CREATION < SBVH.LIMIT_DATE
)
AND
sc.TIMESTAMP_CREATION >= TO_TIMESTAMP ('2012-08-27 00:00:00.000000','yyyy-mm-dd hh24:mi:ss.ff')
AND
sc.MEASURE_TIMESTAMP_CREATION <= TO_TIMESTAMP ('2015-07-06 00:00:00.00000','yyyy-mm-dd hh24:mi:ss.ff')
)LOOP
UPDATE TB_R SET total = total + other_field WHERE id_m_fk = obj.ID ;
UPDATE tb_sc
SET TOTAL_PCT = (
SELECT
CASE WHEN SUM (r.total_sent) > 0 THEN (
100 * SUM (r.total_lost) / SUM (r.total_sent)
)
ELSE
0
END
FROM
tb_r r
WHERE
ID = obj.ID
)
WHERE
ID = obj.ID;
END LOOP;
DBMS_OUTPUT.PUT_LINE ('2.Finished!!!') ;
END ;
But was not a good Idea , because the main query of this procedure (into the FOR loop), alone, spent 29 minutes!! locking database .
Then,
My second idea was share the main query in small ones , kind of bach, and run it with small loops.
This new query running each day per time , spend 3 seconds.
And , the idea is : i put this fast diary query into a loop and run with 30 a 30 days each time:
SELECT
*
FROM
(
SELECT
SC.ID_M_FK,
SC.TIMESTAMP_CREATION,
D.ID_D_PK
FROM
TB_SC SC
JOIN TB_D D ON D.HASH = SC.HASH
WHERE
SC.TIMESTAMP_CREATION >= TIMESTAMP '2013-02-06 00:00:00.000'
AND SC.TIMESTAMP_CREATION < TIMESTAMP '2013-02-07 00:00:00.000'
AND SC.IS_ACTIVE = 1
) CD -- CACHE_DATA
JOIN (
SELECT
T1.ID_D_FK,
MIN (T1.UPDATE_DATE) LIMIT_DATE
FROM
TB_SBVH T1
WHERE
T1. VERSION > 10000
GROUP BY
T1.ID_D_FK
) SBVH ON SBVH.ID_D_FK = CD.ID_D_PK
WHERE
(
SBVH.LIMIT_DATE IS NULL
OR
CD.TIMESTAMP_CREATION < SBVH.LIMIT_DATE
)
;
But in this question I can use a timestamp variable on query of cursor in a store procedure?
many people tell me that it is not a good solution.
Someone could help me please?

postgresql if else select query

I am trying to make a query that will compare first
first condition was to compare
the year now and
get the maximum year of view view_delinquency_allquarter
then, it will execute the first query
else second query
BEGIN
IF
select max(ctaxyear) as ctaxyear,
(select cast ( (SELECT EXTRACT(QUARTER FROM TIMESTAMP 'now()')) as int ) as yearnow) as yearnow
from view_delinquency_allquarter
where ctaxyear > year_next
THEN
select * from view_delinquency_allquarter;
ELSE
select * from view_delinquency;
END IF;
END
There are plenty of answers as well as documentation use declare var and then assignment var := (your query result)

plpgsql Error: RETURN cannot have a parameter in function returning void

I am trying to extract the count of records corresponding to a specific date and user_ids which do not have corresponding user_ids for the next later date in the database. This is the way I am trying to accomplish it (using plpgsql but not defining a function:
DO
$BODY$
DECLARE
a date[]:= array(select distinct start_of_period from monthly_rankings where balance_type=2);
res int[] = '{}';
BEGIN
FOR i IN array_lower(a,1) .. array_upper(a,1)-1
LOOP
res:=array_append(res,'SELECT COUNT(user_id) from (select user_id from monthly_rankings where start_of_period=a[i] except select user_id from monthly_rankings where start_of_period=a[i+1]) as b');
i:=i+1;
END LOOP;
RETURN res;
$BODY$ language plpgsql
I get an Error: could not Retrieve the result : ERROR: RETURN cannot have a parameter in function returning void
LINE 11: RETURN res;
I am new to this procedural language and cannot spot why the function is returning void. I do assign the values to variables , and I declared empty - not NULL - arrays. Is there a syntax or a more significant reasoning mistake?
1.) You cannot RETURN from a DO statement at all. You would have to CREATE FUNCTION instead.
2.) You don't need any of this. Use this query, which will be faster by an order of magnitude:
WITH x AS (
SELECT DISTINCT start_of_period
,rank() OVER (ORDER BY start_of_period) AS rn
FROM monthly_rankings
WHERE balance_type = 2
)
SELECT x.start_of_period, count(*) AS user_ct
FROM x
JOIN monthly_rankings m USING (start_of_period)
WHERE NOT EXISTS (
SELECT 1
FROM x x1
JOIN monthly_rankings m1 USING (start_of_period)
WHERE x1.rn = x.rn + 1
-- AND m1.balance_type = 2 -- only with matching criteria?
AND m1.user_id = m.user_id
)
-- AND balance_type = 2 -- all user_id from these dates?
GROUP BY x.start_of_period
ORDER BY x.start_of_period
This includes the last qualifying start_of_period, you may want to exclude it like in your plpgsql code.