Update table using with temp CTE - sql

%sql
with temp1 as
(
select req_id from table1 order by timestamp desc limit 8000000
)
update table1 set label = '1' where req_id in temp1 and req_query like '%\<\/script\>%'
update table1 set label = '1' where req_id in temp1 and req_query like '%aaaaa%'
update table1 set label = '1' where req_id in temp1 and req_query like '%bbbb%'
getting error:
com.databricks.backend.common.rpc.DatabricksExceptions$SQLExecutionException: org.apache.spark.sql.catalyst.parser.ParseException:
mismatched input 'in' expecting {, ';'}(line 6, pos 93)
can someone advise? what will be less costly to ask the database the same question?
select req_id from table1 order by timestamp desc limit 8000000

This is not allowed: where req_id in temp1. temp1 is not a list, but the result of a query and should be used like another table.
You might rather write something like this:
update table1
set label = '1'
where req_id in (select req_id from table1 order by timestamp desc limit 8000000)
and req_query like '%\<\/script\>%'

You can use update with join instead of IN clause.
Delta does not support updating tables using inner join but you can use MERGE I think. Something like this:
WITH temp1 AS (
SELECT req_id FROM table1 ORDER BY timestamp DESC LIMIT 8000000
)
MERGE INTO table1 a
USING temp1 b
ON (a.req_id = b.req_id)
WHEN MATCHED THEN
UPDATE SET a.label = CASE WHEN a.req_query LIKE '%\<\/script\>%' THEN '1'
WHEN a.req_query LIKE '%aaaaa%' THEN '2'
WHEN a.req_query LIKE '%bbbb%' THEN '3'
ELSE a.label
END

Related

DB2 - SQL UPDATE statement using JOINS and SELECT statement

Morning,
I'm running the following SELECT statement on a DB2 server (IBM Power System) and it returns the latest record from tableB based on a Timestamp (all good).
SELECT * FROM library1/tableA
JOIN library1/tableB on tableB.PRDCOD = tableA.NPROD
WHERE tableB.PRDCOD = '5520' and tableA.SPRTXT01 <> '0/9'
ORDER BY tableB.timstp DESC FETCH NEXT 1 ROWS ONLY
I now need to change this statement to update tableA and set field SPRTXT01 = '0/9', but only if tableB.SRVRSP= 'SUCCESSFUL' i.e. the latest record from tableB has a response of 'SUCCESSFUL'.
But I don't know how to format this statement correctly. Can anyone assist please?
I've tried the below, but this updated ALL rows in the table
UPDATE library1/tableA
SET tableA.SPRTXT01 = '0/9'
Where exists (
Select '1'
FROM library1/tableA
JOIN library1/tableB on
tableB.PRDCOD = tableA.NPROD
WHERE tableB.PRDCOD = '5520' and tableB.SRVRSP = 'SUCCESSFUL'
and tableA.SPRTXT01 <> '0/1'
ORDER BY tableB.timstp DESC FETCH NEXT 1 ROWS ONLY)
and I don't think it's applying the selection correctly i.e. rather than selecting the latest record from table B and then applying the RVSRP = 'SUCCESSFUL' check, it is only selecting the latest record for table B where SRVSRP = 'SUCCESSFUL'.
Thanks
Try this:
CREATE TABLE tableA (NPROD VARCHAR (10), SPRTXT01 VARCHAR (3));
CREATE TABLE tableB (PRDCOD VARCHAR (10), SRVRSP VARCHAR (20), timstp TIMESTAMP);
INSERT INTO tableA (NPROD, SPRTXT01)
VALUES ('5520', ''), ('XXXX', '');
INSERT INTO tableB (PRDCOD, SRVRSP, timstp)
VALUES
('5520', 'SUCCESSFUL', CURRENT TIMESTAMP)
, ('5520', 'UNSUCCESSFUL', CURRENT TIMESTAMP + 1 SECOND)
-- Comment out the next row to make it NOT update the SPRTXT01 column
, ('5520', 'SUCCESSFUL', CURRENT TIMESTAMP + 2 SECOND)
;
UPDATE tableA
SET tableA.SPRTXT01 = '0/9'
Where tableA.NPROD = '5520' AND tableA.SPRTXT01 <> '0/9'
AND exists
(
SELECT 1
FROM tableB
JOIN (SELECT PRDCOD, MAX (timstp) AS timstp FROM tableB GROUP BY PRDCOD) G
ON (G.PRDCOD, G.timstp) = (tableB.PRDCOD, tableB.timstp)
WHERE tableB.PRDCOD = tableA.NPROD
AND tableB.SRVRSP = 'SUCCESSFUL'
);
SELECT * FROM tableA;
NPROD
SPRTXT01
5520
0/9
XXXX
fiddle

Find the difference between 1 column depending on date

When I run this:
SELECT NAME FROM T1
WHERE _LOAD_DATETIME::date = '2022-01-31'
I see 62 rows
but when I do
SELECT NAME FROM T1
WHERE _LOAD_DATETIME::date = '2022-02-01'
I see 59
I want to see what NAME's are missing when it ran for _LOAD_DATETIME::date = '2022-02-01'
I thought this would work but it doesn't:
SELECT NAME FROM table
WHERE _LOAD_DATETIME::date = '2022-02-01'
AND NOT EXISTS (
SELECT NAME FROM
table
WHERE _LOAD_DATETIME::date = '2022-01-31')
You have to use MINUS for your purposes:
SELECT NAME FROM T1
WHERE _LOAD_DATETIME::date = '2022-01-31'
MINUS
SELECT NAME FROM T1
WHERE _LOAD_DATETIME::date = '2022-02-01'
If we are talking about PostgreSQL, you have to use EXCEPT instead of MINUS.
There are two set operators MINUS or EXCEPT you can use (they are aliases for each other)
SELECT column1 FROM values (1),(2),(3),(4)
MINUS
SELECT column1 FROM values (2),(3),(4),(5);
gives 1 if you want to see 5 you need to flip the order of SELECTs.

How to isolate result from parent string?

Do you have ideas how could I isolate result from the following task..
I have a column which contrains the following value:
col_1
10001A
10001A10002A
10001A10002A10003A
10004A
10004A10005B
10006A
10007A
10007A10008A
I should select only the rows which don't have offsprings -
col_1
10001A10002A10003A
10004A10005B
10006A
10007A10008A
You need a like condition to find those rows:
select *
from the_table t1
where not exists (select *
from the_table t2
where t2.col_1 like t1.col_1||'%'
and t1.col_1 <> t2.col_2);
Online example: https://rextester.com/GVGVV77242

NULL id in SQL statement

Select *
from tbl
where id = '1fa3bcdc9a1cf60f02a2ae774e2cf166'
or matching_id = 'ea74c270-65fd-46d0-898b-faf1a7bf7e16'
My id column below comes back with a NULL for one of my events. They are related using the matching_id key. Is there a way I can write a case statement to populate that id field so its not NULL?
You can use coalesce() and in case of null id fetch a value (any value?) for the id matching the matching_id like this:
Select
coalesce(
id,
(select max(id) from tbl where matching_id = 'ea74c270-65fd-46d0-898b-faf1a7bf7e16')
) as id,
matching_id,
event_name
from tbl
where
id = '1fa3bcdc9a1cf60f02a2ae774e2cf166'
or
matching_id = 'ea74c270-65fd-46d0-898b-faf1a7bf7e16'
Edit.
Try this in case of unknown values:
Select
coalesce(
t.id,
(select max(id) from tbl where matching_id = t.matching_id)
) as id,
coalesce(
t.matching_id,
(select max(matching_id) from tbl where id = t.id)
) as matching_id,
t.event_name
from tbl t
Ended up writing something like this..
[snapshot 1
with find_event AS (
Select tbl1. id, tbl1.matching_id
from tbl1
where tbl1.id IS NOT NULL
)
Select
CASE WHEN find_event. id IS NOT NULL
THEN find_event. id
WHEN find_event.id IS NULL
THEN tbl2. id
ELSE tbl2. id
END as final_id,
tbl2.matching_id,
tbl2. id,
tbl2.event_name,
find _event. id as find_canonical_id
from tbl2
left JOIN find_event on find_event.matching_id = tbl2.matching_id
where
tbl2.transaction_canonical_id = '1fa3bcdc9a1cf60f02a2ae774e2cf166'
or
tbl2.matching_id = 'ea74c270-65fd-46d0-898b-faf1a7bf7e16'
If you want to replace the column values with the values from your parameters, you could use something like this:
with ids (id, mid) as (
values
('1fa3bcdc9a1cf60f02a2ae774e2cf166', 'ea74c270-65fd-46d0-898b-faf1a7bf7e16')
)
Select coalesce(tbl.id, ids.id),
tbl.matching_id,
tbl.event_name
from tbl
join ids on ids.id = tbl.id or ids.mid = tbl.matching_id;
You may be able to utilize the ISNULL() function for this. Maybe something like:
Select
ISNULL(id, matching_id) AS [myid],
matching_id,
event_name
from tbl
where id = '1fa3bcdc9a1cf60f02a2ae774e2cf166'
or matching_id = 'ea74c270-65fd-46d0-898b-faf1a7bf7e16'

#1222 - The used SELECT statements have a different number of columns

Why am i getting a #1222 - The used SELECT statements have a different number of columns
? i am trying to load wall posts from this users friends and his self.
SELECT u.id AS pid, b2.id AS id, b2.message AS message, b2.date AS date FROM
(
(
SELECT b.id AS id, b.pid AS pid, b.message AS message, b.date AS date FROM
wall_posts AS b
JOIN Friends AS f ON f.id = b.pid
WHERE f.buddy_id = '1' AND f.status = 'b'
ORDER BY date DESC
LIMIT 0, 10
)
UNION
(
SELECT * FROM
wall_posts
WHERE pid = '1'
ORDER BY date DESC
LIMIT 0, 10
)
ORDER BY date DESC
LIMIT 0, 10
) AS b2
JOIN Users AS u
ON b2.pid = u.id
WHERE u.banned='0' AND u.email_activated='1'
ORDER BY date DESC
LIMIT 0, 10
The wall_posts table structure looks like id date privacy pid uid message
The Friends table structure looks like Fid id buddy_id invite_up_date status
pid stands for profile id. I am not really sure whats going on.
The first statement in the UNION returns four columns:
SELECT b.id AS id,
b.pid AS pid,
b.message AS message,
b.date AS date
FROM wall_posts AS b
The second one returns six, because the * expands to include all the columns from WALL_POSTS:
SELECT b.id,
b.date,
b.privacy,
b.pid.
b.uid message
FROM wall_posts AS b
The UNION and UNION ALL operators require that:
The same number of columns exist in all the statements that make up the UNION'd query
The data types have to match at each position/column
Use:
FROM ((SELECT b.id AS id,
b.pid AS pid,
b.message AS message,
b.date AS date
FROM wall_posts AS b
JOIN Friends AS f ON f.id = b.pid
WHERE f.buddy_id = '1' AND f.status = 'b'
ORDER BY date DESC
LIMIT 0, 10)
UNION
(SELECT id,
pid,
message,
date
FROM wall_posts
WHERE pid = '1'
ORDER BY date DESC
LIMIT 0, 10))
You're taking the UNION of a 4-column relation (id, pid, message, and date) with a 6-column relation (* = the 6 columns of wall_posts). SQL doesn't let you do that.
(
SELECT b.id AS id, b.pid AS pid, b.message AS message, b.date AS date FROM
wall_posts AS b
JOIN Friends AS f ON f.id = b.pid
WHERE f.buddy_id = '1' AND f.status = 'b'
ORDER BY date DESC
LIMIT 0, 10
)
UNION
(
SELECT id, pid , message , date
FROM
wall_posts
WHERE pid = '1'
ORDER BY date DESC
LIMIT 0, 10
)
You were selecting 4 in the first query and 6 in the second, so match them up.
Beside from the answer given by #omg-ponies; I just want to add that this error also occur in variable assignment. In my case I used an insert; associated with that insert was a trigger. I mistakenly assign different number of fields to different number of variables. Below is my case details.
INSERT INTO tab1 (event, eventTypeID, fromDate, toDate, remarks)
-> SELECT event, eventTypeID,
-> fromDate, toDate, remarks FROM rrp group by trainingCode;
ERROR 1222 (21000): The used SELECT statements have a different number of columns
So you see I got this error by issuing an insert statement instead of union statement. My case difference were
I issued a bulk insert sql
i.e. insert into tab1 (field, ...) as select field, ... from tab2
tab2 had an on insert trigger; this trigger basically decline duplicates
It turns out that I had an error in the trigger. I fetch record based on new input data and assigned them in incorrect number of variables.
DELIMITER ##
DROP TRIGGER trgInsertTrigger ##
CREATE TRIGGER trgInsertTrigger
BEFORE INSERT ON training
FOR EACH ROW
BEGIN
SET #recs = 0;
SET #trgID = 0;
SET #trgDescID = 0;
SET #trgDesc = '';
SET #district = '';
SET #msg = '';
SELECT COUNT(*), t.trainingID, td.trgDescID, td.trgDescName, t.trgDistrictID
INTO #recs, #trgID, #trgDescID, #proj, #trgDesc, #district
from training as t
left join trainingDistrict as tdist on t.trainingID = tdist.trainingID
left join trgDesc as td on t.trgDescID = td.trgDescID
WHERE
t.trgDescID = NEW.trgDescID
AND t.venue = NEW.venue
AND t.fromDate = NEW.fromDate
AND t.toDate = NEW.toDate
AND t.gender = NEW.gender
AND t.totalParticipants = NEW.totalParticipants
AND t.districtIDs = NEW.districtIDs;
IF #recs > 0 THEN
SET #msg = CONCAT('Error: Duplicate Training: previous ID ', CAST(#trgID AS CHAR CHARACTER SET utf8) COLLATE utf8_bin);
SIGNAL SQLSTATE '45000' SET MESSAGE_TEXT = #msg;
END IF;
END ##
DELIMITER ;
As you can see i am fetching 5 fields but assigning them in 6 var. (My fault totally I forgot to delete the variable after editing.
You are using MySQL Union.
UNION is used to combine the result from multiple SELECT statements into a single result set.
The column names from the first SELECT statement are used as the column names for the results returned. Selected columns listed in corresponding positions of each SELECT statement should have the same data type. (For example, the first column selected by the first statement should have the same type as the first column selected by the other statements.)
Reference: MySQL Union
Your first select statement has 4 columns and second statement has 6 as you said wall_post has 6 column.
You should have same number of column and also in same order in both statement.
otherwise it shows error or wrong data.