I have requirement like in address table I have address of many people stored but the address of a particular person can change.but address table is insert only.Every time address of a particular person changes we insert a new row and change the address in that row and we have one common_id column in the table which tells for which person this address got changed.So for each change in particular row the new row is added having same common_id as original initial row.I wrote the query for getting the latest address(finding by time_created) whose status is N and type L for all the people.But it is failing
select *
from address wi
where type = 'L'
and status = 'N'
and time_created = (
select time_created
from (
select *
from address w1
where wi.common_id = w1.common_id
order by time_created desc
) t
where rownum = 1
)
the above query is failing but when I am writing below query it is passing and giving the expected result
select *
from address wi
where type = 'L'
and status = 'N'
and time_created = (
select max(time_created)
from address t
where t.common_id = wi.common_id
)
the above query is passing and giving the expected result.
I am amazed why why previous query is failing giving ora-00904 invalid identifier wi.common_id.Kindly help in understanding.
Your problem is that you are trying to make a correlation two levels down, and you can only do it in one level. That's why your second query works, and the first won't. As it doesn't recognize wi.common_id anymore, because you are doing two nested subqueries to find the latest date.
...
and time_created = (
select time_created -- Here it would still be recognized
from (
select * -- Here it won't anymore
...
Related
In Google BigQuery, I would like to delete a subset of records, based on the value of a specific column. It's a query that I need to run repeatedly and that I would like to run automatically.
The problem is that this specific column is of the form STRUCT<column_1 ARRAY (STRING), column_2 ARRAY (STRING), ... >, and I don't know how to use such a column in the where-clause when using the delete-command.
Here is basically what I am trying to do (this code does not work):
DELETE
FROM dataset.table t
LEFT JOIN UNNEST(t.category.column_1) AS type
WHERE t.partition_date = '2020-07-22'
AND type = 'some_value'
The error that I'm getting is: Syntax error: Expected end of input but got keyword LEFT at [3:1]
If I replace the DELETE with SELECT *, it does work:
SELECT *
FROM dataset.table t
LEFT JOIN UNNEST(t.category.column_1) AS type
WHERE t.partition_date = '2020-07-22'
AND type = 'some_value'
Does somebody know how to use such a column to delete a subset of records?
EDIT:
Here is some code to create a reproducible example with some silly data (fill in your own dataset and table name in all queries):
Suppose you want to delete all rows where category.type contains the value 'food'.
1 - create a table:
CREATE TABLE <DATASET>.<TABLE_NAME>
(
article STRING,
category STRUCT<
color STRING,
type ARRAY<STRING>
>
);
2 - Insert data into the new table:
INSERT <DATASET>.<TABLE_NAME>
SELECT "apple" AS article, STRUCT('red' AS color, ['fruit','food'] as type) AS category
UNION ALL
SELECT "cabbage" AS article, STRUCT('blue' AS color, ['vegetable', 'food'] as type) AS category
UNION ALL
SELECT "book" AS article, STRUCT('red' AS color, ['object'] as type) AS category
UNION ALL
SELECT "dog" AS article, STRUCT('green' AS color, ['animal', 'pet'] as type) AS category;
3 - Show that select works (return all rows where category.type contains the value 'food'; these are the rows I want to delete):
SELECT *
FROM <DATASET>.<TABLE_NAME>
LEFT JOIN UNNEST(category.type) type
WHERE type = 'food'
Initial Result
4 - My attempt at deleting rows where category.type contains 'food' does not work:
DELETE
FROM <DATASET>.<TABLE_NAME>
LEFT JOIN UNNEST(category.type) type
WHERE type = 'food'
Syntax error: Unexpected keyword LEFT at [3:1]
Desired Result
This is the code I used to delete the desired records (the records where category.type contains the value 'food'.)
DELETE
FROM <DATASET>.<TABLE_NAME> t1
WHERE EXISTS(SELECT 1 FROM UNNEST(t1.category.type) t2 WHERE t2 = 'food')
The embarrasing thing is that I've seen these kind of answers on similar questions (for example on update-queries). But I come from Oracle-SQL and I think that there you are required to connect your subquery with your main query in the WHERE-statement of the subquery (ie. connect t1 with t2), so I didn't understand these answers. That's why I posted this question.
However, I learned that BigQuery automatically understands how to connect table t1 and 'table' t2; you don't have to explicitly connect them.
Now it is possible to still do this (perhaps even recommended?):
DELETE
FROM <DATASET>.<TABLE_NAME> t1
WHERE EXISTS (SELECT 1 FROM <DATASET>.<TABLE_NAME> t2 LEFT JOIN UNNEST(t2.category.type) AS type WHERE type = 'food' AND t1.article=t2.article)
but a second difficulty for me was that my ID in my actual data is somehow hidden in an array>struct-construction, so I got stuck connecting t1 & t2. Fortunately this is not always an absolute necessity.
Since you did not provide any sample data I am going to explain using some dummy data. In case you add your sample data, I can update the answer.
Firstly,according to your description, you have only a STRUCT not an Array[Struct <col_1, col_2>].For this reason, you do not need to use UNNEST to access the values within the data. Below is an example how to access particular data within a STRUCT.
WITH data AS (
SELECT 1 AS id, STRUCT("Alex" AS name, 30 AS age, "NYC" AS city) AS info UNION ALL
SELECT 1 AS id, STRUCT("Leo" AS name, 18 AS age, "Sydney" AS city) AS info UNION ALL
SELECT 1 AS id, STRUCT("Robert" AS name, 25 AS age, "Paris" AS city) AS info UNION ALL
SELECT 1 AS id, STRUCT("Mary" AS name, 28 AS age, "London" AS city) AS info UNION ALL
SELECT 1 AS id, STRUCT("Ralph" AS name, 45 AS age, "London" AS city) AS info
)
SELECT * FROM data
WHERE info.city = "London"
Notice that the STRUCT is named info and the data we accessed is city and used it in the WHERE clause.
Now, in order to delete the rows that contains an specific value within the STRUCT , in your case I assume it would be your_struct.column_1, you can use DELETE or MERGE and DELETE. I have saved the above data in a table to execute the below examples, which have the same output,
First method: DELETE
DELETE FROM `project.dataset.table`
WHERE info.city = "Sydney"
Second method: MERGE and DELETE
MERGE `project.dataset.table` a
USING (SELECT * from `project.dataset.table` WHERE info.city ="London") b
ON a.info.city =b.info.city
WHEN matched and b.id=1 then
Delete
And the output for both queries,
Row id info.name info.age info.city
1 1 Alex 30 NYC
2 1 Robert 25 Paris
3 1 Ralph 45 London
4 1 Mary 28 London
As you can see the row where info.city = "Sydney" was deleted in both cases.
It is important to point out that your data is excluded from your source table. Therefore, you should be careful.
Note: Since you want to run this process everyday, you could use Schedule Query within BigQuery Console, appending or overwriting the results after each run. Also, it is a good practice not deleting data from your source table. Thus, consider creating a new table from your source table without the rows you do not desire.
I have a table that can store multiple descriptions for each code. However there is a flag in that table that is to indicate which of those is the main or primary description. In some instances, we have codes that have more than one with this flag set to Y which is not correct.
I am having trouble coming up with the SQL to get all the rows in that table that have more than one description set to Y.
I've used this SQL to identify rows that do not have ANY dsp_fg = 'Y'
select *
from table A
where dsp_fg = 'N'
and not exists (select 1 FROM table where cod_int_id = A.cod_int_id AND dsp_fg = 'Y')
But I am having trouble writing the SQL to get me the cod_int_ids that have more than one Y record, can someone help?
SELECT int_id FROM A
WHERE dsp_fg = 'Y'
GROUP BY int_id
HAVING count(1) > 1
This is not perfect, but it identifies what I need.
This is my first ever question :
Below is what I am trying to execute :
update SRM_SR_AuditLog
set MODIFIED_DATE = '1426816800'
, USER_X = 'Vaibhav via DB'
where REQUEST_ID in (
select max(REQUEST_ID) from SRM_SR_AuditLog
where ORIGINAL_REQUEST_ID = (
select SYSREQUESTID from SRM_Request
where REQUEST_NUMBER in (
'ASREQ0000136770', 'ASREQ0000137758', 'ASREQ0000138174',
'ASREQ0000138175', 'ASREQ0000138176', 'ASREQ0000138177',
'ASREQ0000138178', 'ASREQ0000138180', 'ASREQ0000138181',
'ASREQ0000138238', 'ASREQ0000138319', 'ASREQ0000138349',
'ASREQ0000139486', 'ASREQ0000140292', 'ASREQ0000140295',
'ASREQ0000140299', 'ASREQ0000140334', 'ASREQ0000140403',
'ASREQ0000140637', 'ASREQ0000140692' )
)
);
I know below wouldnt work :
ORIGINAL_REQUEST_ID = (
select SYSREQUESTID from SRM_Request where REQUEST_NUMBER in
Because query (select SYSREQUESTID from SRM_Request** where REQUEST_NUMBER = "XYZ") will return more one records but for each of that record in SRM_Request there are more than one records in table "SRM_SR_AuditLog". I want the latest/biggest request id reference from "SRM_ST_Audit" table for each of the SYSREQUESTID returned by above query.
Hope this makes sense.
I want to execute outer query for each value returned by inner query.
How can I proceed on this please ?
Thanks heaps
Vab
If I am understanding correctly, then I think what you want is this:
update SRM_SR_AuditLog set
MODIFIED_DATE = '1426816800',
USER_X = 'Vaibhav via DB' where
REQUEST_ID in
(
select max(REQUEST_ID) from SRM_SR_AuditLog where
ORIGINAL_REQUEST_ID IN
(
select SYSREQUESTID from SRM_Request where REQUEST_NUMBER in
(
'ASREQ0000136770', 'ASREQ0000137758', 'ASREQ0000138174', 'ASREQ0000138175', 'ASREQ0000138176', 'ASREQ0000138177', 'ASREQ0000138178', 'ASREQ0000138180', 'ASREQ0000138181', 'ASREQ0000138238', 'ASREQ0000138319', 'ASREQ0000138349', 'ASREQ0000139486', 'ASREQ0000140292', 'ASREQ0000140295', 'ASREQ0000140299', 'ASREQ0000140334', 'ASREQ0000140403', 'ASREQ0000140637', 'ASREQ0000140692'
)
)
group by ORIGINAL_REQUEST_ID
)
This will find all request IDs in SRM_Request for the given request numbers; find all rows in SRM_SR_AuditLog whose original request ID is in those request IDs; find the maximum region ID for each unique original request ID; and update the rows with those request IDs.
Thanks for the reply Dave.
SRM_Request has one to many mapping with SRM_SR_AuditLog table.
updating to "IN" will scan all records and find just one record with max(REQUEST_ID).
select SYSREQUESTID from SRM_Request where REQUEST_NUMBER in
(
'ASREQ0000136770', 'ASREQ0000137758', 'ASREQ0000138174', 'ASREQ0000138175', 'ASREQ0000138176', 'ASREQ0000138177', 'ASREQ0000138178', 'ASREQ0000138180', 'ASREQ0000138181', 'ASREQ0000138238', 'ASREQ0000138319', 'ASREQ0000138349', 'ASREQ0000139486', 'ASREQ0000140292', 'ASREQ0000140295', 'ASREQ0000140299', 'ASREQ0000140334', 'ASREQ0000140403', 'ASREQ0000140637', 'ASREQ0000140692'
)
This will return 20 references
for each of these 20 references - I want 20 references from SRM_SR_AuditLog via max(REQUEST_ID).
"select max(REQUEST_ID) from SRM_SR_AuditLog where "
"IN" or "=" wouldnt help.
I came across the following table structure and I need to perform a certain type of query upon it.
id
first_name
last_name
address
email
audit_parent_id
audit_entry_type
audit_change_date
The last three fields are for the audit trail. There is a convention that says: all original entries have the value "0" for "audit_parent_id" and the value "master" for "audit_entry_type". All the modified entries have the value of their parent id for audit_parent_id" and the value "modified" for the "audit_entry_type".
Now what I want to do is to be able to get the original value and the modified value for a field and I want to make this with less queries possible.
Any ideas? Thank you.
Assuming a simple case, when you want to get the latest adress value change for the record with id 50, this query fits your needs.
select
p.id,
p.adress as original_address,
(select p1.adress from persons p1 where p1.audit_parent_id = p.id order by audit_change_date desc limit 1) as latest_address
from
persons p -- Assuming it's the table name
where
p.id = 50
But this assumes that, even if the address value doesn't change between one audit to the other, it remains the same in the field.
Here's another example, showing all persons that had an address change:
select
p.id,
p.adress as original_address,
(select p1.adress from persons p1 where p1.audit_parent_id = p.id order by audit_change_date desc limit 1) as latest_address
from
persons p -- Assuming it's the table name
where
p.audit_parent_id = 0
and
p.adress not like (select p1.adress from persons p1 where p1.audit_parent_id = p.id order by audit_change_date desc limit 1)
This can be solved with pure SQL in modern Postgres using WITH RECURSIVE.
For PostgreSQL 8.3, this plpgsql function does the job while it is also a decent solution for modern PostgreSQL. You want to ..
get the original value and the modified value for a field
The demo picks first_name as filed:
CREATE OR REPLACE FUNCTION f_get_org_val(integer
, OUT first_name_curr text
, OUT first_name_org text) AS
$func$
DECLARE
_parent_id int;
BEGIN
SELECT INTO first_name_curr, first_name_org, _parent_id
first_name, first_name, audit_parent_id
FROM tbl
WHERE id = $1;
WHILE _parent_id <> 0
LOOP
SELECT INTO first_name_org, _parent_id
first_name, audit_parent_id
FROM tbl
WHERE id = _parent_id;
END LOOP;
END
$func$ LANGUAGE plpgsql;
COMMENT ON FUNCTION f_get_org_val(int) IS 'Get current and original values for id.
$1 .. id';
Call:
SELECT * FROM f_get_org_val(123);
This assumes that all trees have a root node with parent_id = 0. No circular references, or you will end up with an endless loop. You might want to add a counter and exit the loop after x iterations.
2 records in above image are from Db, in above table Constraint are (SID and LINE_ITEM_ID),
SID and LINE_ITEM_ID both column are used to find a unique record.
My issues :
I am looking for a query it should fetch the recored from DB depending on conditions
if i search for PART_NUMBER = 'PAU43-IMB-P6'
1. it should fetch one record from DB if search for PART_NUMBER = 'PAU43-IMB-P6', no mater to which SID that item belong to if there is only one recored either under SID =1 or SID = 2.
2. it should fetch one record which is under SID = 2 only, from DB on search for PART_NUMBER = 'PAU43-IMB-P6', if there are 2 items one in SID=1 and other in SID=2.
i am looking for a query which will search for a given part_number depending on Both SID 1 and 2, and it should return value under SID =2 and it can return value under SID=1 only if the there are no records under SID=2 (query has to withstand a load of Million record search).
Thank you
Select *
from Table
where SID||LINE_ITEM_ID = (
select Max(SID)||Max(LINE_ITEM_ID)
from table
where PART_NUMBER = 'PAU43-IMB-P6'
);
If I understand correctly, for each considered LINE_ITEM_ID you want to return only the one with the largest value for SID. This is a common requirement and, as with most things in SQL, can be written in many different ways; the best performing will depend on many factors, not least of which is the SQL product you are using.
Here's one possible approach:
SELECT DISTINCT * -- use a column list
FROM YourTable AS T1
INNER JOIN (
SELECT T2.LINE_ITEM_ID,
MAX(T2.SID) AS max_SID
FROM YourTable AS T2
GROUP
BY T2.LINE_ITEM_ID
) AS DT1 (LINE_ITEM_ID, max_SID)
ON T1.LINE_ITEM_ID = DT1.LINE_ITEM_ID
AND T1.SID = DT1.max_SID;
That said, I don't recall seeing one that relies on the UNION relational operator. You could easily rewrite the above using the INTERSECT relational operator but it would be more verbose.
Well in my case it worked something like this:
select LINE_ITEM_ID,SID,price_1,part_number from (
(select LINE_ITEM_ID,SID,price_1,part_number from Table where SID = 2)
UNION
(select LINE_ITEM_ID,SID,price_1,part_number from Table SID = 1 and line_item_id NOT IN (select LINE_ITEM_ID,SID,price_1,part_number from Table SID = 2)))
This query solved my issue..........