SQLite Update Syntax for Correlated Subquery with Condition - sql

I'm working with GTFS data in a SQLite database. I have a StopTimes table with columns trip_id, stop_sequence, and departure_time (among others). I want to null the last departure_time (in the tuple with the largest stop_sequence) for each trip.
The most obvious way I could think of to do this was this query:
UPDATE StopTimes AS A
SET departure_time = NULL
WHERE NOT EXISTS
(
SELECT * FROM StopTimes B
WHERE B.stop_sequence > A.stop_sequence
)
Unfortunately, it looks like I can't use an alias for a table in an UPDATE statement in SQLite. The query fails with (in Python) "sqlite3.OperationalError: near "AS": syntax error", and this other style isn't allowed in SQLite either:
UPDATE A
SET departure_time = NULL
FROM StopTimes AS A
WHERE NOT EXISTS
(
SELECT * FROM StopTimes B
WHERE B.stop_sequence > A.stop_sequence
)
I've tried a couple other variants, such as using ">= ALL (SELECT stop_sequence... WHERE trip_id = ? ...)", but I can't fill in that question mark.
I've also tried this one, but it doesn't look like this is valid SQL:
UPDATE StopTimes
SET departure_time = NULL
WHERE (trip_id, stop_sequence) IN
(
SELECT trip_id, MAX(stop_sequence)
FROM StopTimes
GROUP BY trip_id
)
How can I reference an outer table's attributes in a subquery in an UPDATE query, with syntax that SQLite will accept? Is there some way I can reformulate my query to get around this limitation?

You cannot put an alias on the table updated in an UPDATE statement.
However, you can rename any other table in a subquery, so the table names will still be unambiguous:
UPDATE StopTimes
SET departure_time = NULL
WHERE NOT EXISTS
(
SELECT 1 FROM StopTimes AS B
WHERE B.stop_sequence > StopTimes.stop_sequence
)

Related

SQL Using subquery in where clause and use values in select

I can not figure out how to get it done where I have a main select list, in which I want to use values which I select in a sub query in where clause..My query have join statements as well..loosely code will look like this
if object_id('tempdb..#tdata') is not null drop table #tdata;
go
create table #tdata(
machine_id varchar(12),
temestamp datetime,
commit_count int,
amount decimal(6,2)
);
if object_id('tempdb..#tsubqry') is not null drop table #tsubqry;
go
--Edit:this is just to elaborate question, it will be a query that
--will return data which I want to use as if it was a temp table
--based upon condition in where clause..hope makes sense
create table #tsubqry(
machine_id varchar(12),
temestamp datetime,
amount1 decimal(6,2),
amount2 decimal(6,2)
);
insert into #tdata select 'Machine1','2018-01-02 13:03:18.000',1,3.95;
insert into #tdata select 'Machine1','2018-01-02 02:11:19.000',1,3.95;
insert into #tdata select 'Machine1','2018-01-01 23:18:16.000',1,3.95;
select m1.machine_id, m1.commit_count,m1.amount,***tsub***.amount1,***tsub***.amount2
from #tdata m1, (select amount1,amount2 from #tsubqry where machine_id=#tdata.machine_id) as ***tsub***
left join sometable1 m2 on m1.machine_id=m2.machine_id;
Edit: I have tried join but am getting m1.timestamp could not be bound as I need to compare these dates as well, here is my join statement
from #tdata m1
left join (
select amount1,amount2 from #tsubqry where cast(temestamp as date)<=cast(m1.temestamp as date)
) tt on m1.machine_id=tt.machine_id
Problem is I want to use some values which has to be brought in from another table matching a criteria of main query and on top of that those values from another table has to be in the column list of main query..
Hope it made some sense.
Thanks in advance
There seems to be several things wrong here but I think I see where you are trying to go with this.
The first thing I think you are missing is is the temestamp on the #tsubqry table. Since you are referencing it later I'm assuming it should be there. So, your table definition needs to include that field:
create table #tsubqry(
machine_id varchar(12),
amount1 decimal(6,2),
amount2 decimal(6,2),
temestamp datetime
);
Now, in your query I think you were trying to use some fields from #tdata in your suquery... Fine in a where clause, but not a from clause.
Also, I'm thinking you will not want to duplicate all the data from #tdata for each matching #tsubqry, so you probably want to group by. Based on these assumptions, I think your query needs to look something like this:
select m1.machine_id, m1.commit_count, m1.amount, sum(tt.amount1), sum(tt.amount2)
from #tdata m1
left join #tsubqry tt on m1.machine_id=tt.machine_id
and cast(tt.temestamp as date)<=cast(m1.temestamp as date)
group by m1.machine_id, m1.commit_count, m1.amount
MS SQL Server actually has a built-in programming construct that I think would be useful here, as an alternative solution to joining on a subquery:
-- # ###
-- # Legends
-- # ###
-- #
-- # Table Name and PrimaryKey changes (IF machine_id is NOT the primary key in table 2,
-- # suggest make one and keep machine_ie column as an index column).
-- #
-- #
-- # #tdata --> table_A
-- # #tsubqry --> table_B
-- #
-- =====
-- SOLUTION 1 :: JOIN on Subquery
SELECT
m1.machine_id,
m1.commit_count,
m1.amount,
m2.amount1,
m2.amount2
FROM table_A m1
INNER JOIN (
SELECT machine_id, amount1, amount2, time_stamp
FROM table_B
) AS m2 ON m1.machine_id = m2.machine_id
WHERE m1.machine_id = m2.machine_id
AND CAST(m2.time_stamp AS DATE) <= CAST(m1.time_stamp AS DATE);
-- SOLUTION 2 :: Use a CTE, which is specific temporary table in MS SQL Server
WITH table_subqry AS
(
SELECT machine_id, amount1, amount2, time_stamp
FROM table_B
)
SELECT
m1.machine_id,
m1.commit_count,
m1.amount,
m2.amount1,
m2.amount2
FROM table_A m1
LEFT JOIN table_subqry AS m2 ON m1.machine_id = m2.machine_id
WHERE m1.machine_id = m2.machine_id
AND CAST(m2.time_stamp AS DATE) <= CAST(m1.time_stamp AS DATE);
Also, I created an SQLFiddle in case it's helpful. I don't know what all your data looks like, but at least this fiddle has your schema and runs the CTE query qithout any errors.
Let me know if you need any more help!
SQL Fiddle
Source: Compare Time SQL Server
SQL SERVER Using a CTE
Cheers.

Hive Query with a large WHERE Condition

I am writing a HIVE query to pull about 2,000 unique keys from a table.
I keep getting this error - java.lang.StackOverflowError
My query is basic but looks like this:
SELECT * FROM table WHERE (Id = 1 or Id = 2 or Id = 3 Id = 4)
my WHERE clause goes all the way up to 2000 unique id's and I receive the error above. Does anyone know of a more efficient way to do this or get this query to work?
Thanks!
You may use the SPLIT and EXPLODE to convert the comma separated string to rows and then use IN or EXISTS.
using IN
SELECT * FROM yourtable t WHERE
t.ID IN
(
SELECT
explode(split('1,2,3,4,5,6,1998,1999,2000',',')) as id
) ;
Using EXISTS
SELECT * FROM yourtable t WHERE
EXISTS
(
SELECT 1 FROM (
SELECT
explode(split('1,2,3,4,5,6,1998,1999,2000',',')) as id
) s
WHERE s.id = t.id
);
Make use of the Between clause instead of specifying all unique ids:
SELECT ID FROM table WHERE ID BETWEEN 1 AND 2000 GROUP BY ID;
i you can create a table for these IDs and after use the condition of exist in the new table to get only your specific IDs

Firebird select from table distinct one field

The question I asked yesterday was simplified but I realize that I have to report the whole story.
I have to extract the data of 4 from 4 different tables into a Firebird 2.5 database and the following query works:
SELECT
PRODUZIONE_T t.CODPRODUZIONE,
PRODUZIONE_T.NUMEROCOMMESSA as numeroco,
ANGCLIENTIFORNITORI.RAGIONESOCIALE1,
PRODUZIONE_T.DATACONSEGNA,
PRODUZIONE_T.REVISIONE,
ANGUTENTI.NOMINATIVO,
ORDINI.T_DATA,
FROM PRODUZIONE_T
LEFT OUTER JOIN ORDINI_T ON PRODUZIONE_T.CODORDINE=ORDINI_T.CODORDINE
INNER JOIN ANGCLIENTIFORNITORI ON ANGCLIENTIFORNITORI.CODCLIFOR=ORDINI_T.CODCLIFOR
LEFT OUTER JOIN ANGUTENTI ON ANGUTENTI.IDUTENTE = PRODUZIONE_T.RESPONSABILEUC
ORDER BY right(numeroco,2) DESC, left(numeroco,3) desc
rows 1 to 500;
However the query returns me double (or more) due to the REVISIONE column.
How do I select only the rows of a single NUMEROCOMMESSA with the maximum REVISIONE value?
This should work:
select COD, ORDER, S.DATE, REVISION
FROM TAB1
JOIN
(
select ORDER, MAX(REVISION) as REVISION
FROM TAB1
Group By ORDER
) m on m.ORDER = TAB1.ORDER and m.REVISION = TAB1.REVISION
Here you go - http://sqlfiddle.com/#!6/ce7cf/4
Sample Data (as u set it in your original question):
create table TAB1 (
cod integer primary key,
n_order varchar(10) not null,
s_date date not null,
revision integer not null );
alter table tab1 add constraint UQ1 unique (n_order,revision);
insert into TAB1 values ( 1, '001/18', '2018-02-01', 0 );
insert into TAB1 values ( 2, '002/18', '2018-01-31', 0 );
insert into TAB1 values ( 3, '002/18', '2018-01-30', 1 );
The query:
select *
from tab1 d
join ( select n_ORDER, MAX(REVISION) as REVISION
FROM TAB1
Group By n_ORDER ) m
on m.n_ORDER = d.n_ORDER and m.REVISION = d.REVISION
Suggestions:
Google and read the classic book: "Understanding SQL" by Martin Gruber
Read Firebird SQL reference: https://www.firebirdsql.org/file/documentation/reference_manuals/fblangref25-en/html/fblangref25.html
Here is yet one more solution using Windowed Functions introduced in Firebird 3 - http://sqlfiddle.com/#!6/ce7cf/13
I do not have Firebird 3 at hand, so can not actually check if there would not be some sudden incompatibility, do it at home :-D
SELECT * FROM
(
SELECT
TAB1.*,
ROW_NUMBER() OVER (
PARTITION BY n_order
ORDER BY revision DESC
) AS rank
FROM TAB1
) d
WHERE rank = 1
Read documentation
https://community.modeanalytics.com/sql/tutorial/sql-window-functions/
https://www.firebirdsql.org/file/documentation/release_notes/html/en/3_0/rnfb30-dml-windowfuncs.html
Which of the three (including Gordon's one) solution would be faster depends upon specific database - the real data, the existing indexes, the selectivity of indexes.
While window functions can make the join-less query, I am not sure it would be faster on real data, as it maybe can just ignore indexes on order+revision cortege and do the full-scan instead, before rank=1 condition applied. While the first solution would most probably use indexes to get maximums without actually reading every row in the table.
The Firebird-support mailing list suggested a way to break out of the loop, to only use a single query: The trick is using both windows functions and CTE (common table expression): http://sqlfiddle.com/#!18/ce7cf/2
WITH TMP AS (
SELECT
*,
MAX(revision) OVER (
PARTITION BY n_order
) as max_REV
FROM TAB1
)
SELECT * FROM TMP
WHERE revision = max_REV
If you want the max revision number in Firebird:
select t.*
from tab1 t
where t.revision = (select max(t2.revision) from tab1 t2 where t2.order = t.order);
For performance, you want an index on tab1(order, revision). With such an index, performance should be competitive with any other approach.

How to fill Joining date and id based on following requirement?

I want to fill the joining date and id by creating a new view and output should be like second image
you might be looking for something like:
UPDATE mytable
SET tofill.ID = fillvalues.ID
,tofill.JOININGDATE = fillvalues.JOININGDATE
FROM mytable tofill
INNER JOIN
( SELECT DISTINCT ID, JOININGDATE, NAME
FROM mytable
WHERE ID IS NOT NULL
AND JOININGDATE IS NOT NULL
) fillvalues
ON tofill.NAME = fillvalues.NAME
WHERE tofill.ID IS NULL
OR tofill.JOININGDATE IS NULL
;
I am not familiar with Oracle, but statement should be teh same or similiar

Tricky MS Access SQL query to remove surplus duplicate records

I have an Access table of the form (I'm simplifying it a bit)
ID AutoNumber Primary Key
SchemeName Text (50)
SchemeNumber Text (15)
This contains some data eg...
ID SchemeName SchemeNumber
--------------------------------------------------------------------
714 Malcolm ABC123
80 Malcolm ABC123
96 Malcolms Scheme ABC123
101 Malcolms Scheme ABC123
98 Malcolms Scheme DEF888
654 Another Scheme BAR876
543 Whatever Scheme KJL111
etc...
Now. I want to remove duplicate names under the same SchemeNumber. But I want to leave the record which has the longest SchemeName for that scheme number. If there are duplicate records with the same longest length then I just want to leave only one, say, the lowest ID (but any one will do really). From the above example I would want to delete IDs 714, 80 and 101 (to leave only 96).
I thought this would be relatively easy to achieve but it's turning into a bit of a nightmare! Thanks for any suggestions. I know I could loop it programatically but I'd rather have a single DELETE query.
See if this query returns the rows you want to keep:
SELECT r.SchemeNumber, r.SchemeName, Min(r.ID) AS MinOfID
FROM
(SELECT
SchemeNumber,
SchemeName,
Len(SchemeName) AS name_length,
ID
FROM tblSchemes
) AS r
INNER JOIN
(SELECT
SchemeNumber,
Max(Len(SchemeName)) AS name_length
FROM tblSchemes
GROUP BY SchemeNumber
) AS w
ON
(r.SchemeNumber = w.SchemeNumber)
AND (r.name_length = w.name_length)
GROUP BY r.SchemeNumber, r.SchemeName
ORDER BY r.SchemeName;
If so, save it as qrySchemes2Keep. Then create a DELETE query to discard rows from tblSchemes whose ID value is not found in qrySchemes2Keep.
DELETE
FROM tblSchemes AS s
WHERE Not Exists (SELECT * FROM qrySchemes2Keep WHERE MinOfID = s.ID);
Just beware, if you later use Access' query designer to make changes to that DELETE query, it may "helpfully" convert the SQL to something like this:
DELETE s.*, Exists (SELECT * FROM qrySchemes2Keep WHERE MinOfID = s.ID)
FROM tblSchemes AS s
WHERE (((Exists (SELECT * FROM qrySchemes2Keep WHERE MinOfID = s.ID))=False));
DELETE FROM Table t1
WHERE EXISTS (SELECT 1 from Table t2
WHERE t1.SchemeNumber = t2.SchemeNumber
AND Length(t2.SchemeName) > Length(t1.SchemeName)
)
Depend on your RDBMS you may use function different from Length (Oracle - length, mysql - length, sql server - LEN)
delete ShortScheme
from Scheme ShortScheme
join Scheme LongScheme
on ShortScheme.SchemeNumber = LongScheme.SchemeNumber
and (len(ShortScheme.SchemeName) < len(LongScheme.SchemeName) or (len(ShortScheme.SchemeName) = len(LongScheme.SchemeName) and ShortScheme.ID > LongScheme.ID))
(SQL Server flavored)
Now updated to include the specified tie resolution. Although, you may get better performance doing it in two queries: first deleting the schemes with shorter names as in my original query and then going back and deleting the higher ID where there was a tie in name length.
I'd do this in multiple steps. Large delete operations done in a single step make me too nervous -- what if you make a mistake? There's no sql 'undo' statement.
-- Setup the data
DROP Table foo;
DROP Table bar;
DROP Table bat;
DROP Table baz;
CREATE TABLE foo (
id int(11) NOT NULL,
SchemeName varchar(50),
SchemeNumber varchar(15),
PRIMARY KEY (id)
);
insert into foo values (714, 'Malcolm', 'ABC123' );
insert into foo values (80, 'Malcolm', 'ABC123' );
insert into foo values (96, 'Malcolms Scheme', 'ABC123' );
insert into foo values (101, 'Malcolms Scheme', 'ABC123' );
insert into foo values (98, 'Malcolms Scheme', 'DEF888' );
insert into foo values (654, 'Another Scheme ', 'BAR876' );
insert into foo values (543, 'Whatever Scheme ', 'KJL111' );
-- Find all the records that have dups, find the longest one
create table bar as
select max(length(SchemeName)) as max_length, SchemeNumber
from foo
group by SchemeNumber
having count(*) > 1;
-- Find the one we want to keep
create table bat as
select min(a.id) as id, a.SchemeNumber
from foo a join bar b on a.SchemeNumber = b.SchemeNumber
and length(a.SchemeName) = b.max_length
group by SchemeNumber;
-- Select into this table all the rows to delete
create table baz as
select a.id from foo a join bat b where a.SchemeNumber = b.SchemeNumber
and a.id != b.id;
This will give you a new table with only records for rows that you want to remove.
Now check these out and make sure that they contain only the rows you want deleted. This way you can make sure that when you do the delete, you know exactly what to expect. It should also be pretty fast.
Then when you're ready, use this command to delete the rows using this command.
delete from foo where id in (select id from baz);
This seems like more work because of the different tables, but it's safer probably just as fast as the other ways. Plus you can stop at any step and make sure the data is what you want before you do any actual deletes.
If your platform supports ranking functions and common table expressions:
with cte as (
select row_number()
over (partition by SchemeNumber order by len(SchemeName) desc) as rn
from Table)
delete from cte where rn > 1;
try this:
Select * From Table t
Where Len(SchemeName) <
(Select Max(Len(Schemename))
From Table
Where SchemeNumber = t.SchemeNumber )
And Id >
(Select Min (Id)
From Table
Where SchemeNumber = t.SchemeNumber
And SchemeName = t.SchemeName)
or this:,...
Select * From Table t
Where Id >
(Select Min(Id) From Table
Where SchemeNumber = t.SchemeNumber
And Len(SchemeName) <
(Select Max(Len(Schemename))
From Table
Where SchemeNumber = t.SchemeNumber))
if either of these selects the records that should be deleted, just change it to a delete
Delete
From Table t
Where Len(SchemeName) <
(Select Max(Len(Schemename))
From Table
Where SchemeNumber = t.SchemeNumber )
And Id >
(Select Min (Id)
From Table
Where SchemeNumber = t.SchemeNumber
And SchemeName = t.SchemeName)
or using the second construction:
Delete From Table t Where Id >
(Select Min(Id) From Table
Where SchemeNumber = t.SchemeNumber
And Len(SchemeName) <
(Select Max(Len(Schemename))
From Table
Where SchemeNumber = t.SchemeNumber))