Flagging records that meet a specific condition using SQL on a Firebird Database - sql

I am currently working with a Firebird database and I am trying to identify/flag records
that have a zero pos_buildcount value and where the previous record to the zero record has a pos_buildcount value that is not 255. I can do it in excel but I want to do it in a SQL query as excel can only deal with a certain amount of records. Essentially I want my results to look like the following image:
I have tried the following links to try select individual records but most solutions use a id number which my database doesn't have (I don't know why) or the solutions use the row_number() command which Firebird does not have.
Selecting the last record that meets a condition
Is there a way to access the “previous row” value in a SELECT statement?

use a id number which my database doesn't have
Then add it.
SQL is a language of sets: https://en.wikipedia.org/wiki/Set_(mathematics)
By definition, elements in the set are not located left or right to one another, to the North or to the South, or before and after. They just exists. In unordered way.
So, if you want your rows ordered - then you have to ADD some ordering value (field, column) into it, and then populate that value. And then reading from the table, if you want it, you may ask to order the results by that value.
As of now, Firebird has all the rights to read rows in any order it might like, even to change the order of the rows on disc (it will not, but that is implementation ndetail and it may change in future).
You have to add some ID column and then to populate it.
This would make "last record" or "prior record" or "row above" meaningful idioms: "object with ID one less than ID of current object". As of now in SQL terms there is no any meaning in "last record" and reliable formulating a query is not possible.
After that changing the flag column in the table becomes a trivial MERGE statement.
MERGE INTO MyTable T1
USING MyTable T2
ON (T2.ID = T1.ID - 1) AND (T2.pos_buildcount <> 255)
WHEN MATCHED THEN
UPDATE SET T1.flag = 1
http://www.firebirdsql.org/file/documentation/reference_manuals/fblangref25-en/html/fblangref25-dml-merge.html
Or, without adding the column into the table, but making a "virtual" column in the query, joining the table upon itself.
https://en.wikipedia.org/wiki/Join_(SQL)#Left_outer_join
SELECT T1.*, T2.Vehicle as Flag
FROM MyTable T1
LEFT JOIN MyTable T2
ON (T2.ID = T1.ID - 1) AND (T2.pos_buildcount <> 255)
Granted it would only work if ID column would be filled with integers with no gaps. Otherwise there would be other definitions what "prior record" means than T2.ID = T1.ID - 1, but the idea holds. Define what "prior" means in terms of real data on real columns and then you can compare the table with itself.

Related

How does a DELETE FROM with a SELECT in the WHERE work?

I am looking at an application and I found this SQL:
DELETE FROM Phrase
WHERE Modified < (SELECT Modified FROM PhraseSource WHERE Id = Phrase.PhraseId)
The intention of the SQL is to delete rows from Phrase where there are more recent rows in the PhraseSource table.
Now I know the tables Phrase and PhraseSource have the same columns and Modified holds the number of seconds since 1970 but I cannot understand how/why this works or what it is doing. When I look at it then it seems like on the left of the < it is just one column and on the right side of the > it would be many rows. Does it even make any sense?
The two tables are identical and have the following structure
Id - GUID primary key
...
...
...
Modified int
the ... columns are about ten columns containing text and numeric data. The PhraseSource table may or may not contain more recent rows with a higher number in the Modified column and different text and numeric data.
The SELECT statement in parenthesis is a sub-query or nested query.
What happens is that for each row, the Modified column value is compared with the result of the sub-query (which is run once for each of the rows in the Phrase table).
The sub-query has a WHERE statement, so it finds a row that has the same ID as the row from Phrase table that we are currently evaluating and returns the Modified value (which is for a sigle row, actually a single scalar value).
The two Modified values are compared and in case the Phrase's row has been modified before the row in PhraseSource, it is deleted.
As you can see this approach is not efficient, because it requires the database to run a separate query for each of the rows in the Phrase table (although I imagine that some databases might be smart enough to optimize this a little bit).
A better solution
The more efficient solution would be to use INNER JOIN:
DELETE p FROM Phrase p
INNER JOIN PhraseSource ps
ON p.PhraseId=ps.Id
WHERE p.Modified < ps.Modified
This should do the exact same thing as your query, but using efficient JOIN mechanism. INNER JOIN uses the ON statement to choose how to "match" rows in two different tables (which is done very efficiently by the DB) and then again compares the Modified values of matching rows.

Bahaviour of SQL update using one-to-many join

Imagine I have two tables, t1 and t2. t1 has two fields, one containing unique values called a and another field called value. Table t2 has a field that does not contain unique values called b and a field also called value.
Now, if I use the following update query (this is using MS Access btw):
UPDATE t1
INNER JOIN t2 ON t1.a=t2.b
SET t1.value=t2.value
If I have the following data
t1 t2
a | value b | value
------------ ------------
'm' | 0.0 'm'| 1.1
'm'| 0.2
and run the query what value ends up in t1.value? I ran some tests but couldn't find consistent behaviour, so I'm guessing it might just be undefined. Or this kind of update query is something that just shouldn't be done? There is a long boring story about why I've had to do it this way, but it's irrelevant to the technical nature of my enquiry.
This is known as a non deterministic query, it means exactly what you have found that you can run the query multiple times with no changes to the query or underlying data and get different results.
In practice what happens is the value will be updated with the last record encountered, so in your case it will be updated twice, but the first update will be overwritten by last. What you have absolutely no control over is in what order the SQL engine accesses the records, it will access them it whatever order it deems fit, this could be simply a clustered index scan from the begining, or it could use other indexes and access the clustered index in a different order. You have no way of knowing this. It is quite likely that running the update multiple times would yield the same result, because with no changes to the data the sql optimiser will use the same query plan. But again there is no guarantee, so you should not rely on a non determinstic query to get deterministic results.
EDIT
To update the value in T1 to the Maximum corresponding value in T2 you can use DMax:
UPDATE T1
SET Value = DMax("Value", "T2", "b=" & T1.a);
When you execute the query as you’ve indicated, the “value” that ends up in “t1” for the row ‘m’ will be, effectively, random, due to the fact that “t2” has multiple rows for the identity value ‘m’.
Unless you specifically specify that you want the maximum (max function), minimum (min function) or some-other aggregate of the collection of rows with the identity ‘m’ the database has no ability to make a defined choice and as such returns whatever value it first comes across, hence the inconsistent behaviour.
Hope this helps.

Is there a way to include a query that is non updateable in an UPDATE query? [duplicate]

This question already has an answer here:
Access SQL Update One Table In Join Based on Value in Same Table
(1 answer)
Closed 10 years ago.
For the following query:
UPDATE tempSpring_ASN AS t
SET t.RECORD_TYPE = (
SELECT TOP 1 RECORD_TYPE
FROM (
SELECT "A" AS RECORD_TYPE
FROM TABLE5
UNION ALL
SELECT "B" AS RECORD_TYPE
FROM TABLE5
)
);
I'm getting, "Operation must use an updateable query." I don't understand. I'm not trying to update a union query. I'm just trying to update an otherwise updatable recordset with the output (single value) of a union query.
(The solution provided at Access SQL Update One Table In Join Based on Value in Same Table (which is also provided below) does not work for this situation, contrary to what is indicated on the top of this page.)
This question is a reference to a previous question, data and code examples posted here:
Access SQL Update One Table In Join Based on Value in Same Table
Hi AYS,
In Access, an Update query needs to be run on a table.
As a UNION query is a combination of multiple sets of records, the result set is no longer a table, and cannot be the object of an Update query as the records in the result set are no longer uniquely identified with any one particular table (even if they theoretically could be). Access is hard-coded to treat every UNION query as read-only, which makes sense when there are multiple underlying tables. There are a number of other conditions (such as a sub-query in the SELECT statement) that also trigger this condition.
Think if it this way: if you were not using TOP 1 and your UNION query returned multiple results, how would JET know which result to apply to the unique record in your table? As such, JET treats all such cases the same.
Unfortunately, this is the case even when all of the data is being derived from the same table. In this case, it is likely that the JET optimizer is simply not smart enough to realize that this is the case and re-phrase the query in a manner that does not use UNION.
In this case, you can still get what you want by re-stating your query in such a way that everything references your base table. For example, you can use the following as a SELECT query to get the PO_NUM value of the previous SHP_CUSTOM_5 record:
SELECT
t1.SHP_CUSTOM_5
, t1.PO_NUM
, t1.SHP_CUSTOM_5 -1 AS PREV_RECORD
, (SELECT
t2.PO_NUM
FROM
tempSpring_ASN As t2
WHERE
t2.SHP_CUSTOM_5 = (t1.SHP_CUSTOM_5 -1)
) AS PREV_PO
FROM
tempSpring_ASN AS t1
;
You can then phrase this as an Update query as follows in order to perform the "LIN" updates:
UPDATE
tempSpring_ASN AS t1
SET
t1.RECORD_TYPE = "LIN"
WHERE
t1.PO_NUM=
(
SELECT
t2.PO_NUM
FROM
tempSpring_ASN As t2
WHERE
t2.SHP_CUSTOM_5 = (t1.SHP_CUSTOM_5 -1)
)
;
This code was successful in the tests I ran with dummy data.
Regarding your "HDR" updates, your are really performing two separate updates.
1) If the PO_NUM matches the previous record's PO_NUM, set RECORD_TYPE to "LIN"
2) If it is the first record, set RECORD_TYPE to "HDR"
It is not clear to me why there would be a benefit to performing these actions within one query. I would recommend performing the HDR update using the "TOP 1" by SHP_CUSTOM_5 method you used in your original SELECT query example, as this will be a relatively simple UPDATE query. It is possible to use IIF() within an Update query, but I do not know what additional benefit you would gain from the additional time and complexity that would be required (it would most likely only be much less readable).
Best of luck!

Using correlated subquery in SQL Server update statement gives unexpected result

I'm introducing a primary key column to a table that doesn't have one yet. After I have added a normal field Id (int) with a default value of 0 I tried using the following update statement to create unique values for each record:
update t1
set t1.id = (select count(*) from mytable t2 where t2.id <> t1.id)
from mytable t1
I would expect the subquery to be executed for each row because I'm referencing t1. Each time the subquery would be executed the count should be one less but it doesn't work.
The result is that Id is still 0 in every record. I have used this before on other DBMS with success. I'm using SQL Server 2008 here.
How do I generate unique values for each record and update the Id field?
Trying to explain why it doesn't work as you expect:
I would expect the subquery to be executed for each row because I'm referencing t1.
It is executed and it can affect all rows. But an UPDATE stetement is one statement and it is executed as one statement that affects a whole table (or a part of it if you have a WHERE clause).
Each time the subquery would be executed the count should be one less but it doesn't work.
You are expecting the UPDATE to be executed with one evaluation of the subquery per row. But it is one statement that is first evaluated - for all affected rows - and then the rows are changed (updated). (A DBMS may do it otherwise but the result should be nonetheless as if it was doing it this way).
The result is that Id is still 0 in every record.
That's the correct and expected behaviour of this statement when all rows have the same 0 value before execution. The COUNT(*) is 0.
I have used this before on other DBMS with success.
My "wild" guess is that you have used it in MySQL. (Correction/Update: my guess was wrong, this syntax for Update is not valid for MySQL, apparently the query was working "correctly" in Firebird). The UPDATE does not work in the standard way in that DBMS. It works - as you have learned - row by row, not with the full table.
I'm using SQL Server 2008 here.
This DBMS works correctly with UPDATE. You can write a different Update statement that would have the wanted results or, even better, use an autogenerated IDENTITY column, as others have advised.
The SQL is updating every row with the number of records where the ID doesn't equal 0. As all the rows ID equal 0 then there are no rows that are not equal to 0, and hence nothing gets updated.
Try looking at this answer here:
Adding an identity to an existing column

Access: DELETE query with WHERE <> clause: "At most one record can be returned by this subquery"

I want to run this DELETE query:
DELETE * FROM Table1 WHERE Table1.ID <> (SELECT Table1.ID FROM Table1 WHERE ....)
The query in brackets returns all the IDs I want to keep in Table1 (This query works on it's own, I tested it). But as soon as I add the DELETE part I get the following error: "At most one record can be returned by this subquery". I tried the Code
DELETE * FROM Table1 WHERE Table1.ID NOT IN (SELECT Table1.ID FROM Table1 WHERE ....)
But now my database hangs and doesn't do anything anymore...
Thank you for your help!
Actually * is not necessary for delete statements because it deletes the entire row that matches the where condition.
Generally <> (Not equal to) is used to provide a single static value. (like below)
DELETE FROM Table1 WHERE Table1.ID <> 1
But the subquery returns rows/records (with the selected columns). Hence you got the error: At most one record can be returned by this subquery
Coming to the NOT IN clause that you are using, it is the right way to do it, but NOT IN can become pretty complex if more number of rows are involved in the subquery as NOT IN does a join behind the hood.
The best way in your case is to use NOT(condition) as you already know the condition for which you need the required table IDs. (like below)
DELETE FROM Table1 WHERE NOT(condition)
This does your job pretty fast as there are no joins involved like in the earlier case.