I have a table which I dynamically fill with some data I want to create some statistics for. I have one value which has some values following a certain pattern, so I created an additional column where I map the values to other values so I can group them.
Now before I run my statistics, I need to check if I have to remap these values which means that I have to check if there are null values in that column.
I can do a select like this:
select distinct 1
from my-table t
where t.status_rd is not null
;
The disadvantage is, that this returns exactly one row, but it has to perform a full select. Is there some way that I can stop the select for the first encounter? I'm not interested in the exact row, because when there is at least one row, I have to run an update on all of them anyway, but I would like to avoid running the update unnecessarily everytime.
In Oracle I would do it with rownum, but this doesn't exist in SQLite
select 1
from my-table t
where t.status_rd is not null
and rownum <= 1
;
Use LIMIT 1 to select the first row returned:
SELECT 1
FROM my_table t
WHERE t.status_rd IS NULL
LIMIT 1
Note: I changed the where clause from IS NOT NULL to IS NULL based on your problem description. This may or may not be correct.
Related
A common mistake when writing update statements is to forget the where clause, or to write it incorrectly, so that more rows than expected get updated. Is there a way to specify in the update statement itself that it should only update one row (and to fail if it would update more)?
Correcting an error in the number of rows updated requires thinking ahead - using a transaction, formatting it as a select first to check the number of rows - and then actually catching the error. It would be useful to be able to write in one place the expectation for the number of rows.
Combining a few facts, I found a working solution for Postgres.
A select will fail when comparing using = to a subquery that returns more than one row. (where x = (select ...))
Values can be returned from an update statement, using the returning clause. An update cannot be used as a subquery, but it can be used as a CTE, which can be used in a subquery.
Example:
create table foo (id int not null primary key, x int not null);
insert into foo (id, x) values (1,5), (2,5);
with updated as (update foo set x = 4 where x = 5 returning id)
select id from foo where id = (select id from updated);
The query containing the update fails with ERROR: more than one row returned by a subquery used as an expression, and the updates are not applied. If the update's where clause is adjusted to only match one row, the update succeeds.
I have a database table with about 100 columns (bulky, I know). I have about half of these columns which I will need to update iteratively to set Is Null or "" values to "TBD".
I compiled all 50 some columns which need to be updated into an update query with Access SQL code that looked something like this...
UPDATE tablename
SET tablename.column1="TBD", tablename.column2="TBD", tablename.column3="TBD"....
WHERE tablename.column1 Is Null OR tablename.column1="" OR tablename.column2 Is Null OR tablename.column2="" OR tablename.column3 Is Null OR tablename.column3=""....
Two issues: This query with 50 columns receives a "query is too complex" error.
This query is also just functionally wrong...because I'm losing data within these columns due to the WHERE statement. Records that had values populated which I did not want to update are being updated because of the OR clause.
My question is how can I go about updating all of these columns and setting their null or empty values to a particular value (in this case, "TBD")?
I know that I can just use a select query to select the columns I need to update, run it, and just CTRL+H to find & replace "" to "TBD". However, I'm worried about the potential for this to introduce errors into my dataset. I also know I could also go through column by column and update these values via an update query. However, this would be quite time consuming with 50+ columns & the iterative updates which I need to run on the entire dataset.
I'm leaning towards this latter route. I am still wondering if there are any other scripted options which I can build into a query to overcome such an issue, and that leads me here to you.
Thank you!
You could just run 50 queries:
UPDATE table SET column1="TBD" WHERE column1 IS NULL OR column1 = "";
An optimization could be:
Create a temporary table which determines which rows actually would need an update: Concatenate all column values such that a single NULL or empty would result in an record in your temp table. This way you only have to scan the base table once.
Use the keys from that table to focus on those rows only.
Etc.
That is safe and only updates your empty values (where as your previous query would have updated all columns unless you would have checked every value first with an IFNULL).
This query style also does not run into the too complex issue
You could issue one query as:
UPDATE tablename
SET column1 = iif(column1 is null or column1 = "", "TBD", column1),
column2 = iif(column2 is null or column2 = "", "TBD", column2),
. . .;
If you don't mind potentially updating all rows, you can leave out the where clause.
For a table like this one:
CREATE TABLE Users(
id SERIAL PRIMARY KEY,
name TEXT UNIQUE
);
What would be the correct one-query insert for the following operation:
Given a user name, insert a new record and return the new id. But if the name already exists, just return the id.
I am aware of the new syntax within PostgreSQL 9.5 for ON CONFLICT(column) DO UPDATE/NOTHING, but I can't figure out how, if at all, it can help, given that I need the id to be returned.
It seems that RETURNING id and ON CONFLICT do not belong together.
The UPSERT implementation is hugely complex to be safe against concurrent write access. Take a look at this Postgres Wiki that served as log during initial development. The Postgres hackers decided not to include "excluded" rows in the RETURNING clause for the first release in Postgres 9.5. They might build something in for the next release.
This is the crucial statement in the manual to explain your situation:
The syntax of the RETURNING list is identical to that of the output
list of SELECT. Only rows that were successfully inserted or updated
will be returned. For example, if a row was locked but not updated
because an ON CONFLICT DO UPDATE ... WHERE clause condition was not
satisfied, the row will not be returned.
Bold emphasis mine.
For a single row to insert:
Without concurrent write load on the same table
WITH ins AS (
INSERT INTO users(name)
VALUES ('new_usr_name') -- input value
ON CONFLICT(name) DO NOTHING
RETURNING users.id
)
SELECT id FROM ins
UNION ALL
SELECT id FROM users -- 2nd SELECT never executed if INSERT successful
WHERE name = 'new_usr_name' -- input value a 2nd time
LIMIT 1;
With possible concurrent write load on the table
Consider this instead (for single row INSERT):
Is SELECT or INSERT in a function prone to race conditions?
To insert a set of rows:
How to use RETURNING with ON CONFLICT in PostgreSQL?
How to include excluded rows in RETURNING from INSERT ... ON CONFLICT
All three with very detailed explanation.
For a single row insert and no update:
with i as (
insert into users (name)
select 'the name'
where not exists (
select 1
from users
where name = 'the name'
)
returning id
)
select id
from users
where name = 'the name'
union all
select id from i
The manual about the primary and the with subqueries parts:
The primary query and the WITH queries are all (notionally) executed at the same time
Although that sounds to me "same snapshot" I'm not sure since I don't know what notionally means in that context.
But there is also:
The sub-statements in WITH are executed concurrently with each other and with the main query. Therefore, when using data-modifying statements in WITH, the order in which the specified updates actually happen is unpredictable. All the statements are executed with the same snapshot
If I understand correctly that same snapshot bit prevents a race condition. But again I'm not sure if by all the statements it refers only to the statements in the with subqueries excluding the main query. To avoid any doubt move the select in the previous query to a with subquery:
with s as (
select id
from users
where name = 'the name'
), i as (
insert into users (name)
select 'the name'
where not exists (select 1 from s)
returning id
)
select id from s
union all
select id from i
I'm phrasing the question title poorly as I'm not sure what to call what I'm trying to do but it really should be simple.
I've a link / join table with two ID columns. I want to run a check before saving new rows to the table.
The user can save attributes through a webpage but I need to check that the same combination doesn't exist before saving it. With one record it's easy as obviously you just check if that attributeId is already in the table, if it is don't allow them to save it again.
However, if the user chooses a combination of that attribute and another one then they should be allowed to save it.
Here's an image of what I mean:
So if a user now tried to save an attribute with ID of 1 it will stop them, but I need it to also stop them if they tried ID's of 1, 10 so long as both 1 and 10 had the same productAttributeId.
I'm confusing this in my explanation but I'm hoping the image will clarify what I need to do.
This should be simple so I presume I'm missing something.
If I understand the question properly, you want to prevent the combination of AttributeId and ProductAttributeId from being reused. If that's the case, simply make them a combined primary key, which is by nature UNIQUE.
If that's not feasible, create a stored procedure that runs a query against the join for instances of the AttributeId. If the query returns 0 instances, insert the row.
Here's some light code to present the idea (may need to be modified to work with your database):
SELECT COUNT(1) FROM MyJoinTable WHERE AttributeId = #RequestedID
IF ##ROWCOUNT = 0
BEGIN
INSERT INTO MyJoinTable ...
END
You can control your inserts via a stored procedure. My understanding is that
users can select a combination of Attributes, such as
just 1
1 and 10 together
1,4,5,10 (4 attributes)
These need to enter the table as a single "batch" against a (new?) productAttributeId
So if (1,10) was chosen, this needs to be blocked because 1-2 and 10-2 already exist.
What I suggest
The stored procedure should take the attributes as a single list, e.g. '1,2,3' (comma separated, no spaces, just integers)
You can then use a string splitting UDF or an inline XML trick (as shown below) to break it into rows of a derived table.
Test table
create table attrib (attributeid int, productattributeid int)
insert attrib select 1,1
insert attrib select 1,2
insert attrib select 10,2
Here I use a variable, but you can incorporate as a SP input param
declare #t nvarchar(max) set #t = '1,2,10'
select top(1)
t.productattributeid,
count(t.productattributeid) count_attrib,
count(*) over () count_input
from (select convert(xml,'<a>' + replace(#t,',','</a><a>') + '</a>') x) x
cross apply x.x.nodes('a') n(c)
cross apply (select n.c.value('.','int')) a(attributeid)
left join attrib t on t.attributeid = a.attributeid
group by t.productattributeid
order by countrows desc
Output
productattributeid count_attrib count_input
2 2 3
The 1st column gives you the productattributeid that has the most matches
The 2nd column gives you how many attributes were matched using the same productattributeid
The 3rd column is how many attributes exist in the input
If you compare the last 2 columns and the counts
match - you can use the productattributeid to attach to the product which has all these attributes
don't match - then you need to do an insert to create a new combination
I am using oracle database
While inserting a row in a table, i need to find the max value of a column and increment it by 1, and use that value in row i am inserting.
INSERT INTO dts_route
(ROUTE_ID, ROUTE_UID, ROUTE_FOLDER)
VALUES (
(SELECT MAX(ROUTE_ID) + 1 FROM route) ,
ROUTE_UID,
ROUTE_FOLDER)
This works fine if their is at least one entry in table.
But returns null when their are no entries in table.
How can i get default value of 1 when their are no entries in table.
SELECT COALESCE(MAX(ROUTE_ID),0) ...
This is not a safe way of creating an auto-increment field. You can use an Oracle sequence to achieve this goal.
As for the null, you can use NVL to give a default value (say, 0) in case the function returns null.
Use sequence for the ID. You need to create sequence. See below link
http://www.basis.com/onlinedocs/documentation/b3odbc/sql_sequences.htm
Use:
INSERT INTO dts_route
(ROUTE_ID)
SELECT COALESCE(MAX(r.route_id), 0) +1
FROM ROUTE r
...but you really should be using a sequence to populate the value with a sequential numeric value:
CREATE SEQUENCE dts_route_seq;
...
INSERT INTO dts_route
(ROUTE_ID)
SELECT dts_route_seq.NEXTVAL
FROM DUAL;
Set a default for NULL
SELECT NVL(MAX(ROUTE_ID),0)
though using a sequence might be easier if you don't mind the odd gaps in your route ids
select 0 when null, then it will be 0+1 which is a correct number compared to null+1
SELECT isnull(MAX(ROUTE_ID),0) + 1 FROM route
If you are concerned about there being gaps in your route ids then create the sequence with the NOCACHE clause:
CREATE SEQUENCE dts_route_seq NOCACHE;
Note that there is a performance hit because Oracle now has to "commit" each time you increment the sequence.