Postgresql - Compare a string with a null value - sql

I'm a bit puzzled because I believe the answer to this question is fairly simple but I've searched and tried several options and couldn't find the right answer.
The database is a PostgreSQL 13.1
I am using an API which sends a JSON object to a stored function in the database as follows:
select * from api.car_model_create( '{
"payload": {
"manufacturer": "ai3ZV7PzbP5dNo2fb9q9QGjj2nS5aWJm",
"name": "SR22",
"variant": "G2",
"subname": null
},
"stk": "YbtmjypXMqXb1U5WOq53DxkaxrbIxl4X"
}'::json
);
The function queries a table with the following structure:
CREATE TABLE app.car_models (
id INTEGER NOT NULL
, public_id CHARACTER(32) DEFAULT (api.random_string(32))
, name CHARACTER VARYING(64) NOT NULL
, variant CHARACTER VARYING(64)
, subname CHARACTER VARYING(64)
, designator CHARACTER VARYING(16)
, manufacturer INTEGER NOT NULL
, car_type INTEGER
, status INTEGER NOT NULL DEFAULT 1
) WITHOUT OIDS TABLESPACE app;
Inside the function is a query like this:
SELECT count(*)
FROM app.car_models am, app.business_entities be
WHERE am.manufacturer=be.id
AND be.public_id=$1
AND lower(am.name) = lower($2)
AND lower(am.variant) = lower($3)
AND lower(am.subname) = lower($4);
Everything works as expected until one of the values of "variant" or "subname" is passed as NULL.
These two are the $3 and $4 in the query. The table accepts null values for these two columns.
If the value of "variant" or "subname" passed by the JSON object is null the query doesn't return any result even if the row exists in the table. I must be missing something really simple or basic. But I can't find it.
EDIT TO ADD A MINIMUM REPRODUCIBLE EXAMPLE:
CREATE TABLE car_models (
id INTEGER NOT NULL
, name CHARACTER VARYING(64) NOT NULL
, variant CHARACTER VARYING(64)
, subname CHARACTER VARYING(64)
);
INSERT INTO car_models VALUES (1, 'Name 1', 'Variant 1', 'Subname 1');
INSERT INTO car_models VALUES (2, 'Name 2', 'Variant 2', 'Subname 2');
INSERT INTO car_models VALUES (3, 'Name 3', NULL, 'Subname 3');
INSERT INTO car_models VALUES (4, 'Name 4', 'Variant 4', NULL);
SELECT count(*)
FROM car_models
WHERE lower(name) = lower('Name 4')
AND lower(variant) = lower('Variant 4')
AND lower(subname) = lower(null);

Postgres supports standard null-safe equality operator is distinct from, which does exactly what you ask for:
SELECT count(*)
FROM car_models
WHERE lower(name) IS NOT DISTINCT FROM lower('Name 4')
AND lower(variant) IS NOT DISTINCT FROM lower('Variant 4')
AND lower(subname) IS NOT DISTINCT FROM lower(null);
Demo on DB Fiddle:
| count |
| ----: |
| 1 |
Side note: do you really need lower() here? It is not obvious from your sample data. Note that using this function prevents the database from taking advantage of an index (unless you do create an index on this specific expression).

Related

how can I reference another column inside postgres insert query?

I have the following table:
CREATE TABLE tab (
id SERIAL PRIMARY KEY,
code TEXT NOT NULL,
data TEXT
)
In some cases, I'd like to insert a new row ensuring that the code column is generated by the id column. In other cases the code is provided by the user.
For example:
INSERT INTO tab(code, data) VALUES ('code: ' || id::TEXT, 'my data');
The expected result is something like:
id
code
data
1
code: 1
abc
2
code: 2
xyz
INSERT INTO tab(code, data) VALUES ('user.provided.code', 'my data');
The expected result is something like:
id
code
data
1
code: 1
abc
2
code: 2
xyz
3
user.provided.code
xyz
Is it possibile in one statement?
It sounds like you want to default the coder to something based on the id. Unfortunately, this doesn't work in Postgres:
create table tab (
id integer primary key generated always as identity,
code text not null default ('code '||id::text),
data text
);
One option is a single statement that does both an insert and update:
with i as (
insert into tab (code, data)
values ('', 'my data'
returning *
)
update tab
set code = 'code: ' || id::TEXT
where tab.id in (select i.id from i);
Another is to use a trigger that assigns the value.
Use INSERT INTO .. SELECT as follows:
INSERT INTO tab(code, data)
select 'code: ' || id::TEXT, 'my data' from tab;
Note: In newly added data(above insert), you are missing to add id column data or I think it is auto generated.

Insert NULL values into INT & STRING columns

I need to insert null values ​​in integer and string columns but in the data set that it obtains before obtaining values ​​"---" for the case of string and "NA" for the case of INT, necessary when you have those values ​​are inserted as void I'm using SQL Sever and my query is like that.
INSERT INTO BOEMIC01
(MICRO_DATE, MICRO_YEAR, MICRO_MONTH, MICRO_WEEK, MICRO_DIVISION, MICRO_SUBDIVISION, MICRO_CODE_COUNTRY, MICRO_COUNTRY, MICRO_CODE_CENTER, MICRO_CENTER, MICRO_FREQ, MICRO_TOTAL_M, MICRO_TOTAL_Y, MICRO_TOTAL_Z, MICRO_ID_PROCESS, MICRO_DESC_PROCESS, MICRO_TOTAL_A, MICRO_TOTAL_B, MICRO_TOTAL_C, MICRO_ID_POINT, MICRO_DESC_POINT, MICRO_CODE_MATERIAL, MICRO_DESC_MATERIAL, MICRO_TOTAL_D, MICRO_TOTAL_E, MICRO_TOTAL_F) VALUES
(
'2019-01-15',
'2019',
'1',
'3',
'X',
'Y',
'P001',
'USA',
'USA1',
'USA2',
'Daily',
'2',
'2',
'0',
'158',
'Enva',
'2',
'2',
'0',
'344',
'2',
'---', --NULL
'---', --NULL
'NA', --NULL
'NA', --NULL
'NA' --NULL
)
To insert NULL values use the NULL keyword. As in:
insert into t (col)
values (null);
To insert the default value, which is usually null, just leave the column out of the column list entirely:
insert into t (col1)
values ('happy value');
col2 will be set to its default value -- which is NULL if no other default is defined.
If you are inserting values from another source, then use try_convert() or nullif() For example:
insert into t (col_str, col_int)
values (nullif(#col_str, '---'), try_convert(int, #col_int));
Also, as a matter of standard practice, you should always use query parameters to supply any literal values to your queries, to avoid "SQL injection" issues. For instance, your query would now read:
INSERT INTO BOEMIC01
(MICRO_DATE, MICRO_YEAR, MICRO_MONTH, [...])
VALUES(?, ?, ? [...])
Notice the ? symbols and notice also that they are not in quotes.
Then, when you execute the query, you supply both the SQL string and, separately, an array of values that are to be substituted for each ? in order of occurrence. Now, SQL cannot misinterpret any value as "part of the SQL," because it isn't. Different sets of parameter values can be supplied to the same SQL string each time.
You can use functions such as NULLIF() as mentioned in BJones' comment: NULLIF('---', ?) ... the parameter's value will be passed to the NULLIF function as its second argument. I think that's a fine way to handle your requirement (and it should have been offered as "an answer").
It really depends where the values are coming from, but if for example this insert is inside a Stored Procedure and the values are coming in via parameters then the following shows how to ensure null values for the cases specified. (Irrelevant columns left out for brevity):
INSERT INTO BOEMIC01 (... MICRO_CODE_MATERIAL, MICRO_DESC_MATERIAL, MICRO_TOTAL_D, MICRO_TOTAL_E, MICRO_TOTAL_F)
select ...
, case when #MICRO_CODE_MATERIAL != '---' then #MICRO_CODE_MATERIAL else null end
, case when #MICRO_DESC_MATERIAL != '---' then #MICRO_CODE_MATERIAL else null end
, try_convert(int, #MICRO_TOTAL_D)
, try_convert(int, #MICRO_TOTAL_E)
, try_convert(int, #MICRO_TOTAL_F)
However if you are passing this data from a client application then convert it client side.

How to write a WHERE clause for NULL value in ARRAY type column?

I created a table which contains a column of string ARRAY type as:
CREATE TABLE test
(
id integer NOT NULL,
list text[] COLLATE pg_catalog."default",
CONSTRAINT test_pkey PRIMARY KEY (id)
)
I then added rows which contain various values for that array, including an empty array and missing data (null):
insert into test (id, list) values (1, array['one', 'two', 'three']);
insert into test (id, list) values (2, array['four']);
insert into test (id, list) values (3, array['']);
insert into test (id, list) values (4, array[]::text[]); // empty array
insert into test (id, list) values (5, null); // missing value
pgAdmin shows table like this:
I am trying to get a row which contains a null value ([null]) in the list column but:
select * from test where list = null;
...returns no rows and:
select * from test where list = '{}';
...returns row with id = 4.
How to write WHERE clause which would target NULL value for column of ARRAY type?
demo:db<>fiddle
... WHERE list IS NULL
select * from test where list IS null;
Like this:
select * from test where list IS NULL;

PostgreSQL filter JSON column with mixed boolean and JSON values in it

My schema looks like this:
create table mytable
(
id integer not null,
value json not null
);
And the value column contains mixed data, both JSON and booleans like this:
id | value
----------
1 | {"key1": 1, "key2": 2}
2 | false
3 | {"key2": 3}
4 | true
These mixed values are accepted just fine by PostgreSQL.
Now I want to select all rows that either contain some json data or are true. I.e. I want to select rows 1, 3 and 4.
Here's the SQL query which I could come up with:
SELECT mytable.value
FROM mytable
WHERE CAST(CAST(mytable.value AS TEXT) AS BOOLEAN) IS NOT false;
But it fails with this message:
ERROR: argument of IS NOT FALSE must be type boolean, not type json
Your value false is not boolean but varchar (insert into mytable (id, value) values (4, true); fails, while insert into mytable (id, value) values (4, 'true'); works fine).
You can select all values that are not 'false' like this:
SELECT mytable.value FROM mytable WHERE mytable.value::text != 'false';

Advice on a complex SQL query for a BIRT dataset

I have the following (simplified) PostgreSQL database table containing info about maintenance done on a certain device:
id bigint NOT NULL,
"time" timestamp(0) with time zone,
action_name text NOT NULL,
action_info text NOT NULL DEFAULT ''::text,
The action_name field can have four values of interest:
MAINTENANCE_START
DEVICE_DEFECT
DEVICE_REPAIRED
MAINTENANCE_STOP
<other (irrelevant) values>
I have to do a BIRT report using the information from this table. I should have an entry in a table each time a MAINTENANCE_STOP action is encountered. If between this MAINTENANCE_STOP action and the its corresponding MAINTENANCE_START action (should be the MAINTENANCE_START action with the max "time" value smaller than that of the MAINTENANCE_STOP action) I encounter a DEVICE_DEFECT or DEVICE_REPAIRED action I should write in a table cell the string "Device not available", else I should write "Device available".
Also, I should compute the duration of the maintenance as the time difference between the MAINTENANCE_STOP action and the MAINTENANCE_START action.
I first attempted to do this in the SQL query, but now I'm not sure it's possible. What approach do you recommend?
My working snippet:
CREATE TABLE "log"
(
id bigint NOT NULL,
time timestamp(0) with time zone,
action_name text NOT NULL,
action_info text NOT NULL DEFAULT ''::text
);
insert into log(id,time,action_name,action_info) values ( 1, '2011-01-01', 'MAINTENANCE_START', 'maintenance01start');
insert into log(id,time,action_name,action_info) values ( 2, '2011-02-01', 'MAINTENANCE_START', 'maintenance02start');
insert into log(id,time,action_name,action_info) values ( 3, '2011-03-01', 'MAINTENANCE_START', 'maintenance03start');
insert into log(id,time,action_name,action_info) values ( 4, '2011-04-01', 'MAINTENANCE_START', 'maintenance04start');
insert into log(id,time,action_name,action_info) values ( 5, '2011-01-10', 'MAINTENANCE_STOP', 'maintenance01stop');
insert into log(id,time,action_name,action_info) values ( 6, '2011-02-10', 'MAINTENANCE_STOP', 'maintenance02stop');
insert into log(id,time,action_name,action_info) values ( 7, '2011-03-10', 'MAINTENANCE_STOP', 'maintenance03stop');
--insert into log(id,time,action_name,action_info) values ( 8, '2011-04-10', 'MAINTENANCE_STOP', 'maintenance04stop');
insert into log(id,time,action_name,action_info) values ( 9, '2011-02-05', 'DEVICE_DEFECT', 'maintenance02defect');
insert into log(id,time,action_name,action_info) values ( 10, '2011-03-05', 'DEVICE_REPAIRED', 'maintenance03repaired');
select
maintenance.start as start
, maintenance.stop as stop
, count (device_action.*) as device_actions
from (select
l_start.time as start
, (select time
from log l_stop
where l_stop.time > l_start.time
and l_stop.action_name = 'MAINTENANCE_STOP'
order by time asc limit 1) as stop
from log l_start
where l_start.action_name='MAINTENANCE_START' order by l_start.time asc) maintenance
left join log device_action
on device_action.time > maintenance.start
and device_action.time < maintenance.stop
and device_action.action_name like 'DEVICE_%'
group by maintenance.start
, maintenance.stop
order by maintenance.start asc
;
Be carefull with performance. If Postgres didn't optimize nested query, it would take O(n^2) time.
If you may:
Change structure. E.g. one table DEVICE_MAINTENANCES with maintenance ID and second table DEVICE_MAINTENANCE_ACTIONS with foreign key DEVICE_MAINTENANCES.ID. Queries will be simpler and faster.
If not, treat time as primary key (implict index)
If not, create index on time column.