Need to replace all the null values with '0' in Vertica - sql

I have a vertica table, "CUSTOMER" which contains around 10 columns. Each column contains few null values. So I have to write one query which will replace all the null values to '0'.
Is it possible to do it in vertica. Can anyone please help me on that.

You use coalesce():
select coalesce(col1, 0) as col1, . . .
from t;
You can incorporate similar logic into an update as well.

In a SELECT, as #GordonLinoff says, you use COALESCE(), or the slightly faster NVL(), IFNULL() or ISNULL() functions (they are all synonyms of each other and take exactly two arguments, while COALESCE() is more flexible - at a cost - with a variable-length argument list, returning the first non-null value of a list of arguments of varying length).
For updating, strive to update only the rows you need to update, and go, for each column:
UPDATE t SET col1=0 WHERE col1 IS NULL;
UPDATE t SET col2=0 WHERE col2 IS NULL;
Well, in an extreme case, you might end up updating the same row as often as its number of columns, then you have won nothing - but it's worth planning to minimise how often you update.
Or, you might consider:
UPDATE t SET
col1 = NVL(col1,0)
, col2 = NVL(col2,0)
, col3 = NVL(col3,0)
[...]
WHERE col1 IS NULL
OR col2 IS NULL
OR col3 IS NULL
[...]
;
Being columnar, and due to the fact that each UPDATE, in Vertica, is a DELETE and an INSERT anyway, it makes no difference if you update just one column or all columns.

Related

How to know which column has changed on UPDATE?

In a statement like this:
update tab1 set (col1,col2)=(val1,val2) returning "?"
I send whole row for update on new values, RETURNING * gives back the whole row, but is there a way to check which exactly column has changed when others remained the same?
I understand that UPDATE rewrites the values, but maybe there is some built-in function for such comparison?
Basically, you need the pre-UPDATE values of updated rows to compare. That's kind of hard as RETURNING only returns post-UPDATE state. But can be worked around. See:
Return pre-UPDATE column values using SQL only
So this does the basic trick:
WITH input(col1, col2) AS (
SELECT 1, text 'post_up' -- "whole row"
)
, pre_upd AS (
UPDATE tab1 x
SET (col1, col2) = (i.col1, i.col2)
FROM input i
JOIN tab1 y USING (col1)
WHERE x.col1 = y.col1
RETURNING y.*
)
TABLE pre_upd
UNION ALL
TABLE input;
db<>fiddle here
This is assuming that col1 in your example is the PRIMARY KEY. We need some way to identify rows unambiguously.
Note that this is not safe against race conditions between concurrent writes. You need to do more to be safe. See related answer above.
The explicit cast to text I added in the CTE above is redundant as text is the default type for string literals anyway. (Like integer is the default for simple numeric literals.) For other data types, explicit casting may be necessary. See:
Casting NULL type when updating multiple rows
Also be aware that all updates write a new row version, even if nothing changes at all. Typically, you'd want to suppress such costly empty updates with appropriate WHERE clauses. See:
How do I (or can I) SELECT DISTINCT on multiple columns?
While "passing whole rows", you'll have to check on all columns that might change, to achieve that.

UPDATING a table which is selected using SELECT query in SQL

I want to use the Update keyword with select something like
UPDATE(select col1,col2,col3 from UNPIVOTED_TABLE)
SET col1=0
WHERE col1 IS NULL
SET col2=0
WHERE col2 is NULL
SET col3=0
WHERE col3 is NULL
I know my syntax is not right but this basically is what i am trying to achieve
I am selecting 3 columns and there are some null values which i want to update and set it as 0
Also i cannot update the table itself since the original table was UNPIVOTED and i am PIVOTING it in the select statement and i need the pivoted result (that is the columns i have selected) (col1,col2,col3)
Also i am using amazon Athena if that is relevant
If I followed you correctly, you just want coalesce():
select
coalesce(col1, 0) col1,
coalesce(col2, 0) col2,
coalesce(col3, 0) col3
from unpivoted_table
colaesce() checks if the first argument is null: if it doesn't, it returns the original value as-is, otherwise it returns the value given a second argument instead.
In case you are using Athena, I can assume you have read only access and cannot really update the original data.
In your case, if you'd like to represent the nulls as 0 you can use `IFNULL(column, 0)
For more information about IFNULL you can read here

PL/SQL Oracle condition equals

I think I'm encountering a fairly simple problem in PL/SQL on an Oracle Database(10g) and I'm hoping one of you guys can help me out.
I'm trying to explain this as clear as possible, but it's hard for me.
When I try to compare varchar2 values of 2 different tables to check if I need to create a new record or I can re-use the ID of the existing one, the DB (or I) compares these values in a wrong way. All is fine when both the field contain a value, this results in 'a' = 'a' which it understands. But when both fields are NULL (or '' which Oracle will turn into NULL) it can not compare the fields.
I found a 'solution' to this problem but I'm certain there is a better way.
rowTable1 ROWTABLE1%ROWTYPE;
iReUsableID INT;
SELECT * INTO rowTable1
FROM TABLE1
WHERE TABLE1ID = 'someID';
SELECT TABLE2ID INTO iReUsableID
FROM TABLE2
WHERE NVL(SOMEFIELDNAME,' ') = NVL(rowTable1.SOMEFIELDNAME,' ');
So NVL changes the null value to ' ' after which it will compare in the right way.
Thanks in advance,
Dennis
You can use LNNVL function (http://docs.oracle.com/cd/B19306_01/server.102/b14200/functions078.htm) and reverse the condition:
SELECT TABLE2ID INTO iReUsableID
FROM TABLE2
WHERE LNNVL(SOMEFIELDNAME != rowTable1.SOMEFIELDNAME);
Your method is fine, unless one of the values could be a space. The "standard" way of doing the comparison is to explicitly compare to NULL:
WHERE col1 = col2 or col1 is null and col2 is null
In Oracle, comparisons on strings are encumbered by the fact that Oracle treats the empty string as NULL. This is a peculiarity of Oracle and not a problem in other databases.
In Oracle (or any RDBMS I believe), one NULL is not equal to another NULL. Therefore, you need to use the workaround that you have stated if you want to force 2 NULL values to be considered the same. Additionally, you might want to default NULL values to '' (empty) rather than ' ' (space).
From Wikipedia (originally the ISO spec, but I couldn't access it):
Since Null is not a member of any data domain, it is not considered a "value", but rather a marker (or placeholder) indicating the absence of value. Because of this, comparisons with Null can never result in either True or False, but always in a third logical result, Unknown.
As mentioned by Jan Spurny, you can use LNNVL for comparison. However, it would be wrong to say that a comparison is actually being made when both values being compared are NULL.
This is indeed a simple and usable way to compare nulls.
You cannot compare NULLS directly since NULL is not equal NULL.
You must provide your own logic who you would like to compare, what you've done with NVL().
Take in mind, you are treating NULLS as space, so ' ' in one table would be equal to NULL in another table in your case.
There are some other ways (e.g. LNNVL ) but they are not some kind of a "better" way, I think.

How should I deal with null parameters in a PL/SQL stored procedure when I want to use them in comparisons?

I have a stored procedure with a parameter name which I want to use in a where clause to match the value of a column i.e. something like
where col1 = name
Now of course this fails to match null to null because of the way null works. Do I need to do
where ((name is null and col1 is null) or col1 = name)
in situations like this or is there a more concise way of doing it?
You can use decode function in the following fashion:
where decode(col1, name, 0) is not null
Cite from SQL reference:
In a DECODE function, Oracle considers
two nulls to be equivalent.
I think your own suggestion is the best way to do it.
What you have done is correct. There is a more concise way, but it isn't really better:
where nvl(col1,'xx') = nvl(name,'xx')
The trouble is, you have to make sure that the value you use for nulls ('xx' is my example) couldn't actually be a real value in the data.
If col1 is indexed, it would be best (performance-wise) to split the query in two:
SELECT *
FROM mytable
WHERE col1 = name
UNION ALL
SELECT *
FROM mytable
WHERE name IS NULL AND col1 IS NULL
This way, Oracle can optimize both queries independently, so the first or second part won't be actually executed depending on the name passed being NULL or not.
Oracle, though, does not index NULL values of fields, so searching for a NULL value will always result in a full table scan.
If your table is large, holds few NULL values and you search for them frequently, you can create a function-based index:
CREATE INDEX ix_mytable_col1__null ON mytable (CASE WHEN col1 IS NULL THEN 1 END)
and use it in a query:
SELECT *
FROM mytable
WHERE col1 = name
UNION ALL
SELECT *
FROM mytable
WHERE CASE WHEN col1 IS NULL THEN 1 END = CASE WHEN name IS NULL THEN 1 END
Keep it the way you have it. It's more intuitive, less buggy, works in any database, and is faster. The concise way is not always the best. See (PLSQL) What is the simplest expression to test for a changed value in an Oracle on-update trigger?
SELECT * FROM table
WHERE paramater IS NULL OR column = parameter;

Optional Parameters In Stored Procs - CASE vs. OR

Is there any difference Performance wise between these filter methods?
Method 1: WHERE (#Col1 IS NULL OR t.column = #Col1)
Method 2: WHERE 1 = case when #col1 is null then 1 else case when col1 = #col1 then 1 else 0 end end
If you know your Col1 column doesn't itself contain any null values, you can do this:
WHERE Col1 = COALESCE(#Col1, Col1)
Otherwise your CASE statement should typically do a little better than the OR. I add emphasis to "typically" because ever table is different. You should always profile to know for sure.
Unfortunately, typically the fastest way is to use dynamic sql to exclude the condition from the query in the first place if the parameter is null. But of course save that as an optimization of last resort.
Why not use Coalesce?
Where Col1 = Coalesce(#Col1, Col1)
EDIT: (thx to Joel's comment below) This works only if col1 does not allow Nulls, or if it does allow nulls and you want the null values excluded when #Col1 is null or absent. SO, if it allows nulls and you want them included when #Col1 parameter is null or absent then modify as follows:
Where Coalesce(#Col1, Col1) Is Null Or Col1 = Coalesce(#Col1, Col1)
Yes. CASE has a guaranteed execution order while OR does not. Many programmers rely on OR short-circuit and are surprised to learn that a set oriented declarative language like SQL does not guarantee boolean operator short-circuit.
That being said using OR and CASe in WHERE clauses is a bad practice. Separate the condition into a clear IF statement and have separate queries on each branch:
IF #col1 IS NOT NULL
SELECT ... WHERE col1 = #col1;
ELSE
SELECT ... WHERE <alternatecondition>;
Placing the condition inside the WHERE usually defeats the optimizer that cannot guess what #col1 will be and produces a bad plan involving a full scan.
Update
Since I got tired of explaining again and again that boolean short-circuit is not guaranteed in SQL, I decided to write a full blog column about it: SQL Server boolean operator short-circuit. There you'll find a simple counter example showing that boolean short-circuit is not only not guaranteed, but relying on it can actually be very dangerous as it can result in run time errors.