SQL - conditionally set column values to NULL - sql

I have a table - some_table which has a number of columns and some of them have some invalid value in some rows which need to transformed into NULL.
I cannot use the below due as mutating the original table is not allowed by permissions for one and also it needs to be repeated for all column names.
UPDATE some_table TABLE## SET column_name = NULL WHERE column_name = 'invalid value';
So it needs to be a 'SELECT' operation to create a new table with invalid values converted to NULL - is there a quick way to do this ?
Updating with an answer from #Jonny below
NULLIF is a good option. However is there a way to apply it to all columns rather having to do it for each column separately - sometimes the number of columns is pretty huge.

You could use a NULLIF
Have a look at 9.16.3. NULLIF
https://www.postgresql.org/docs/current/static/functions-conditional.html
SELECT NULLIF('invalid value', column_name)
FROM some_table

How about something like:
INSERT INTO some_table2 (column_name, ...) SELECT * FROM some_table WHERE column_name <> 'invalid value';
INSERT INTO some_table2 (column_name, ...) SELECT null, ... FROM some_table WHERE column_name = 'invalid_value';

Related

SQL joining huge tables by excluding just one column in select statement [duplicate]

I'm trying to use a select statement to get all of the columns from a certain MySQL table except one. Is there a simple way to do this?
EDIT: There are 53 columns in this table (NOT MY DESIGN)
Actually there is a way, you need to have permissions of course for doing this ...
SET #sql = CONCAT('SELECT ', (SELECT REPLACE(GROUP_CONCAT(COLUMN_NAME), '<columns_to_omit>,', '') FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = '<table>' AND TABLE_SCHEMA = '<database>'), ' FROM <table>');
PREPARE stmt1 FROM #sql;
EXECUTE stmt1;
Replacing <table>, <database> and <columns_to_omit>
(Do not try this on a big table, the result might be... surprising !)
TEMPORARY TABLE
DROP TABLE IF EXISTS temp_tb;
CREATE TEMPORARY TABLE ENGINE=MEMORY temp_tb SELECT * FROM orig_tb;
ALTER TABLE temp_tb DROP col_a, DROP col_f,DROP col_z; #// MySQL
SELECT * FROM temp_tb;
DROP syntax may vary for databases #Denis Rozhnev
Would a View work better in this case?
CREATE VIEW vwTable
as
SELECT
col1
, col2
, col3
, col..
, col53
FROM table
You can do:
SELECT column1, column2, column4 FROM table WHERE whatever
without getting column3, though perhaps you were looking for a more general solution?
If you are looking to exclude the value of a field, e.g. for security concerns / sensitive info, you can retrieve that column as null.
e.g.
SELECT *, NULL AS salary FROM users
To the best of my knowledge, there isn't. You can do something like:
SELECT col1, col2, col3, col4 FROM tbl
and manually choose the columns you want. However, if you want a lot of columns, then you might just want to do a:
SELECT * FROM tbl
and just ignore what you don't want.
In your particular case, I would suggest:
SELECT * FROM tbl
unless you only want a few columns. If you only want four columns, then:
SELECT col3, col6, col45, col 52 FROM tbl
would be fine, but if you want 50 columns, then any code that makes the query would become (too?) difficult to read.
While trying the solutions by #Mahomedalid and #Junaid I found a problem. So thought of sharing it. If the column name is having spaces or hyphens like check-in then the query will fail. The simple workaround is to use backtick around column names. The modified query is below
SET #SQL = CONCAT('SELECT ', (SELECT GROUP_CONCAT(CONCAT("`", COLUMN_NAME, "`")) FROM
INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = 'users' AND COLUMN_NAME NOT IN ('id')), ' FROM users');
PREPARE stmt1 FROM #SQL;
EXECUTE stmt1;
If the column that you didn't want to select had a massive amount of data in it, and you didn't want to include it due to speed issues and you select the other columns often, I would suggest that you create a new table with the one field that you don't usually select with a key to the original table and remove the field from the original table. Join the tables when that extra field is actually required.
You could use DESCRIBE my_table and use the results of that to generate the SELECT statement dynamically.
My main problem is the many columns I get when joining tables. While this is not the answer to your question (how to select all but certain columns from one table), I think it is worth mentioning that you can specify table. to get all columns from a particular table, instead of just specifying .
Here is an example of how this could be very useful:
select users.*, phone.meta_value as phone, zipcode.meta_value as zipcode
from users
left join user_meta as phone
on ( (users.user_id = phone.user_id) AND (phone.meta_key = 'phone') )
left join user_meta as zipcode
on ( (users.user_id = zipcode.user_id) AND (zipcode.meta_key = 'zipcode') )
The result is all the columns from the users table, and two additional columns which were joined from the meta table.
I liked the answer from #Mahomedalid besides this fact informed in comment from #Bill Karwin. The possible problem raised by #Jan Koritak is true I faced that but I have found a trick for that and just want to share it here for anyone facing the issue.
we can replace the REPLACE function with where clause in the sub-query of Prepared statement like this:
Using my table and column name
SET #SQL = CONCAT('SELECT ', (SELECT GROUP_CONCAT(COLUMN_NAME) FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = 'users' AND COLUMN_NAME NOT IN ('id')), ' FROM users');
PREPARE stmt1 FROM #SQL;
EXECUTE stmt1;
So, this is going to exclude only the field id but not company_id
Yes, though it can be high I/O depending on the table here is a workaround I found for it.
SELECT *
INTO #temp
FROM table
ALTER TABLE #temp DROP COlUMN column_name
SELECT *
FROM #temp
It is good practice to specify the columns that you are querying even if you query all the columns.
So I would suggest you write the name of each column in the statement (excluding the one you don't want).
SELECT
col1
, col2
, col3
, col..
, col53
FROM table
I agree with the "simple" solution of listing all the columns, but this can be burdensome, and typos can cause lots of wasted time. I use a function "getTableColumns" to retrieve the names of my columns suitable for pasting into a query. Then all I need to do is to delete those I don't want.
CREATE FUNCTION `getTableColumns`(tablename varchar(100))
RETURNS varchar(5000) CHARSET latin1
BEGIN
DECLARE done INT DEFAULT 0;
DECLARE res VARCHAR(5000) DEFAULT "";
DECLARE col VARCHAR(200);
DECLARE cur1 CURSOR FOR
select COLUMN_NAME from information_schema.columns
where TABLE_NAME=#table AND TABLE_SCHEMA="yourdatabase" ORDER BY ORDINAL_POSITION;
DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;
OPEN cur1;
REPEAT
FETCH cur1 INTO col;
IF NOT done THEN
set res = CONCAT(res,IF(LENGTH(res)>0,",",""),col);
END IF;
UNTIL done END REPEAT;
CLOSE cur1;
RETURN res;
Your result returns a comma delimited string, for example...
col1,col2,col3,col4,...col53
I agree that it isn't sufficient to Select *, if that one you don't need, as mentioned elsewhere, is a BLOB, you don't want to have that overhead creep in.
I would create a view with the required data, then you can Select * in comfort --if the database software supports them. Else, put the huge data in another table.
At first I thought you could use regular expressions, but as I've been reading the MYSQL docs it seems you can't. If I were you I would use another language (such as PHP) to generate a list of columns you want to get, store it as a string and then use that to generate the SQL.
Based on #Mahomedalid answer, I have done some improvements to support "select all columns except some in mysql"
SET #database = 'database_name';
SET #tablename = 'table_name';
SET #cols2delete = 'col1,col2,col3';
SET #sql = CONCAT(
'SELECT ',
(
SELECT GROUP_CONCAT( IF(FIND_IN_SET(COLUMN_NAME, #cols2delete), NULL, COLUMN_NAME ) )
FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = #tablename AND TABLE_SCHEMA = #database
),
' FROM ',
#tablename);
SELECT #sql;
If you do have a lots of cols, use this sql to change group_concat_max_len
SET ##group_concat_max_len = 2048;
I agree with #Mahomedalid's answer, but I didn't want to do something like a prepared statement and I didn't want to type all the fields, so what I had was a silly solution.
Go to the table in phpmyadmin->sql->select, it dumps the query: copy, replace and done! :)
While I agree with Thomas' answer (+1 ;)), I'd like to add the caveat that I'll assume the column that you don't want contains hardly any data. If it contains enormous amounts of text, xml or binary blobs, then take the time to select each column individually. Your performance will suffer otherwise. Cheers!
Just do
SELECT * FROM table WHERE whatever
Then drop the column in you favourite programming language: php
while (($data = mysql_fetch_array($result, MYSQL_ASSOC)) !== FALSE) {
unset($data["id"]);
foreach ($data as $k => $v) {
echo"$v,";
}
}
The answer posted by Mahomedalid has a small problem:
Inside replace function code was replacing "<columns_to_delete>," by "", this replacement has a problem if the field to replace is the last one in the concat string due to the last one doesn't have the char comma "," and is not removed from the string.
My proposal:
SET #sql = CONCAT('SELECT ', (SELECT REPLACE(GROUP_CONCAT(COLUMN_NAME),
'<columns_to_delete>', '\'FIELD_REMOVED\'')
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = '<table>'
AND TABLE_SCHEMA = '<database>'), ' FROM <table>');
Replacing <table>, <database> and `
The column removed is replaced by the string "FIELD_REMOVED" in my case this works because I was trying to safe memory. (The field I was removing is a BLOB of around 1MB)
You can use SQL to generate SQL if you like and evaluate the SQL it produces. This is a general solution as it extracts the column names from the information schema. Here is an example from the Unix command line.
Substituting
MYSQL with your mysql command
TABLE with the table name
EXCLUDEDFIELD with excluded field name
echo $(echo 'select concat("select ", group_concat(column_name) , " from TABLE") from information_schema.columns where table_name="TABLE" and column_name != "EXCLUDEDFIELD" group by "t"' | MYSQL | tail -n 1) | MYSQL
You will really only need to extract the column names in this way only once to construct the column list excluded that column, and then just use the query you have constructed.
So something like:
column_list=$(echo 'select group_concat(column_name) from information_schema.columns where table_name="TABLE" and column_name != "EXCLUDEDFIELD" group by "t"' | MYSQL | tail -n 1)
Now you can reuse the $column_list string in queries you construct.
I wanted this too so I created a function instead.
public function getColsExcept($table,$remove){
$res =mysql_query("SHOW COLUMNS FROM $table");
while($arr = mysql_fetch_assoc($res)){
$cols[] = $arr['Field'];
}
if(is_array($remove)){
$newCols = array_diff($cols,$remove);
return "`".implode("`,`",$newCols)."`";
}else{
$length = count($cols);
for($i=0;$i<$length;$i++){
if($cols[$i] == $remove)
unset($cols[$i]);
}
return "`".implode("`,`",$cols)."`";
}
}
So how it works is that you enter the table, then a column you don't want or as in an array: array("id","name","whatevercolumn")
So in select you could use it like this:
mysql_query("SELECT ".$db->getColsExcept('table',array('id','bigtextcolumn'))." FROM table");
or
mysql_query("SELECT ".$db->getColsExcept('table','bigtextcolumn')." FROM table");
May be I have a solution to Jan Koritak's pointed out discrepancy
SELECT CONCAT('SELECT ',
( SELECT GROUP_CONCAT(t.col)
FROM
(
SELECT CASE
WHEN COLUMN_NAME = 'eid' THEN NULL
ELSE COLUMN_NAME
END AS col
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'employee' AND TABLE_SCHEMA = 'test'
) t
WHERE t.col IS NOT NULL) ,
' FROM employee' );
Table :
SELECT table_name,column_name
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'employee' AND TABLE_SCHEMA = 'test'
================================
table_name column_name
employee eid
employee name_eid
employee sal
================================
Query Result:
'SELECT name_eid,sal FROM employee'
I use this work around although it may be "Off topic" - using mysql workbench and the query builder -
Open the columns view
Shift select all the columns you want in your query (in your case all but one which is what i do)
Right click and select send to SQL Editor-> name short.
Now you have the list and you can then copy paste the query to where ever.
If it's always the same one column, then you can create a view that doesn't have it in it.
Otherwise, no I don't think so.
I would like to add another point of view in order to solve this problem, specially if you have a small number of columns to remove.
You could use a DB tool like MySQL Workbench in order to generate the select statement for you, so you just have to manually remove those columns for the generated statement and copy it to your SQL script.
In MySQL Workbench the way to generate it is:
Right click on the table -> send to Sql Editor -> Select All Statement.
The accepted answer has several shortcomings.
It fails where the table or column names requires backticks
It fails if the column you want to omit is last in the list
It requires listing the table name twice (once for the select and another for the query text) which is redundant and unnecessary
It can potentially return column names in the wrong order
All of these issues can be overcome by simply including backticks in the SEPARATOR for your GROUP_CONCAT and using a WHERE condition instead of REPLACE(). For my purposes (and I imagine many others') I wanted the column names returned in the same order that they appear in the table itself. To achieve this, here we use an explicit ORDER BY clause inside of the GROUP_CONCAT() function:
SELECT CONCAT(
'SELECT `',
GROUP_CONCAT(COLUMN_NAME ORDER BY `ORDINAL_POSITION` SEPARATOR '`,`'),
'` FROM `',
`TABLE_SCHEMA`,
'`.`',
TABLE_NAME,
'`;'
)
FROM INFORMATION_SCHEMA.COLUMNS
WHERE `TABLE_SCHEMA` = 'my_database'
AND `TABLE_NAME` = 'my_table'
AND `COLUMN_NAME` != 'column_to_omit';
I have a suggestion but not a solution.
If some of your columns have a larger data sets then you should try with following
SELECT *, LEFT(col1, 0) AS col1, LEFT(col2, 0) as col2 FROM table
If you use MySQL Workbench you can right-click your table and click Send to sql editor and then Select All Statement This will create an statement where all fields are listed, like this:
SELECT `purchase_history`.`id`,
`purchase_history`.`user_id`,
`purchase_history`.`deleted_at`
FROM `fs_normal_run_2`.`purchase_history`;
SELECT * FROM fs_normal_run_2.purchase_history;
Now you can just remove those that you dont want.

Dynamic UPDATE statement to update values in columns returned by a previous SELECT

In essence, what I want to do is:
find all tables and their columns that match a specific query,
update values in these columns.
So say I have something like
SELECT COLUMN_NAME, TABLE_NAME, TABLE_SCHEMA
FROM INFORMATION_SCHEMA.COLUMNS
WHERE
(
TABLE_SCHEMA = 'PUBLIC'
) AND (
COLUMN_NAME LIKE '%SOMETHING%'
OR COLUMN_NAME LIKE '%SOMETHINGELSE%'
) AND (
DATA_TYPE = 'BIGINT' OR
DATA_TYPE = 'TINYINT' OR
DATA_TYPE = 'SMALLINT' OR
DATA_TYPE = 'INTEGER'
)
Or for Oracle something like:
SELECT COLUMN_NAME, TABLE_NAME
FROM USER_TAB_COLS
WHERE
(
COLUMN_NAME LIKE '%SOMETHING%'
OR COLUMN_NAME LIKE '%SOMETHINGELSE%'
) AND
DATA_TYPE IN ('NUMBER')
I want to then do an UPDATE on all resulting columns similar to:
UPDATE _RESULTING_COLUMN_NAMES_HERE_THEORETICALLY_
SET
_SINGLE_COLUMN_NAME_ = _SOME_NEW_VALUE_
WHERE _SINGLE_COLUMN_NAME_ = _SOME_OLD_VALUE_;
Well obviously that does not work or even exist, but I hope you understand what I want to achieve.
I could see a way where you generate an UPDATE statement for each matching table from the SELECT resultset, but I don't really see how to achieve this.
To make things more fun, I'd need to do that for a list of old_value to new_value transformations.
Any ideas are welcome.
I am trying to have this work on HSQLDB and Oracle as my 2 requirements, but supporting additional platforms would be a pretty good bonus.
Anytime you think you need to use dynamic SQL, you should stop, take a step back and see if there's another way to do it, or if you REALLY need to do what you're doing .
I'd probably seriously question your base "requirement" of:
updating all columns for all tables matching some string, and of type integer (or variations thereof).
Something still smells "funny" ... I'd be VERY careful about what you're doing - make sure you know what the results are going to be, test test test .. and TEST again ... on a DEV box somewhere ...
that said, anytime I need to resort to dynamic SQL, I have found the simplest way is to start with a "template":
So in your case, the final UPDATE you want to fire is as you put it:
UPDATE _RESULTING_COLUMN_NAMES_HERE_THEORETICALLY_
SET
_SINGLE_COLUMN_NAME_ = _SOME_NEW_VALUE_
WHERE _SINGLE_COLUMN_NAME_ = _SOME_OLD_VALUE_;
Ok, I'd probably re-write that as a string now, and start a query using the WITH clause:
WITH w_template AS ( select
rtrim(q'[ UPDATE _RESULTING_COLUMN_NAMES_HERE_THEORETICALLY_ ]')||CHR(10)||
rtrim(q'[ SET ]')||CHR(10)||
rtrim(q'[ _SINGLE_COLUMN_NAME_ = _SOME_NEW_VALUE_ ]')||CHR(10)||
rtrim(q'[ WHERE _SINGLE_COLUMN_NAME_ = _SOME_OLD_VALUE_; ]')
template from dual
)
Note I haven't changed anything in your query (yet). All I did was wrap some "q'[" and "]'" around it ... an rtrim, a CHR(10) and put it in a WITH clause.
1) q'[ some string ]' is an alternate way to do a string. The advantage of it is you can have single quotes inside that string without any real issue:
ie q'[ some 'string' ]' works just fine ... prints " some 'string' "
2) RTRIM - I left spaces at end of line in there as cosmetic so it's easier for us to read. However, due to length restrictions of strings, those spaces can grow that string really big, really fast with a larger query. So RTRIM is a habit I've gotten into . Keep the cosmetic spaces, but tell Oracle not to use them ;) they're just for us.
3) CHR(10) - cosmetic only - you can leave this off if you want. I like it as if you want to dump the query during testing, you can easily read the query and see what it built.
Next we'll change the names of your dynamic values there so we can more easily spot them and substitue them:
WITH w_template AS ( select
rtrim(q'[ UPDATE <table_name> ]')||CHR(10)||
rtrim(q'[ SET ]')||CHR(10)||
rtrim(q'[ <col_name> = <col_new_val> ]')||CHR(10)||
rtrim(q'[ WHERE <col_name> = <col_old_val>; ]')
template from dual
)
all I did was create an easily identified "strings" that I'll use to substitute values in later.
Note that if your columns were strings, you might need quotes in there: <col_name> = '<col_new_val>'
but seems you're dealing with integer data .. so I think we're ok ...
Now we need to pull your data ... so we go back to your original query:
SELECT COLUMN_NAME, TABLE_NAME
FROM USER_TAB_COLS
WHERE
(
COLUMN_NAME LIKE '%SOMETHING%'
OR COLUMN_NAME LIKE '%SOMETHINGELSE%'
) AND
DATA_TYPE IN ('NUMBER')
Hmm, I'll have to trust you in your query there, I'm not sure that'll run on Oracle, but you know your query better than I do ;) So I'll trust your query "as is" for this example - as long as it picks out the data you want, and includes the table name, column name, and the before/after values you want (which it currently doesn't) we're ok.
So all we need to do is tack those two together ... we'll do this:
WITH w_template AS ( select
rtrim(q'[ UPDATE <table_name> ]')||CHR(10)||
rtrim(q'[ SET ]')||CHR(10)||
rtrim(q'[ <col_name> = <col_new_val> ]')||CHR(10)||
rtrim(q'[ WHERE <col_name> = <col_old_val>; ]')
template from dual
)
w_data AS (
SELECT COLUMN_NAME, TABLE_NAME
FROM USER_TAB_COLS
WHERE
(
COLUMN_NAME LIKE '%SOMETHING%'
OR COLUMN_NAME LIKE '%SOMETHINGELSE%'
) AND
DATA_TYPE IN ('NUMBER')
)
Then we just need to add the final query, using REPLACE to substitute values ..
(note: not sure where you get "some_new_value" and "some_old_value" from ??? you'll have to join that into your query .. )
WITH w_template AS ( select
rtrim(q'[ UPDATE <table_name> ]')||CHR(10)||
rtrim(q'[ SET ]')||CHR(10)||
rtrim(q'[ <col_name> = <col_new_val> ]')||CHR(10)||
rtrim(q'[ WHERE <col_name> = <col_old_val>; ]')
template from dual
),
w_data AS (
SELECT COLUMN_NAME, TABLE_NAME
FROM USER_TAB_COLS
WHERE
(
COLUMN_NAME LIKE '%SOMETHING%'
OR COLUMN_NAME LIKE '%SOMETHINGELSE%'
) AND
DATA_TYPE IN ('NUMBER')
)
SELECT REPLACE ( REPLACE ( REPLACE ( REPLACE (
wt.template, '<table_name>',
wd.table_name ),
'<col_name>', wd.column_name ),
'<col_new_val>', ??? ),
'<col_old_val>', ??? ) query
FROM w_template wt,
w_data wd
I left ??? there for the old / new values, since you didn't indicate where they'd come from ??
but if you run that, it should spit out some update statements .. ;)
Once you're comfortable with those, pushing them through execute immediate is the easy work.
Again, I would advise to be cautious of this approach, this is ok for a 1 off migration, or such, however, it is not advised for a daily job to be running on a regular basis. ;)
find all tables and their columns that match a specific query,
update values in these columns.
With HSQLDB, it is not possible to do this with just SQL. You need to write a short Java program to list the required table names and their column names, then construct an UPDATE statement per table and execute it.
With Oracle, you could write the same program in PL/SQL. But the Java language solution is compatible with both database engines.

ISDATE Function in Oracle

I am developing a web application which is getting data from an Oracle DB. The select statements are created dynamically. What I want to do is, whenever I select a date field in a table, it should return it to a string with the format of dd.mm.yyyy
what I need is basically a way to have a function like isdate(COLUMN_NAME, true stmt, false stmt)
SELECT ISDATE(First Column, to_char(FirstColumn,'dd.mm.yyyy'), FistColumn)
FROM ANYTABLE
is there a way for this?
You can check to see what the data type is for that table using the data dictionary, and connect multiple versions of the same query to handle whatever data type it might be.
For example let's say you had this table:
create table tbl_char (dt varchar2(10));
insert into tbl_char values ('01.03.2013');
And then ran:
select to_char(dt, 'dd.mm.yyyy')
from tbl_char
where exists (select 'x'
from all_tab_cols
where table_name = 'TBL_CHAR'
and column_name = 'DT'
and data_type = 'DATE')
union all
select dt
from tbl_char
where exists (select 'x'
from all_tab_cols
where table_name = 'TBL_CHAR'
and column_name = 'DT'
and data_type = 'VARCHAR2')
You would get one row, "01.03.2013", as output, because only the 2nd query actually ran. The first would have returned an error if not for the filter resulting from the EXISTS subquery. Now, if we were to change that varchar field over to a date, we would get exactly the same output, only the result would technically be from the first query. The second would run and return no rows.
sql fiddle: http://sqlfiddle.com/#!4/0001d/1/0

Oracle/PL SQL/SQL null comparison on where clause

Just a question about dealing will null values in a query.
For example I have the following table with the following fields and values
TABLEX
Column1
1
2
3
4
5
---------
Column2
null
A
B
C
null
I'm passing a variableY on a specific procedure. Inside the procedure is a cursor like this
CURSOR c_results IS
SELECT * FROM TABLEX where column2 = variableY
now the problem is variableY can be either null, A, B or C
if the variableY is null i want to select all record where column2 is null, else where column2 is either A, B or C.
I cannot do the above cursor/query because if variableY is null it won't work because the comparison should be
CURSOR c_results IS
SELECT * FROM TABLEX where column2 IS NULL
What cursor/query should I use that will accomodate either null or string variable.
Sorry if my question is a bit confusing. I'm not that good in explaining things. Thanks in advance.
Either produce different SQL depending on the contents of that parameter, or alter your SQL like this:
WHERE (column2 = variableY) OR (variableY IS NULL AND column2 IS NULL)
Oracle's Ask Tom says:
where decode( col1, col2, 1, 0 ) = 0 -- finds differences
or
where decode( col1, col2, 1, 0 ) = 1 -- finds sameness - even if both NULL
Safely Comparing NULL Columns as Equal
You could use something like:
SELECT * FROM TABLEX WHERE COALESCE(column2, '') = COALESCE(variableY, '')
(COALESCE takes the first non NULL value)
Note this will only work when you the column content cannot be '' (empty string). Else this statement will fail because NULL will match '' (empty string).
(edit)
You could also consider:
SELECT * FROM TABLEX WHERE COALESCE(column2, 'a string that never occurs') = COALESCE(variableY, 'a string that never occurs')
This will fix the '' fail hypothesis.
Below is similar to "top" answer but more concise:
WHERE ((column2 = variableY ) or COALESCE( column2, variableY) IS NULL)
May not be appropriate depending on the data you're looking at, but one trick I've seen (and used) is to compare NVL(fieldname,somenonexistentvalue).
For example, if AGE is an optional column, you could use:
if nvl(table1.AGE,-1) = nvl(table2.AGE,-1)
This relies on there being a value that you know will never be allowed. Age is a good example, salary, sequence numbers, and other numerics that can't be negative. Strings may be trickier of course - you may say that you'll never have anyone named 'xyzzymaryhadalittlelamb" or something like that, but the day you run with that assumption you KNOW they'll hire someone with that name!!
All that said: "where a = b or (a is null and b is null)" is the traditional way to solve it. Which is unfortunate, as even experienced programmers forget that part of it sometimes.
Try using the ISNULL() function. you can check if the variable is null and if so, set a default return value. camparing null to null is not really possible. remember: null <> null
WHERE variableY is null or column2 = variableY
for example:
create table t_abc (
id number(19) not null,
name varchar(20)
);
insert into t_abc(id, name) values (1, 'name');
insert into t_abc(id, name) values (2, null);
commit;
select * from t_abc where null is null or name = null;
--get all records
select * from t_abc where 'name' is null or name = 'name';
--get one record with name = 'name'
You could use DUMP:
SELECT *
FROM TABLEX
WHERE DUMP(column2) = DUMP(variableY);
DBFiddle Demo
Warning: This is not SARG-able expression so there will be no index usage.
With this approach you don't need to search for value that won't exists in your data (like NVL/COALESCE).

Using a function in a where clause that includes null search

I currently have a prepared statement in Java which uses the following SQL statement in the WHERE clause of my query, but I would like to re-write this into a function to limit the user parameters passed to it and possibly make it more efficient.
(
(USER_PARAM2 IS NULL AND
( COLUMN_NAME = nvl(USER_PARAM1, COLUMN_NAME) OR
(nvl(USER_PARAM1, COLUMN_NAME) IS NULL)
)
)
OR
(USER_PARAM2 IS NOT NULL AND COLUMN_NAME IS NULL)
)
USER_PARAM1 and USER_PARAM2 are passed into the prepared statement by the user.
USER_PARAM1 represents what the application user wants to search this particular COLUMN_NAME for. If the user does not include this parameter, it will default to NULL.
USER_PARAM2 was my way to allow a user to request a NULL value only search on this COLUMN_NAME. Additionally I have some server logic that sets USER_PARAM2 to 'true' if passed in by the user or NULL if it wasn't specified by the user.
The intended behavior is that if USER_PARAM2 was declared then only COLUMN_NAME values of NULL are returned. If USER_PARAM2 wasn't declared and USER_PARAM1 was declared then only COLUMN_NAME = USER_PARAM1 are returned. If neither user params are declared then all rows are returned.
Could anyone help me out on this?
Thanks in advance...
EDIT:
Just to clarify this is how my current query looks (without the other WHERE clause statements..)
SELECT *
FROM TABLE_NAME
WHERE (
(USER_PARAM2 IS NULL AND
( COLUMN_NAME = nvl(USER_PARAM1, COLUMN_NAME) OR
(nvl(USER_PARAM1, COLUMN_NAME) IS NULL)
)
)
OR
(USER_PARAM2 IS NOT NULL AND COLUMN_NAME IS NULL)
)
... and this is where I would like to get to...
SELECT *
FROM TABLE_NAME
WHERE customSearchFunction(USER_PARAM1, USER_PARAM2, COLUMN_NAME)
EDIT #2:
OK, so another co-worker helped me out with this...
CREATE OR REPLACE function searchNumber (pVal IN NUMBER, onlySearchForNull IN CHAR, column_value IN NUMBER)
RETURN NUMBER
IS
BEGIN
IF onlySearchForNull IS NULL THEN
IF pVal IS NULL THEN
RETURN 1;
ELSE
IF pVal = column_value THEN
RETURN 1;
ELSE
RETURN 0;
END IF;
END IF;
ELSE
IF column_value IS NULL THEN
RETURN 1;
ELSE
RETURN 0;
END IF;
END IF;
END;
... this seems to work in my initial trials..
SELECT *
FROM TABLE_NAME
WHERE 1=searchNumber(USER_PARAM1, USER_PARAM2, COLUMN_NAME);
... the only issues I have with it would be
1)possible performance concerns vs the complex SQL statement I started with.
2)that I would have to create similar functions for each data type.
However, the latter would be less of an issue for me.
EDIT #3 2012.02.01
So we ended up going with the solution I chose below, while using the function based approach where code/query cleanliness trumps performance. We found that the function based approach performed roughly 6x worse than using pure SQL.
Thanks everyone for the great input everyone!
EDIT #4 2012.02.14
So looking back I noticed that applying the virtual table concept in #Alan's solution with the clarity of #danihp's solution gives a very nice overall solution in terms of clarity and performance. Here's what I now have
WITH params AS (SELECT user_param1 AS param, user_param2 AS param_nullsOnly FROM DUAL)
SELECT *
FROM table_name, params p
WHERE ( nvl(p.param_nullsOnly, p.param) IS NULL --1)
OR p.param_nullsOnly IS NOT NULL AND column_name IS NULL --2)
OR p.param IS NOT NULL AND column_name = p.param --3)
)
-- 1) Test if all rows should be returned
-- 2) Test if only NULL values should be returned
-- 3) Test if param equals the column value
Thanks again for the suggestions and comments!
There's a simple way of to pass your parameters only once and refer to them as many times as needed, using common-table expressions:
WITH params AS (SELECT user_param1 AS up1, user_param2 AS up2 FROM DUAL)
SELECT *
FROM table_name, params p
WHERE ((p.up2 IS NULL
AND (column_name = NVL(p.up1, column_name)
OR (NVL(p.up1, column_name) IS NULL)))
OR (p.up2 IS NOT NULL AND column_name IS NULL))
In effect, you're creating a virtual table, where the columns are your parameters, that is populated with a single row.
Conveniently, this also ensures that all of your parameters are collected in the same place and can be specified in an arbitrary order (as opposed to the order that the naturally appear in the query).
There are a couple big advantages to this over a function-based approach. First, this will not prevent the use of indexes (as pointed out by #Bob Jarvis). Second, this keeps the query's logic in the query, rather than hidden in functions.
I don't know if my approach has more performance, but it has best readability:
Sending 2 additionals parameters to query you can rewrite query like:
where
( P_ALL_RESULTS is not null
OR
P_ONLY_NULLS is not null AND COLUMN_NAME IS NULL
OR
P_USE_P1 is not null AND COLUMN_NAME = USER_PARAM1
)
Disclaimer: answered before OP question clarification