So I am comparing two Oracle databases by grabbing random rows in database A, and searching for these rows in database B based off their key columns. Then I compare the rows which are returned in java.
I am using the following query to find rows in database B using the key columns from database A:
select * from mytable
Where (Key_Column_A,Key_Column_B,Key_Column_C)
in (('1','A', 'cat'),('2','B', 'dog'),('3','C', ''));
This works just fine for the first two sets of keys, but the third key('3','C', '') does not work because there is a null value in the third column. Changing the statement to ('3','C', NULL) or changing the SQL to
select * from mytable
Where (Key_Column_A,Key_Column_B,Key_Column_C)
in ((('1','A', 'cat'),('2','B', 'dog'),('3','C', ''))
OR (Key_Column_A,Key_Column_B,Key_Column_C) IS NULL);
will not work either.
Is there a way to include a null column in an IN clause? And if not, is there a way to efficiently do the same thing? (My only solution currently is to create a check to make sure there are no nullable columns in my keys which would make this process rather unefficient and somewhat messy).
You can use it this way. I think it would work.
select * from mytable
Where (NVL(Key_Column_A,''),NVL(Key_Column_B,''),NVL(Key_Column_C,''))
in (('1','A', 'cat'),('2','B', 'dog'),('3','C', ''));
I am not sure about this (Key_Column_A,Key_Column_B,Key_Column_C) IS NULL. Wouldn't this imply that all of the columns (A,B,C) are NULL ?
Related
I have been trying to find an optimal solution to select unique values from each column. My problem is I don't know column names in advance since different table has different number of columns. So first, I have to find column names and I could use below query to do it:
select column_name from information_schema.columns
where table_name='m0301010000_ds' and column_name like 'c%'
Sample output for column names:
c1, c2a, c2b, c2c, c2d, c2e, c2f, c2g, c2h, c2i, c2j, c2k, ...
Then I would use returned column names to get unique/distinct value in each column and not just distinct row.
I know a simplest and lousy way is to write select distict column_name from table where column_name = 'something' for every single column (around 20-50 times) and its very time consuming too. Since I can't use more than one distinct per column_name, I am stuck with this old school solution.
I am sure there would be a faster and elegant way to achieve this, and I just couldn't figure how. I will really appreciate any help on this.
You can't just return rows, since distinct values don't go together any more.
You could return arrays, which can be had simpler than you may have expected:
SELECT array_agg(DISTINCT c1) AS c1_arr
,array_agg(DISTINCT c2a) AS c2a_arr
,array_agg(DISTINCT c2b) AS c2ba_arr
, ...
FROM m0301010000_ds;
This returns distinct values per column. One array (possibly big) for each column. All connections between values in columns (what used to be in the same row) are lost in the output.
Build SQL automatically
CREATE OR REPLACE FUNCTION f_build_sql_for_dist_vals(_tbl regclass)
RETURNS text AS
$func$
SELECT 'SELECT ' || string_agg(format('array_agg(DISTINCT %1$I) AS %1$I_arr'
, attname)
, E'\n ,' ORDER BY attnum)
|| E'\nFROM ' || _tbl
FROM pg_attribute
WHERE attrelid = _tbl -- valid, visible table name
AND attnum >= 1 -- exclude tableoid & friends
AND NOT attisdropped -- exclude dropped columns
$func$ LANGUAGE sql;
Call:
SELECT f_build_sql_for_dist_vals('public.m0301010000_ds');
Returns an SQL string as displayed above.
I use the system catalog pg_attribute instead of the information schema. And the object identifier type regclass for the table name. More explanation in this related answer:
PLpgSQL function to find columns with only NULL values in a given table
If you need this in "real time", you won't be able to archive it using a SQL that needs to do a full table scan to archive it.
I would advise you to create a separated table containing the distinct values for each column (initialized with SQL from #Erwin Brandstetter ;) and maintain it using a trigger on the original table.
Your new table will have one column per field. # of row will be equals to the max number of distinct values for one field.
For on insert: for each field to maintain check if that value is already there or not. If not, add it.
For on update: for each field to maintain that has old value != from new value, check if the new value is already there or not. If not, add it. Regarding the old value, check if any other row has that value, and if not, remove it from the list (set field to null).
For delete : for each field to maintain, check if any other row has that value, and if not, remove it from the list (set value to null).
This way the load mainly moved to the trigger, and the SQL on the value list table will super fast.
P.S.: Make sure to pass all you SQL from trigger to explain plan to make sure they use best index and execution plan as possible. For update/deletion, just check if old value exists (limit 1).
I am stuck in the following query. This was working properly on mySQL but it gives error on MSSQL-2005. The main purpose of the query is to copy data from one table to another without duplicates based on multiple columns comparison from both tables.
I can do this to compare one column for duplication, but I can't do when I compare more then one column for duplication.
Here is my query.
INSERT INTO eBayStockTaking (OrderLineItemID,Qty,SKU,SubscriberID,eBayUserID)
SELECT OrderLineItemID,Qty,SKU,SubscriberID,eBayUserID
FROM tempEBayStockTaking WHERE (OrderLineItemID,SubscriberID,eBayUserID)
Not In (SELECT OrderLineItemID,SubscriberID,eBayUserID FROM eBayStockTaking)
Note: I have been through many similar questions but all in vain.
Thanks
Rather try NOT EXISTS
Something like
INSERT INTO eBayStockTaking (OrderLineItemID,Qty,SKU,SubscriberID,eBayUserID)
SELECT OrderLineItemID,
Qty,
SKU,
SubscriberID,
eBayUserID
FROM tempEBayStockTaking t
WHERE Not EXISTS (
SELECT *
FROM eBayStockTaking e
WHERE e.OrderLineItemID = t.OrderLineItemID
AND e.SubscriberID = t.SubscriberID
AND e.eBayUserID = t.eBayUserID)
)
I know MySQL allows Row Subqueries, nut SQL Server does not allow this.
I have a unique column. I also have a known set of elements that are possible values for the column. I need to know which of the possible values are not already in the table, and as such, are suitable for insertion.
Is this possible with SQL or is post processing required?
Currently, I am using the "in" operator to select all rows where the column value equals an element in my set. Then I remove all matched elements from my set via post processing.
Stick the allowed values in a temporary table allowed, then use a subquery using NOT IN:
SELECT *
FROM allowed
WHERE allowed.val NOT IN (
SELECT maintable.val
)
Some DBs will allow you to build up a table "in-place", instead of having to create a separate table. E.g. in PostgreSQL (any version):
SELECT *
FROM (
SELECT 'foo'
UNION ALL SELECT 'bar'
UNION ALL SELECT 'baz' -- etc.
) inplace_allowed
WHERE inplace_allowed.val NOT IN (
SELECT maintable.val
)
More modern versions of PostgreSQL (and perhaps other DBs) will let you use the slightly nicer VALUES syntax to do the same thing.
To do this entirely in SQL you will need to create a separate table with one column. Each row holds one value from the known set of elements. Assuming the table is called ElementList and the other table is called Existing:
SELECT * FROM ElementList WHERE Element NOT IN
(SELECT DISTINCT Element FROM Existing)
Depending on what database engine you're using you may be able to use a temporary table to create and hold the list without saving it permanently in the database. However, storing the list of allowed elements is valuable for constraining the Element column in the Existing table (and for presenting the user with allowed Elements in the user interface).
I have 2 sqlite databases, and I'm trying to insert data from one database to another. For example, "db-1.sqlite" has a table '1table' with 2 columns ('name', 'state'). Also, "db-2.sqlite" has a table '2table' with 2 columns ('name', 'url'). Both tables contain a list of 'name' values that are mostly common with each other but randomized, so the id of each row does not match.
I want to insert the values for the 'url' column into the db-1's table, but I want to make sure each url value goes to its corresponding 'name' value.
So far, I have done this:
> sqlite3 db-1.sqlite
sqlite> alter table 1table add column url;
sqlite> attach database 'db-2.sqlite' as db2;
Now, the part I'm not sure about:
sqlite> insert into 1table(url) select db2.2table.url from db2.2table where 1table.name==db2.2table.name
If you look at what I wrote above, you can tell what I'm trying to accomplish, but it is incorrect. If I can get any help on the matter, I'd be very grateful!!
The equality comparison operator in SQL is =, not ==.
Also, I suspect that you should be updating 1table, rather then inserting in it.
Finally, your table names start with digits, so you need to escape them.
This SQL should work better:
update `1table`
set url = (select db2.`2table`.url
from db2.`2table`
where `1table`.name = db2.`2table`.name);
imagine there are 50 columns. I dont wan't any row that includes a null value. Are there any tricky way?
SQL 2005 server
Sorry, not really. All 50 columns have to be checked in one form or another.
Column1 IS NOT NULL AND ... AND Column50 IS NOT NULL
Of course, under these conditions why not disallow NULLs in the first place by having NOT NULL in the table definition
If it's SQL Server 2005+ you can do something like:
SELECT fields
FROM MyTable
WHERE stuff
EXCEPT -- This excludes the below results
SELECT fields
FROM MyTable
WHERE (Col1 + Col2 + Col3....) IS NULL
Adding a null to a value results in a null, so the sum of all your columns will be NULL.
This may need to change based on your data types, but adding NULL to either a char/varchar or a number will result in another NULL.
If you are looking at the values not being null, you can do this in the select statement.
SELECT ISNULL(firstname,''), ISNULL(lastname,'') FROM TABLE WHERE SOMETHING=1
This will replace nulls with string blanks. If you want another value use: ISNULL(firstname,'empty') for example. You can use anything where the word empty is.
I prefer this query
select *
from table
where column1>''
and column2>''
and (column3>'' or column3<'')
Allows sql server to use an index seek if the proper index/es exist. you would have to do the syntext for column 3 for any numeric values that could be negative.