Null Handling in SQL Queries - sql

I have the following table:
All records
I want to get records satisfying some set of conditions.
Below is the query and it returns 3 results:
Records satisfying query
Now I want records which did not satisfy remaining records (i.e. records which did not satisfy conditions in the previous query) and I expect total 4 rows in return. So I am executing the query:
Query for remaining records
However, record number 4 is not returned and I know that col2 values 'null' is causing this problem.
I even tried with NVL and coalesce function but without any luck:
nvl_coalesce_queries
So basically, I want 4 rows in 'NOT' query.
Please let me know if any suggestions.

Use something like
select *
from tmp_dbg3
where col0 not in (select col0 from tmp_dbg3 where <your 'satisfying' condition>)

Try using the NOT IN operator:
https://technet.microsoft.com/en-us/library/ms189062(v=sql.105).aspx
SELECT * FROM database
WHERE id NOT IN (SELECT id FROM database WHERE *your conditions*)
I am assuming that the first column in your database is the auto increment key. I would personally prefer to use that to identify which columns are not in your conditions instead of col1.

Related

select count(*)+count(*) is this sql statement produce any result or error?

I have faced this question on interview with option like error,1,2,3
Now got the result as : 2
select count(*)+COUNT(*)
result is 2
Normally all selects are of the form SELECT [columns, scalar computations on columns, grouped computations on columns, or scalar computations] FROM [table or joins of tables, etc]
Because this allows plain scalar computations we can do something like SELECT 1 + 1 FROM SomeTable and it will return a recordset with the value 2 for every row in the table SomeTable.
Now, if we didn't care about any table, but just wanted to do our scalar computed we might want to do something like SELECT 1 + 1. This isn't allowed by the standard, but it is useful and most databases allow it (Oracle doesn't unless it's changed recently, at least it used to not).
Hence such bare SELECTs are treated as if they had a from clause which specified a table with one row and no column (impossible of course, but it does the trick). Hence SELECT 1 + 1 becomes SELECT 1 + 1 FROM ImaginaryTableWithOneRow which returns a single row with a single column with the value 2.
Mostly we don't think about this, we just get used to the fact that bare SELECTs give results and don't even think about the fact that there must be some one-row thing selected to return one row.
In doing SELECT COUNT() you did the equivalent of SELECT COUNT() FROM ImaginaryTableWithOneRow which of course returns 1.
References : Why MySQL COUNT without table name gives 1

SQL - Two DISTINCTs performing very poorly

I've got two tables containing a column with the same name. I try to find out which distinct values exist in Table2 but don't exist in Table1. For that I have two SELECTs:
SELECT DISTINCT Field
FROM Table1
SELECT DISTINCT Field
FROM Table2
Both SELECTs finish within 2 Seconds and return about 10 rows each. If I restructure my query to find out which values are missing in Table1, the query takes several minutes to finish:
SELECT DISTINCT Field
FROM Table1
WHERE Field NOT IN
(
SELECT DISTINCT Field
FROM Table2
)
My temporary workaround is inserting the results of the second distinct in a temporary table an comparing against it. But the performance still isn't great.
Does anyone know why this happens? I guess because SQL-Server keeps recalculating the second DISTINCT but why would it? Shouldn't SQL-Server optimize this somehow?
Not sure if this will improve performance but i'd use EXCEPT:
SELECT Field
FROM Table1
EXCEPT
SELECT Field
FROM Table2
There is no need to use DISTINCT because EXCEPT is a set operator that removes duplicates.
EXCEPT returns distinct rows from the left input query that aren’t
output by the right input query.
The number and the order of the columns must be the same in all queries.
The data types must be compatible.

The Subquery which returns multiple rows in Oracle SQL

I have a complex SQL query with multiple sub queries. The Query returns a very big data. The tables are dynamic and they get updated every day. Yesterday, the query didn't execute, because one of the subqueries returned multiple rows.
The subquery would be something like this.
Select Value1 from Table1 where Table1.ColumnName = 123456
Table1.ColumnName will be fetched dynamically, nothing will be hardcoded. Table1.ColumnName will be fetched from another subquery which runs perfectly.
My Question would be,
How to find which value in the particular subquery returned two rows.
How to find which value in the particular subquery returned two rows.
You need to check each sub-query whether it returns a single-row or multiple-rows for a value. You can use the COUNT function to verify -
select column_name, count(*) from table_name
group by column_name
having count(*) > 1
The above is the sub-query for which it checks the count of rows grouped by each value, if any value returns more than one row, that value is the culprit.
Once you get to know which sub-query and respective column is the culprit, you coulkd then use ROWNUM or ANALYTIC functions to limit the number of rows.

Python/SQL query to use ROWID to realign out of sync data in SQLite3 db

I have an SQLite3 table with 4 columns, "date", "price", "price2" and "vol".There are about 200k lines of data, but the last 2 columns are out of sync by 798 rows. That is the values of the second two columns in row 1, actually correspond to the values of the first two columns at row 798.
I am using Python 2.7.
I was thinking there must be a way of using the ROWID column as a unique identifier where i can extract the first two columns, then extract the second two columns and rejoin based upon "ROWID+798" or something like that.
Is this possible and if so would anyone know how to do this?
I'm curious how your database could get corrupt like that, and sceptical about your assessment that you seem to know exactly what is wrong. If something like this could happen it seems likely that there are many other problems.
In any case, the query you describe should be like this, if i understood correctly.
In most DBMS's you could do this with one subquery, but the syntax (col1,col2) = is not allowed in SQLite, so you have to do it like this.
UPDATE table1 t SET
col1 =
(SELECT col1 FROM table1
WHERE t.rowid = rowid - 798),
col2 =
(SELECT col2 FROM table1
WHERE t.rowid = rowid - 798)

Apply the same aggregate to every column in a table

I am using a proprietary mpp database that has been forked off psql 8.3. I am trying to apply a simple count to a wide table (around 450 columns) and so I was wondering if the best way to do this in terms of a simple sql function. I am just counting the number of distinct values in a given column as well as the count of the number of null values in the column. The query i want to generalize for every column is for example
If i want to run the query against the column names i write
select
count(distinct names) d_names,
sum(case when names is not null then 1 else 0 end) n_s_ip
from table;
How do i generalize the query above to iterate through every column in the table if the number of columns is 450 without writing out each column name by hand?
First, since COUNT() only counts non-null values, your query can be simplified:
SELECT count(DISTINCT names) AS unique_names
,count(names) AS names_not_null
FROM table;
But that's the number of non-null values and contradicts your description:
count of the number of null values in the column
For that you would use:
count(*) - count(names) AS names_null
Since count(*) count all rows and count(names) only rows with non-null names.
Removed inferior alternative after hint by #Andriy.
To automate that for all columns build an SQL statement off of the catalog table pg_attribute dynamically. You can use EXECUTE in a PL/pgSQL function to execute it immediately. Find full code examples with links to the manual and explanation under these closely related questions:
How to perform the same aggregation on every column, without listing the columns?
postgresql - count (no null values) of each column in a table
You can generate the repetitive part of query by using information_scheam.columns.
select 'count(distinct '||column_name||') d_names, sum(case when '||column_name||' is not null then 1 else 0 end) n_s_ip,'
from information_schema.columns where table_name='table'
order by ordinal_position;
The above query will generate count(...) and sum(...) for each column of table. This result can be used as select-list for your query. You can cut&paste the result to the following query:
select
-- paste here
from table;
After paste, you have to remove the last comma.
In this way, you can avoid writing select-list for 450 columns.