Reject a row based on 2 column values - sql

Below is the output of a simple join query. All the 3 columns are from different tables.
Col1 Col2 Col3
Manual Y-Yes Include
MC Y-Yes Include
Manual Y-Yes Exclude
Manual Y-Yes Exclude
I need to get the rows with 'Include' only if there is no 'Exclude' for the same Col1 value.
If there is no 'Exclude' for the Col1 value, then its fine to display 'Include'.
So the query should not display the first row according to the requirement since the Col1 value 'Manual' has 'Exclude'.

Your sql query should look a lot like what your question would be in English:
You want all the rows where there is no row for the same col1 value that has 'Exclude' in the col3 value, right?
I cannot give exact sql since you do not provide table or column names, but if all three columns were in the same table, it would look like this:
Select * from mytable
where not exists
(select * from mytable
where col1 = t.col1
and col3 = 'Exclude')

Related

DISTINCT for large table

I have got a very large table and in one column I have some strings like TypeA, TypeB, etc I would like to do a query with CASE operator using that column
CASE WHEN col1 = 'TypeA' Then '25'
WHEN col1 = 'TypeB' Then '28'
...
WHEN col1 = '????' Then '15'
END
but I do not know how many unique values that column has and what are they (they are words/sentences up to 3 words).
I know I could find those unique values by
SELECT DISTINCT col1 FROM table1
or
SELECT col1 FROM table1 GROUP BY col1
but due to the size of table it's executing endlessly
Can I do it in efficient way? I want to find all unique values just from 1 column
It seems one should better create a table with unique values. Then you can join on that table, and the value domain is open-ended. You can then replace the field with a reference to the value table.
As there does not seem to exist an index on col1 DISTINCT on the original table is slow. Have an index / primary key on the col1 of the value table.

Best way to compare three columns in sql Hive

I need to do some comparison through 3 columns containing string dates 'yyyy-mm-dd', in Hive SQL. Please take in consideration that the table has more than 2 million records.
Consider three columns (col1; col2; col3) from table T1, I must guarantee that:
col1 = col2, and both, or at least one is different from col3.
My best regards,
Logically you have an issue.
col1 = col2
Therefore if col1 != col3 then col2 != col3;
There for it's really enough to use:
select * from T1 where col1 = col2 and col1 != col3;
It is appropriate to do this map side so using a where criteria is likely good enough.
If you wanted to say 2 out of the 3 need to match you could use group by with having to reduce comparisons.

SQL - get only colums from a table where not all values are nulls

SQL question:
How do I get all column values from columns where not all values are null?
Table A
COL1 COL2 COL3 COL4 COL5
---------------------------------------
abc 1 NULL NULL NULL
def 2 NULL testA NULL
NULL 3 NULL testB NULL
jkl 4 NULL NULL NULL
I want to get
COL1 COL2 COL4
-----------------------
abc 1 NULL
def 2 testA
NULL 3 testB
jkl 4 NULL
Is there a sql or plsql solution achieve this this?
To avoid answers that are irrelevant: assume I have a million rows.
I want the result to be a view or a result table.
Not a written output.
I found a similar question, but it does not satisfy my need:
How to select columns from a table which have non null values?
The column names can be quickly grabbed through this query
select column_name
from all_tab_columns
where lower(table_name)='tableA' and num_distinct > 0;
I understand I could create a script with a cursor and then loop through it, adding the result to a new table or view.
This is not what I need. I wondered if this could be done using a single query, using pivot/unpivot or something else.
What you are asking for is not possible in plain SQL, unless you know ahead of time which columns only have NULL everywhere. (It seems that you don't want to assume that you know that.)
Which columns are included in the output - how many columns, their names, and in what order they appear - must be hard-coded in the SELECT clause, it can't be determined at runtime. On the other hand, you will only know which columns are all-NULL only after reading the data (meaning, at runtime) - or else you must have that information from an external source.
The only way to do what you seem to want to do is with dynamic SQL. That is an advanced topic, and a technique generally considered a poor business practice.
WHY do you not want to show columns with all-NULL values? Are you sure that requirement is meaningful?
try these steps, it may help:
Create table temp as (Select * from TableA)
Declare NbrRows Number(10);
plsql_block VARCHAR2(1000);
CountNullRows Number (10)
Select count(*) as nbr
into NbrRows
from TableA
Select count(COL1) as nbr
into CountNullRows
from TableA where COL1 is null
if (NbrRows = CountNullRows) then
Alter table Temp drop column COL1
endif
Select count(COL2) as nbr
into CountNullRows
from TableA where COL2 is null
if (NbrRows = CountNullRows) then
Alter table Temp drop column COL2
endif
Select count(COL3) as nbr
into CountNullRows
from TableA where COL3 is null
if (NbrRows = CountNullRows) then
Alter table Temp drop column COL3
endif
...etc...
Do the same thing for all your columns
You have the desired result in the Tem table.

Create column with values from multiple columns in SQL

I want to add a column to a table that includes value from one of two columns, depending on which row contains the value.
For instance,
SELECT
concat("Selecting Col1 or 2", cast("Col1" OR "Col2" as string)) AS relevantinfo,
FROM table
I do not know much SQL and I know this query does not work. Is this even possible to do?
Col1 Col2
1
4
3
4
5
FINAL RESULT
Col1 Col2 relevantinfo
1 1
4 4
3 3
4 4
5 5
You can use the COALESCE function, which will return the first non-null value in the list.
SELECT col1, col2, COALESCE(col1, col2) AS col3
FROM t1;
Working example: http://sqlfiddle.com/#!9/05a83/1
I wouldn't alter the table structure to add a redundant information that can be retrieved with a simple query.
I would rather use that kind of query with IFNULL/ISNULL(ColumnName, 'Value to use if the col is null'):
--This will work only if there can't be a value in both column at the same time
--Mysql
SELECT CONCAT(IFNULL(Col1,''),IFNULL(Col2,'')) as relevantinfo FROM Table
--Sql Server
SELECT CONCAT(ISNULL(Col1,''),ISNULL(Col2,'')) as relevantinfo FROM Table

INSERT INTO Table WHERE

I am working with Postgres and Python (psycopg2).
I am looking for a way to INSERT data into a table.
Assuming a table with 10 rows. id going from 1 to 10. Taking a row (i.e id = 3) with a WHERE condition, all my columns are filled with some value, except 2 columns (col3 and col4). Meaning col1,col2 and col5 have values in it. col3 and col4 have NULL conditions, explaining why they are empty in the first place.
I would like to fill these 2 columns with some Data.
I am looking for something like:
INSERT INTO table_a (col3,col4) WHERE id = 3 VALUES ...
Bottom line, I would like to find the row I should fill my two empty columns with the Data I would like.
Sounds like you're looking for an update statement, not an insert statement:
UPDATE table_a
SET col3 = 'some_value', col4 = 'some_other_value
WHERE id = 3
I think you want UPDATE, not INSERT:
UPDATE table_a
SET col3 = ?,
col4 = ?
WHERE id = 3;
INSERT inserts new rows into a table. UPDATE updates existing rows.
If the row already exist, you are looking for an update rather than an insert.
UPDATE table
SET col3 = valueY,
col4 = valueX
WHERE id = 3
Does this make sense to you?