How can I return a list of row names and column names where the value is greater than 0 in SQL? - sql

I've put together a reconciliation tool in SQL Server which identifies the number of record breaks by field (col 2 - col 4) between two identical (data types/structure) sources. The output returned is in the format below, grouped on col 1.
Col1 Col2 Col3 Col4
X 0 0 1
Y 0 1 1
Z 1 0 1
I am trying to manipulate the output so that it provides a list of the Col 1 identifier and the name of any column names (col 2 - col 4) which have breaks (value > 0).
The expected output based on the above data would look like this.
Col1 FieldBreak
X Col2
Y Col3
Y Col4
Z Col2
Z Col4
I'm newer to SQL (6 months of professional experience) and am stuck. Any help would be much appreciated!

In any database, you can use:
select col1, 'col2' as col
from t
where col2 = 1
union all
select col1, 'col3' as col
from t
where col3 = 1
union all
select col1, 'col4' as col
from t
where col4 = 1;
There are probably more efficient methods, but those depend on the database. And for a small table efficiency may not be a concern.
In SQL Server, you would unpivot using apply:
select t.col1, v.*
from t cross apply
(values ('col2', t.col2), ('col3', t.col3) . . .
) v(col, val)
where v.val is not null;
If you have a lot of columns, you can construct the expression using a SQL statement (from INFORMATION_SCHEMA.COLUMNS) and/or using a spreadsheet.

Related

What happens if I compare a column with itself and it is NULL?

What happens if I compare a column with itself and it is NULL ? Is this similar to floating point values where x == x only is false if the value is a NaN ?
It depends on the comparison you do with itself.
If you do
WHERE Col = Col
any rows where Col IS NULL will have the WHERE clause evaluate to UNKNOWN (rather than TRUE or FALSE) and the row will not be returned.
So WHERE Col = Col is equivalent to WHERE Col IS NOT NULL
If you do (not available in all RDBMS but standard SQL)
WHERE Col IS NOT DISTINCT FROM Col
Then this will evaluate to true
The comparsion with = is not NULL safe, which means the result is UNKNOWN
So you should check your database for NULL safe comparisons
On MySQL it is <=>
Postgres uses col1 IS DISTINCT FROM col2
SQL Server hasn't one till Version 2022, you can uses the last option in query. Since Version 2022 it also supports col1 IS DISTINCT FROM col2
SELECT col1 = col2,col1 <=> col2, col1 = col2 OR (col1 IS NULL AND col2 IS NULL) FROM tab1
col1 = col2
col1 <=> col2
col1 = col2 OR (col1 IS NULL AND col2 IS NULL)
null
1
1
fiddle

Is there a way to compare 2 columns of table in SQL Server

Can somebody give me an idea to solve this? Finding 2 columns in a table has same data, we don't have idea about the columns to be same.
Can I move partial data into excel to check?
I have columns of about 39 and rows of 2B
Col1 equal to col3
col2 equal to col6
col4 not equal col5
Output should show only columns that are common or some output which are common
Null Values are bothering me.
Thanks in advance.
If you have 39 columns and want to evaluate each against every other that gives you 741 column pairings to evaluate. This is possible to do in a concise manner but I wouldn't recommend this for 2 billion rows!
SELECT V1.name,
V2.name
FROM YourTable T WITH (TABLOCKX)
CROSS APPLY (SELECT (SELECT T.*
FOR xml path('row'), elements, type)) CA(X)
CROSS APPLY CA.X.nodes('/row/*') N1(n)
CROSS APPLY CA.X.nodes('/row/*') N2(n)
CROSS APPLY (VALUES(n1.n.value('local-name(.)', 'sysname'), n1.n.value('.', 'nvarchar(4000)') )) V1(name, val)
CROSS APPLY (VALUES(n2.n.value('local-name(.)', 'sysname'), n2.n.value('.', 'nvarchar(4000)') )) V2(name, val)
WHERE V2.name < V1.name
AND V1.val = V2.val
GROUP BY V1.name,
V2.name
HAVING COUNT(*) = (SELECT COUNT(*)
FROM YourTable)
You should first profile the values in your columns. Get the MIN, MAX and COUNT of all columns (and potentially other aggregate data too for numeric columns). Discard any columns where COUNT is not equal to the whole row count as these won't match anything with your desired treatment of NULLs and identify sets of columns with the same MIN and MAX for further investigation.
If you do that with your example data you will see that the only pairs worth investigating are Col1 <-> col3 and Col2 <-> col6. So you can then do a much more focused query to determine whether this is actually the case.
SELECT COUNT(*),
SUM(CASE WHEN col1 = col3 THEN 1 ELSE 0 END) AS count_rows_same_1_3,
SUM(CASE WHEN col2 = col6 THEN 1 ELSE 0 END) AS count_rows_same_2_6
FROM YourTable T

SQL: Select the minimum value from multiple columns with null values

I have a table like this one
ID Col1 Col2 Col3
-- ---- ---- ----
1 7 NULL 12
2 2 46 NULL
3 NULL NULL NULL
4 245 1 792
I wanted a query that yields the following result
ID Col1 Col2 Col3 MIN
-- ---- ---- ---- ---
1 7 NULL 12 7
2 2 46 NULL 2
3 NULL NULL NULL NULL
4 245 1 792 1
I mean, I wanted a column containing the minimum values out of Col1, Col2, and Col 3 for each row ignoring NULL values. In a previous question (What's the best way to select the minimum value from multiple columns?) there is an answer for non NULL values. I need a query as efficient as possible for a huge table.
Select Id,
Case When Col1 < Col2 And Col1 < Col3 Then Col1
When Col2 < Col1 And Col2 < Col3 Then Col2
Else Col3
End As MIN
From YourTableNameHere
Assuming you can define some "max" value (I'll use 9999 here) that your real values will never exceed:
Select Id,
Case When Col1 < COALESCE(Col2, 9999)
And Col1 < COALESCE(Col3, 9999) Then Col1
When Col2 < COALESCE(Col1, 9999)
And Col2 < COALESCE(Col3, 9999) Then Col2
Else Col3
End As MIN
From YourTableNameHere;
You didn't specify which version of Teradata you're using. If you're using version 14+ then you can use least.
Unfortunately least will return null if any of its arguments are null. From the docs:
LEAST supports 1-10 numeric values.
If numeric_value is the data type of the first argument, the return
data type is numeric. The remaining arguments in the input list must
be the same or compatible types. If either input parameter is NULL,
NULL is returned.
But you can get around that by using coalesce as Joe did in his answer.
select id,
least(coalesce(col1,9999),coalesce(col2,9999),coalesce(col3,9999))
from mytable
This might work:
Select id, Col1, Col2, Col3, least(Col1, Col2, Col3) as MIN From YourTableNameHere
in this way you don't need to check for nulls, just use min and a subquery
select tbl.id,tbl.col1,tbl.col2,tbl.col3,
(select min(t.col)
from (
select col1 as col from tbl_name t where t.id=tbl.id
union all
select col2 as col from tbl_name t where t.id=tbl.id
union all
select col3 as col from tbl_name t where t.id=tbl.id
)t)
from tbl_name tbl
Output:
1 7 NULL 12 7
2 2 46 NULL 2
3 NULL NULL NULL NULL
4 245 1 792 1
Just modify your query with coalesce():
Select Id,
(Case When Col1 <= coalesce(Col2, col3, col1) And
Col1 <= coalesce(Col3, col2, col1)
Then Col1
When Col2 <= coalesce(Col1, col3, col2) And
Col2 <= coalesce(Col3, col1, col2)
Then Col2
Else Col3
End) As MIN
From YourTableNameHere;
This doesn't require inventing a "magic" number or over-complicating the logic.
I found this solution to be more efficient than using multiple case statement clauses, which can get extremely lengthy when evaluating data from several columns across one row.
Also, I can't take credit for this solution as I found it on some website a year or so ago. Today I needed a refresh on this logic, and I couldn't find it anywhere. I found my old code and decided to share it in this forum now.
Creating your test table:
create table #testTable(ID int, Col1 int, Col2 int, Col3 int)
Insert into #testTable values(1,7,null,12)
Insert into #testTable values(2,2,46,null)
Insert into #testTable values(3,null,null,null)
Insert into #testTable values(4,245,1,792)
Finding min value in row data:
Select ID, Col1, Col2, Col3 ,(SELECT Min(v) FROM ( VALUES (Col1), (Col2), (Col3) ) AS value(v)) [MIN] from #testTable order by ID

SQL Query - aliasing + different periods + different tables

I am using toad for oracle and I experienced different issues.
Aliasing - When I want to use the same column twice?!
Let us asume that we have a table x which has col1, col2, col3. Col1 contains a customer contact numbers (211,212,213, and more)
And there is another table, y, that has col1,col4,col5. Col1 in both tables are equal. Col4 shows whether a number is main or secondary.
Table y
(Col1,col4,col5)
(211,Main,v)
(212,Secondary,s)
(213,Secondary,w)
What I want to do is as follow :
SELECT col2, col1 as mainNumbet, col1 as secondNumber
FROM x
WHERE mainNumber IN (SELECT col1
FROM y
WHERE col4 = 'main')
AND SecondNumber IN (SELECT col1
FROM y
WHERE col4 = "secondary")
But it states that there is a problem !??
There are several problems with your code.
Perhaps this is what you want:
SELECT x.col2,
CASE WHEN col4 ='main' THEN x.col1 END AS mainNumber,
CASE WHEN col4 ='secondary' THEN x.col1 END AS secondNumber,
FROM x
JOIN y
ON x.col1 = y.col1
You don't say what col2 is, but you are taking the same column (col1) from the same row of the same table and trying to assign different meanings to it (main_number and second_number)
SELECT col2, col1 as mainNumbet, col1 as secondNumber
FROM x
If COL1 is unique on 'y', then it can only be the main OR the secondary, so this should work
SELECT col2, col1 as number, (select col4 from y where y.col1=x.col1) type
FROM x
If COL1 is NOT unique on 'y', then it can be a main and a secondary, so this should work
SELECT col2, col1 as number,
(select col4 from y where y.col1=x.col1 and col4 = 'main' and rownum=1) m_ind,
(select col4 from y where y.col1=x.col1 and col4 = 'secondary' and rownum=1) s_ind
FROM x

SQL query grouped parameter maximum

Let's say I had two columns in a database, col1 and col2. Column 2 is the time, Column 1 something. In my query, I want to do the following:
I want to SELECT * from my table and group the results by col1. However, I only want those entries where for the grouped col1 there is no value of col2 higher than a certain value. Meaning that, I only want those col1-s for which col2 does not exceed a certain value.
If, for instance, I had three rows, as follows:
ROW1: col1 = val1, col2 = 3
ROW2: col1 = val1, col2 = 5
ROW3: col1 = val2, col2 = 3
ROW4: col1 = val2, col2 = 4
And I do not want the time for any of them to exceed 4, then, as a result, I would only want ROW3 or ROW4, which, does not matter, for col1 is the same and is grouped. But in rows 1 and 2, that are grouped by col1's value "val1", in one of them col2 DOES exceed 4, therefore, I do not want any of them.
SELECT col1 FROM table GROUP BY col1 HAVING MAX(col2) <= 4
Because you want only the common value (col1) from the group, you can use GROUP BY. When you do a GROUP BY (aggregate) query, you can use the HAVING clause to apply a filter to the aggregated data set.
I am not use I got the point (my english is not good).
I think sub-query is the best choice.
Note: this example should work with mySql ...
SELECT *
FROM table
WHERE col1 IN
(SELECT col1 FROM table WHERE col2 < 5 GROUP BY col1)
ORDER BY col1
CREATE TABLE x (
t TIME NOT NULL,
v INT NOT NULL );
INSERT INTO x VALUES
('13:14:00', 24),
('13:14:00', 27),
('13:14:00', 29),
('17:12:00', 14),
('17:12:00', 20),
('17:12:00', 24);
SELECT t, MAX(v) AS mv FROM x
GROUP BY t
HAVING mv <= 25;
Or do I misunderstand the question?