Select (show) only different columns from almost similar rows - sql

I have a table with many columns 50+. in order to take decisions I analyze any variant data.
Actually my query:
SELECT maincol, count(maincol) FROM table where (conditions) group by maincol having count(maincol) > 1
then:
SELECT * FROM table where (conditions) and maincol = (previous result)
before consult displays all rows and I have to search one by one
col1, col2, col3, col4, col5, col6, manycolumns..., colN
5 7 1 13 341 9 123
5 7 2 13 341 5 123
I want to get:
col3, col6
1 9
2 5
because it's difficult searching manually column by column.
- N columns could be different
- I don't have access to credentials, then I can't use a programing language to manage results.
- Working on DB2

This will be a little tedious but worth it. This assumes that col1 through coln are all of the same type. If not, cast each to character in the select clause.
The result set will identify the maincol values that occur more than once that also have one or more columns with differing values. The columns that differ will be named.
Select maincol, colname, count(distinct colvalue)
From (
Select maincol, ‘column1’ as colname, col1 as colvalue
from table
Union
Select maincol, ‘column2’ as colname, col2 as colvalue
from table
Union
Select maincol, ‘column3’ as colname, col3 as colvalue
from table
Repeat this pattern for remaining columns
)
Group by maincol, colname
Having count(distinct colvalue) > 1
You could even join the result set from above with the original table to show the entire row including the name of the columns that differ:
Select b.colname, a.*
From table a, Select(
include entire query from above
) as b
Where a.maincol = b.maincol

Related

Oracle SQL How to find duplicate values in different columns?

I have a set of rows with many columns. For example,
ID | Col1 | Col2 | Col3 | Duplicate
------------------------------------
81 | 101 | 102 | 101 | YES
82 | 101 | 103 | 104 | NO
I need to calculate the "Duplicate" column. It is duplicate because it has the same value in Col1 and Col3. I know there is the LEAST function, which is similar to the MIN function but with columns. Does something similar to achieve this exists?
The approach I have in mind is to write all possible combinations in a case like this:
SELECT ID, col1, col2, col3,
CASE WHEN col1 = col2 or col1 = col3 or col2 = col3 then 1 else 0 end as Duplicate
FROM table
But, I wish to avoid that, since I have too many columns in some cases, and is very prone to errors.
What is the best way to solve this?
Hmmm. You are looking for within-row duplicates. This is painful. More recent versions of Oracle support lateral joins. But for just a handful of non-NULL columns, you can do:
select id, col1, col2, col3,
(case when col1 in (col2, col3) or col2 in (col3) then 1 else 0 end) as Duplicate
from t;
For each additional column, you need to add one more in comparison and update the other in-lists.
Something like this... note that in the lateral clause we still need to unpivot, but that is one row at a time - resulting in possibly much faster execution than simple unpivot and standard aggregation.
with
input_data ( id, col1, col2, col3 ) as (
select 81, 101, 102, 101 from dual union all
select 82, 101, 103, 104 from dual
)
-- End of simulated input data (for testing purposes only).
-- Solution (SQL query) begins BELOW THIS LINE.
select i.id, i.col1, i.col2, i.col3, l.duplicates
from input_data i,
lateral ( select case when count (distinct val) = count(val)
then 'NO' else 'YES'
end as duplicates
from input_data
unpivot ( val for col in ( col1, col2, col3 ) )
where id = i.id
) l
;
ID COL1 COL2 COL3 DUPLICATES
-- ---- ---- ---- ----------
81 101 102 101 YES
82 101 103 104 NO
You can do this by unpivoting and then counting the distinct values per id and checking if it equals the number of rows for that id. Equal means there are no duplicates. Then left join this result to the original table to caclulate the duplicate column.
SELECT t.*,
CASE WHEN x.id IS NOT NULL THEN 'Yes' ELSE 'No' END AS duplicate
FROM t
LEFT JOIN
(SELECT id
FROM
(SELECT *
FROM t
unpivot (val FOR col IN (col1,col2,col3)) u
) t
GROUP BY id
HAVING count(*)<>count(DISTINCT val)
) x ON x.id=t.id
The best way† is to avoid storing repeating groups of columns. If you have multiple columns that essentially store comparable data (i.e. a multi-valued attribute), move the data to a dependent table, and use one column.
CREATE TABLE child (
ref_id INT,
col INT
);
INSERT INTO child VALUES
(81, 101), (81, 102), (81, 101),
(82, 101), (82, 103), (82, 104);
Then it's easier to find cases where a value occurs more than once:
SELECT id, col, COUNT(*)
FROM child
GROUP BY id, col
HAVING COUNT(*) > 1;
If you can't change the structure of the table, you could simulate it using UNIONs:
SELECT id, col1, COUNT(*)
FROM (
SELECT id, col1 AS col FROM mytable
UNION ALL SELECT id, col2 FROM mytable
UNION ALL SELECT id, col3 FROM mytable
... for more columns ...
) t
GROUP BY id, col
HAVING COUNT(*) > 1;
† Best for the query you are trying to run. A denormalized storage strategy might be better for some other types of queries.
SELECT ID, col1, col2,
NVL2(NULLIF(col1, col2), 'Not duplicate', 'Duplicate')
FROM table;
If you want to compare more than 2 columns can implement same logic with COALESCE
I think you want to use fresh data that doesnot contains any duplicate values inside table if it right then use SELECT DISTINCT statement like
SELECT DISTINCT * FROM TABLE_NAME
It will conatins duplicate free data,
Note: It will also applicable for a particular column like
SELECT DISTINCT col1 FROM TABLE_NAME

Oracle SQL - Join 2 table columns in 1 row

I have 2 SQL's and the result come fine. They are no relation between those 2 queries but I want to see all the rows in single column.
e.g.
Select col1,col2,sum(col3) as col3 from table a
select col4,col5 from table b
I would like the result to be
col1 col2 col3 col4 col5
If there is no equivalent row for either table a or table b replace with zeroes.
Could some one help me with this. thanks.
Since, you didn't provided any information like table structure or data inside each tables. You can cross join both tables.
select t.col1,t.col2,t.col3,t1.col1,t1.col2 from tab1 t,tab2 t1;
SQLFiddle
In both select statements add column based on rownum or row_number() and then full join results using this column:
select nvl(col1, 0) col1, nvl(col2, 0) col2, nvl(col3, 0) col3,
nvl(col4, 0) col4, nvl(col5, 0) col5
from
(select rownum rn, col1, col2, col3 from (
select col1, col2, sum(col3) col3 from tableA group by col1, col2)) a
full join (select rownum rn, col4, col5 from tableB) b using (rn)
SQLFiddle demo
I guess a UNION could be a pragmatic solution since the 2 queries are not related. They are just 2 data sets that should be retrieved in one statement:
Select col1,col2,sum(col3) as col3 from table a
UNION
select col4,col5, to_number(null) col6 from table b
Be aware of col6 in the example. SQL insists on retrieving an equal set of columns in a UNION statement. It is a good practice to retrieve columns with exactly the same datatype. Since the sum(col3) will yield a number datatype column, col6 should too.
The outcome of col4 and col5 will be shown in col1 and col2.

merge two queries with different where and different grouping into 1

Sorry, I asked this question just before and got some good answers but then I realised I made a mistake with the query in question, if I change the question in the original post that could make the answers invalid so I'm posting again with the right query this time, please forgive me, I hope this is acceptable.
DECLARE #Temp TABLE
(MeasureDate, col1, col2, type)
INSERT INTO #Temp
SELECT MeasureDate, col1, col2, 1
FROM Table1
WHERE Col3 = 1
INSERT INTO #Temp
SELECT MeasureDate, col1, col2, 3
FROM Table1
WHERE Col3 = 1
AND Col4 = 7000
SELECT SUM(col1) / SUM(col2) AS Percentage, MeasureDate, Type
FROM #Temp
GROUP BY MeasureDate, Type
I do two inserts into the temp table, 2nd insert with an extra WHERE but same columns same table, but different type, then I do SUM(col1) / SUM(col2) on the temp table to return the result I need per MeasureDate and type. Is there a way to merge all these inserts and selects into one statement so I don't use a temp table and do a single select from Table1? Or even if I still need the temp table, merge the selects into one select instead of two separate selects? Stored procedure works fine as it is, just looking for a way to shorten it.
Thanks.
Sure can. I might start with combining the two queries from your inserts using UNION ALL (this variation of UNION will not remove duplicates), wrapped up in a CTE from which you can perform your final query:
WITH MeasureData(MeasureDate, col1, col2, type) AS (
SELECT MeasureDate, col1, col2, 1
FROM Table1
WHERE Col3 = 1
UNION ALL
SELECT MeasureDate, col1, col2, 3
FROM Table1
WHERE Col3 = 1
AND Col4 = 7000
)
SELECT SUM(col1) / SUM(col2) AS Percentage, MeasureDate, Type
FROM MeasureData
GROUP BY MeasureDate, Type
That's it, no more table variable or insert statements.
No real need for a UNION, you can handle this with a CASE statement:
SELECT SUM(col1) / SUM(col2) AS Percentage, MeasureDate, Type
FROM (
SELECT MeasureDate, col1, col2, case when Col4 = 7000 then 3 else 1 end type
FROM Table1
WHERE Col3 = 1
) t
GROUP BY MeasureDate, Type
Edit, as Gordon correctly points out, for Type = 1, this query wouldn't produce the same results. Here's a variation on Gordon's good answer that might be easier to visually understand using a CROSS JOIN and IF logic:
SELECT T1.MeasureDate,
T.Type,
SUM(IF(T.Type=1,Col1,IF(T.Type=3 AND T1.Col4=7000,T1.Col1,0))) /
SUM(IF(T.Type=1,Col2,IF(T.Type=3 AND T1.Col4=7000,T1.Col2,0))) AS Percentage
FROM Table1 T1
CROSS JOIN (SELECT 1 Type UNION SELECT 3) T
WHERE T1.Col3 = 1
GROUP BY T1.MeasureDate, T.Type
Condensed SQL Fiddle
Your method is double counting cases where col3 = 1 and col4 = 7000. Here is a method that takes this into account, without union on the overall table:
select t.type, SUM(t1.col1) / SUM(t1.col2) AS Percentage, t1.MeasureDate, t.Type
from table1 t1 join
(select 1 as type union all
select 3 as type
) t
on t.type = 1 or t1.col4 = 7000
where t1.col3 = 1
group by measuredate, type;

Create SQL summary using union

I currently have some SQL that is used to create an excel report in the following format:
COL1 COL2 COL3
2 1 8
3 7 9
1 2 4
Now what I am trying to do is sum up the total of these each value and insert it at the bottom using UNION ALL (unless of course there is a better way.)
Now the values for each column are generated already by sums. The concept I can't grasp is how to sum all the values for the final row, if this is even possible.
So the output should look like so:
COL1 COL2 COL3
2 1 8
3 7 9
1 2 4
6 10 21
Thanks!
It looks like you want to add
WITH ROLLUP
to the end of your query
eg:
Select sum(a) as col1, sum(b) as col2
from yourtable
group by something
with rollup
Depending on the full nature of your query, you may prefer to use with cube, which is similar. See http://technet.microsoft.com/en-us/library/ms189305(v=sql.90).aspx
select
col1
,col2
,col3
from tableA
union
select
sum(col1)
,sum(col2)
,sum(col3)
from tableA
order by col1,col2,col3
SELECT COL1, COL2, COL3
FROM SomeTable
UNION ALL
SELECT SUM(COL1), SUM(COL2), SUM(COL3)
FROM SomeTable
note. there is also a ROLLUP clause but I think the above would be a simpler solution in this case
http://technet.microsoft.com/en-us/library/ms189305%28v=sql.90%29.aspx

Select column in query based on other table

I have a table called A where records contains some column name of table B.
table A
Id, columnName
1 col1
2 col2
3 col3
table B
ID, col1, col2, col3, col4, col5
I want to select columns of B based on the value of table A.
Example
Select col1, col2, col3
from B
If the record number 3 in table A were deleted the sql statement will be.
Select col1, col2
from B
You need a join. A basic SQL construct.
http://www.w3schools.com/sql/sql_join_inner.asp