Taking out common data - sql

I want to compare two column and take out the common rows which are present in table1 and table 2 from two different tables.
table 1 table 2 result
mobnum A mobnum B 988123456
988123456 988124567201718 988123457
988124567 988123456201718
944123456 988623456201718

I'm not quite sure since you haven't really formated your data in a nice way but I think the code below will give you what you want, I included the second table in the where () in order to only select matching values. If you need the rows simply change "Select Num" to Select the unique Id's and go from there.
Table Test_1:
Num
988123456
988124
988124567
944123456
Table Test_2:
Num
988123456
988123457
9881234
9886234
Query:
select Num from Test_1 where Num in (Select Num from Test_2)
Output:
Num
988123456

Related

How to find out if a row of one table exists in the values of at least one row of another table?

I have two SQL tables, example below:
Table 1 (column types varchar, integer, numeric)
A
B
C
D
A007
22
14.02
_Z 1
A008
36
15.06
_Z 1
Table 2 (column types varchar)
A
B
C
D
A009,A010,A011
33,35,36
16.06,17.06
_Z 1,_Z 2
A003,A007,A009
14,22,85
13.01,17.05,14.02
_Z 1
Is there a way to compare individual rows of the first table with the rows of the second table and find out which row of the first table does not occur in the values of any row of the second table?
As can be seen, the first row of table 1 occurs in the values of the second row of table 2.
However, the second row of table 1 does not occur in the values of the rows of table 2, therefore the desired output is row 2 of table 1.
Desired output table:
A
B
C
D
A008
36
15.06
_Z 1
What I have tried so far:
My solution was to create a table containing all possible combinations of column values for each row of the second table (with the same column data types as the columns of the first table) and then use SELECT * FROM TABLE1 EXCEPT SELECT * FROM TABLE2 to get the difference rows.
The solution worked (for relatively small tables) but I am currently in a situation where generating all combinations of column values for each row of the second table (which in my case has 500 rows) results in a table containing millions of rows, so I am looking for another solution, where I can use the original table with 500 rows.
Thank you in advance for any possible answer, preferably one that could also work in the IBM DB2 database.
We can use a LIKE trick here along with string concatenation:
SELECT t1.*
FROM Table1 t1
WHERE NOT EXISTS (
SELECT 1
FROM Table2 t2
WHERE ',' || t2.A || ',' LIKE '%,' || t1.A || ',%'
);
Note that it would be a preferable table design for Table2 to not store CSV values in this way. Instead, get every A value onto a separate row.

In Snowflake, I want to count duplicates in a table based on all the columns in the table without typing out every column name

I have a table with 60 columns in it. I would like to identify how many duplicates there are in the table based on all the columns being identical.
I don't want to have to type out every field name in the SELECT or GROUP BY clauses. Is there a way to do that?
You can use an approach like this for each table:
SELECT
MD5(OBJECT_CONSTRUCT(SRC.*)::VARCHAR) DUP_MD5, SUM(1) AS TOTAL_COUNT
FROM <table> SRC
GROUP BY 1
HAVING SUM(1) > 1;

How can I separate same column values to a variable based on value in another column?

suppose I Have below table
A
B
1
one
2
two
1
three
2
four
1
last
for value in A=1
then I need the output as one;three;last
how can I query this in Oracle's SQL?
If you care whether you get the string "one;three;last" or "three;one;last" or some other combination of the three values, you'd need some additional column to order the results by (a database table is inherently unordered). If there is an id column that you're not showing, for example, that could do that, you'd order by id in the listagg.
If you don't care what order the values appear in the result, you could do something like this
select listagg( b, ';' ) within group (order by a)
from your_table
where a = 1

SQL deleting one of two duplicate records?

I have a DB that has a problem that there are two of the same records for everything but they all have a different ID, but they have 2 columns (the actual data) that are the same. I was wondering if there was a good way to have a DELETE statement where I could select all these records that have the 2 columns match but have a different ID and delete one (doesn't matter which one)?
If you could could you give me a code example?
Delete from ...
where id in (select max(id), count as c
from ...
group by data1, data2
having c >1)
The idea is to select the bigger id of all duplicate rows, by grouping the rows on the column that are the same and making sure that there are multiple rows (having clause).
delete from your_table
where id not in
(
select min(id)
from your_table
group by col2
)

SQL Server Sum multiple rows into one - no temp table

I would like to see a most concise way to do what is outlined in this SO question: Sum values from multiple rows into one row
that is, combine multiple rows while summing a column.
But how to then delete the duplicates. In other words I have data like this:
Person Value
--------------
1 10
1 20
2 15
And I want to sum the values for any duplicates (on the Person col) into a single row and get rid of the other duplicates on the Person value. So my output would be:
Person Value
-------------
1 30
2 15
And I would like to do this without using a temp table. I think that I'll need to use OVER PARTITION BY but just not sure. Just trying to challenge myself in not doing it the temp table way. Working with SQL Server 2008 R2
Simply put, give me a concise stmt getting from my input to my output in the same table. So if my table name is People if I do a select * from People on it before the operation that I am asking in this question I get the first set above and then when I do a select * from People after the operation, I get the second set of data above.
Not sure why not using Temp table but here's one way to avoid it (tho imho this is an overkill):
UPDATE MyTable SET VALUE = (SELECT SUM(Value) FROM MyTable MT WHERE MT.Person = MyTable.Person);
WITH DUP_TABLE AS
(SELECT ROW_NUMBER()
OVER (PARTITION BY Person ORDER BY Person) As ROW_NO
FROM MyTable)
DELETE FROM DUP_TABLE WHERE ROW_NO > 1;
First query updates every duplicate person to the summary value. Second query removes duplicate persons.
Demo: http://sqlfiddle.com/#!3/db7aa/11
All you're asking for is a simple SUM() aggregate function and a GROUP BY
SELECT Person, SUM(Value)
FROM myTable
GROUP BY Person
The SUM() by itself would sum up the values in a column, but when you add a secondary column and GROUP BY it, SQL will show distinct values from the secondary column and perform the aggregate function by those distinct categories.