Need help in Sql query joining operation - sql

I have below test table with data:
Create table test
(
col1 int,
col2 int
)
Sample data:
col1 col2
-------------
1 4
1 5
2 4
3 5
3 4
Now I want all the col1 which have col2 value 4 and 5
o/p 1,3 since it contain 4 as well as 5 col2 value

SELECT *
FROM TestTable
WHERE col1 IN
(
SELECT col1
FROM TestTable
WHERE col2 = 4
)
AND col2 = 5

If you only need col1 value, you can do
of course, the value in the having clause (2 in this case) depends on the numbers of elements in the IN clause (so if you have 3, 4, 5, you'll need to put 3 in the having clause)
select col1 from test t1
where col2 in (4, 5)--take only results with 4 and 5 in col2
group by col1
having count(distinct col2) = 2 --be sure there's at least a 4 and a 5

Related

How can I find groups with more than one rows and list the rows in each such group?

I have a table "mytable" in a database.
Given a subset of the columns of the table, I would like to group by the subset of the columns, and find those groups with more than one rows:
For example, if the table is
col1 col2 col3
1 1 1
1 1 2
1 2 1
2 2 1
2 2 3
2 1 1
I am interested in finding groups by col1 and col2 with more than one rows, which are:
col1 col2 col3
1 1 1
1 1 2
and
col1 col2 col3
2 2 1
2 2 3
I was wondering how to write a SQL query for that purpose?
Is the following the best way to do that?
First get the col1 and col2 values of such groups:
SELECT col1 col2 COUNT(*)
FROM mytable
GROUP BY col1, col2
HAVING COUNT(*) > 1
Then based on the output of the previous query, manually write a query for each group:
SELECT *
FROM mytable
WHERE col1 = val1 AND col2 = val2
If there are many such groups, then I will have to manually write many queries, which can be a disadvantage.
I am using SQL Server.
Thanks.
This is a common problem. One solution is to get the "keys" in a derived table and join to that to get the rows.
declare #test as table (col1 int, col2 int, col3 int)
insert into #test values (1,1,1),(1,1,2),(1,2,1),(2,2,1),(2,2,3),(2,1,1)
select t.*
from #test t
inner join (
select col1, col2
from #test
group by col1, col2
having count(*) > 1
) k
on k.col1 = t.col1 and k.col2 = t.col2
col1 col2 col3
----------- ----------- -----------
1 1 1
1 1 2
2 2 1
2 2 3
The window function sum() over() may help here
Example
with cte as (
Select *
,Cnt = sum(1) over (partition by Col1,Col2)
From YourTable
)
Select *
From cte
Where Cnt>=2
Results
Another option (less performant)
Select top 1 with ties *
From YourTable
Order By case when sum(1) over (partition by Col1,Col2) > 1 then 1 else 2 end
Results

drop duplicates on some columns and keep other columns values

I have the following table with Postgres:
Id Col1 Col2 Col3
1 A 1 x
2 A 0 y
3 A 0 z
4 B 0 x
5 B 1 y
6 C 0 z
As part of a select query, I want to be able to drop duplicates in Col1, based on the highest Col2 values (where will never be multiple highest values per Col1 value), and keep the corresponding Col2, Col3 values.
Desired output:
Id Col1 Col2 Col3
1 A 1 x
5 B 1 y
6 C 0 z
In Postgres, you can use distinct on:
select distinct on (col1) t.*
from t
order by col1, col2 desc;

How create new hive table from two existing ones having same columns

I have two hive table :
> T1exp
Col1 Col2 Col3
1 5 7
3 4 6
4 2 1
and the table
> T2exp
Col1 Col2 Col3
0 5 4
1 2 2
4 3 1
I need to get one by merging both :
>FinalTable
Col1 Col2 Col3
1 5 7
3 4 6
4 2 1
0 5 4
1 2 2
4 3 1
I tried using this instruction :
create TableRDH as (select * from T2exp as t1 left.join FinalTable as t2 on t1.Col1 = t2.Col1 );
But it gives this error
FAILED: ParseException line 1:7 cannot recognize input near 'create'
'TableRDH' 'as' in ddl statement
How can I resolve this ?
There is a simple way to achieve your objective:
create table as
select * from <T2exp>
union
select * from <FinalTable>
You need do union to merge the table data,
Try,
create table as
select col1, col2, col3 from <table1>
union
select col1, col2, col3 from <table2>
Note - Error which you are getting is due to syntax issue, try the query without keyword as and paranthesis

How to select a column that doesn't exist in a table and return a NULL result for all rows returned

I have a query, that I'm looking to select a column that doesn't exist and just fill it with "NULL" in the results. The data is being exported to a .CSV File for import to another database which has the column. For example:
My query is:
Select col1, col2, col3
from table1
Output is:
col1 col2 col3
1 5 9
2 6 10
3 7 11
4 8 12
I'd like the output to be:
col1 col2 col3 col4
1 5 9 NULL
2 6 10 NULL
3 7 11 NULL
4 8 12 NULL
You can select a null literal:
SELECT col1, col2, col3, NULL AS col4
FROM mytable
Select col1, col2, col3, null as col4 from table1

Removing rows in SQL that have a duplicate column value

I have looked high and low on SO for an answer over the last couple of hours (subqueries, CTE's, left-joins with derived tables) to this question but none of the solutions are really meeting my criteria..
I have a table with data like this :
COL1 COL2 COL3
1 A 0
2 A 1
3 A 1
4 B 0
5 B 0
6 B 0
7 B 0
8 B 1
Where column1 1 is the primary key and is an int. Column 2 is nvarchar(max) and column 3 is an int. I have determined that by using this query:
select name, COUNT(name) as 'count'
FROM [dbo].[AppConfig]
group by Name
having COUNT(name) > 3
I can return the total counts of "A, B and C" only if they have an occurrence of column C more than 3 times. I am now trying to remove all the rows that occur after the initial value of column 3. The sample table I provided would look like this now:
COL1 COL2 COL3
1 A 0
2 A 1
4 B 0
8 B 1
Could anyone assist me with this?
If all you want is the first row with a ColB-ColC combination, the following will do it:
select min(id) as id, colB, colC
from tbl
group by colB, colC
order by id
SQL Fiddle
This should work:
;WITH numbered_rows as (
SELECT
Col1,
Col2,
Col3,
ROW_NUMBER() OVER(PARTITION BY Col2, Col3 ORDER BY Col3) as row
FROM AppConfig)
SELECT
Col1,
Col2,
Col3
FROM numbered_rows
WHERE row = 1
SELECT DISTINCT MIN(COL1) AS COL1,COL2,COL3
FROM TABLE
GROUP BY COL2,COL3
ORDER BY COL1