Selecting rows with shared values in two distinct fields - sql

I searched around for this but all I can find are answers on how to select rows with the same value in both fields. I'm trying to select rows using PostgreSQL that share values in two fields with any other row in the table.
As an example:
id col1 col2
1 A X
2 A Y
3 A X
4 B Y
5 B Y
6 B X
In this case I'd want to select rows 1, 3, 4 and 5. Thanks in advance!

Use window functions:
select t.*
from (select t.*, count(*) over (partition by col1, col2) as cnt
from t
) t
where cnt > 1;

Related

SQL Sum amount for column with unique values

Update
Realised I was doing it correctly. The reason why I had the issue was because I didn't realise my data for Col1 wasn't as expected, having some Col1 that associates with multiple Col0 (It was supposed to be Col1:Col0 1:1 relationship. That's why the confusion of it's not working as intended.
Original Question
I'm using SQL query to sum a column for total revenue of distinct values in one of the columns, and return a table with combining with other attributes.
Here's my table:
Col 0 Col1 Col2(unique) Revenue
X 1 A 10
X 1 B 20
X 1 C 0
X 2 D 5
X 2 E 8
Y 3 F 3
Y 3 G 0
Y 3 H 50
Desired output:
Col0 Col1 Revenue
X 1 30
X 2 13
Y 3 53
I tried:
WITH
rev_calc AS (
SELECT
Col0,
Col1,
Col2, ##this is for further steps to combine other tables for mapping after this
SUM(Revenue) AS total_revenue, ##total rev by Col1
FROM table_input
GROUP BY Col1, Col0, Col2 ##Have to group by Col0 and Col2 too as it raised error because of 'list expression'
)
SELECT DISTINCT
table2.mappedOfCol0,
rev_calc.Col1,
rev_calc.Col2,
rev_calc.total_revenue,
FROM another_table AS table2
LEFT JOIN rev_calc
ON rev_calc.Col0 = table2.mappedOfCol0
But getting actual output with multiple rows of revenues under a specific Col1.
For example, when i filter by Col1 = 1 in the output table, I get a list of different revenue amount still:
Col1 total_revenue
1 10
1 20
1 0
I thought the GROUP BY should have sum up the revenue by distinctly under Col1. What did I miss out here? I also tried querying first FROM (SELECT DISTINCT Col1....) way but the sum(revenue) is producing a list of different revenue as well
Newbie to SQL here, appreciate if anyone can share any insights here. Thanks.
Don't you just want aggregation?
select col0, col1, sum(revenue) as revenue
from mytable
group by col0, col1
I don't understand what you are trying to do with col2 in the query. This produces the result you want for the data you showed, that contains a single table.
As per explanation you provided, I think your requirement is aggregate revenue of selective records that map with another table based on Col2 values. If that is the case then you may try following query.
WITH
rev_calc AS (
SELECT
distinct(Col2) as Col2
From table_input
LEFT JOIN another_table
ON another_table.Col2 = table_input.Col2
)
SELECT
Col0,
Col1,
SUM(Revenue) AS total_revenue
FROM table_input
WHERE Col2 in (select Col2 from rev_calc)
GROUP BY Col0, Col1;

Top n distinct values of one column in Oracle

I'm using a query where a part of it gets the top 3 of a certain column.
It creates a distinct subquery of the column, limited by 3 number of rows, and then filters those rows to the main query to do the top 3.
WITH subquery AS (
SELECT col FROM (
SELECT DISTINCT col
FROM tbl
) WHERE ROWNUM <= 3
)
SELECT col
FROM tbl
WHERE tbl.col = subquery.col
So the original table is like this:
col
-----
a
a
a
b
b
b
c
d
d
e
f
f
f
f
And the query returns the top 3 of the column (not the top 3 rows which would only be a):
col
-----
a
a
a
b
b
b
c
I'm trying to learn if there is a more correct way of doing this as the real query is big and duplicating its size with a subquery that looks almost the same just to get the top 3 is hard to work with and understand/modify.
Is there a better way to do the top first 3 distinct values of one column in Oracle?
Yes, you can use dense_rank and avoid duplicated code:
select col
from (select col, dense_rank() over (order by col) rnk from tbl)
where rnk <= 3
demo

Query to add rows until multiple of 10

I need a query with a column with row number (probably using ROW_NUMBER() ) and if the result are 8 rows (e.g.) the query should result 10 rows with rows 9 and 10 blank except row number. If the result is 15 rows the result should be 20 rows, and so on...
It is possivel?
Normally, something like this would be done in the application layer. However, you can do this in SQL:
select t.*
from table t
union all
select nulls.*
from (select 1 as n union all select 2 union all . . . select 10
) n cross join
(select count(*) cnt from table) cnt left join
table nulls
on 1 = 0
where 10 * floor(cnt / 10) + n.n <= cnt;
The first subquery gets all your data. The second gets the additional rows with NULL values. It uses a left join with "false" condition to get all the columns.

get subset of a table in SQL

I want to get a subset of a table, here's the example:
1 A
2 A
3 B
4 B
5 C
6 D
7 D
8 D
I want to get the unique record, but with the smallest id:
1 A
3 B
5 C
6 D
How can I write the SQL in SQL Server? Thanks!
Use a common-table expression like this:
;WITH DataCTE AS
(
SELECT ID, OtherCol,
ROW_NUM() OVER(PARTITION BY OtherCol ORDER BY ID) 'RowNum'
FROM dbo.YourTable
)
SELECT *
FROM DataCTE
WHERE RowNum = 1
This "partitions" your data by the second column you have (A, B, C) and orders by the ID (1, 2, 3) - smallest ID first.
Therefore, for each "partition" (i.e. each value of your second column), the entry with RowNum = 1 is the one with the smallest ID for each value of the second column.
select min(id), othercol
from thetable
group by othercol
and maybe with
order by othercol
... at the end if thats important
Try this:
SELECT MIN(Id) AS Id, Name
FROM MyTable
GROUP BY Name
select min(id), column2
from table
group by column2
It helps if you provide the table information in the question - I've just guessed at the column names...

Select database rows in range

I want to select the rows between A and B from a table. The table has at least A rows but it might have less than B rows.
For example if A = 2, B = 5 and the table has 3 rows it should return rows 2 and 3.
How could I get the rows in such a range?
I am using Microsoft SQL Server 2008.
You can use something similar to what's being described in this SO question.
I.E.
SELECT * FROM (
SELECT *, ROW_NUMBER() OVER (ORDER BY YOUR_ORDERED_FIELD) as row FROM YOUR_TABLE
) a WHERE row > 5 and row <= 10
Where A = 5 and B = 10 in your example.
SELECT *,ROW_NUMBER() OVER
(ORDER BY ordercol) AS 'rank'
FROM table
where rank between #a and #b