Finding unique values with multiple columns using certain condition - sql

ID? A B C
--- -- -- --
1 J 1 B
2 J 1 S
3 M 1 B
4 M 1 S
5 M 2 B
6 M 2 S
7 T 1 B
8 T 2 S
9 C 1 B
10 C 1 S
11 C 2 B
12 N 1 S
13 N 2 S
14 N 3 S
15 Q 1 S
16 Q 1 S
17 Z 1 B
I need to find unique values with multiple column with some added condition. The unique value are combination of Col A,B and C.
If Col A has only two rows (like record 1 and 2) and the Column B is same on both data and there is a different value as in Column C then i dont need those records.
If Col A has only multiple rows (like record 3 to 6 ) with different Col B and C combination we want to see those values.
If Col A has multiple rows (like record 7 to 8 ) with different Col B and C combination we want to see those values.
If Col A has only multiple rows (like record 9 to 11 ) with similar/different Col B and C combination we want to see those values.
If Col A has only multiple rows (like record 12onwards ) with similar Col C and similar or different Column B we dont need those values...
If single value like Row 17 there is no need to display either
Tried a lot but not getting exact answer any help is greatly appreciated..

Trying to go through all the logic, I think you want all rows where the values of both columns A and B differ. An easy way to see whether records differ is by looking at the min and max values. And, you can do this using analytic functions:
select A, B, C
from (select t.*,
count(*) over (partition by A) as Acnt,
min(B) over (partition by A) as Bmin,
max(B) over (partition by A) as Bmax,
min(C) over (partition by A) as Cmin,
max(C) over (partition by A) as Cmax
from t
) t
where (Bmin <> Bmax or Cmin <> Cmax)
Your example data does not have any actual duplicates, so I don't think a count(distinct) is necessary. Your rules say nothing about what to do when A only appears once. This version will filter those rows out.

Related

Remove duplicate values from the column of a PostgreSQL table and maintain only one value per column for duplicated rows

I would like to format the data of below table as follows. Here on the value column, I want to maintain only on value for each of the duplicated rows.
input table
code value
A 10
A 10
A 10
B 20
B 20
B 20
C 30
C 30
D 40
Expected result
code value
A 10
A
A
B 20
B
B
C 30
C
D 40
A combination of CASE and window function can solve your problem
select code,
case when t.rn = 1 then value else null end value
from (
select row_number() over (partition by code, value order by value) rn,
code, value
from your_table
) t

SQL query to find the entries corresponding to the maximum count of each type

I have a table X in Postgres with the following entries
A B C
2 3 1
3 3 1
0 4 1
1 4 1
2 4 1
3 4 1
0 5 1
1 5 1
2 5 1
3 5 1
0 2 2
1 2 3
I would like to find out the entries having maximum of Column C for every kind of A and B i.e (group by B) with the most efficient query possible and return corresponding A and B.
Expected Output:
A B C
1 2 3
2 3 1
0 4 1
0 5 1
Please help me with this problem . Thank you
demo: db<>fiddle
Using DISTINCT ON:
SELECT DISTINCT ON (B)
A, B, C
FROM
my_table
ORDER BY B, C DESC, A
DISTINCT ON gives you exactly the first row for an ordered group. In this case B is grouped.
After ordering B (which is necessary): We first order the maximum C (with DESC) to the top of each group. Then (if there are tied MAX(C) values) we order the A to get the minimum A to the top.
Seems like it is a greatest n per group problem:
WITH cte AS (
SELECT *, RANK() OVER (PARTITION BY B ORDER BY C DESC, A ASC) AS rnk
FROM t
)
SELECT *
FROM cte
WHERE rnk = 1
You're not clear which A needs to be considered, the above returns the row with smallest A.
itseems to me you need max()
select A,B, max(c) from table_name
group by A,B
this will work:
select * from (SELECT t.*,
rank() OVER (PARTITION BY A,B order by C) rank
FROM tablename t)
where rank=1 ;

Create multiple rows based on 1 column

I currently have a table with a quantity in it.
ID Code Quantity
1 A 1
2 B 3
3 C 2
4 D 1
Is there anyway to write a sql statement that would get me
ID Code Quantity
1 A 1
2 B 1
2 B 1
2 B 1
3 C 1
3 C 1
4 D 1
I need to break out the quantity and have that many number of rows
Thanks
Here's one option using a numbers table to join to:
with numberstable as (
select 1 AS Number
union all
select Number + 1 from numberstable where Number<100
)
select t.id, t.code, 1
from yourtable t
join numberstable n on t.quantity >= n.number
order by t.id
Online Demo
Please note, depending on which database you are using, this may not be the correct approach to creating the numbers table. This works in most databases supporting common table expressions. But the key to the answer is the join and the on criteria.
One way would be to generate an array with X elements (where X is the quantity). So for rows
ID Code Quantity
1 A 1
2 B 3
3 C 2
you would get
ID Code Quantity ArrayVar
1 A 1 [1]
2 B 3 [1,2,3]
3 C 2 [2]
using a sequence function (e.g, in PrestoDB, sequence(start, stop) -> array(bigint))
Then, unnest the array, so for each ID, you get a X rows, and set the quantity to 1. Not sure what SQL distribution you're using, but this should work!
You can use connect by statement to cross join tables in order to get your desired output.
check my solution it works pretty robust.
select
"ID",
"Code",
1 QUANTITY
from Table1, table(cast(multiset
(select level from dual
connect by level <= Table1."Quantity") as sys.OdciNumberList));

SQL - after sorting, return only rows with certain consecutive values in a column

I have columns name, timestamp, doing. I've already sorted by name, then by timestamp, and I expect that moving down the doing column within a group with the same name looks like A, A, A, B, B, A, A, ... - alternating series of A and B. I need to get only the rows which comprise the first B row after a transition from A to B within a group with the same name.
name timestamp doing
1 1 A
1 2 A
1 3 B
1 4 B
1 5 A
2 2 B
2 4 A
2 6 B
2 8 A
I would like to return
name timestamp doing
1 3 B
2 6 B
But not
2 2 B
because it is not a transition from A to B within name = 2
I think you just want lag():
select t.*
from (select t.*,
lag(doing) over (partition by name order by timestamp) as prev_doing
from t
) t
where prev_doing = 'A' and doing = 'B';

SQL combining of a COUNT with a WHERE in single query

Here is the data, call it table T
A B
-- --
1 14
2 15
3 16
4 1
4 3
4 6
4 9
4 12
4 15
I would like to get the value of A that has only one value and a B value of 15.
There are two rows where B=15 but there are 6 rows where A=4 and only one row where A=2.
So the correct SQL should return me the 2.
I have tried this but it returns both rows.
select A from T group by A,B having Count(A) = 1 and B = 15
This similarly fails:
select A from T where B = 15 group by A having count( A ) = 1
Try this:
select A
from T
group by A
having Count(A) = 1 and Max(B) = 15;
Your problem seems to be that you are grouping by both columns. You only want to group by A.
Admittedly, your query has group by A, T, but I think that is a typo, based on the described behavior.
You can check the count of B after grouping by A.
select A
from T
group by A
having Count(B) = 1 and max(B) = 15