Remove duplicate values from the column of a PostgreSQL table and maintain only one value per column for duplicated rows - sql

I would like to format the data of below table as follows. Here on the value column, I want to maintain only on value for each of the duplicated rows.
input table
code value
A 10
A 10
A 10
B 20
B 20
B 20
C 30
C 30
D 40
Expected result
code value
A 10
A
A
B 20
B
B
C 30
C
D 40

A combination of CASE and window function can solve your problem
select code,
case when t.rn = 1 then value else null end value
from (
select row_number() over (partition by code, value order by value) rn,
code, value
from your_table
) t

Related

Fill empty values in running sum

I've to calculate a ratio between several categories everytime one some category gets a value change.
If I have the following table:
Block
Category
Value
1
A
30
1
B
20
1
C
50
2
A
40
4
B
10
I'd need a way to get the running sum filling empty values (Category A without changes since block 1) with the nearest preceding one:
Block
Category
Value
1
A
30
1
B
20
1
C
50
2
A
70
2
B
20
2
C
50
4
A
70
4
B
30
4
C
50
I'm using a query like the following one:
SELECT category, block_number
SUM(block_sum) OVER(PARTITION BY category ORDER BY block_number)
FROM block_table
But in the case there's no value from a given category in a block_number, then I won't get that row in the results.
Generate the rows using a cross join. Use a left join to bring in the values that exist and then use window functions:
select b.block, bc.category,
sum(bt.value) over (partition by c.category order by b.block) as value
from (select distinct block from block_table) b cross join
(select distinct category from block_table) c left join
block_table bt
on b.block = bt.block and c.category = bt.category;

To pull 1 record out of multiple records having same data in a field based on other fields

A | B | C | D | E
a y 6 12 21
b n 3 10 5
c n 4 12 12
c n 7 12 2
c y 1 12 22
d n 6 10 32
d n 7 10 32
OUTPUT TABLE:
A | B | C | F
a y 6 21
b n 3 12
c y 1 22
d n 6 10
I have a table that contains certain fields. From that table I want to remove duplicate records in A and produce the output table.
Now, the field F is calculated based on the field C when there are no duplicates for the records in A. So, if there is only one record of a in A then if C>5 then the F Column(Output table) pulls the record in E column. So, if record b has the value <5 in field C, then the F column (output table) will pull the record in D column for b. I have been able to achieve this using a case statement.
However, when there are duplicate records in column A, I want only one of the records based on the column B. Only that record should be pulled that has the value 'y' in column B and where the column F contains the value from column E. If none of the duplicate records in A have a value of 'n' in the B column, then pull any record with column D as column F in the output table. I am not able to figure out this part.
Please let me know if anything is not clear.
Code I am using:
SELECT A,B,C,
CASE
WHEN (SELECT COUNT(*) FROM MyTable t2 WHERE t1.A=t2.A)>1
THEN (SELECT TOP 1 CASE WHEN b='y' THEN E ELSE D END
FROM MyTable t3
WHERE t3.A=t1.A
ORDER BY CASE WHEN b='y' THEN 0 ELSE 1 END)
ELSE {
case when cast(C as float) >= 5.00 then (Case when E = '0.00' then D else E end)
when cast(C as float)< 5.00 then D end )
}
END AS F
FROM MyTable t1
You might want to encapsulate this logic in a Function to make it look cleaner, but the logic would go like this:
IF the record count of rows in the table with the same value for A as the current row is greater than 1, THEN SELECT the TOP 1 record with this value for A ORDER BY CASE WHEN b='y' THEN 0 ELSE 1 END
Use another CASE WHEN b='y' to determine if you will use column E or D for output column F.
And ELSE (the record count is not greater than 1), use your existing CASE expression.
EDIT: Here is a more psuedo-codey explanation:
WITH cte AS (SELECT A,B,C,
ROW_NUMBER() OVER (PARTITION BY A, ORDER BY CASE WHEN b='y' THEN 0 ELSE 1 END) rn
FROM MyTable
)
SELECT A,B,C,
CASE
WHEN (SELECT COUNT(*) FROM MyTable t2 WHERE t1.A=t2.A)>1
THEN CASE WHEN b='y' THEN E ELSE D END
ELSE {use your existing CASE Expression}
END AS F
FROM cte t1
WHERE rn=1

SQL combining of a COUNT with a WHERE in single query

Here is the data, call it table T
A B
-- --
1 14
2 15
3 16
4 1
4 3
4 6
4 9
4 12
4 15
I would like to get the value of A that has only one value and a B value of 15.
There are two rows where B=15 but there are 6 rows where A=4 and only one row where A=2.
So the correct SQL should return me the 2.
I have tried this but it returns both rows.
select A from T group by A,B having Count(A) = 1 and B = 15
This similarly fails:
select A from T where B = 15 group by A having count( A ) = 1
Try this:
select A
from T
group by A
having Count(A) = 1 and Max(B) = 15;
Your problem seems to be that you are grouping by both columns. You only want to group by A.
Admittedly, your query has group by A, T, but I think that is a typo, based on the described behavior.
You can check the count of B after grouping by A.
select A
from T
group by A
having Count(B) = 1 and max(B) = 15

How to add two values of the same column in a table

Consider the following table?
ID COL VALUE
1 A 10
2 B 10
3 C 10
4 D 10
5 E 10
Output:
ID COL VALUE
1 A 10
2 B 20
3 C 30
4 D 40
5 E 50
Based on your (deleted) comment in output it is taking up the sum of the upper values, it sounds like you're wanting a cumulative SUM().
You can do this with a windowed function:
Select Id, Col, Sum(Value) Over (Order By Id) As Value
From YourTable
Output
Id Col Value
1 A 10
2 B 20
3 C 30
4 D 40
5 E 50
Please make use of the the below code to obtain the cumulative sum. The code is working as expected with SQL Server 2012.
DECLARE #Table TABLE (ID int, COL CHAR(2), VALUE int)
INSERT #Table
(ID,COL,[VALUE])
VALUES
(1,'A',10),
(2,'B',10),
(3,'C',10),
(4,'D',10),
(5,'E',10)
SELECT t.ID,t.COL,SUM(VALUE) OVER (ORDER BY t.ID) AS VALUE
FROM #Table t
Not really sure what you are asking for. If my assumption is correct, you want to SUM the contents of a column and group it.
Select sum(value), col
from table
group by col

Finding unique values with multiple columns using certain condition

ID? A B C
--- -- -- --
1 J 1 B
2 J 1 S
3 M 1 B
4 M 1 S
5 M 2 B
6 M 2 S
7 T 1 B
8 T 2 S
9 C 1 B
10 C 1 S
11 C 2 B
12 N 1 S
13 N 2 S
14 N 3 S
15 Q 1 S
16 Q 1 S
17 Z 1 B
I need to find unique values with multiple column with some added condition. The unique value are combination of Col A,B and C.
If Col A has only two rows (like record 1 and 2) and the Column B is same on both data and there is a different value as in Column C then i dont need those records.
If Col A has only multiple rows (like record 3 to 6 ) with different Col B and C combination we want to see those values.
If Col A has multiple rows (like record 7 to 8 ) with different Col B and C combination we want to see those values.
If Col A has only multiple rows (like record 9 to 11 ) with similar/different Col B and C combination we want to see those values.
If Col A has only multiple rows (like record 12onwards ) with similar Col C and similar or different Column B we dont need those values...
If single value like Row 17 there is no need to display either
Tried a lot but not getting exact answer any help is greatly appreciated..
Trying to go through all the logic, I think you want all rows where the values of both columns A and B differ. An easy way to see whether records differ is by looking at the min and max values. And, you can do this using analytic functions:
select A, B, C
from (select t.*,
count(*) over (partition by A) as Acnt,
min(B) over (partition by A) as Bmin,
max(B) over (partition by A) as Bmax,
min(C) over (partition by A) as Cmin,
max(C) over (partition by A) as Cmax
from t
) t
where (Bmin <> Bmax or Cmin <> Cmax)
Your example data does not have any actual duplicates, so I don't think a count(distinct) is necessary. Your rules say nothing about what to do when A only appears once. This version will filter those rows out.