How to rewrite some values in query? - PostgreSQL - sql

I have got a query which result is in table1
-------------------------
|column1|column2|column3|
-------------------------
| v1 | 30 | 40 |
| v1 | 34 | 41 |
| v1 | 35 | 42 |
| v2 | 30 | 40 |
| v2 | 34 | 41 |
| v2 | 35 | 42 |
-------------------------
I want to change duplicated values to NULL in first column, (i.e).
-------------------------
|column1|column2|column3|
-------------------------
| v1 | 30 | 40 |
| null | 34 | 41 |
| null | 35 | 42 |
| v2 | 30 | 40 |
| null | 34 | 41 |
| null | 35 | 42 |
-------------------------
What should I do with table1?

You can use row_number() function:
select (case when row_number() over (partition by col1 order by col1) > 1
then null else col1 end) col1,
col2, col3
from table t;
However, you can also use lag() function
select (case when lag(col1) over (partition by col1 order by col1) = col1
then null else col1 end) col1,
col2, col3
from table t;

try this:
SELECT CASE WHEN D.RN = 1 THEN column1 ELSE NULL END AS column1
,column2
,column3
FROM(
SELECT *
, ROW_NUMBER() OVER(PARTITION BY column1 ORDER BY column1)RN
FROM Your_table
)D

You can try this query.
using ROW_NUMBER function to make rowNumber,then CASE When Expression to show the first column1 else be null
ORDER BY (SELECT 1) can let your data be the original order.
select CASE WHEN rn = 1 THEN column1 ELSE NULL END,
column2,
column3
from
(
SELECT column1,
column2,
column3,
ROW_NUMBER() OVER(PARTITION BY column1 ORDER BY (SELECT 1)) rn
FROM T
) t
sqlfiddle:http://sqlfiddle.com/#!15/fcf3d/1

SELECT CASE WHEN RN=1 THEN
"column1"
ELSE
NULL
END AS "column1",
"column2", "column3"
FROM
(
SELECT
"column1", "column2", "column3",
ROW_NUMBER() OVER (PARTITION BY "column1" ORDER BY "column1") AS RN
FROM Table1
) AS T
Ouput
column1 column2 column3
v1 30 40
(null) 34 41
(null) 35 42
v2 30 40
(null) 34 41
(null) 35 42
Demo
http://sqlfiddle.com/#!17/5b2e1/9

Related

Filter out rows from the final result, while still utilizing some of their values?

To give an example, let's say I have a view that returns the following result:
| id | foreignkey | value1 | value2 |
|----|------------|--------|--------|
| 1 | 500 | -100 | 0 |
| 2 | 500 | 900 | 15 |
| 3 | 500 | 570 | 25 |
| 4 | 999 | 100 | 57 |
| 5 | 999 | 150 | 0 |
The logic I'm trying to implement is as follows -
Filter out all rows that have value2 = 0.
But, for rows that have value2 = 0, I need to add it's value1 to the value1 of all other rows with the same foreign key where value2 != 0. If there are no other rows with the same foreign key, then rows with value2 = 0 simply get filtered out.
So in this example, I want the final result to be
| id | foreignkey | value1 | value2 |
|----|------------|--------|--------|
| 2 | 500 | 800 | 15 |
| 3 | 500 | 470 | 25 |
| 4 | 999 | 250 | 57 |
Any ideas? I was thinking something with group by might be possible but haven't been able to come up with a solution yet.
With SUM() window function:
select id, foreignkey, value1 + coalesce(total, 0) value1, value2
from (
select *,
sum(case when value2 = 0 then value1 end) over (partition by foreignkey) total
from tablename
) t
where value2 <> 0
See the demo.
Results:
> id | foreignkey | value1 | value2
> -: | ---------: | -----: | -----:
> 2 | 500 | 800 | 15
> 3 | 500 | 470 | 25
> 4 | 999 | 250 | 57
Hmmm . . . assuming that this doesn't filter out all rows, you can use window functions like this:
select id, foreignkey, value1, value2 + (case when seqnum = 1 then value2_0 else 0 end)
from (select t.*,
row_number() over (partition by foreignkey order by value1 desc) as seqnum,
sum(case when value1 = 0 then value2 end) over (partition by foreignkey) as value2_0
from t
) t
where value2 <> 0;
One way is to treat all zero rows as one group and all others as another group (based on foreignkey) and then simply join and add the values and finally select only the required ones:
;with cte as
(
select id, foreignkey, value1, value2,dense_rank() over (partition by foreignkey order by (case when value2 = 0 then 0 else 1 end)) as rn
from #t t1
)
,cte2 as
(
select t1.id, t1.foreignkey, t1.value1 + isnull(t2.value1,0) as value1, t1.value2
from cte t1
left join cte t2 on (t2.foreignkey = t1.foreignkey and t1.rn<> t2.rn)
)
select * from cte2
where value2 <> 0
Please find the db<>fiddle here.

Select most recent rows - last 24 hours

I have a table that looks like this:
col1 | col2 | col3 | t_insert
---------------------------------
1 | z | |2018-04-25 17:23:46.686816+10
1 | zy | |2018-04-26 18:53:46.686816+10
2 | f | |2018-04-26 19:23:46.686816+10
3 | g | |2018-04-27 17:23:46.686816+10
2 | z | |2018-04-27 18:23:46.686816+10
4 | z | |2018-04-27 20:13:46.686816+10
Where there are duplicate values in col1 I want to select by most recent timestamp and create a new column (col4) and insert the string 'update'.
Where there are not duplicate values in col1 I want to select the value and insert the string 'new' into col4.
Also I only want to select rows that have a timestamp from the last 24 hours.
The expected result: (This result dosen't show select rows from last 24 hours)
col1 | col2 | col3 | t_insert | col4 |
-------------------------------------------------------------
1 | zy | |2018-04-26 18:53:46.686816+10 |update |
3 | g | |2018-04-27 17:23:46.686816+10 |new |
2 | z | |2018-04-27 18:23:46.686816+10 |update |
4 | z | |2018-04-27 20:13:46.686816+10 |new |
Thanks in advance,
Hmmm, window function can help here:
select col, col2, col3, t_insert,
(case when cnt > 1 then 'update' else 'new' end) as col4
from (select t.*,
count(*) over (partition by col1) as cnt,
row_number() over (partition by col1 order by t_insert desc) as seqnum
from t
where t_insert >= now() - interval '24 hour'
) t
where seqnum = 1;

Exclude rows with the same values in some columns

I have the table like following:
id | col1 | col2 | col3 | col4
---+------+------+--------+-----------
1 | abc | 23 | data1 | otherdata1
2 | def | 41 | data2 | otherdata2
3 | ghi | 41 | data3 | otherdata3
4 | jkl | 58 | data4 | otherdata4
5 | mno | 23 | data1 | otherdata5
6 | pqr | 41 | data3 | otherdata6
7 | stu | 76 | data2 | otherdata7
How can I fast select rows where col2+col3 doesn't have duplicates? There is over 15 millions of rows in the table, so join may be not suitable.
Final result should look like this:
id | col1 | col2 | col3 | col4
---+------+------+--------+-----------
2 | def | 41 | data2 | otherdata2
4 | jkl | 58 | data4 | otherdata4
7 | stu | 76 | data2 | otherdata7
Not sure how fast this will be, but this should work:
select id, col1, col2, col3, col4
from (
select id, col1, col2, col3, col4,
count(*) over (partition by col2, col3) as cnt
from the_table
) t
where cnt = 1
order by id;
Window functions are definitely one possibility. But, if you care about performance, it is also worth trying another approach and comparing the speed.
NOT EXISTS comes to mind:
select t.*
from table t
where not exists (select 1
from table t2
where t2.col2 = t.col2 and t2.col3 = t.col3 and
t2.id <> t.id
);
This can take advantage of an index on table(col2, col3).
Try this as well..
select * from
(
select id,col1,col2,col3,col4
,row_number() over (partition by col2,col3 order by col2,col3 desc ) as rnm
from
table
) x where rnm =1;

Compare two dates from columns SQL

I am having some issues with getting this working. I have a table with this data in it.
| DateStarted | Field9 | Field2 | ID | Field6 |
----------------------------------------------------------------------------------------
| 2013-04-15 09:23:00 | TEST1 | TEST2 | 1 | 2000 |
| 2013-04-08 09:23:00 | TEST1 | TEST2 | 2 | 180 |
| 2013-04-15 09:23:00 | TEST2 | TEST3 | 3 | 1000 |
| 2013-04-04 09:23:00 | TEST2 | TEST3 | 7 | 80 |
| 2013-04-03 09:23:00 | TEST2 | TEST4 | 5 | 70 |
What my end goal is was to have the last two dates for the value for Field9 be returned so that I could subtract the value of Field6 for each unique instance of Field9. Below is an example of the return.
| DateStarted | Field1 | Field2 | ID | SUB |
----------------------------------------------------------------------------------------
| 2013-04-15 09:23:00 | TEST1 | TEST2 | 1 | 1820 |
| 2013-04-15 09:23:00 | TEST2 | TEST3 | 3 | 920 |
So for the second row it took the two greatest dates and then took the value of field6 and subtracted them returning just the one row.
You can get the latest row for each unique value of Field1 by using partitioned windowing functions.
;WITH x AS
(
SELECT DateStarted, Field9, Field2, ID, Field6,
rn = ROW_NUMBER() OVER (PARTITION BY Field9 ORDER BY DateStarted DESC)
FROM dbo.your_table_name
),
y AS
(
SELECT x.*, [SUB] = x.Field6 - COALESCE(y.Field6, 0)
FROM x LEFT OUTER JOIN x AS y
ON x.Field9 = y.Field9
AND x.rn = 1 AND y.rn = 2
)
SELECT DateStarted, Field1 = Field9, Field2, ID, [SUB]
FROM y
WHERE rn = 1
ORDER BY Field1;
SQL fiddle demo
One way to get the difference is to identify the two rows and then aggregate them together:
select MAX(case when seqnum = 1 then DateStarted end), Field1,
max(case when seqnum = 1 then Field2 end) as Field2
MAX(case when seqnum = 1 then id end) as Id,
MAX(case when seqnum = 1 then field3 end) - MAX(case when seqnum = 2 then field 3 end) as sub
from (SELECT DateStarted, Field1, Field2, ID, Field3,
ROW_NUMBER() OVER (PARTITION BY Field1 ORDER BY DateStarted DESC) as seqnum
FROM t
) t
group by Field1
This uses conditional aggregation to get the difference.

SQL Check if ungroup column values match

I have a sql table with following values
| col1 | col2| source | values
| 1 | 2 | A | null
| 1 | 2 | B | 1.0
| 1 | 2 | C | null
| 1 | 4 | A | 2.0
| 1 | 4 | B | 2.0
| 1 | 4 | C | 2.0
| 1 | 5 | A | null
| 1 | 5 | B | null
| 1 | 5 | C | null
How can I get an output with a group by of col1 and col2 with a flag:
all values match for a group ( flag = 1)
all values are null ( flag = 2)
some values is null (flag = 3)
Output:
| col1 | col2| flag
| 1 | 2 | 3
| 1 | 4 | 1
| 1 | 5 | 2
Or: based on your updated question:
SELECT
col1,
col2,
SUM(CASE WHEN SomeConditionHere THEN 1 ELSE 0 END) AS Flag
FROM Table1
GROUP BY col1, col2;
SQL Fiddle Demo
This will give you:
| COL1 | COL2 | FLAG |
----------------------
| 1 | 2 | 2 |
| 1 | 4 | 0 |
| 1 | 5 | 3 |
Note that: I assumed that the flag is how many NULL values are in the VALUES column, so I used "Values" IS NULL instead of SomeConditionHere.
I couldn't understand how the flag should be computed in the expected results you posted. You have to use the predicate that define your flag instead of "Values" IS NULL.
Update:
Try this:
WITH Flags
AS
(
SELECT
col1, col2,
COUNT(*) ValuesCount,
SUM(CASE WHEN "Values" IS NULL THEN 1 ELSE 0 END) AS NULLValues
FROM Table1
GROUP BY col1, col2
)
SELECT
col1,
col2,
Flag = CASE WHEN ValuesCount = NULLValues THEN 2
WHEN NULLVALUES = 0
AND ValuesCount = (SELECT COUNT(*)
FROM Table1 t2
WHERE t1.col1 = t2.col1
AND t1.col2 = t2.col2) THEN 1
ELSE 3
END
FROM Flags t1;
Updated SQL Fiddle Demo
This will give you:
| COL1 | COL2 | FLAG |
----------------------
| 1 | 2 | 3 |
| 1 | 4 | 1 |
| 1 | 5 | 2 |
In SQLServer2005+
;WITH cte AS
(
SELECT col1, col2, [values],
COUNT(CASE WHEN [values] IS NULL THEN 1 END) OVER(PARTITION BY col1, col2) AS cntNULL,
COUNT(*) OVER(PARTITION BY col1, col2) AS cntCol
FROM dbo.test5
)
SELECT col1, col2, MAX(CASE WHEN cntNULL = 0 THEN 1
WHEN cntNULL = cntCol THEN 2
ELSE 3 END) AS flag
FROM cte
GROUP BY col1, col2
Demo on SQLFiddle
...And solution without CTE if you want more portable SQL:
select col1,
col2,
case
when DistinctValuesWithoutNulls = 1 and NullCount = 0 then 1
when DistinctValuesWithoutNulls = 0 then 2
when NullCount > 0 then 3
end flag
from
(
select col1,
col2,
count(distinct [values]) DistinctValuesWithoutNulls,
sum(case when [values] is null then 1 else 0 end) NullCount
from Table1
group by col1, col2
) tmp