How to set an incrementing flag column for related rows? - sql

I am trying to create a flag column called "Related" to use in reporting to highlight specific rows that are related based on the ID column (1 = related, NULL = not related). The original table "table1" looks like below:
Name ID Related
--------------------------------
Jack 101 NULL
John 101 NULL
Pat 105 NULL
Ben 106 NULL
Jordan 106 NULL
George 300 NULL
Alan 500 NULL
Bill 200 NULL
Bob 200 NULL
I then used this UPDATE statement below:
UPDATE a
SET Related = 1
FROM table1 a
JOIN (SELECT ID FROM table1 GROUP BY ID HAVING COUNT(*) > 1) b
ON a.ID = b.ID
Below is the result of this update statement:
Name ID Related
--------------------------------
Jack 101 1
John 101 1
Pat 105 NULL
Ben 106 1
Jordan 106 1
George 300 NULL
Alan 500 NULL
Bill 200 1
Bob 200 1
This gets me close but I need for it to instead of assigning the number 1 to each related row, to increment the number for each set of related rows based on their different ID column values.
Desired result:
Name ID Related
--------------------------------
Jack 101 1
John 101 1
Pat 105 NULL
Ben 106 2
Jordan 106 2
George 300 NULL
Alan 500 NULL
Bill 200 3
Bob 200 3

This is a possible solution using dense_rank to number your related values and an updateable CTE
with r as (
select id
from t
group by id having Count(*) > 1
),
n as (
select t.id, t.related, Dense_Rank() over (order by r.id) r
from r
join t on t.id = r.id
)
update n set related = r

You can do this without a self-join, just using window functions in a CTE, and updating the CTE directly:
WITH tCounted AS (
SELECT
t.id,
t.related,
c = COUNT(*) OVER (PARTITION BY r.id)
FROM t
),
tWithRelated as (
SELECT
t.id,
t.related,
rn = DENSE_RANK() OVER (ORDER BY r.id)
FROM tCounted
WHERE c > 1
)
UPDATE tWithRelated
SET related = rn;

Use an updateable CTE - comments explain the logic.
with cte1 as (
select [Name], ID, Related
-- Get the count within the id partition, less 1 as specified
, count(*) over (partition by id) - 1 cnt
-- Get the row number within the id partition
, row_number() over (partition by id order by id) rn
from #Test
), cte2 as (
select [Name], ID, Related, cnt, rn
-- Add 1 *only* if the count is > 0 *and* its the first row in the id partition
, case when cnt > 0 then sum(case when cnt > 0 and rn = 1 then 1 else 0 end) over (order by id) else null end NewRelated
from cte1
)
update cte2 set Related = NewRelated;
This doesn't assume Related is already null and works for more than 2 rows for any given ID.
It does assume that one can order by the ID column - even though the data provided doesn't do that.

Related

How to delete records with lower version in big query?

Lets say my table contains the following data
id
name
version
1
Rahul
1
1
Rahul
2
2
John
1
3
Mike
1
2
John
2
4
Rubel
1
5
David
1
1
Rahul
3
I need to filter the duplicate records with lower version. How can this be done?
The output essentially should be
id
name
version
1
Rahul
3
2
John
2
3
Mike
1
4
Rubel
1
5
David
1
For this dataset, aggregation seems sufficient:
select id, name, max(version) as max_version
from mytable
group by id, name
You can use not exists as follows:
select id, name, version
from your_table t
Where not exists
(Select 1 from your_table tt
Where tt.id = t.id and tt.version > t.version)
Or you can use analytical function row_number as follows:
Select id, name, version from
(select t.*,
Row_number() over (partition by id order by version desc) as rn
from your_table t) t
Where rn = 1

Duplicate id rows with few columns to unique id row with many columns Oracle SQL

I have a pole table that can have one to four streetlights on it. Each row has a pole ID and the type (a description) of streetlight. I need the ID's to be unique with a column for each of the possible streetlights. The type/description can anyone of 26 strings.
I have something like this:
ID Description
----------------
1 S 400
1 M 200
1 HPS 1000
1 S 400
2 M 400
2 S 250
3 S 300
What I need:
ID Description_1 Description_2 Description_3 Description_4
------------------------------------------------------------------
1 S 400 M 200 HPS 1000 S 400
2 M 400 S 250
3 S 300
The order the descriptions get populated in the description columns is not important, e.g. for ID = 1 the HPS 1000 value could be in description column 1, 2, 3, or 4. So, long as all values are present.
I tried to pivot it but I don't think that is the right tool.
select * from table t
pivot (
max(Description) for ID in (1, 2, 3))
Because there are ~3000 IDs I would end up with a table that is ~3001 rows wide...
I also looked at this Oracle SQL Cross Tab Query But it is not quite the same situation.
What is the right way to solve this problem?
You can use row_number() and conditional aggregation:
select
id,
max(case when rn = 1 then description end) description_1,
max(case when rn = 2 then description end) description_2,
max(case when rn = 3 then description end) description_3,
max(case when rn = 4 then description end) description_4
from (
select t.*, row_number() over(partition by id order by description) rn
from mytable t
) t
group by id
This handles up to 4 descriptions per id. To handle more, you can just expand the select clause with more conditional max()s.

postgresql - filter out double rows (but not the first and last one)

i got an "postgres" SQL problem.
I got a table which looks like this
id name level timestamp
1 pete 1 100
2 pete 1 200
3 pete 1 500
4 pete 5 900
7 pete 5 1000
9 pete 5 1200
15 pete 2 700
Now I want to delete the lines i dont need. i only want to now the first line where he get a new level and the last line he has this level.
id name level timestamp
1 pete 1 100
3 pete 1 500
15 pete 2 700
4 pete 5 900
9 pete 5 1200
(there much more columns like realmpoints and so on)
I have a solution if the the timestamp is only increasing.
SELECT id, name, level, timestamp
FROM player_testing
WHERE id IN ( SELECT MAX(dup.id)
FROM player_testing As dup
GROUP BY dup.name, dup.level)
UNION
SELECT MIN(dup.id)
FROM player_testing As dup
GROUP BY dup.name, dup.level)
)
ORDER BY ts
But I find no way to makes it work for my problem.
select id, name, level, timestamp
from (
select id,name,level,timestamp,
row_number() over (partition by name, level order by timestamp) as rn,
count(*) over (partition by name, level) as max_rn
from player_testing
) t
where rn = 1 or rn = max_rn;
Btw: timestamp is a horrible name for a column. For one reason because it's a reserved word, but more importantly because it doesn't document what the column contains. Is that a start_timestamp and end_timestamp a valid_until_timestamp, ...?
Here is an alternate solution to #a_horse_with_no_name's without over partition, and thus more generic SQL:
select *
from player_testing as A
where id = (
select min(id)
from player_testing as B
where A.name = B.name
and A.level = B.level
)
or id = (
select max(id)
from player_testing as B
where A.name = B.name
and A.level = B.level
)
Here is the fiddle to show it working: http://sqlfiddle.com/#!2/47bd44/1

How to declare a row as a Alternate Row

id Name claim priority
1 yatin 70 5
6 yatin 1 10
2 hiren 30 3
3 pankaj 40 2
4 kavin 50 1
5 jigo 10 4
7 jigo 1 10
this is my table and i want to arrange this table as shown below
id Name claim priority AlternateFlag
1 yatin 70 5 0
6 yatin 1 10 0
2 hiren 30 3 1
3 pankaj 40 2 0
4 kavin 50 1 1
5 jigo 10 4 0
7 jigo 1 10 0
It is sorted as alternate group of same row.
I am Using sql server 2005. Alternate flag starts with '0'. In my example First record with name "yatin" so set AlternateFlag as '0'.
Now second record has a same name as "yatin" so alternate flag would be '0'
Now Third record with name "hiren" is single record, so assign '1' to it
In short i want identify alternate group with same name...
Hope you understand my problem
Thanks in advance
Try
SELECT t.*, f.AlternateFlag
FROM tbl t
JOIN (
SELECT [name],
AlternateFlag = ~CAST(ROW_NUMBER() OVER(ORDER BY MIN(ID)) % 2 AS BIT)
FROM tbl
GROUP BY name
) f ON f.name = t.name
demo
You could use probably an aggregate function COUNT() and then HAVING() and then UNION both Table, like:
SELECT id, A.Name, Claim, Priority, 0 as AlternateFlag
FROM YourTable
INNER JOIN (
SELECT Name, COUNT(*) as NameCount
FROM YourTable
GROUP BY Name
HAVING COUNT(*) > 1 ) A
ON YourTable.Name = A.Name
UNION ALL
SELECT id, B.Name, Claim, Priority, 1 as AlternateFlag
FROM YourTable
INNER JOIN (
SELECT Name, COUNT(*) as NameCount
FROM YourTable
GROUP BY Name
HAVING COUNT(*) = 1 ) B
ON YourTable.Name = B.Name
Now, this assumes that the Names are unique meaning the names like Yatin for example although has two counts is only associated to one person.
See my SqlFiddle Demo
You can use Row_Number() function with OVER that will give you enumeration, than use the reminder of integer division it by 2 - so you'll get 1s and 0s in your SELECT or in the view.

Update column value of one row from other rows

I have the following table:
sno name pid amount total
1 Arif 0 100 null
2 Raj 1 200 null
3 Ramesh 2 100 null
4 Pooja 2 100 null
5 Swati 3 200 null
6 King 4 100 null
I want total of each person such that it gives total sum of amount of its descendants.
For ex.
for RAJ total will be : total= amount of(raj+ramesh+pooja+swati+king)
for SWATI :Total=amount of swati only.
You could try something like this:
WITH hierarchified AS (
SELECT
sno,
amount,
hierarchyID = CAST(sno AS varchar(500))
FROM yourTable
WHERE pid = 0
UNION ALL
SELECT
t.sno,
t.amount,
hierarchyID = CAST(h.hierarchyID + '/' + RTRIM(t.sno) AS varchar(500))
FROM yourTable t
INNER JOIN hierarchified h ON t.pid = h.sno
)
UPDATE yourTable
SET total = t.amount + ISNULL(
(
SELECT SUM(amount)
FROM hierarchified
WHERE hierarchyID LIKE h.hierarchyID + '/%'
),
0
)
FROM yourTable t
INNER JOIN hierarchified h ON t.sno = h.sno;
Note that this query (which you can try on SQL Fiddle) would probably not be very efficient on a large dataset. It might do as a one-off query, and then it would likely be better to organise updating the totals each time the table is updated, i.e. using triggers.