Sybase, SQL: Adding a counter to duplicate values - sql

I try to handle the following case. I have a list of entries in the db:
Col 1 | Col 2|
------|------|
aaaa | x |
aaaa | x |
bbbb | y |
cccc | z |
cccc | z |
The goal is to identify the duplicates in Col 1 and add a number to each line and for the duplicates the number should incremented so that we get also unique entries. After each new entry the counter should start from 1 again.
Col 1 | Col 2 |
--------|-------|
aaaa-1 | x |
aaaa-2 | x |
bbbb-1 | y |
cccc-1 | z |
cccc-2 | z |
Do you have any idea how to manage this?
Best regards,
Dirk

Hi Dirk,
WITH cte
AS (SELECT Col1,Col2,ROW_NUMBER() OVER (PARTITION BY Col1,Col2
ORDER BY ( SELECT 0)) RN
FROM tableName)
select Col1+'-'+convert(varchar,rn) as Col1,Col2 FROM cte
Thanks :)

I would try this. I hope that solves your issue.
SELECT COL1 + ' - ' + ROW_NUMBER() OVER(PARTITION BY COL1 ORDER BY COL1) as COL_A
, COL2 AS COL_B
FROM TABLE
Depending on the DBMS you are working with, you might have to CAST ROW_NUMBER(). I tried that on Oracle 11.2 and it worked properly:
SELECT COL1 || ' - ' || ROW_NUMBER() OVER(PARTITION BY COL1 ORDER BY COL1) AS COL_A
, COL2 AS COL_B
FROM TABLENAME

Related

ORACLE conditional COUNT query

001 | 9441 | P021948
001 | 9442 | P021948
001 | 9443 | P021950
001 | 9444 | P021951
001 | 9445 | P021952
001 | 9446 | P021948
In the above table I am looking to COUNT the third column so long as it is outside of the second column's value by (+/- 1).
In other words, I am trying to achieve a count of 2 for P021948 because values 9441 and 9442 are within 1 of each other and record 9446 is outside of that range. My intent is to achieve a total count of 5 given these conditions.
How could I go about querying?
Any advice is greatly appreciated!
Hmmm, I'm thinking you want to count the "islands" that are separated by a value of more than 1. If so:
select count(*)
from (select t.*, lag(col2) over (partition by col1, col3 order by col2) as prev_col2
from t
) t
where prev_col2 is null or col2 - prev_col2 > 1;
Here is a rextester illustration of the query and the result.
select column1, column3,
sum(case when lag(column3, 1, 0) over(order by column3)=column3 or
lead(column3, 1, 0) over(order by column3)=column3 then 1 else 0 end)
from yourtable
group by column1, column3

How to get certain word from a column value

Column A
/Site/Test1/mysite/Do?id=90
/Site/Test2/mysite/Done?id=10
/NewSite/Site/Test3/mysite/Do?id=90
/Site/Test3/mysite/Done?id=1901
What I am trying to do is get the Test# from each row as well as the # after the =.
I tried the following:
Select
SUBSTRING(Column A, CHARINDEX('/', Column A, 1) + 7, LEN(Column A)),
SUBSTRING(Column A, CHARINDEX('=', Column A, 1) + 1, LEN(Column A)),
Column A
from
Table1
I am able to get the # after the = but how can I get the Test# from each row.
UPDATE: Test# is an example, it can be anything in there. What is for certain is Site and NewSite.
UPDATE #2:
Updated Table:
Column A
/Site/My%20Web%20Site/mysite/Do?id=90
/Site/Test%20It%20Out/mysite/Do?id=101
/Site/Test1/dummy/Done?id=1000
/NewSite/Site/No%20Way/thesite/Do?id=909
Result:
Col1 Col2
My%20Web%20Site 90
Test%20It%20Out 101
Test1 1000
No%20Way 909
select
Col1 = substring(a
, charindex('/Site/', a)+6
, charindex('/', a,(charindex('/Site/', a)+6))-(charindex('/Site/', a)+6)
)
, Col2 = substring(a
, charindex('=', a, 1) + 1
, len(a))
from t
rextester demo: http://rextester.com/DEBB37305
returns:
+-----------------+------+
| Col1 | Col2 |
+-----------------+------+
| My%20Web%20Site | 90 |
| Test%20It%20Out | 101 |
| Test1 | 1000 |
| No%20Way | 909 |
+-----------------+------+
This should work:
select SUBSTRING(col,CHARINDEX('Test',col),5)
To test it with one example:
select SUBSTRING('/Site/Test1/mysite/Do?id=90',CHARINDEX('Test','/Site/Test1/mysite/Do?id=90'),5)

TSQL Number Rows Based on change in fieldvalue and sorted on date with incremented numbers on duplicates

Say I have a data like the following:
X | 2/2/2000
X | 2/3/2000
B | 2/4/2000
B | 2/10/2000
B | 2/10/2000
J | 2/11/2000
X | 3/1/2000
I would like to get a dataset like this:
1 | X | 2/2/2000
1 | X | 2/3/2000
2 | B | 2/4/2000
2 | B | 2/10/2000
2 | B | 2/10/2000
3 | J | 2/11/2000
4 | X | 3/1/2000
So far everything I have tried has either ended up numbering each change resetting the count on each field value change or in the example leave the last X as 1.
This is a gaps and islands problem. You can use a difference of row numbers:
select dense_rank() over (order by col1, seqnum_1 - seqnum_2) as col0,
col1, col2
from (select t.*,
row_number() over (order by col2) as seqnum_1,
row_number() over (partition by col1 order by col2) as seqnum_2
from t
) t;
Explaining why this works is a bit cumbersome. If you run the subquery, you will see how the sequence numbers are assigned and why the difference is what you want.
you can query like this:
SELECT dense_rank() over(order by yourcolumn1), * from yourtable

SQL Query - Row to columns not really a pivot

I'm trying to move certain fields of an ID into columns, but it doesn't appear to match all the pivot examples I am finding. All the examples I can find use some form of a grouping on a field value. I want to use more of a placement regardless of the value in the field. I want to do this in a query without looping via code. Data source example (sorry couldn't figure out how to format a table on the post so I used a code snippet):
+----+--------+--------+
| ID | Field1 | Field2 |
+----+--------+--------+
| 1 | NULL | NULL |
| 2 | Jim | 321 |
| 2 | Jack | 54 |
| 2 | Sue | 985 |
| 2 | Gary | 654 |
| 3 | Herb | 332 |
| 3 | Chevy | 10 |
+----+--------+--------+
Result set I'm trying to generate:
+----+------+------+-------+------+------+------+
| ID | Col1 | Col2 | Col3 | Col4 | Col5 | Col6 |
+----+------+------+-------+------+------+------+
| 1 | NULL | NULL | | | | |
| 2 | Jim | 321 | Jack | 54 | Sue | 985 |
| 3 | Herb | 332 | Chevy | 10 | | |
+----+------+------+-------+------+------+------+
SQL Fiddle: http://sqlfiddle.com/#!3/a225a/1
;with cte as (
select id
, field1
, field2
, ROW_NUMBER() over (partition by id order by field1, field2) r
from #t
)
select c1.id
, c1.field1 col1
, c1.field2 col2
, c2.field1 col3
, c2.field2 col4
, c3.field1 col5
, c3.field2 col6
from cte c1
left outer join cte c2 on c2.id = c1.id and c2.r = c1.r + 1
left outer join cte c3 on c3.id = c1.id and c3.r = c1.r + 2
where (c1.r % 3) = 1
Explanation
ROW_NUMBER() over (partition by id order by field1, field2) r. This line ensures that we have a column counting up from 1 for each id. This allows us to distinguish between the multiple rows.
The CTE is used to save typing the same statement for c1, c2 and c3.
The joins ensure that all items in a row have the same id, and that data for col1, col3 and col5 (likewise for col2, col4 and col6) is taken from consecutive rows. We're using left outer joins because there may not rows in the source table for these columns.
The where statement says to take the first row of each set of 3 for the data in c1 (with c2 and c3 thus being the second and third of each set, thanks to the earlier join).
Here's a solution using dynamic sql that works though I'm sure there's a better way to do it. Caution, it's a bit painful. First it builds the list of columns to pivot and select, builds the dynamic sql and runs it.
DECLARE #PivotColumns as varchar(max), #SelectColumns as varchar(max), #sql as varchar(max)
SELECT #PivotColumns = ISNULL(#PivotColumns + ',', '') + ColNum,
#SelectColumns = ISNULL(#SelectColumns + ',', '') + 'NULLIF(' + ColNum + ', ''NULL'') as ' + ColNum
from (select distinct 'Col' + cast(ROW_NUMBER() OVER (partition by id order by id) as varchar) as ColNum
from (select id,
isnull(field1,'NULL') as field1,
isnull(field2,'NULL') as field2
from weirdpivot) cols
unpivot
(
value
for col in (field1, field2)
) unpivoted) DistinctColumns
set #sql = '
select id, + ' + #SelectColumns + '
from (select
''Col'' + cast(ROW_NUMBER() OVER (partition by id order by id) as varchar) as colnum
,id
,value
from (select id,
isnull(field1,''NULL'') as field1,
isnull(field2,''NULL'') as field2
from weirdpivot) cols
unpivot
(
value
for col in (field1, field2)
) u) unpivoted
pivot
(
max(value)
for colnum in (' + #PivotColumns + ')
) p'
exec (#sql)

SQL rank/dense_rank and how to query/calculate with the result

So I have a table where it dense_ranks my rows.
Here is the table:
COL1 | COL2 | COL3 | DENSE_RANK |
a | b | c | 1 |
a | s | r | 1 |
a | w | f | 1 |
b | b | c | 2 |
c | f | r | 3 |
c | q | d | 3 |
So now I want to select any rows where the rank was only represented once, so the 2 is all alone, but not the 1 or 3. I want to select all the rows where this occurs, but how do I do that?
Some ideas:
-COUNT DISTINCT (RANK())
-COUNT RANK()
but neither of those are working, any ideas? please and thank you!
happy hacking
actual code:
SELECT events.event_type AS "event",
DENSE_RANK() OVER (ORDER BY bw_user_event.pad_id) as rank
FROM user_event
WHERE (software_events.software_id = '8' OR software_events.software_id = '14')
AND (software_events.event_type = 'install')
WITH Dense_ranked_table as (
-- Your select query that generates the table with dense ranks
)
SELECT DENSE_RANK
FROM Dense_ranked_table
GROUP BY DENSE_RANK
HAVING COUNT(DENSE_RANK) = 1;
I don't have SQL Server to test this. So please let me know whether this works or not.
I would think you can add a COUNT(*) OVER (PARTITION BY XXXXX) where XXXXX is what you include in your dense rank.
Then wrap this in a Common Table Expression and select where your new Count is = 1.
Something like this fiddler:
http://sqlfiddle.com/#!6/ae774/1
Code included here as well:
CREATE TABLE T
(
COL1 CHAR,
COL2 CHAR,
COL3 CHAR
);
INSERT INTO T
VALUES
('a','b','c'),
('a','s','r'),
('a','w','f'),
('b','b','c'),
('c','f','r'),
('c','q','d');
WITH CTE AS (
SELECT COL1 ,
COL2 ,
COL3,
DENSE_RANK() OVER (ORDER BY COL1) AS DR,
COUNT(*) OVER (PARTITION BY COL1) AS C
FROM dbo.T AS t
)
SELECT COL1, COL2, COL3, DR
FROM CTE
WHERE C = 1
Would return just the
b, b, c, 2
row from your test data.