SQL: Increment ID only for new rows based on the count

SQL: Increment ID only for new rows based on the count - sql

Requirement: Generate new ID from the MAX ID for those Name doesn't exist in the Target table and has count >1
Below is the Source data, The yellow highlighted are new rows, Those with count >1 are incremented with a new ID, and those with count =1 defaults to FM00000001
The expected result is highlighted in yellow in the Target table
I have generated the existing ID manually for one time , as I have to automate daily jobs so I need to generate incremental ID from MAX ID for those count >1
with src as (
select '121 MEDICAL PLACE' as Label , 1 as cnt
union all
select '16TH STREET COMMUNITY' as Label , 1 as cnt
union all
select '19TH AVENUE CLINIC' as Label , 2 as cnt
union all
select '1ST CLASS URGENT CARE' as Label , 3 as cnt
union all
select '160 DORADO BCH' as Label , 2 as cnt
union all
select 'APPLETREE LN' as Label , 4 as cnt
union all
select 'KNOLLWOOD LN' as Label , 1 as cnt
)
select * from src
with tgt as (
select '121 MEDICAL PLACE' as Label , 'FM00000001' as ID
union all
select '16TH STREET COMMUNITY' as Label , 'FM00000001'as ID
union all
select '19TH AVENUE CLINIC' as Label , 'FM00000002'as ID
union all
select '1ST CLASS URGENT CARE' as Label , 'FM00000003' as ID
)
select * from tgt

ok If I understand correctly , here is how you can do it :
select *
, 'FM'+ LPAD(case when cnt =1 then cnt else count(case when cnt > 1 then 1 end) over (order by OrderColumn ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) + 1 end, 8,0) as ID
from table
for reference: Window Functions

Related

Bigquery uncommon elements in array

Let us say you have a column of arrays like this , I am trying to group the rows based on the count of uncommon elements. Once the number of distinct uncommon elements reaches 5, it will be in the next group.
In the below example, first three rows will be group 1 because the uncommon elements are ['3','4','6','7'] which is 4 in length, but if you add the next row to the group , the array of distinct uncommon element would be ['1','3','4','5','6','7'] it will exceed the limit of 5 distinct uncommon elements.
with arr as (
select 1 ord, ['1','2','3','4'] as ar
union all
select 2, ['1','2','3']
union all
select 3,['1','2','6','7']
union all
select 4,['2','4','5','7']
union all
select 5, ['string1','5','6','7','8']
)
select * from arr
I am looking for an output like below
Code I have written so far but definitely missing a big piece. Adding it just in case if it is helpful
with arr as (
select 1 ord, ['1','2','3','4'] as ar,1 subclass
union all
select 2, ['1','2','3'],1
union all
select 3,['1','2','6','7'],1
union all
select 4,['2','4','5','7'],1
union all
select 5, ['string1','5','6','7','8'],1
)
, history_t as (
select a.* ,
ARRAY_AGG(struct(ar)) OVER (PARTITION BY SUBCLASS ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as history
from arr a )
, tem2 as (
select a.* except(history,ar),
(SELECT COUNT(1) FROM UNNEST(history) AS col ) AS array_cnt
,b.ar unnest1
from history_t a
,unnest(history) b
)
, tem3 as (
select a.* except(unnest1),sku_lst
from tem2 a , unnest(unnest1) sku_lst
)
, all_sku_freq as (
select
ord, array_cnt , sku_lst , subclass,count(*) sku_freq
from tem3
group by 1,2,3,4 )
, uncommon_sku_cnt as (
select ord, subclass, count( sku_lst) uncommon_sku_count from all_sku_freq where sku_freq <> array_cnt group by 1,2 )
,rolling_uncomm_sku_cnt as (
select a.*, sum(uncommon_sku_count) over(partition by subclass order by ord asc range between unbounded preceding and current row ) roll_uncomm_sku_cnt
from uncommon_sku_cnt a
)
select a.* from rolling_uncomm_sku_cnt a

Selecting the latest record within a table

I have ant an Oracle v11 database, and whilst I do not have the schema definition of the tables, I have illustrated what I am trying to achieve below.
This is what the table looks like
I am trying to transform the data by selecting only the latest rows, the table keeps an history of changes, I am not interested in the changes only the latest value for every present issue
This is what I have so far.
select issueno,
case (when fieldname = 'name' then string_value end) name,
case (when fieldname = 'point' then string_value end) point
from issues
where issueno = 1234
The issue with the query above is that it returns 4 rows, I would like to return only a single row.

You can get the latest date by using LAST ORDER BY clause within the MAX() KEEP (..) values for transition_date(or load_date column, depending on which you mean replace within the query) such as
WITH i AS
(
SELECT CASE WHEN fieldname = 'name' THEN
MAX(string_value) KEEP (DENSE_RANK LAST ORDER BY transition_date)
OVER (PARTITION BY issue_no, fieldname)
END AS name,
CASE WHEN fieldname = 'point' THEN
MAX(string_value) KEEP (DENSE_RANK LAST ORDER BY transition_date)
OVER (PARTITION BY issue_no, fieldname)
END AS point
FROM issues
)
SELECT MAX(name) AS name, MAX(point) AS point
FROM i
But, if ties(equal values) occur for the related date values, then consider using DENSE_RANK() function in order to compute the values returning equal to 1 along with ROW_NUMBER() to be able to use with the JOIN clause in the main query such as
WITH i AS
(
SELECT i.*,
DENSE_RANK() OVER ( PARTITION BY issue_no, fieldname
ORDER BY transition_date DESC) AS dr,
ROW_NUMBER() OVER ( PARTITION BY issue_no, fieldname
ORDER BY transition_date DESC) AS rn
FROM issues i
)
SELECT i1.string_value AS name, i2.string_value AS point
FROM ( SELECT string_value, rn FROM i WHERE dr = 1 AND fieldname = 'name' ) i1
FULL JOIN ( SELECT string_value, rn FROM i WHERE dr = 1 AND fieldname = 'point' ) i2
ON i2.rn = i1.rn
Demo

Assuming that you want to have the latest record by the column load_date
select issueno,
case (when fieldname = 'name' then string_value end) name,
case (when fieldname = 'point' then string_value end) point
from issues
where issueno = 1234 and
(fieldname , load_date) in (select fieldname ,max(load_date) from issues where issueno=1234 group by fieldname)

I would use a subquery + window function to achieve what you asked for (assuming you use are basing load_date to determine the latest record)
select issueno,
case (when fieldname = 'name' then string_value end) name,
case (when fieldname = 'point' then string_value end) point
from
(
SELECT name, point, ROW_NUMBER() OVER(PARTITION BY ISSUENO, FIELDNAME ORDER BY LOAD_DATE DESC) RN
FROM issues
)
where issueno = 1234
AND RN = 1
The syntax ROW_NUMBER() OVER ([query_partition_clause] order_by_clause) is actually a window function that assign a ranking to each rows governed by how you declare the rule in [query_partition_clause] order_by_clause

See whether something like this helps; read comments within code.
SQL> with issues (issueno, fieldname, string_value,
2 transition_date, transition_id, load_date)
3 as
4 -- sample data; you have it in a table, don't type that
5 (select 1234, 'name', null , date '2021-01-01', 1, date '2021-01-02' from dual union all
6 select 1234, 'name', 'Tom', date '2021-02-11', 2, date '2021-02-12' from dual union all
7 select 1234, 'point', '0' , date '2021-02-04', 3, date '2021-02-05' from dual union all
8 select 1234, 'point', '5' , date '2021-02-10', 5, date '2021-02-11' from dual
9 ),
10 -- query you need begins here
11 temp as
12 -- rank values partitioned by ISSUENO and FIELDNAME, sorted by TRANSITION_ID
13 (select issueno, fieldname, string_value,
14 row_number() over (partition by issueno, fieldname
15 order by transition_id desc) rn
16 from issues
17 )
18 select issueno,
19 max(case when fieldname = 'name' then string_value end) name,
20 max(case when fieldname = 'point' then string_value end) point
21 from temp
22 where rn = 1
23 group by issueno;
ISSUENO NAME POINT
---------- ---------- ----------
1234 Tom 5
SQL>

how to find all column records are same or not in group by column in SQL

How to find all column values are same in Group by of rows in table
CREATE TABLE #Temp (ID int,Value char(1))
insert into #Temp (ID ,Value ) ( Select 1 ,'A' union all Select 1 ,'W' union all Select 1 ,'I' union all Select 2 ,'I' union all Select 2 ,'I' union all Select 3 ,'A' union all Select 3 ,'B' union all Select 3 ,'1' )
select * from #Temp
Sample Table:
How to find all column value of 'Value' column are same or not if group by 'ID' Column.
Ex: select ID from #Temp group by ID
For ID 1 - Value column records are A, W, I - Not Same
For ID 2 - Value column records are I, I - Same
For ID 3 - Value column records are A, B, 1 - Not Same
I want the query to get a result like below

When all items in the group are the same, COUNT(DISTINCT Value) would be 1:
SELECT Id
, CASE WHEN COUNT(DISTINCT Value)=1 THEN 'Same' ELSE 'Not Same' END AS Result
FROM MyTable
GROUP BY Id

If you're using T-SQL, perhaps this will work for you:
SELECT t.ID,
CASE WHEN MAX(t.RN) > 1 THEN 'Same' ELSE 'Not Same' END AS GroupResults
FROM(
SELECT *, ROW_NUMBER() OVER(PARTITION BY ID, VALUE ORDER BY ID) RN
FROM #Temp
) t
GROUP BY t.ID

Usally that's rather easy: Aggregate per ID and count distinct values or compare minimum and maximum value.
However, neither COUNT(DISTINCT value) nor MIN(value) nor MAX(value) take nulls into consideration. So for an ID having value 'A' and null, these would detect uniqueness. Maybe this is what you want or nulls don't even occur in your data.
But if you want nulls to count as a value, then select distinct values first (where null gets a row too) and count then:
select id, case when count(*) = 1 then 'same' else 'not same' end as result
from (select distinct id, value from #temp) dist
group by id
order by id;
Rextester demo: http://rextester.com/KCZD88697

delete duplicates records leaving unique in group with priority

I have table that is generated by procedure that I cannot modify and that is returning data like so:
USER_ID ACTIVE_STREET STREET
----------- ----------- -----------------
1 1 STREET1
1 0 STREET1
1 0 OTHER STREET
2 0 OTHER USER STREET
2 0 OTHER USER STREET
2 0 OTHER USER STREET
2 1 OTHER USER STREET
I need to remove records from this table following this rules:
Every user has only one active street.
I must delete duplicates but only removing those that have ACTIVE_STREET set to 0
So I'd like to leave only these records:
USER_ID ACTIVE_STREET STREET
----------- ----------- -----------------
1 1 STREET1
1 0 OTHER STREET
2 1 OTHER USER STREET
I've tried grouping but there is no id column so I can't get id's to delete.
How can I delete those duplicates without altering original table structure?
EDIT - based on Gordon's answer
this is really close, but there is a litle difference:
IF OBJECT_ID( 'tempdb..#MY_TMP' ) IS NOT NULL
BEGIN
DROP TABLE #MY_TMP;
END;
SELECT * INTO #MY_TMP
FROM(
SELECT 1 AS USER_ID,
1 AS ACTIVE_STREET,
'STREET1' AS STREET
UNION ALL
SELECT 2 AS USER_ID,
1 AS active,
'OTHER USER STREET' AS STREET
UNION ALL
SELECT 1 AS USER_ID,
0 AS active,
'STREET1' AS STREET
UNION ALL
SELECT 1 AS USER_ID,
0 AS active,
'OTHER STREET' AS STREET
UNION ALL
SELECT 2 AS USER_ID,
0 AS active,
'OTHER USER STREET' AS STREET
UNION ALL
SELECT 2 AS USER_ID,
0 AS active,
'OTHER USER STREET 2' AS STREET ) X;
SELECT *
FROM #MY_TMP ORDER BY USER_ID, ACTIVE_STREET desc;
SELECT * FROM (
select USER_ID, MAX(ACTIVE_STREET) AS a, STREET
from #MY_TMP
group by USER_ID, STREET ) X ORDER BY USER_ID, a desc
;with todelete as (
select row_number() over (partition by user_id, ACTIVE_STREET
order by street) as seqnum
from #MY_TMP t
)
delete todelete
where seqnum > 1;
SELECT *
FROM #MY_TMP ORDER BY USER_ID, ACTIVE_STREET desc;

Does this do what you want?
select user_id, active_street, min(street) as street
from atable t
group by user_id, active_street;
It returns the results that you specify.
If you actually want to delete rows from the table, you can use row_number():
with todelete as (
select t.*, row_number() over (partition by user_id, active_street
order by street) as seqnum
from atable t
)
delete todelete
where seqnum > 1;
Here is a SQL Fiddle that demonstrates the code.
EDIT:
Ooops, I think I misunderstood the logic. You want to delete all streets that are the same as the active street with the flag = 0. If so, this is the query:
delete t from my_tmp t
where active_street = 0 and
exists (select 1
from my_tmp t2
where t2.user_id = t.user_id and
t2.street = t.street and
t2.active_street = 1
);
And here is the SQL Fiddle for this one.

Create a temporary table. Move data to temp table, using GROUP BY:
insert into temptable
select USER_ID, MAX(ACTIVE_STREET), STREET
from tablename
group by USER_ID, STREET
When done, delete from original table and copy from temptable to it.

Maybe this variants will be applicable for your task?
-- Create table with sample data
IF OBJECT_ID('tempdb..#MY_TMP') IS NOT NULL
DROP TABLE #MY_TMP
;
SELECT * INTO #MY_TMP
FROM (
VALUES ( 1, 1, 'STREET1' )
, ( 1, 0, 'STREET1' )
, ( 1, 0, 'OTHER STREET' )
, ( 2, 0, 'OTHER USER STREET' )
, ( 2, 0, 'OTHER USER STREET' )
, ( 2, 0, 'OTHER USER STREET' )
, ( 2, 1, 'OTHER USER STREET' )
) T([USER_ID], [ACTIVE_STREET], [STREET]);
Variant using temp table:
1 - fill table with required ruesults;
2 - truncate source table;
3 - insert data from temp to source table:
IF OBJECT_ID('tempdb..#ToBeInserted') IS NOT NULL
DROP TABLE #ToBeInserted
SELECT [USER_ID]
, [ACTIVE_STREET]
, [STREET]
INTO #ToBeInserted
FROM (SELECT *, RN = ROW_NUMBER() OVER (PARTITION BY [USER_ID], [STREET]
ORDER BY [STREET],[ACTIVE_STREET] DESC)
FROM #MY_TMP) AS T
WHERE RN = 1
TRUNCATE TABLE #MY_TMP
INSERT INTO #MY_TMP ( [USER_ID], [ACTIVE_STREET], [STREET] )
SELECT [USER_ID]
, [ACTIVE_STREET]
, [STREET]
FROM #ToBeInserted
Variant using CTE
WITH CTE
AS
(SELECT *, RN = ROW_NUMBER() OVER (PARTITION BY [USER_ID],[STREET]
ORDER BY [STREET],[ACTIVE_STREET] DESC)
FROM #MY_TMP)
DELETE CTE
WHERE RN > 1;

Multiple row's coulmns in one row's multiple columns

Table Schema
ID Status Patient
1 critical Gabriel
1 moderate Frank
1 critical Dorin
2 low Peter
3 critical Noman
3 moderate Johnson
Expected OutPut
ID Patient1 Patient2
1 Gabriel Dorin
3 Noman Null
Here I have to show only those patient whose situation is critcal.
I found the similar question Multiple column values in a single row, but its in SQL also the columns are hard coded.
Thanks!

First step is to select the critical patients and order them:
select id, patient, row_number() over (partition by id order by patient) as rnk
from your_table
where status='critical';
After this you can select first two critical patients in this manner:
select id,
max(case when rnk=1 then patient end) as Patient1,
max(case when rnk=2 then patient end) as Patient2
from (
select id,
patient,
row_number() over (partition by id order by patient) as rnk
from your_table
where status='critical'
)
group by id;
If you want a more flexible solution you can try a query like below, but you should choose the number of ranks in before the runtime:
with your_table as
(select 1 as id, 'critical' as status, 'Gabriel' as patient from dual
union all
select 1, 'moderate', 'Frank' from dual union all
select 1, 'critical', 'Dorin' from dual union all
select 1, 'critical', 'Vasile' from dual union all
select 2, 'low', 'Peter' from dual union all
select 3, 'critical', 'Noman' from dual union all
select 3, 'moderate', 'Johnson' from dual )
select * from (
select id, patient, row_number() over (partition by id order by patient) as rnk
from your_table
where status='critical'
)
pivot (max(patient) for rnk in (1, 2, 3))
order by 1 ;
(This is for three patients.)

Try to build query and execute the result to a cursor.
SET SERVEROUTPUT ON
DECLARE
v_fact NUMBER := 1;
v_max_cnt number:=1;
V_query CLOB:='';
BEGIN
select max(RNum) into v_max_cnt from(
select row_number() over (partition by ID order by ID) RNum from PATIENTSTATUS where status='critical'
)x;
FOR v_counter IN 1..v_max_cnt LOOP
V_query := V_query||v_fact||' as Patient'||v_fact||(case when v_fact=v_max_cnt then '' else ',' end);
v_fact:=v_fact+1;
END LOOP;
DBMS_OUTPUT.PUT_LINE ('select * from (
select id, patient, row_number() over (partition by id order by patient) as rnk
from PATIENTSTATUS
where status=''critical'')
pivot (max(patient) for rnk in ('||V_query||'))
order by 1;');
END;
From a procedure, data can be inserted to a cursor by
OPEN CUR_Your_Cursor FOR V_query;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL: Increment ID only for new rows based on the count - sql

ok If I understand correctly , here is how you can do it : select * , 'FM'+ LPAD(case when cnt =1 then cnt else count(case when cnt > 1 then 1 end) over (order by OrderColumn ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) + 1 end, 8,0) as ID from table for reference: Window Functions

Related

Bigquery uncommon elements in array

Selecting the latest record within a table

how to find all column records are same or not in group by column in SQL

delete duplicates records leaving unique in group with priority

Multiple row's coulmns in one row's multiple columns

Categories

Resources