How to find most frequent code(varchar) from a table

How to find most frequent code(varchar) from a table - sql

I would like to find most frequent code within CodeID which is in same code_group from a table.
For example, from original table
ID CodeID Name Code Code_group
1 1 A 101 0
2 1 A 102 0
3 1 B 102 0
4 2 C 201 0
5 2 C 201 0
6 2 D 202 0
7 2 E 202 0
8 3 F 101 1
9 3 G 103 1
10 3 G 104 1
11 3 G 104 1
I want output like the below.
ID CodeID Name Code Code_group Selected_code
1 1 A 101 0 102
2 1 A 102 0 102
3 1 B 102 0 102
4 2 C 201 0 NULL
5 2 C 201 0 NULL
6 2 D 202 0 NULL
7 2 E 202 0 NULL
8 3 F 101 1 104
9 3 G 103 1 104
10 3 H 104 1 104
11 3 H 104 1 104
Even though code of 8th ID is same in CodeID: 1,it is not in the same Code_group.
So For CodeID: 1, Selected_code would be 102.
it must be counted within exactly same Code_group.
=======================================
I have tried it like the below. I should not use ID for this one.
From TableA
with m as
(
select
CodeID,
Name,
Code,
Code_group,
cnt,
Selected_code = ROW_NUMBER() over (partition by Code_group order by cnt desc)
from( select CodeID, Name, Code,Code_group
,count(*) over (partition by Code,CodeID) as cnt from tableA
group by CodeID, Name, Code, Code_group,
) as t
group by CodeID,
Name,
Code,
Code_group, cnt
)
select a.CodeID,
a.Name,
a.Code,
a.Code_group, b.Code as Selected_code, cnt
from(select
CodeID,
Name,
Code,
Code_group,Selected_code,
cnt
from m) as a left outer join
(select CodeID,
Name,
Code,
Code_group,Selected_code,
cnt
from m where selected_Code=1) as b on a.CodeID = b.CodeID and a.Code_Group = b.Code_Group
order by a.CodeID, a.Code_Group
The problem of this is
With statment makes my table distinct. It shows only one row if there is exactly same data such as ID 1,2.
Also, I cannot make NULL if there is exactly same frequencies.
What should I add to get my desired output?
Or is there any better approach for this?

CTE cte find the highest frequency code by Code_group and CodeID using dense_rank()
CTE selected check for any Code with same frequency and exclude them.
Final query just select from the original table and LEFT JOIN the selected
with
cte as
(
select Code_group, CodeID, Code
from
(
select Code_group, CodeID, Code,
r = dense_rank() over (partition by Code_Group, CodeID
order by count(*) desc)
from tableA
group by Code_group, CodeID, Code
) c
where c.r = 1
),
selected as
(
select Code_group, CodeID, Code
from
(
select Code_group, CodeID, Code,
cnt = count(*) over (partition by Code_group, CodeID)
from cte
) s
where s.cnt = 1
)
select a.*,
Selected_Code = s.Code
from tableA a
left join selected s on a.Code_Group = s.Code_Group
and a.CodeID = s.CodeID;
db<>fiddle demo

Related

Select quantity on a 1st table based on a total quantity the 2nd table

Table 1
ID
Grp
Qty
1
A
5
2
A
4
3
B
5
4
B
3
5
B
2
6
C
14
7
D
1
8
D
1
9
E
2
10
E
2
11
E
1
12
E
1
Table 2
ID
Grp
Qty
1
A
7
2
B
9
3
C
13
4
D
1
5
E
4
Select/Output
ID
Grp
Qty
1
A
0
2
A
2
3
B
0
4
B
0
5
B
1
6
C
1
7
D
0
8
D
1
9
E
0
10
E
0
11
E
1
12
E
1
I want to select a row on a 1st table with a specific quantity based on the total quantity of the 2nd table. The result is on the 3rd table. Please see sample tables above, I really appreciate a help, thank you so much and sorry it was my first time asking a question here.
I have tried this code on both 2 tables
WITH tbl AS(
SELECT ID,
Qty,
Grp,
ROW_NUMBER() OVER (PARTITION BY Grp)AS Rown,
SUM(Qty) OVER (PARTITION BY Grp)AS Total
FROM Table1
)
SELECT * FROM tbl WHERE Rown = 1
But I am not able to select the specific rows on Table 1 because it only select the 1st row and total the quantity. Every row on table 1 has its own quantity.

You could use a cumulative windowed aggregates and then a CASE expression to achieve this:
--Saple Data
WITH Table1 AS(
SELECT *
FROM (VALUES(1,'A',5),
(2,'A',4),
(3,'B',5),
(4,'B',3),
(5,'B',2),
(6,'C',14))V(ID,Grp,Qty)),
Table2 AS(
SELECT *
FROM (VALUES(1,'A',7),
(2,'B',9),
(3,'C',13))V(ID,Grp,Qty)),
--Solution
CTE AS(
SELECT T1.ID,
T1.Grp,
T1.Qty,
SUM(T1.Qty) OVER (PARTITION BY T1.Grp ORDER BY T1.Id
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS RunningQty,
T2.Qty AS T2Qty
FROM Table1 T1
JOIN Table2 T2 ON T1.Grp = T2.Grp)
SELECT C.ID,
C.Grp,
CASE WHEN C.RunningQty <= C.T2Qty THEN C.Qty
ELSE C.T2Qty - LAG(C.RunningQty,1,0) OVER (PARTITION BY C.Grp ORDER BY C.ID)
END AS Qty
FROM CTE C;

How to update records belongs to same partition SQL Server

I have a table in a database containing the following data:
GroupId ExceptionId ParentExceptionId row
1 101 NULL 1
1 102 NULL 2
1 103 NULL 3
2 104 NULL 1
2 105 NULL 2
2 106 NULL 3
3 107 NULL 1
3 108 NULL 2`
I worte a following query to get the above row number:
with CTE_RN as
(
SELECT a.[GroupId], a.[SolId], a.[id],ParentExceptionId,
ROW_NUMBER() OVER(PARTITION BY a.[GroupId] ORDER BY a.[GroupId]) AS [row]
FROM [dbo].[trn_Report6_Zone1_Exception] a)
select * from cte_rn`
expected output:
update ParentExceptionId with ExceptionId of first record having same group id and keep ParentExceptionId of that first record null.
GroupId ExceptionId ParentExceptionId row
1 101 NULL 1
1 102 101 2
1 103 101 3
2 104 Null 1
2 105 104 2
2 106 104 3
3 107 NULL 1
3 108 107 2`

You can use first_value function :
select GroupId, ExceptionId,
(case when f_value <> ExceptionId then f_value end) as ParentExceptionId, row
from (select *, first_value(ExceptionId) over (partition by GroupId order by ExceptionId) f_value
from [dbo].[trn_Report6_Zone1_Exception] a
) a;
In same way you can use updateable cte :
with a as (
select *, first_value(ExceptionId) over (partition by GroupId order by ExceptionId) f_value
from [dbo].[trn_Report6_Zone1_Exception] a
)
update a
set ParentExceptionId = f_value
where f_value <> ExceptionId;

Try like this
SELECT * INTO #TAB FROM
(select 1,101,NULL,1 UNION ALL
select 1,102,NULL,2 UNION ALL
select 1,103,NULL,3 UNION ALL
select 2,104,NULL,1 UNION ALL
select 2,105,NULL,2 UNION ALL
select 2,106,NULL,3 UNION ALL
select 3,107,NULL,1 UNION ALL
select 3,108,NULL,2
)AS TABLEA(GroupId,ExceptionId,ParentExceptionId,rowW)
;WITH CTE AS(
SELECT GroupId, MIN(ExceptionId) MIN_ExceptionId
FROM #TAB
GROUP BY GroupId
)
UPDATE T SET T.ParentExceptionId = C.MIN_ExceptionId
FROM #TAB T
INNER JOIN CTE C ON T.GroupId = C.GroupId
WHERE rowW <>1

SQL Server Query group by analytic

I have a table like this
Id scid name namesuffix nameId namesuffixid fullname
--------------------------------------------------------
1 1 a a 100 100 a
2 1 a b 100 101 ab
3 1 b c 101 102 abc
4 1 c d 102 103 abcd
5 2 e e 104 104 e
6 2 e f 104 105 ef
7 2 f g 105 106 efg
8 3 i i 107 107 i
9 3 i j 107 108 ij
10 3 j k 108 109 ijk
11 3 k l 109 110 ijkl
12 3 l m 110 111 ijklm
for each scid (group by scid)
select firstRow fullName
Last row fullName
Expected output
id scid fullname
-------------------
1 1 a
4 1 abcd
5 2 e
7 2 efg
8 3 i
12 3 ijklm
I tried first_value and last_value analytic functions, but the rows are repeating, didn't get expected result.
Any help appreciated.

Another option is to use ROW_NUMBER() and COUNT
select
id, scid, fullname
from (
select
*, row_number() over (partition by scid order by id) rn
, count(*) over (partition by scid) cnt
from
myTable
) t
where
rn = 1
or rn = cnt

You could use FIRST_VALUE and LAST_VALUE as you proposed:
SELECT scid,
FIRST_VALUE(id) OVER(PARTITION BY scid ORDER BY id
ROWS UNBOUNDED PRECEDING) AS id,
FIRST_VALUE(fullname) OVER(PARTITION BY scid ORDER BY id
ROWS UNBOUNDED PRECEDING) AS fullname
FROM tab_name
UNION
SELECT scid,
LAST_VALUE(id) OVER(PARTITION BY scid ORDER BY id
RANGE BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) AS id,
LAST_VALUE(fullname) OVER(PARTITION BY scid ORDER BY id
RANGE BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) AS fullname
FROM tab_name
ORDER BY scid, id;
Demo

There are other ways to do this without window funtions:
select t.*
from t join
(select min(id) as min_id, max(id) as max_id
from t
group by sc_id
) tt
on t.id in (min_id, max_id);
I only suggest this because there are many ways to do what you want. If performance is an issue, you may want to experiment with different methods.

Query is not working accordingly in postgresql

Select sum(num) as num, sum(numbr) as numbr
from
(
(Select 0 as num)
union all
(Select 1 as num)
) t,
(
(Select 2 as numbr)
union all
(Select 3 as numbr)
) t1
giving result:
num numbr
2 10
But the correct result should be
num numbr
1 5

You are doing the cross product of a table containing 0 and 1, and a table containing 2 and 3. Try removing the sums:
Select num, numbr as numbr from
(
(Select 0 as num)
union all
(Select 1 as num))t,
((Select 2 as numbr)
union all
(Select 3 as numbr)
)t1
This gives you:
0;2
0;3
1;2
1;3
Which will correctly sum to 2 and 10.

That happens because you are CROSS JOINING , every record connect to every record with out a relation condition, which means that in this case, your join becomes this:
NUM | NUMBR
0 2
0 3
1 2
1 3
Which SUM(NUM) = 2 and SUM(NUMBR) = 10 .
When joining, you have to specify the relation condition unless this is what you want.
Note: You are using implicit join syntax(comma separated) , you should avoid that and use the explicit syntax and this will help you make sure you are using a relation condition (by the ON clause):
Select sum(num) as num, sum(numbr) as numbr
from
(
(Select 0 as num)
union all
(Select 1 as num)
) t
INNER JOIN
(
(Select 2 as numbr)
union all
(Select 3 as numbr)
) t1
ON(t.<Col> = t1.<Col1>)

Select num, numbr as numbr
from
(
(Select 0 as num)
union all
(Select 1 as num)
) t,
(
(Select 2 as numbr)
union all
(Select 3 as numbr)
) t1
Gives you the cartessian product of tables.
| Num | Number |
|-----|--------|
| 0 | 2 |
| 0 | 3 |
| 1 | 2 |
| 1 | 3 |
Therefore the sum of these are 2 and 10

Its correctly working as you wrote. If you want the result as you expected, try this:
Select sum(distinct num) as num, sum(distinct numbr) as numbr
from
(
(Select 0 as num)
union all
(Select 1 as num)
) t,
(
(Select 2 as numbr)
union all
(Select 3 as numbr)
) t1

MS Sql Server, same column with a different row neighbors

I need a little help on a SQL query. I could not get the result that I wanted.
ID I10 H 10NS HNS CC NSCC
0 1 1 1 1 14 14
1 0 1 0 1 6 2
1 0 2 0 2 12 2
1 0 3 0 3 17 4
1 0 3 0 3 18 4
1 0 3 0 3 19 4
1 0 3 0 3 20 4
What I want to have is one from each ID with highest CC
For example,
ID I10 H 10NS HNS CC NSCC
0 1 1 1 1 14 14
1 0 3 0 3 20 4
I tried with this code:
SELECT a.ID, b.name, a.i10 as[i-10-index], a.h as[h-index], 10ns as[i-10-index based on non-self-citation], a.hns as [h-index based on non-self-citation],
max(a.[Citation Count]), (a.[Non-Self-Citation Count])
FROM tbl_lpNumerical as a
join tbl_lpAcademician as b
on a.ID= (b.ID-1)
GROUP BY a.ID, b.name, a.i10, a.h, a.10ns, a.hns,
a.[Non-Self-Citation Count]
order by a.ID desc
However, I could not get the desired results.
Thank you for your time.

You can simply get all the row where not exist another row with an higher CC
SELECT n.*
FROM tbl_lpNumerical n
WHERE NOT EXISTS ( SELECT 'b'
FROM tbl_lpNumerical n2
WHERE n2.ID = n.ID
AND n2.CC > n.CC
)

In SQL Server, you can use row_number() for this. Based on your sample data`, something like:
select sd.*
from (select sd.*, row_number() over (partition by id order by cc desc) as seqnum
from sampledata sd
) sd
where seqnum = 1;
I have no idea what your query has to do with the sample data. If it generates the data, then you can use a CTE:
with sampledata as (
<some query here>
)
select sd.*
from (select sd.*, row_number() over (partition by id order by cc desc) as seqnum
from sampledata sd
) sd
where seqnum = 1;

The following query will select a single row from each ID partition: the one with the highest CC value:
SELECT *
FROM (SELECT *,
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY CC DESC) AS rn
FROM mytable) t
WHERE t.rn = 1
If there can be multiple rows having the same CC max value and you want all of them selected, then you can replace ROW_NUMBER() with RANK().

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to find most frequent code(varchar) from a table - sql

Related

Select quantity on a 1st table based on a total quantity the 2nd table

How to update records belongs to same partition SQL Server

SQL Server Query group by analytic

Query is not working accordingly in postgresql

MS Sql Server, same column with a different row neighbors

Categories

Resources