Postgres The plpgsql aggregate function filters the length of each group - sql

For plpgsql aggregate function help, not sure whether it can be realized. Thanks in advance for your help
Table
_id group_id content num len
0 2 tab 1 3
1 2 name 2 4
2 1 tag 1 3
3 1 bag 2 3
4 1 a 3 1
5 2 b 3 1
6 1 bo 4 2
7 2 an 4 2
I want to implement an aggregation function to aggregate according to group_id, and num is processed in sorted order, and then judge in the function to skip if len is less than or equal to 2, and then return the data of the specified length after each aggregation.
example:
with sorted_table as(select * from Table order by num)
select my_func(content, len, 2(required_num)) from sorted_table group by group_id;
expect result
_id group_id content num len
0 2 tab 1 3
1 2 name 2 4
2 1 tag 1 3
3 1 bag 2 3
for example, need to sort the top 10 (required_num) in each group, sort according to the num of each group, and compare the contents of the top 10 in turn. If the similarity is too high(i can use select similarity judge), filter out, and so on to reach 10 per group Claim. It may also be this
group_id result
2 [{"num":1,"content":"tab","len":3,"_id":0},{"num":2,"content":"name","len":4,"_id":1}]
1 [{"num":1,"content":"tag","len":3,"_id":2},{"num":2,"content":"bag","len":3,"_id":3}]

As far as I understand the question, you don't really need the custom aggregate:
select group_id,
jsonb_agg(t) filter (where len <= 2) as result
from the_table t
group by group_id;

Related

Row Number with specific window size

I want to group records by row numbers.
Like from row 1-3 in group 1 , 4-6 in group 2 , 7-9 in group 3 and so on.
Suppose below is the table structure:
Row NumberDataValue
1 A 10
2 A 5
3 A 1
4 A 33
5 A 2
6 A 127
1 B 1
2 B 0
3 B 7
4 B 7
5 B 5
6 B 8
7 B 1
8 B 0
I want a output like this:
GroupValue
1 10
1 5
1 1
2 33
2 2
2 127
1 1
1 0
1 7
2 7
2 5
2 8
3 1
3 0
I am using Oracle 11G.
I can achieve this using PL/SQL. But I have to use SQL only. As I have to use this query in a reporting tool.
If this is a duplicate question please provide the link of the answered question.
Subtract 1 from the column "RowNumber" and divide by 3.
Then use TRUNC() to get the integer part:
SELECT TRUNC(("RowNumber" - 1) / 3) + 1 "Group",
"Value"
FROM tablename
See the demo.
I would assume the name of the first column is ordering.
You can do:
select
1 + trunc(row_number() over(partition by data order by ordering) - 1) / 3,
value
from t
What you show looks like the output from something like this:
select ceil(rn/3) as grp, value
from your_table
order by rn;
Note that "row number" and "group" are reserved words/phrases which should not be used as column names. I used rn and grp instead.
I think the ceiling function is the simplest way to arrive at what you want. If you want to base it on the RowNumber column:
select ceil( RowNumber / 3.0) as grouping
If you want to calculate it yourself using row_number():
select ceil( row_number() over (order by RowNumber) / 3.0 ) as grouping

Resetting a Count in SQL

I have data that looks like this:
ID num_of_days
1 0
2 0
2 8
2 9
2 10
2 15
3 10
3 20
I want to add another column that increments in value only if the num_of_days column is divisible by 5 or the ID number increases so my end result would look like this:
ID num_of_days row_num
1 0 1
2 0 2
2 8 2
2 9 2
2 10 3
2 15 4
3 10 5
3 20 6
Any suggestions?
Edit #1:
num_of_days represents the number of days since the customer last saw a doctor between 1 visit and the next.
A customer can see a doctor 1 time or they can see a doctor multiple times.
If it's the first time visiting, the num_of_days = 0.
SQL tables represent unordered sets. Based on your question, I'll assume that the combination of id/num_of_days provides the ordering.
You can use a cumulative sum . . . with lag():
select t.*,
sum(case when prev_id = id and num_of_days % 5 <> 0
then 0 else 1
end) over (order by id, num_of_days)
from (select t.*,
lag(id) over (order by id, num_of_days) as prev_id
from t
) t;
Here is a db<>fiddle.
If you have a different ordering column, then just use that in the order by clauses.

Delete rows, which are duplicated and follow each other consequently

It's hard to formulate, so i'll just show an example and you are welcome to edit my question and title.
Suppose, i have a table
flag id value datetime
0 b 1 343 13
1 a 1 23 12
2 b 1 21 11
3 b 1 32 10
4 c 2 43 11
5 d 2 43 10
6 d 2 32 9
7 c 2 1 8
For each id i want to squeze the table by flag columns such that all duplicate flag values that follow each other collapse to one row with sum aggregation. Desired result:
flag id value
0 b 1 343
1 a 1 23
2 b 1 53
3 c 2 75
4 d 2 32
5 c 2 1
P.S: I found functions like CONDITIONAL_CHANGE_EVENT, which seem to be able to do that, but the examples of them in docs dont work for me
Use the differnece of row number approach to assign groups based on consecutive row flags being the same. Thereafter use a running sum.
select distinct id,flag,sum(value) over(partition by id,grp) as finalvalue
from (
select t.*,row_number() over(partition by id order by datetime)-row_number() over(partition by id,flag order by datetime) as grp
from tbl t
) t
Here's an approach which uses CONDITIONAL_CHANGE_EVENT:
select
flag,
id,
sum(value) value
from (
select
conditional_change_event(flag) over (order by datetime desc) part,
flag,
id,
value
from so
) t
group by part, flag, id
order by part;
The result is different from your desired result stated in the question because of order by datetime. Adding a separate column for the row number and sorting on that gives the correct result.

What is the best way to initialize a SortOrder column (e.g. 0, 1, 2, 3) where there are multiple groups based on another field?

I have a table of list items. There is a ListID column used as an identifier to group the list items together. Is there a sane way to give every item a sort order, starting at 0 per list and incremental by one per item.
Basically, I need to populate the following SortOrder Column values for a large number of entries/ListIDs.
ID ListID SortOrder
1 1 0
2 0 0
3 1 1
4 0 1
5 1 2
6 0 2
7 2 0
8 2 1
9 2 2
You can use ROW_NUMBER() with a PARTITION on the ListId field for this:
Select Id, ListId,
Row_Number() Over (Partition By ListId Order By Id) -1 As SortOrder
From YourTable
Order By Id
I think you want:
WITH toupdate as (
SELECT t.*,
ROW_NUMBER() OVER (PARTITION BY ListId Order By id) as new_SortOrder
FROM tableName
)
UPDATE toupdate a
SET sortorder = new_sort_order;
SQL Server has the nice ability to update a subquery or CTE under some circumstances.
Do you need to persist the order of lists containing items that are shared between lists? If so, perhaps variations on this schema would work for you.
Item
id label
1 A
2 B
3 C
4 D
List
id listName
1 abc list
2 cbd list
3 aaa list
ListMembership
id listId itemId order
1 1 1 1
2 1 2 2
3 1 3 3
4 2 2 2
5 2 3 1
6 2 4 3
7 3 1 1
8 3 1 2
9 3 1 3
usage:
select i.label from listMembership as lm
join Item as i on i.id=lm.itemId
where lm.listId=2
order by lm.order
yields:
label
C
B
D

Search string value from mssql column, regex, group by

These data:
ID Desc
1 CUSTSEG
2 CUSTSEG;CARDMNU;CRC;CRCBISOA;CARDMNU;CRC;CRCBISOA
3 CUSTSEG;HKM
4 CUSTSEG;HKM;HKM
5 CUSTSEG;HKM;HKM;HKM;HKM;HKM;HKM;HKM
6 CUSTSEG;PHPM
7 CUSTSEG;PHPM;CARDMNU
8 CUSTSEG;PHPM;CARDMNU;ATM
must be queried into this format:
COUNT Desc
1 ATM
4 CARDMNU
2 CRC
2 CRCBISOA
8 CUSTSEG
10 HKM
3 PHPM
How can I achieve this using? Substring? I've tried this:
SELECT COUNT(*), CallTraversalLog
FROM [IVR].[dbo].[tblReportData]
WHERE CallTraversalLog Like '%CUSTSEG%'
GROUP BY CallTraversalLog
But the resultset I got is
COUNT Desc
1 CUSTSEG;PHPM;CARDMNU;CRC;ATM
1 CUSTSEG;PHPM;CARDMNU;CRC;CARDMNU;CRC
1 CUSTSEG;PHPM;CARDMNU;CRC;CARDMNU;CRC;CRCBISOA
2 CUSTSEG;PHPM;CARDMNU;CRC;CC
3 CUSTSEG;PHPM;CARDMNU;CRC;CRC
2 CUSTSEG;PHPM;CARDMNU;CRC;CRC;CARDMNU;CRC
1 CUSTSEG;PHPM;CARDMNU;CRC;CRC;CRC;CRC;CARDMNU;CRC
25 CUSTSEG;PHPM;CARDMNU;CRC;CRCACTIVATION
4 CUSTSEG;PHPM;CARDMNU;CRC;CRCACTIVATION;CRCENROLL
55 CUSTSEG;PHPM;CARDMNU;CRC;CRCAPST
I would split the strings and count the items. You need a table valued function that splits a string by delimiter. If you don't want to write your own function you can easily google one. Then CROSS APPLY the function to your table and count the items.
SELECT s.item, count(*)
FROM [IVR].[dbo].[tblReportData] d
CROSS APPLY dbo.fnSplitString(d.CallTraversalLog, ';') s
GROUP BY s.item