PostgreSQL: chain of inserting throws "column does not exist" - sql

I have a function for inserting a row in the first table for every input row and inserting with linking exactly two rows in the second table with id of the first insert.
begin
with participants as(
select (first, second)
from get_contest_participants_candidates(competition_id_input)
), new_contest as(
insert into contests (competition_id)
select competition_id_input
from participants
returning id
)
insert into "contestParticipants" (contest_id, contestant_id)
values(
(select c.id, p.first from new_contest c, participants p),
(select c.id, p.second from new_contest c, participants p)
);
end;
function get_contest_participants_candidates() returns
first
second
1
2
1
3
Expected result:
table contests:
id
competition_id
1
1
2
1
table contestParticipants:
id
contest_id
participant_id
1
1
1
2
1
2
3
2
1
4
2
3
But all it returns:
Failed to run sql query: column "first" does not exist
If change (first, second) to first, second it returns
Failed to run sql query: subquery must return only one column
What's I'm doing wrong here?

You have incorrect syntax in here:
insert into "contestParticipants" (contest_id, contestant_id)
values(
(select c.id, p.first from new_contest c, participants p),
(select c.id, p.second from new_contest c, participants p)
);
May be you want to write this:
insert into "contestParticipants" (contest_id, contestant_id)
select c.id, p.first from new_contest c, participants p
union all
select c.id, p.second from new_contest c, participants p;

Related

SQL recursively creating matching groups based on reference table

Imagine you had a data source like:
Id
Val
Data_Date
1
A
2022-01-01
2
B
2022-01-05
3
C
2022-01-09
4
D
2022-01-31
5
E
2022-02-01
With a reference table matching values in this way:
Target_Val
Matching_Val
Valid_Start
Valid_End
B
A
2022-01-04
2022-01-06
C
B
2022-01-09
2022-01-09
D
A
2022-01-31
2022-01-31
Imagine you want to create a table grouping values together where there is a match in the reference table within X days, say 4.
And you want to apply this matching recursively.
Output would be something like this:
Group_Id
Id
1
1
1
2
1
3
2
4
3
5
The logic here would be that C matches to B in the appropriate date range, and B matches to A in the appropriate date range, therefore they are all one group.
But although D matches to A, it is too far apart (greater than 4 days). And E doesn't match to anything.
There could be any depth (A > B > C > D ...)
Is there an appropriate algorithm in SQL to accomplish this? The values of the group IDs are unimportant and just meant to group data points together.
Here's my attempt. You do indeed need a recursive CTE, but you need to join the source table to groups table and then join back to the source table to ensure that the child fits within the parent's 4 day window. E.g. in the case of D and A, as you mention, they match, but they aren't close enough to be counted.
Then I added a calc to work out which rows were valid hierarchies and used that for the recursive join, because we can exclude anything not part of a hierachy.
After that we need to order the records by their depth so we know which parent record is first, e.g. in the case of A > B > C.
Then DENSE_RANK over the results to get your final groups. This will need some testing with deeper levels of recursion though, but this should point you in the right direction:
CREATE TABLE SourceData
(
Id INTEGER,
Val CHAR(1),
Data_Date DATE
);
CREATE TABLE Groups
(
Target_Val CHAR(1),
Matching_Val CHAR(1),
Valid_Start DATE,
Valid_End DATE
);
INSERT INTO SourceData (Id, Val, Data_Date) VALUES (1,'A','2022-01-01');
INSERT INTO SourceData (Id, Val, Data_Date) VALUES (2,'B','2022-01-05');
INSERT INTO SourceData (Id, Val, Data_Date) VALUES (3,'C','2022-01-09');
INSERT INTO SourceData (Id, Val, Data_Date) VALUES (4,'D','2022-01-31');
INSERT INTO SourceData (Id, Val, Data_Date) VALUES (5,'E','2022-02-01');
INSERT INTO Groups (Target_Val, Matching_Val, Valid_Start, Valid_End ) VALUES ('B','A','2022-01-04','2022-01-06');
INSERT INTO Groups (Target_Val, Matching_Val, Valid_Start, Valid_End ) VALUES ('C','B','2022-01-09','2022-01-09');
INSERT INTO Groups (Target_Val, Matching_Val, Valid_Start, Valid_End ) VALUES ('D','A','2022-01-31','2022-01-31');
WITH sourceCTE AS
(
SELECT sd.Id, sd.Val, sd.Data_Date, g.Valid_Start, g.Valid_End, IIF(s.Val IS NULL, sd.Val, g.Matching_Val) [ParentVal], CAST(NULL AS DATE) [start], CAST(NULL AS DATE) [end], 1 [Depth],
IIF(s.Val IS NULL, 0, 1) IsHeirarchy
FROM SourceData sd
LEFT JOIN Groups g ON g.Target_Val = sd.Val AND sd.Data_Date BETWEEN g.Valid_Start AND g.Valid_End
LEFT JOIN SourceData s ON s.Val = g.Matching_Val AND ABS(DATEDIFF(DAY, s.Data_Date, sd.Data_Date)) < 5
UNION ALL
SELECT s.Id, s.Val, s.Data_Date, g.Valid_Start, g.Valid_End, g.Matching_Val, g.Valid_Start, g.Valid_End, s.[Depth] + 1, 1
FROM sourceCTE s
INNER JOIN Groups g ON g.Target_Val = s.[ParentVal] AND s.IsHeirarchy = 1
),
ResultCTE AS
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY Id ORDER BY [Depth] DESC) [RNum]
FROM sourceCTE
)
SELECT DENSE_RANK() OVER (ORDER BY ParentVal) [Group_Id], Id
FROM ResultCTE
WHERE [RNum] = 1
Here's a working fiddle.
I can't promise this is the best solution, because just like the query optimiser I gave up after about 2 hours, ha.
Also, for any future questions, please provide sample data in script format to save time creating the structure.

Need help writing an SQL query to count non duplicate rows (not a distinct count)

I have a table like below. I'm trying to do a count of IDs that are not duplicated. I don't mean a distinct count. A distinct count would return a result of 7 (a, b, c, d, e, f, g). I want it to return a count of 4 (a, c, d, f). These are the IDs that do not have multiple type codes. I've tried the following queries but got counts of 0 (the result should be a count in the millions).
select ID, count (ID) as number
from table
group by ID
having count (ID) = 1
Select count (distinct ID)
From table
Having count (ID) = 1
ID|type code
a|111
b|222
b|333
c|444
d|222
e|111
e|333
e|555
f|444
g|333
g|444
thanks to #scaisEdge! The first query you provided gave me exactly what I'm looking for in the above question. Now that that's figured out my leaders have asked for it to be taken a step further to show the count of how many times there is an ID within a single type code. For example, we want to see
type code|count
111|1
222|1
444|2
There are 2 instances of IDs that have a single type code of 444 (c, f), there is one instance of an ID that has a single type code of 111 (a), and 222 (d). I've tried modifying the query as such, but have been coming across errors when running the query
select count(admin_sys_tp_cd) as number
from (
select cont_id from
imdmadmp.contequiv
group by cont_id
having count(*) =1) t
group by admin_sys_tp_cd
If you want the count Could be
select count(*) from (
select id from
my_table
group by id
having count(*) =1
) t
if you want the id
select id from
my_table
group by id
having count(*) =1
Hou about this you do a loop in a temporary table?:
select
*
into #control
from tablename
declare #acum as int
declare #code as char(3)
declare #id as char(1)
declare #id2 as int
select #acum=0
while exists (select* from #control)
begin
select #code = (select top 1 code from #control order by id)
select #id = (select top 1 id from #control order by id)
select #id2 =count(id) from #control where id in (select id from tablename where id = #id and code <> #code)
if #id2=0
begin
select #acum = #acum+1
end
delete #control
where id = #id --and code = #code
end
drop table #control
print #acum

T-SQL to generate expect result

I have a case in SQL: Source table have three columns: ID, Cate, Type.
With same Cate, Type (A, A-) (B, B-) eliminate each other, and return rows have MAX(Id).
eg:
With cate = AM0001 : Id = 1,2,3 then Id 1,2 eliminate each other --> keep id =3.
With cate = AM003: Id= 4,6 , type = B --> keep both.
With cate = AM005: Id= 7,8,9 row: 7,8 eliminate each other --> keep: Id =9
With cate = AM0006: Id= 10,11, type = A -->keep both.
Expected result:
I'm using cursor to resolve it quite hard to resolve. Is there any clue for resolving it in T-SQL
Assuming that I understand the problem:
you have a number of rows with sections ("Cates") and symbols ("Type");
if there are any symbols ending in a minus sign then these indicate a row without a minus sign should be removed;
symbols are never "mixed" per section, i.e. a section can never have "A" and "B-";
there will always be a row to remove if there is a type with a minus;
rows should be removed starting with the lowest Id.
Then this should work:
DECLARE #data TABLE (
Id INT,
Cate VARCHAR(5),
[Type] VARCHAR(2));
INSERT INTO #data SELECT 1, 'AM001', 'A';
INSERT INTO #data SELECT 2, 'AM001', 'A-';
INSERT INTO #data SELECT 3, 'AM001', 'A';
INSERT INTO #data SELECT 4, 'AM003', 'B';
INSERT INTO #data SELECT 6, 'AM003', 'B';
INSERT INTO #data SELECT 7, 'AM005', 'B';
INSERT INTO #data SELECT 8, 'AM005', 'B-';
INSERT INTO #data SELECT 9, 'AM005', 'B';
INSERT INTO #data SELECT 10, 'AM006', 'A';
INSERT INTO #data SELECT 11, 'AM006', 'A';
INSERT INTO #data SELECT 12, 'AM011', 'B';
INSERT INTO #data SELECT 13, 'AM011', 'B-';
INSERT INTO #data SELECT 14, 'AM011', 'B';
WITH NumberToRemove AS (
SELECT
Cate,
COUNT(*) AS TakeOff
FROM
#data
WHERE
[Type] LIKE '_-'
GROUP BY
Cate),
Ordered AS (
SELECT
Id,
Cate,
[Type],
ROW_NUMBER() OVER (PARTITION BY Cate ORDER BY Id) AS RowId
FROM
#data
WHERE
[Type] NOT LIKE '_-')
SELECT
d.*
FROM
#data d
LEFT JOIN NumberToRemove m ON m.Cate = d.Cate
INNER JOIN Ordered o ON o.Id = d.Id
WHERE
o.RowId > ISNULL(m.TakeOff, 0);
The query works by first counting the number of rows to remove from each section ("Cate") by tallying up the number of symbols with a minus sign per section. Next it sorts the rows where the symbols don't have a minus sign and assigns each row a number in Id order ("row number"), starting back at 1 for each new section ("Cate").
Finally I just pick the rows without a minus sign symbol, where the row number is greater than the number that were to be removed. Note that if a section has no rows to remove then it will return NULL rows to remove, so I transform this to 0, because ALL rows in that section with have a row number greater than 0.
My results were:
Id Cate Type
3 AM001 A
4 AM003 B
6 AM003 B
9 AM005 B
10 AM006 A
11 AM006 A
14 AM011 B
If my assumptions were incorrect then this script could easily be amended to suit...

sql query logic

I have following data set
a b c
`1` 2 3
3 6 9
9 2 11
As you can see column a's first value is fixed (i.e. 1), but from second row it picks up the value of column c of previous record.
Column b's values are random and column c's value is calculated as c = a + b
I need to write a sql query which will select this data in above format. I tried writing using lag function but couldn't achieve.
Please help.
Edit :
Column b exists in table only, a and c needs to calculated based on the values of b.
Hanumant
SQL> select a
2 , b
3 , c
4 from dual
5 model
6 dimension by (0 i)
7 measures (0 a, 0 b, 0 c)
8 rules iterate (5)
9 ( a[iteration_number] = nvl(c[iteration_number-1],1)
10 , b[iteration_number] = ceil(dbms_random.value(0,10))
11 , c[iteration_number] = a[iteration_number] + b[iteration_number]
12 )
13 order by i
14 /
A B C
---------- ---------- ----------
1 4 5
5 8 13
13 8 21
21 2 23
23 10 33
5 rows selected.
Regards,
Rob.
Without knowing the relation between the rows ,how can we calculate the sum of the previous row a and b column to current row a column .I have created two more column id and parent in the table to find the relation between the two rows.
parent is the column which tell us about the previous row ,and id is the primary key of the row .
create table test1 (a number ,b number ,c number ,id number ,parent number);
Insert into TEST1 (A, B, C, ID) Values (1, 2, 3, 1);
Insert into TEST1 (B, PARENT, ID) Values (6, 1, 2);
Insert into TEST1 (B, PARENT, ID) Values (4, 2, 3);
WITH recursive (a, b, c,rn) AS
(SELECT a,b,c,id rn
FROM test1
WHERE parent IS NULL
UNION ALL
SELECT (rec.a+ rec.b) a
,t1.b b
,(rec.a+ rec.b+t1.b) c
,t1.id rn
FROM recursive rec,test1 t1
WHERE t1.parent = rec.rn
)
SELECT a,b,c
FROM recursive;
The WITH keyword defines the name recursive for the subquery that is to follow
WITH recursive (a, b, c,rn) AS
Next comes the first part of the named subquery
SELECT a,b,c,id rn
FROM test1
WHERE parent IS NULL
The named subquery is a UNION ALL of two queries. This, the first query, defines the starting point for the recursion. As in my CONNECT BY query, I want to know what is the start with record.
Next up is the part that was most confusing :
SELECT (rec.a+ rec.b) a
,t1.b b
,(rec.a+ rec.b+t1.b) c
,t1.id rn
FROM recursive rec,test1 t1
WHERE t1.parent = rec.rn
This is how it works :
WITH query: 1. The parent query executes:
SELECT a,b,c
FROM recursive;
This triggers execution of the named subquery. 2 The first query in the subquery's union executes, giving us a seed row with which to begin the recursion:
SELECT a,b,c,id rn
FROM test1
WHERE parent IS NULL
The seed row in this case will be for id =1 having parent is null. Let's refer to the seed row from here on out as the "new results", new in the sense that we haven't finished processing them yet.
The second query in the subquery's union executes:
SELECT (rec.a+ rec.b) a
,t1.b b
,(rec.a+ rec.b+t1.b) c
,t1.id rn
FROM recursive rec,test1 t1
WHERE t1.parent = rec.rn

SQL query to get the value that appeared the most for each category

I have a table like this:
Category Reply
---------+---------------+
M 1
F 2
M 1
M 3
M 1
M 3
F 2
F 1
F 2
F 5
F 2
I'm looking for an SQL query to return the following results:
Category Total Number Best Reply Number
---------+---------------+------------------+---------------+
M 5 1 3
F 6 2 4
Total number : the number of appearance of that category (I know how to get this)
Best Reply: The Reply that was chosen the most for that category
Number : The number of time the "best Reply" was chosen
You don't specify your database, so I avoided using common table expressions which would make this clearer. It could still be cleaned up a bit. I did my work on SQL Server 2008.
select rsTotalRepliesByCategory.Category,
TotalRepliesByCategory,
rsCategoryReplyCount.Reply,
rsMaxReplies.MaxReplies
from
(
--calc total replies
select Category, COUNT(*) as TotalRepliesByCategory
from CategoryReply
group by Category
) rsTotalRepliesByCategory
INNER JOIN
(
--calc number of replies by category and reply
select Category, Reply, COUNT(*) as CategoryReplyCount
from CategoryReply
group by Category, Reply
) rsCategoryReplyCount on rsCategoryReplyCount.Category = rsTotalRepliesByCategory.Category
INNER JOIN
(
--calc the max replies
select Category, MAX(CategoryReplyCount) as MaxReplies
from
(
select Category, Reply, COUNT(*) as CategoryReplyCount
from CategoryReply
group by Category, Reply
) rsCategoryReplyCount2
group by Category
) rsMaxReplies on rsMaxReplies.Category = rsTotalRepliesByCategory.Category and rsMaxReplies.MaxReplies = rsCategoryReplyCount.CategoryReplyCount
Here is the setup I used to play around with this:
create table CategoryReply
(
Category char(1),
Reply int
)
insert into CategoryReply values ('M',1)
insert into CategoryReply values ('F',2)
insert into CategoryReply values ('M',1)
insert into CategoryReply values ('M',3)
insert into CategoryReply values ('M',1)
insert into CategoryReply values ('M',3)
insert into CategoryReply values ('F',2)
insert into CategoryReply values ('F',1)
insert into CategoryReply values ('F',2)
insert into CategoryReply values ('F',5)
insert into CategoryReply values ('F',2)
And finally, the output:
Category TotalRepliesByCategory Reply MaxReplies
F 6 2 4
M 5 1 3
SELECT Category, TotalNumber, Row_Number() over (order by TotalNumber)
FROM(
SELECT Category, Sum(Reply) as TotalNumber, Count(Reply) as Number
From Table
Group By Category) as temp
Would be something like that