Display other column values only for distinct values in derived column - sql

These are my table structures -
Rule
Ruleset_rule_map
What I am trying to do is joining both tables based on rule.id and ruleset_rule_map.rule_id.
Want to fetch data like this-
If you can see I want to remove duplicates from derived column - rule_bucket.
At the same time I want to display rule_order only for distincts in derived column rule_bucket
Have written a query for the same -
select DISTINCT rrm.RULE_ORDER, CONCAT(r.PARENT_RULE,
IIF(r.CHILD_RULE IS NOT NULL, CONCAT('|', r.CHILD_RULE), r.CHILD_RULE),
IIF(r.SUB_CHILD_RULE IS NOT NULL, CONCAT('|', r.SUB_CHILD_RULE), r.SUB_CHILD_RULE),
IIF(r.SUB_SUB_CHILD_RULE IS NOT NULL, CONCAT('|', r.SUB_SUB_CHILD_RULE), r.SUB_SUB_CHILD_RULE) )
AS RULE_BUCKET from [RULE] r
inner join RULESET_RULE_MAP rrm on rrm.RULE_ID = r.ID
and rrm.RULESET_ID = 'AAE97A62-F37E-4454-A008-FF40A102BB25'
and r.PARENT_RULE <> 'N/A' order by rrm.RULE_ORDER;
but with the above query I can only get result like this.
Can some one please help me to write the correct query? Samples to help me solve the above are also welcome.

If you really one one row per rule bucket, just use aggregation:
SELECT MIN(RULE_ORDER), RULE_BUCKET
FROM (SELECT rrm.RULE_ORDER,
CONCAT(r.PARENT_RULE,
IIF(r.CHILD_RULE IS NOT NULL, CONCAT('|', r.CHILD_RULE), r.CHILD_RULE),
IIF(r.SUB_CHILD_RULE IS NOT NULL, CONCAT('|', r.SUB_CHILD_RULE), r.SUB_CHILD_RULE),
IIF(r.SUB_SUB_CHILD_RULE IS NOT NULL, CONCAT('|', r.SUB_SUB_CHILD_RULE), r.SUB_SUB_CHILD_RULE)
) AS RULE_BUCKET
FROM [RULE] r JOIN
RULESET_RULE_MAP rrm
ON rrm.RULE_ID = r.ID AND
rrm.RULESET_ID = 'AAE97A62-F37E-4454-A008-FF40A102BB25' AND
r.PARENT_RULE <> 'N/A'
) r
GROUP BY RULE_BUCKET
ORDER BY MIN(rrm.RULE_ORDER);

You can use the analytical function as follows:
select * from
(select t.*, row_number() over (partition by RULE_BUCKET order by RULE_ORDER desc) as rn
from
(select rrm.RULE_ORDER,
CONCAT(r.PARENT_RULE,
IIF(r.CHILD_RULE IS NOT NULL,
CONCAT('|', r.CHILD_RULE), r.CHILD_RULE),
IIF(r.SUB_CHILD_RULE IS NOT NULL,
CONCAT('|', r.SUB_CHILD_RULE), r.SUB_CHILD_RULE),
IIF(r.SUB_SUB_CHILD_RULE IS NOT NULL,
CONCAT('|', r.SUB_SUB_CHILD_RULE), r.SUB_SUB_CHILD_RULE) )
AS RULE_BUCKET
from [RULE] r
inner join RULESET_RULE_MAP rrm on rrm.RULE_ID = r.ID
and rrm.RULESET_ID = 'AAE97A62-F37E-4454-A008-FF40A102BB25'
and r.PARENT_RULE <> 'N/A') t) t
where rn = 1
order by RULE_ORDER;

Related

Duplicate rows returned even though group by is used

This is my query
SELECT p.book FROM customers_books p
INNER JOIN books b ON p.book = b.id
INNER JOIN bookprices bp ON bp.book = p.book
WHERE b.status = 'PUBLISHED' AND bp.currency_code = 'GBP'
AND p.book NOT IN (SELECT cb.book FROM customers_books cb WHERE cb.customer = 1)
GROUP BY p.book, p.created_date ORDER BY p.created_date DESC
This is the data in my customers_books table,
I expect only 8,6,1 of books IDs to return but query is returning 8,6,1,1
table structures are here
CREATE TABLE "public"."customers_books" (
"id" int8 NOT NULL,
"created_date" timestamp(6),
"book" int8,
"customer" int8,
);
CREATE TABLE "public"."books" (
"id" int8 NOT NULL,
"created_date" timestamp(6),
"status" varchar(255) COLLATE "pg_catalog"."default",
)
CREATE TABLE "public"."bookprices" (
"id" int8 NOT NULL,
"currency_code" varchar(255) COLLATE "pg_catalog"."default",
"book" int8
)
what do you think I am doing wrong here.
I really dont want to use p.created_date in group by but I was forced to use because of order by
You have too many joins in the outer query:
SELECT b.book
FROM books b INNER JOIN
bookprices bp
ON bp.book = p.book
WHERE b.status = 'PUBLISHED' AND bp.currency_code = 'GBP' AND
NOT EXISTS (SELECT 1
FROM customers_books cb
WHERE cb.book = p.book AND cb.customer = 1
) ;
Note that I replaced the NOT IN with NOT EXISTS. I strongly, strongly discourage you from using NOT IN with a subquery. If the subquery returns any NULL values, then NOT IN returns no rows at all. It is better to sidestep this issue just by using NOT EXISTS.

SQL query : SELECT

CREATE TABLE WRITTEN_BY
( Re_Id CHAR(15) NOT NULL,
Pub_Number INT NOT NULL,
PRIMARY KEY(Re_Id, Pub_Number),
FOREIGN KEY(Re_Id) REFERENCES RESEARCHER(Re_Id),
FOREIGN KEY(Pub_Number) REFERENCES PUBLICATION(Pub_Number));
CREATE TABLE WORKING_ON
( Re_Id CHAR(15) NOT NULL,
Pro_Code CHAR(15) NOT NULL,
PRIMARY KEY(Re_Id, Pro_Code, Subpro_Code)
FOREIGN KEY(Re_Id) REFERENCES RESEARCHER(Re_Id));
Re_Id stands for ID of a researcher
Pub_Number stands for ID of a publication
Pro_Code stands for ID of a project
Written_by table stores information about a Publication's ID and it's author
Working_on table stores information about a Project's ID and who is working on it
Now, I have this query :
For each project, find the researcher who wrote the most number of publications .
This is what i've done so far :
SELECT Pro_Code,WORK.Re_Id
FROM WORKING_ON AS WORK , WRITTEN_BY AS WRITE
WHERE WORK.Re_Id = WRITE.Re.Id
so I got a table which contains personal ID and project's ID of a researcher who has at least 1 publication. But what's next ? How to solve this problem?
You haven't said which platform you're on but try this. It handles the case where there are ties as well.
select g.Pro_Code, g.Re_Id, g.numpublished
from
(
SELECT work.Pro_Code, WORK.Re_Id, count(WRITE.pub_number) as numpublished
FROM WORKING_ON WORK JOIN WRITTEN_BY AS WRITE ON WORK.Re_Id = WRITE.Re_Id
GROUP BY work.Pro_Code, WORK.Re_Id
) g
inner join
(
select Pro_code, max(numpublished) as maxpublished
from (
SELECT work.Pro_Code, WORK.Re_Id, count(WRITE.pub_number) numpublished
FROM WORKING_ON WORK JOIN WRITTEN_BY AS WRITE ON WORK.Re_Id = WRITE.Re_Id
GROUP BY work.Pro_Code, WORK.Re_Id
) g2
group by Pro_code
) m
on m.Pro_code = g.Pro_Code and m.maxpublished = g.numpublished
Some platforms will allow you to write it this way:
with g as (
SELECT work.Pro_Code, WORK.Re_Id, count(WRITE.pub_number) as numpublished
FROM WORKING_ON WORK JOIN WRITTEN_BY AS WRITE ON WORK.Re_Id = WRITE.Re_Id
GROUP BY work.Pro_Code, WORK.Re_Id
)
select g.Pro_Code, g.Re_Id, g.numpublished
from g
inner join
(
select Pro_code, max(numpublished) as maxpublished
from g
group by Pro_code
) m
on m.Pro_code = g.Pro_Code and m.maxpublished = g.numpublished
I think that you are looking for something like the following :
select
tm.pro_code as pro_code,
tm.re_id as re_id,
max(total) as max_pub
from (
select *
from (
select
wo.pro_code as pro_code
wr.re_id as re_id,
count(wr.pub_number) as total
from
written_by wr,
working_on wo
where
wr.re_id = wo.re_id
group by wr.re_id,wo.pro_code
)
) tm
group by pro_code
If you are using MS SQL, this should work:
With cte as (
select a.Re_Id, Pub_Number,Pro_Code, COUNT(distinct Pub_Number) as pubs
from WRITTEN_BY a
inner join WORKING_ON b
on a.Re_Id = b.Re_Id)
SELECT Re_Id,pubs from cte
HAVING pubs = MAX(pubs)
GROUP BY Re_Id

query optimisation with join on ordered data

My question concerns the optimization of a query sql.
My query retrieves a list of members and their last training.
To get the latest training I do a join on the result of a query returning a complete list of training for all members.
This query works but it is very slow, I'm really interrested if someone would have a solution for it to execute faster.
My query (about 16s):
SELECT
m.nom,
m.prenom,
m.ville,
m.maj,
mbf.libelle,
mbf.datefin,
m.id as idmb
FROM
membres m
LEFT JOIN (
select *
from membreform
where idformation = 1
order by datefin DESC
) as mbf ON mbf.idmembre = m.id
WHERE
role > 0 AND visible = 1
group by m.id
ORDER BY m.maj DESC
limit 0 , 20
My data structure :
membreform (1000 entries)
id int(11) NOT NULL AUTO_INCREMENT,
idmembre int(11) NOT NULL,
libelle varchar(128) NOT NULL,
idformation int(11) NOT NULL,
datedebut date NOT NULL,
datefin date NOT NULL DEFAULT '0000-00-00',
descript text NOT NULL,
KEY id (id),
KEY idmembre (idmembre),
KEY idformation (idformation)
membres (500 entries)
id int(3) NOT NULL AUTO_INCREMENT,
nom varchar(255) NOT NULL,
prenom varchar(255) NOT NULL,
ville varchar(255) NOT NULL,
email varchar(255) NOT NULL,
maj datetime NOT NULL,
role tinyint(4) NOT NULL DEFAULT '1',
PRIMARY KEY (id),
KEY role (role),
KEY maj (maj)
I tested this other way (about 0.40s) but I dont find that really clean
SELECT
m.nom,
m.prenom,
m.ville,
m.maj,
m.id as idmb,
(select
libelle
from
membreform
where
idformation = 1
AND m.id = membreform.idmembre
order by datefin DESC
limit 1
) libelle,
(select
datefin
from
membreform
where
idformation = 1
AND m.id = membreform.idmembre
order by datefin DESC
limit 1
) datefin
FROM
membres m
WHERE
role > 0 AND visible = 1
group by m.id
ORDER BY m.maj DESC
limit 0 , 20
I'm open to any suggestions because I am a bit stuck
thank you
You can use temporary table to increase performace. E.g (MS SQL)
1 Step - Get the memberform data into temp table by adding all relevant where conditions
select *
INTO #TempMembreform
from membreform
where idformation = 1 ...
order by datefin DESC
2 step - do the left outer join as you did in the above first example.
E.g.
SELECT
m.nom,
m.prenom,
m.ville,
m.maj,
mbf.libelle,
mbf.datefin,
m.id as idmb
FROM
membres m
LEFT JOIN #TempMembreForm as mbf
ON mbf.idmembre = m.id
WHERE
role > 0 AND visible = 1
group by m.id
ORDER BY m.maj DESC
limit 0 , 20
3 Step - add NOLOCK keyword if the data is not mision crical (e.g. Bank tracnsactions )
select *
INTO #TempMembreform
from membreform WITH(NOLOCK)
where idformation = 1
order by datefin DESC
SELECT
m.nom,
m.prenom,
m.ville,
m.maj,
mbf.libelle,
mbf.datefin,
m.id as idmb
FROM
membres m with(nolock)
LEFT JOIN #TempMembreForm as mbf
ON mbf.idmembre = m.id
WHERE
role > 0 AND visible = 1
group by m.id
ORDER BY m.maj DESC
limit 0 , 20
My profile

ordering by sql on fields not in projection

Is it possible to order the results of an SQL query, on a field that is not in the projection itself?
See example below - I am taking the distinct ID of a product table, but I want it ordered by title. I don't want to include the title because I am using NHibernate to generate a query, and page the results. I am then using this distinct ID resultset, to load the actual results.
SELECT
DISTINCT this_.`ID` AS y0
FROM
`Product` this_
LEFT OUTER JOIN
`Brand` brand3_
ON this_.BrandId=brand3_.ID
INNER JOIN
`Product_CultureInfo` productcul2_
ON this_.ID=productcul2_.ProductID
AND (
(
(
productcul2_.`Deleted` = 0
OR productcul2_.`Deleted` IS NULL
)
AND (
productcul2_.`_Temporary_Flag` = 0
OR productcul2_.`_Temporary_Flag` IS NULL
)
)
)
INNER JOIN
`ProductCategory` aliasprodu1_
ON this_.ID=aliasprodu1_.ProductID
AND (
(
(
aliasprodu1_.`Deleted` = 0
OR aliasprodu1_.`Deleted` IS NULL
)
AND (
aliasprodu1_.`_Temporary_Flag` = 0
OR aliasprodu1_.`_Temporary_Flag` IS NULL
)
)
)
WHERE
(
this_._Temporary_Flag =FALSE
OR this_._Temporary_Flag IS NULL
)
AND this_.Published = TRUE
AND (
this_.Deleted = FALSE
OR this_.Deleted IS NULL
)
AND (
this_._ComputedDeletedValue = FALSE
OR this_._ComputedDeletedValue IS NULL
)
AND (
(
this_._TestItemSessionGuid IS NULL
OR this_._TestItemSessionGuid = ''
)
)
AND (
productcul2_._ActualTitle LIKE '%silver%'
OR brand3_.Title LIKE '%silver%'
OR aliasprodu1_.CategoryId IN (
47906817 , 47906818 , 47906819 , 47906816 , 7012353 , 44662785
)
)
AND this_.Published = TRUE
AND this_.Published = TRUE
ORDER BY
this_.Priority ASC,
productcul2_._ActualTitle ASC,
this_.Priority ASC LIMIT 25;
Don't know if there's a better solution but how about a nested select where the external query exlude the field that you're not interested in?
So, something like that on a "random" table
SELECT a,b,c from (SELECT a,b,c,d from myTable order by d)
Obviously if there is a "language-direct" solution will be better because, in that way, you have to do two projection and one of those is useless

get the last record of table in select query

This is a follow up on another problem i had with getting-the-last-record-inserted-into-a-select-query
I am trying to edit a query that Andrea was kind enough to help me with yesterday which works fine for one page but I am trying to create a similar query without much luck.
What I need to to is for every board display the board name, the count of topics and messages linked to that board and the user, topic and date of the last message (which does work)
What i need is to get the board name, the topic and message count
This is my table structure
CREATE TABLE `boards` (
`boardid` int(2) NOT NULL auto_increment,
`boardname` varchar(255) NOT NULL default '',
PRIMARY KEY (`boardid`)
);
CREATE TABLE `messages` (
`messageid` int(6) NOT NULL auto_increment,
`topicid` int(4) NOT NULL default '0',
`message` text NOT NULL,
`author` varchar(255) NOT NULL default '',
`date` datetime(14) NOT NULL,
PRIMARY KEY (`messageid`)
);
CREATE TABLE `topics` (
`topicid` int(4) NOT NULL auto_increment,
`boardid` int(2) NOT NULL default '0',
`topicname` varchar(255) NOT NULL default '',
`author` varchar(255) NOT NULL default '',
PRIMARY KEY (`topicid`)
);
and the query I have come up with based on the query that Andrea did for me. What this query outputs in the boardname, the number of topics and messages (which says 1 even though there are 5), the topic author and messagecount (which isn't needed), the author and date of the last post (which is needed) but not the topic name which is needed
SELECT b.boardname, count( DISTINCT t.topicname ) AS topics, count( lm.message ) AS message, t.author as tauthor,
(select count(message) from messages m where m.topicid = t.topicid) AS messagecount,
lm.author as lauthor, lm.date
FROM topics t
INNER JOIN messages lm
ON lm.topicid = t.topicid AND lm.date = (SELECT max(m2.date) from messages m2)
INNER JOIN boards b
ON b.boardid = t.boardid
GROUP BY t.topicname
This my original query that does what I wanted but get the first post, not the last
SELECT b.boardid, b.boardname, count( DISTINCT t.topicname ) AS topics, count( m.message ) AS message, m.author AS author, m.date AS date, t.topicname AS topic
FROM boards b
INNER JOIN topics t ON t.boardid = b.boardid
INNER JOIN messages m ON t.topicid = m.topicid
INNER JOIN (
SELECT topicid, MAX( date ) AS maxdate
FROM messages
GROUP BY topicid
) test ON test.topicid = t.topicid
GROUP BY boardname
ORDER BY boardname
any help with this much appreciated
You need to define "LAST", in terms of an ORDER BY clause. Once you do that, you can just reverse the direction of your order and add LIMIT 1 to the query.
SELECT b.*, m.*, t,*
(
SELECT COUNT(*)
FROM topics ti
WHERE ti.boardid = b.boardid
) AS topiccount,
(
SELECT COUNT(*)
FROM topics ti, messages mi
WHERE ti.boardid = b.boardid
AND mi.topicid = ti.topicid
) AS messagecount
FROM boards b
LEFT JOIN
messages m
ON m.messageid = (
SELECT mii.messageid
FROM topics tii, messages mii
WHERE tii.boardid = b.boardid
AND mii.topicid = tii.topicid
ORDER BY
mii.date DESC
LIMIT 1
)
LEFT JOIN
topics t
ON t.topicid = m.topicid