sqlite3 JOIN, GROUP_CONCAT using distinct with custom separator - sql

Given a table of "events" where each event may be associated with zero or more "speakers" and zero or more "terms", those records associated with the events through join tables, I need to produce a table of all events with a column in each row which represents the list of "speaker_names" and "term_names" associated with each event.
However, when I run my query, I have duplication in the speaker_names and term_names values, since the join tables produce a row per association for each of the speakers and terms of the events:
1|Soccer|Bobby|Ball
2|Baseball|Bobby - Bobby - Bobby|Ball - Bat - Helmets
3|Football|Bobby - Jane - Bobby - Jane|Ball - Ball - Helmets - Helmets
The group_concat aggregate function has the ability to use 'distinct', which removes the duplication, though sadly it does not support that alongside the custom separator, which I really need. I am left with these results:
1|Soccer|Bobby|Ball
2|Baseball|Bobby|Ball,Bat,Helmets
3|Football|Bobby,Jane|Ball,Helmets
My question is this: Is there a way I can form the query or change the data structures in order to get my desired results?
Keep in mind this is a sqlite3 query I need, and I cannot add custom C aggregate functions, as this is for an Android deployment.
I have created a gist which makes it easy for you to test a possible solution: https://gist.github.com/4072840

Look up the speaker/term names independently from each other:
SELECT _id,
name,
(SELECT GROUP_CONCAT(name, ';')
FROM events_speakers
JOIN speakers
ON events_speakers.speaker_id = speakers._id
WHERE events_speakers.event_id = events._id
) AS speaker_names,
(SELECT GROUP_CONCAT(name, ';')
FROM events_terms
JOIN terms
ON events_terms.term_id = terms._id
WHERE events_terms.event_id = events._id
) AS term_names
FROM events

I ran accross this problem as well, but came up with a method that I found a bit easier to comprehend. Since SQLite reports SQLite3::SQLException: DISTINCT aggregates must have exactly one argument, the problem seems not so much related to the GROUP_CONCAT method, but with using DISTINCT within GROUP_CONCAT...
When you encapsulate the DISTINCT 'subquery' within a REPLACE method that actually does nothing you can have the relative simplicity of nawfal's suggestion without the drawback of only being able to concat comma-less strings properly.
SELECT events._id, events.name,
(group_concat(replace(distinct speakers.name),'',''), ' - ') AS speaker_names,
(group_concat(replace(distinct speakers.name),'',''), ' - ') AS term_names
FROM events
LEFT JOIN
(SELECT et.event_id, ts.name
FROM terms ts
JOIN events_terms et ON ts._id = et.term_id
) terms ON events._id = terms.event_id
LEFT JOIN
(SELECT sp._id, es.event_id, sp.name
FROM speakers sp
JOIN events_speakers es ON sp._id = es.speaker_id
) speakers ON events._id = speakers.event_id
GROUP BY events._id;
But actually I would consider this a SQLite bug / inconsistency, or am I missing something?

That's strange that SQLite doesnt support that!.
At the risk of being down voted, only if it helps:
You can avail Replace(X, Y, Z). But you have to be sure you wont have valid , values in your columns..
SELECT events._id, events.name,
REPLACE(group_concat(distinct speakers.name), ',', ' - ') AS speaker_names,
REPLACE(group_concat(distinct terms.name), ',', ' - ') AS term_names
FROM events
LEFT JOIN
(SELECT et.event_id, ts.name
FROM terms ts
JOIN events_terms et ON ts._id = et.term_id
) terms ON events._id = terms.event_id
LEFT JOIN
(SELECT sp._id, es.event_id, sp.name
FROM speakers sp
JOIN events_speakers es ON sp._id = es.speaker_id
) speakers ON events._id = speakers.event_id
GROUP BY events._id;

The problem arises only with the group_concat(X,Y) expression, not with the group_concat(X) expression.
group_concat(distinct X) works well.
So, if the ',' is good for you, there is no problem, but if you want a ';' instead of ',' (and you are sure no ',' is in your original text) you can do:
replace(group_concat(distinct X), ',', ';')

Just to put a proper workaround (murb's answer is strangely parenthesized).
problem:
group_concat(distinct column_name, 'custom_separator') takes custom_separator as a part of distinct.
solution:
We need some no-op to let SQLite know that distinct finished (to wrap distinct and it's arguments).
No-op can be replace with empty string as a second parameter (documentation to replace).
group_concat(replace(distinct column_name, '', ''), 'custom_separator')
edit:
just found that it does not work :-( - can be called but distinct is not working anymore

There is a special case that does not work in sqlite : group_concat(DISTINCT X, Y)
Whereas in SQL you can use group_concat(DISTINCT X SEPARATOR Y) in sqlite you can't
This example : Select group_concat(DISTINCT column1, '|') from example_table group by column2;
gives the result : DISTINCT aggregates must have exactly one argument At line 1:
The solution :
select rtrim(replace(group_concat(DISTINCT column1||'#!'), '#!,', '|'),'#!') from example_table

Related

SQL replace function error while using it to replace one string

I have event table in which there are two fields named as sport, event_name .
This was values such as:
{sport:"Athletic"; event_name:"Athletic 100 meter"}
What I want is to use replace function to replace the string in event_name that matches string in sport with nothing.
so the final output will be such :
{sport:"Athletic"; event_name:"100 meter"}
And I was also joined it with other table so only ID that are to be replace are also present in other table
so I used in this way in following code. But it should an error : "Expected item: < result-column > " . Thank you
SELECT
ae.id ,
ae.city AS event_city,
ae.sport,
REPLACE(ae.event,ae.sport,' ') AS event_name ,
FROM
athlete_events ae
inner join
players_personalinfo pp on
pp.id=ae.id
You need to define the table aliases:
SELECT ae.id AS event_id, ae.city AS event_city, ae.sport,
REPLACE(ae.event, ae.sport, ' ') AS event_name ,
ae.event
FROM athlete_events ae JOIN
players_personalinfo pp
ON pp.id = ae.id;
I would also advise you to trim the result:
TRIM(REPLACE(ae.event, ae.sport, ' ')) AS event_name,
This will remove leading and trailing spaces.
The REPLACE function is case sensitive. Try to check the data to make sure that the capitalization of each is the same.
The prior answers work, but you need to modify one of the field names in your query. In your description, you mentioned the field name is "event_name", but in your query, you reference just "event" (ae.event).
Also, I'm a little surprised that an event_id would join to a player's profile id. Seems a bit odd.
At any rate, I confirmed this SQL works in both postgres and oracle databases...
SELECT
ae.id AS event_id,
ae.city AS event_city,
ae.sport,
ae.event_name as event_name_original
REPLACE(ae.event_name,ae.sport,' ') AS event_name_kinda_ugly,
TRIM(REPLACE(ae.event_name,ae.sport,' ')) AS event_name_clean
FROM
athlete_events ae
inner join
players_personalinfo pp on pp.id=ae.id

Passing value from one parameter to another

I am creating a report in SSRS. Queries are working fine. I am getting the results if I hard coded the input values.
Now I have added three parameters:
YearMonths
SUGName
collection
YearMonths - Data is coming from the SQL query directly. No issues in that.
SUGName -
select cia.AssignmentID,CIA.Collectionid, concat(grp.Title,' -- ', CIA.CollectionName) as deploymentName from
v_CIAssignment cia
inner join v_CIAssignmentToGroup atg on cia.AssignmentType=5 and atg.AssignmentID=cia.AssignmentID
inner join v_AuthListInfo grp on cia.AssignmentType=5 and grp.CI_ID=atg.AssignedUpdateGroup
where concat(datepart(yyyy, grp.DateCreated), '-', RIGHT('0' + RTRIM(MONTH(grp.DateCreated)), 2)) = #YearMonths
Order By grp.Title desc
This is also working.
collection -
select cia.AssignmentID,CIA.Collectionid, concat(grp.Title,' -- ', CIA.CollectionName) as deploymentName from
v_CIAssignment cia
inner join v_CIAssignmentToGroup atg on cia.AssignmentType=5 and atg.AssignmentID=cia.AssignmentID
inner join v_AuthListInfo grp on cia.AssignmentType=5 and grp.CI_ID=atg.AssignedUpdateGroup
where cia.AssignmentID = #SUGName
Order By grp.Title desc
It is not working and is giving an error. The query is working fine. I checked that by putting in SUGName manually.
Below is the error I am getting.
System.Web.Services.Protocols.SoapException:
The Value expression for the query parameter ‘#SUGName’ refers to a non-existing report parameter ‘SUGname’. Letters in the names of parameters must use the correct case.
The parameters references in SSRS are case sensitive. When you are referring to the parameter in your query, make sure SUGName is in the same case in your main query.

Access syntax error when using Join

I have those two tables (Members and Now) I just need to make sure that no one in Members is actually in Now. Both tables have different structures but can be joined on firsname, lastname and postalcode.
So I tried this (in access)
SELECT Members.Prenom, Members.Nom, Members.Adresse, Members.[Adresse 2], Members.ville, Members.Province, Members.CodePostal
FROM Members
Left JOIN now ON (members.prenom = now.firstname AND members.nom = now.lastname
AND members.codepostal = now.postcode) WHERE now.id IS NULL
And it gives me a wonderful error message
invalid use of '.' ' ' or '()'. in query expression
May someone enlighten me on what I did wrong?
Pretty sure you cannot use 'now' as a table name, there are certain reserved words that MS Access need (in this case for function Now(), I guess the error message is telling you have missed the parentesis' ()). You could try encasing it in square brackets but I would strongly recommend changing your table name. A useful format I use is to prefix objects such as tblTableName, qryQueryName, rptReportName, frmFormName etc but whatever works for you.
SELECT Members.Prenom, Members.Nom, Members.Adresse, Members.[Adresse 2],
Members.ville, Members.Province, Members.CodePostal
FROM Members
Left JOIN [now] a ON (members.prenom = a.firstname AND members.nom = a.lastname
AND members.codepostal = a.postcode) WHERE a.id IS NULL

SQL Server Compact won't allow subselect but inner join with groupby not allowed on text datatype

I have the following sql syntax that I used in my database query (SQL Server)
SELECT Nieuwsbrief.ID
, Nieuwsbrief.Titel
, Nieuwsbrief.Brief
, Nieuwsbrief.NieuwsbriefTypeCode
, (SELECT COUNT(*) AS Expr1
FROM NieuwsbriefCommentaar
WHERE (Nieuwsbrief.ID = NieuwsbriefCommentaar.NieuwsbriefID
AND NieuwsbriefCommentaar.Goedgekeurd = 1)) AS AantalCommentaren
FROM Nieuwsbrief
I'm changing now to sql-server-ce (compact edition) which won't allow me to have subqueries like this. Proposed solution : inner join. But as I only need a count of the subtable 'NieuwsbriefCommentaar', I have to use a 'group by' clause on my base table attributes to avoid doubles in the result set.
However the 'Nieuwbrief.Brief' attribute is of datatype 'text'. Group by clauses are not allowed on 'text' datatype in sql-server-ce. 'Text' datatype is deprecated, but sql-server-ce doesn't support 'nvarchar(max)' yet...
Any idea how to solve this? Thx for your help.
I think that the solution could be easier. I don't know exactly how is your metadata but I think that this code could fit your requirements by simply using LEFT JOIN.
SELECT Nieuwsbrief.ID
, Nieuwsbrief.Titel
, Nieuwsbrief.Brief
, Nieuwsbrief.NieuwsbriefTypeCode
, COUNT(NieuwsbriefCommentaar.NieuwsbriefID) AS AantalCommentaren
FROM Nieuwsbrief
LEFT JOIN NieuwsbriefCommentaar ON (Nieuwsbrief.ID = NieuwsbriefCommentaar.NieuwsbriefID)
WHERE NieuwsbriefCommentaar.Goedgekeurd = 1
Edited: 2ndOption
SELECT N.ID, N.Titel, N.Brief, N.NieuwsbriefTypeCode, G.AantalCommentaren FROM Nieuwsbrief as N LEFT JOIN (SELECT NieuwsbriefID, COUNT(*) AS AantalCommentaren FROM NieuwsbriefCommentaar GROUP BY NieuwsbriefID) AS G ON (N.ID = G.NieuwsbriefID)
Please, let me know if this code works in order to find out another workaround..
regards,

naming columns in excel with Complex sql

I’m trying to run this SQL using get external.
It works, but when I try to rename the sub-queries or anything for that matter it remove it.
I tried as, as and the name in '', as then the name in "",
and the same with space. What is the right way to do that?
Relevant SQL:
SELECT list_name, app_name,
(SELECT fname + ' ' + lname
FROM dbo.d_agent_define map
WHERE map.agent_id = tac.agent_id) as agent_login,
input, CONVERT(varchar,DATEADD(ss,TAC_BEG_tstamp,'01/01/1970'))
FROM dbo.maps_report_list list
JOIN dbo.report_tac_agent tac ON (tac.list_id = list.list_id)
WHERE input = 'SYS_ERR'
AND app_name = 'CHARLOTT'
AND convert(VARCHAR,DATEADD(ss,day_tstamp,'01/01/1970'),101) = '09/10/2008'
AND list_name LIKE 'NRBAD%'
ORDER BY agent_login,CONVERT(VARCHAR,DATEADD(ss,TAC_BEG_tstamp,'01/01/1970'))
You could get rid of your dbo.d_agent_define subquery and just add in a join to the agent define table.
Would this code work?
select list_name, app_name,
map.fname + ' ' + map.lname as agent_login,
input,
convert(varchar,dateadd(ss,TAC_BEG_tstamp,'01/01/1970')) as tac_seconds
from dbo.maps_report_list list
join dbo.report_tac_agent tac
on (tac.list_id = list.list_id)
join dbo.d_agent_define map
on (map.agent_id = tac.agent_id)
where input = 'SYS_ERR'
and app_name = 'CHARLOTT'
and convert(varchar,dateadd(ss,day_tstamp,'01/01/1970'),101) = '09/10/2008'
and list_name LIKE 'NRBAD%'
order by agent_login,convert(varchar,dateadd(ss,TAC_BEG_tstamp,'01/01/1970'))
Note that I named your dateadd column because it did not have a name. I also tried to keep your convention of how you do a join. There are a few things that I would do different with this query to make it more readable, but I only focused on getting rid of the subquery problem.
I did not do this, but I would recommend that you qualify all of your columns with the table from which you are getting them.
To remove the sub query in the SELECT statement I suggest the following:
SELECT list_name, app_name, map.fname + ' ' + map.lname as agent_login, input, convert(varchar,dateadd(ss, TAC_BEG_tstamp, '01/01/1970))
FROM dbo.maps_report_list inner join
(dbo.report_tac_agent as tac inner join dbo.d_agent_define as map ON (tac.agent_id=map.agent_id)) ON list.list_id = tac.list_id
WHERE input = 'SYS_ERR' and app_name = 'CHARLOTT' and convert(varchar,dateadd(ss,day_tstamp,'01/01/1970'),101) = '09/10/2008'
and list_name LIKE 'NRBAD%' order by agent_login,convert(varchar,dateadd(ss,TAC_BEG_tstamp,'01/01/1970'))
I used parentheses to create the inner join between dbo.report_tac_agent and dbo.d_agent_define first. This is now a set of join data.
The combination of those tables are then joined to your list table, which I am assuming is the driving table here. If I am understand what you are trying to do with your sub select, this should work for you.
As stated by the other poster you should use table names on your columns (e.g. map.fname), it just makes things easy to understand. I didn't in my example because I am note 100% sure which columns go with which tables. Please let me know if this doesn't do it for you and how the data it returns is wrong. That will make it easier to solve in needed.