SQL Server 2008 - Union of 4 queries and ordering by relevance - sql

I'm facing an issue that my query cannot order by relevance since I declared 'column' MATCH and try to ordering by it.
I'm trying to create a stored procedure using UNION.
This query has some rules that I need to follow since I need to bring 3 related articles. Each rule has a query that I tried to unite them.
Let me explain those rules:
I need to search and match and article that has the same TAG related to it inside the same project (as CampanhaId)
I need to search and match the same TAG without be inside the same project, but public articles
Recent articles at the same project
Recent public article
I need to follow these rules in priory and search for first three articles passing by then.
So, if first rule hasn't at least 3 articles, the second rule will try to fill it. The third and fourth rules follow the same way.
I tried to create a query like this:
CREATE PROCEDURE [dbo].[SP_GetNoticiaRelacionada]
(#Tag VARCHAR(50), #ExtranetId INT, #CampanhaAreaId INT, #NoticiaId INT)
AS
BEGIN
SELECT TOP 3 *
FROM
(SELECT DISTINCT
ArtigoId, CategoriaId, Titulo, Conteudo,
Subtitulo, Categoria, FotoCompacta, QtdResposta,
0 AS MATCH, DataAlteracao
FROM
(SELECT
A.ArtigoId, A.CategoriaId, A.Titulo, A.Conteudo,
A.Subtitulo, C.Nome AS Categoria,
A.ImgAlt AS FotoCompacta,
(SELECT COUNT(*) FROM Comentario C
WHERE C.GenericAreaId = A.ArtigoId) AS QtdResposta,
1 AS MATCH, A.DataAlteracao
FROM
Artigo A
JOIN
ArtigoCategoria C ON A.CategoriaId = C.CategoriaId
WHERE
A.Apagado = 0
AND A.TAG COLLATE Latin1_General_CI_AI LIKE '%' + #Tag + '%'
AND A.CampanhaAreaId = #CampanhaAreaId
AND A.ArtigoId <> #NoticiaId
UNION
SELECT A.ArtigoId
,A.CategoriaId
,A.Titulo
,A.Conteudo
,A.Subtitulo
,C.Nome AS Categoria
,A.ImgAlt AS FotoCompacta
,(SELECT COUNT(*) FROM Comentario C WHERE C.GenericAreaId = A.ArtigoId) AS QtdResposta
,2 AS MATCH
,A.DataAlteracao
FROM Artigo A
JOIN ArtigoCategoria C ON A.CategoriaId = C.CategoriaId
WHERE A.Apagado = 0 AND A.TAG COLLATE Latin1_General_CI_AI LIKE '%' + #Tag + '%' AND A.CampanhaId = #ExtranetId AND A.ArtigoId <> #NoticiaId
UNION
SELECT A.ArtigoId
,A.CategoriaId
,A.Titulo
,A.Conteudo
,A.Subtitulo
,C.Nome AS Categoria
,A.ImgAlt AS FotoCompacta
,(SELECT COUNT(*) FROM Comentario C WHERE C.GenericAreaId = A.ArtigoId) AS QtdResposta
,3 AS MATCH
,A.DataAlteracao
FROM Artigo A
JOIN ArtigoCategoria C ON A.CategoriaId = C.CategoriaId
WHERE A.Apagado = 0 AND A.CampanhaAreaId = #CampanhaAreaId AND A.ArtigoId <> #NoticiaId
UNION
SELECT A.ArtigoId
,A.CategoriaId
,A.Titulo
,A.Conteudo
,A.Subtitulo
,C.Nome AS Categoria
,A.ImgAlt AS FotoCompacta
,(SELECT COUNT(*) FROM Comentario C WHERE C.GenericAreaId = A.ArtigoId) AS QtdResposta
,4 AS MATCH
,A.DataAlteracao
FROM Artigo A
JOIN ArtigoCategoria C ON A.CategoriaId = C.CategoriaId
WHERE A.Apagado = 0 AND A.CampanhaId = #ExtranetId AND A.ArtigoId <> #NoticiaId
) AS T
GROUP BY
ArtigoId
,CategoriaId
,Titulo
,Conteudo
,Subtitulo
,Categoria
,FotoCompacta
,QtdResposta
,MATCH
,DataAlteracao) AS T2
ORDER BY T2.MATCH ASC, T2.DataAlteracao DESC
END
So, the first query returns only Articles in the same TAG and Project.
The second one, returns all Articles with matching the same TAG.
The third one, matches all Article in the same Project.
The last one matches all Article published.
My real problem, I guess, all the results don't respect that order.
If I have two articles with the same TAG, this should bring first as related articles, but somehow this brings first any article that I updated recently and should not be the first one in the list.
When I tried to execute this procedure, SQL Server always returns the column Match with a value of 0.
I think the problem is inside this Match column that I cannot order by it.
If someone needs more information, please advise me. I'll be appreciate any help.
I don't have any further actions I need to take.

You are doing "SELECT 0 AS MATCH" in your outer query, which means it is over-writing any values in your inner query.
In other words, to expose the issue, your code could be simplified to this:
SELECT 0 AS Match
FROM (
SELECT 1 AS Match
UNION
SELECT 2 AS Match
UNION
SELECT 3 AS Match
UNION
SELECT 4 AS Match
)
ORDER BY Match
Since you are using Match 1-4 in the inner query, but then stating "SELECT 0 AS Match" in the outer query that selects from the inner query, all the rows are going to have 0 for Match.
Instead of getting "0 AS Match" in the outer query, you should just get Match from the inner query.

Related

Don't select rows where column A is duplicated AND any row of column B is a specific value

I'm working on generating a report merging multiple tables. The report requires only showing projects that did not have any document marked 'Not Received' These document markings are listed in a table that lists each document in an individual line. So when merged into my other table it creates multiple rows of the same project. For example the following table
Project Number
ChecklistValue
565
Received
565
Not Received
465
Received
465
Not Applicable
As you can see really only two projects are listed on this table but the desired output is:
Project Number
Other Info
465
etc
I do not need the checklist value on the actual report, so I can use the GROUP BY to combine all the good rows, but where I have an Issue is that would still include project 565 even if I include something like where ChecklistValue <> 'Not Received', 565 needs to be hidden from the report entirely because any row for 565 contains 'Not Received'.
So that's my actual question, how do I exclude all project numbers rows that have any row containing 'Not Received'?
I'm adding the entire query will generalized names below:
SELECT
Project Number
,Name
,Contractor
,ABS(DATEDIFF(day,(ActualDate),(EstDate))) AS DelayPeriod
,S.NoteDate
,S.FinalAppDate
,Status
,S.ONE
,S.TWO
,S.THREE
,S.FOUR
,CH.ChecklistValue
FROM [DB1] A
INNER JOIN [DB2] C ON A.Contractor = C.Contractor
INNER JOIN [DB3] S ON A.AppID = S.AppID
INNER JOIN [DB4] LS ON S.StatusID = LS.StatusID
LEFT OUTER JOIN [DB5] CH ON A.AppID = CH.AppID AND CH.OtherID = 1
WHERE C.TypeID = 4 AND A.YEAR = 2022, AND S.THING = 1 AND
(CH.CheckListValue IS NULL OR A.AppID NOT IN (SELECT * FROM [DB5] WHERE
CheckListValue = 'Not Reveived'))
GROUP BY Project Number,Name,Contractor,ABS(DATEDIFF(day,(ActualDate),(EstDate))) AS DelayPeriod,S.NoteDate,S.FinalAppDate,Status,S.ONE,S.TWO,S.THREE,S.FOUR
The last portion of the WHERE clause was added from a suggestion, but I'm clearly not implementing it correctly as it errors
You can use not in like:
create table test(
num int,
description varchar(20)
);
insert into test(num,description)
values(565,'Received'),
(565,'Not Received'),
(465,'Received'),
(465,'Not Applicable');
select *
from test
where num not in
(
select num -- Only select one column here
from test
where description = 'Not Received'
);
Results:
+-----+---------------+
| num | description |
+-----+---------------+
| 465 | Received |
| 465 | Not Applicable|
+-----+---------------+
db<>fiddle this is on sql-server but works on other dbms as well.
So in your query you should have (in my understanding):
OR A.AppID NOT IN
(
SELECT AppID -- Not select *
FROM [DB5]
WHERE CheckListValue = 'Not Reveived'
)
Other way to do it is with a cte but it is complicated at first glance:
with x as(
select num
from test
where description = 'Not Received'
)
select t.num, t.description
from test t
left join x
on t.num = x.num
where x.num is null
I'm first creating a cte on the num column where the description = not received then I'm selecting all from the test table, and I'm left joining to the cte but I'm only selecting the num column that are not in the cte by using where x.num is null, and this will only return 465.
Now which one is better? I don't know sometimes join would be faster and sometimes in, for more you can find on this post.

Change existing sql to left join only on first match

Adding back some original info for historical purposes as I thought simplifying would help but it didn't. We have this stored procedure, in this part it is selecting records from table A (calldetail_reporting_agents) and doing a left join on table B (Intx_Participant). Apparently there are duplicate rows in table B being pulled that we DON'T want. Is there any easy way to change this up to only pick the first match on table B? Or will I need to rewrite the whole thing?
SELECT 'Agent Calls' AS CallType,
CallDate,
CallTime,
RemoteNumber,
DialedNumber,
RemoteName,
LocalUserId,
CallDurationSeconds,
Answered,
AnswerSpeed,
InvalidCall,
Intx_Participant.Duration
FROM calldetail_reporting_agents
LEFT JOIN Intx_Participant ON calldetail_reporting_agents.CallID = Intx_Participant.CallIDKey
WHERE DialedNumber IN ( SELECT DialedNumber
FROM #DialedNumbers )
AND ConnectedDate BETWEEN #LocStartDate AND #LocEndDate
AND (#LocQueue IS NULL OR AssignedWorkGroup = #LocQueue)
Simpler version: how to change below to select only first matching row from table B:
SELECT columnA, columnB FROM TableA LEFT JOIN TableB ON someColumn
I changed to this per the first answer and all data seems to look exactly as expected now. Thank you to everyone for the quick and attentive help.
SELECT 'Agent Calls' AS CallType,
CallDate,
CallTime,
RemoteNumber,
DialedNumber,
RemoteName,
LocalUserId,
CallDurationSeconds,
Answered,
AnswerSpeed,
InvalidCall,
Intx_Participant.Duration
FROM calldetail_reporting_agents
OUTER APPLY (SELECT TOP 1
*
FROM Intx_Participant ip
WHERE calldetail_reporting_agents.CallID = ip.CallIDKey
AND calldetail_reporting_agents.RemoteNumber = ip.ConnValue
AND ip.HowEnded = '9'
AND ip.Recorded = '0'
AND ip.Duration > 0
AND ip.Role = '1') Intx_Participant
WHERE DialedNumber IN ( SELECT DialedNumber
FROM #DialedNumbers )
AND ConnectedDate BETWEEN #LocStartDate AND #LocEndDate
AND (#LocQueue IS NULL OR AssignedWorkGroup = #LocQueue)
You can try to OUTER APPLY a subquery getting only one matching row.
...
FROM calldetail_reporting_agents
OUTER APPLY (SELECT TOP 1
*
FROM intx_Participant ip
WHERE ip.callidkey = calldetail_reporting_agents.callid) intx_participant
WHERE ...
You should add an ORDER BY in the subquery. Otherwise it isn't deterministic which row is taken as the first. Or maybe that's not an issue.

OracleSQL: How do I add a specific AND is not null OR is not null to my query

Backstory:
I have three tables I'm working with. A directory table (directory), an general attribute table (attribute1table) and a specific attribute table (attribute2table). The general attribute tables hold attribute names (ex. Last Name) under attribute id's (attrid = 2). The specific attribute table holds specific data for these attributes (ex. Doe).
I needed to transpose rows to columns. I had tried using pivot, and max(decode) before but all options gave me the wrong string value- so I used a sub select within the select statement. This worked well- it did transpose the rows into columns but gave me a bunch of null values. See query at the bottom for steps.
Then I added in a general 'stringval IS NOT NULL' to eliminate any of the other attribute1table.attrid's (ex. 4, 5, 6). This worked.
This is the output I was getting at this point. The ? are null values.
Name DataID LastName FirstName
File10 1290 ? Jane
File10 1290 Doe ?
Then I wanted to add in a specification. Essentially to include the values where LastName is not null OR FirstName is not null. I found that someone had recommended doing this in a previous question albeit their situation was different. Eliminating specific null values in sql select
I was able to include one statement or the other but could not add in both. Instead of getting an error I just got a horrifically long run time with no foreseeable result (note that I am using software which lets you input oracle queries within the interface to query the database). It works if I run the query up until the ** (see code) but as soon as I add in the OR condition, it doesn't work anymore. I think this is because I have multiple WHERE conditions. In all cases I want the directory ID and general stringval conditions to apply but I want to have a third condition where either lastname is not null or first name is not null. I'm not sure if I'm missing something obvious- please help?
Here is my current query:
SELECT directory.name, directory.dataid,
(SELECT max(stringval) FROM attribute2table WHERE attribute1table.attrid = 2) as LastName,
(SELECT max(stringval) FROM attribute2table WHERE attribute1table.attrid = 3) as FirstName
FROM attribute2table
JOIN directory ON directory.dataid = attribute2table.id
JOIN attribute1table ON attribute1table.id = directory.dataid
WHERE directory.dataid = 1290
AND stringval IS NOT NULL
AND (SELECT max(valstr) FROM attribute1table WHERE attribute1table.attrid = 2) IS NOT NULL
**OR (SELECT max(valstr) FROM attribute1table WHERE attribute1table.attrid = 3) IS NOT NULL**
Basically I just need to get rid of the null values and want my table to look like....
Name DataID LastName FirstName
File10 1290 Doe Jane
This appears to be a parenthesization issue. If I understand the issue, you need to put the two IS NOT NULL conditions in parentheses:
SELECT directory.name,
directory.dataid,
m2.LastName,
m3.FirstName
FROM attribute2table
INNER JOIN directory
ON directory.dataid = attribute2table.id
INNER JOIN attribute1table
ON attribute1table.id = directory.dataid
LEFT OUTER JOIN (SELECT max(valstr) AS LASTNAME
FROM attribute1table
WHERE attribute1table.attrid = 2) m2
ON 1 = 1
LEFT OUTER JOIN (SELECT max(valstr) AS FIRSTNAME
FROM attribute1table
WHERE attribute1table.attrid = 3) m3
ON 1 = 1
WHERE directory.dataid = 1290 AND
stringval IS NOT NULL AND
(m2.LASTNAME IS NOT NULL OR
m3.FIRSTNAME IS NOT NULL)
I also rewrote the query using joins instead of subselects as I think it's a bit clearer.
Note also that in the M2 and M3 joins I used LEFT OUTER with a condition of 1 = 1 rather than using CROSS JOIN, because I've noticed that CROSS JOIN acts like an INNER JOIN if the query being cross-joined returns no rows - that is, it causes the entire SELECT to return no data. dbfiddle demonstrating this situation here
I'm pretty sure you just need conditional aggregation:
SELECT d.name, d.dataid,
MAX(CASE WHEN a1.attrid = 2 THEN a2.stringval END) as LastName,
MAX(CASE WHEN a1.attrid = 3 THEN a2.stringval END) as FirstName
FROM directory d JOIN
attribute2table a2
ON a2.id = d.dataid JOIN
attribute1table a1
ON a1.id = d.dataid
WHERE d.dataid = 1290
GROUP BY d.name, d.dataid

SQL Join / Union

I have two statements that I want to merge into one output.
Statement One:
select name from auxiliary_variable_inquiry
where inquiry_idbr_code = '063'
Returns the following list of names:
Name
------------
Affiliates
NetBookValue
Parents
Worldbase
Statement Two:
select name, value from auxiliary_variable_value
where inquiry_idbr_code = '063'
and ru_ref = 20120000008
and period = 200912
Returns the following:
Name Value
-------------------
Affiliates 112
NetBookValue 225.700
I would like to have an output like this:
Name Value
-------------------
Affiliates 112
NetBookValue 225.700
Parents 0
Worldbase 0
So basically, if the second query only returns 2 names and values, I'd still like to display the complete set of names from the first query, with no values. If all four values were returned by both queries, then all four would be displayed.
Sorry I must add, im using Ingres SQL so im unable to use the ISNULL function.
You can do a left join. This ensures that all records from the first table will stay included. Where value is null, no child record was found, and we use coalesce to display 0 in these cases.
select i.name, COALESCE(v.Value,0) from auxiliary_variable_inquiry i
left join auxiliary_variable_value v
on v.inquiry_idbr_code = i.inquiry_idbr_code
and v.ru_ref = 20120000008
and v.period = 200912
where i.inquiry_idbr_code = '063'
I'd recommend a self-JOIN using the LEFT OUTER JOIN syntax. Include your 'extra' conditions from the second query in the JOIN condition, while the first conditions stay in the WHERE, like this:
select a.name, CASE WHEN b.Value IS NULL THEN 0 ELSE b.Value END AS Value
from
auxiliary_variable_inquiry a
LEFT JOIN
auxiliary_variable_inquiry b ON
a.name = b.name and -- replace this with your real ID-based JOIN
a.inquiry_idbr_code = b.inquiry_idbr_code AND
b.ru_ref = 20120000008 AND
b.period = 200912
where a.inquiry_idbr_code = '063'
if i got right, you should use something like:
SELECT i.NAME,
v.NAME,
v.value
FROM auxiliary_variable_inquiry i
LEFT JOIN auxiliary_variable_value v
ON i.inquiry_idbr_code = v.inquiry_idbr_code
WHERE v.ru_ref = 20120000008
AND v.period = 200912

querying 2 tables with the same spec for the differences

I recently had to solve this problem and find I've needed this info many times in the past so I thought I would post it. Assuming the following table def, how would you write a query to find all differences between the two?
table def:
CREATE TABLE feed_tbl
(
code varchar(15),
name varchar(40),
status char(1),
update char(1)
CONSTRAINT feed_tbl_PK PRIMARY KEY (code)
CREATE TABLE data_tbl
(
code varchar(15),
name varchar(40),
status char(1),
update char(1)
CONSTRAINT data_tbl_PK PRIMARY KEY (code)
Here is my solution, as a view using three queries joined by unions. The diff_type specified is how the record needs updated: deleted from _data(2), updated in _data(1), or added to _data(0)
CREATE VIEW delta_vw AS (
SELECT feed_tbl.code, feed_tbl.name, feed_tbl.status, feed_tbl.update, 0 as diff_type
FROM feed_tbl LEFT OUTER JOIN
data_tbl ON feed_tbl.code = data_tbl.code
WHERE (data_tbl.code IS NULL)
UNION
SELECT feed_tbl.code, feed_tbl.name, feed_tbl.status, feed_tbl.update, 1 as diff_type
FROM data_tbl RIGHT OUTER JOIN
feed_tbl ON data_tbl.code = feed_tbl.code
where (feed_tbl.name <> data_tbl.name) OR
(data_tbl.status <> feed_tbl.status) OR
(data_tbl.update <> feed_tbl.update)
UNION
SELECT data_tbl.code, data_tbl.name, data_tbl.status, data_tbl.update, 2 as diff_type
FROM feed_tbl LEFT OUTER JOIN
data_tbl ON data_tbl.code = feed_tbl.code
WHERE (feed_tbl.code IS NULL)
)
UNION will remove duplicates, so just UNION the two together, then search for anything with more than one entry. Given "code" as a primary key, you can say:
edit 0: modified to include differences in the PK field itself
edit 1: if you use this in real life, be sure to list the actual column names. Dont use dot-star, since the UNION operation requires result sets to have exactly matching columns. This example would break if you added / removed a column from one of the tables.
select dt.*
from
data_tbl dt
,(
select code
from
(
select * from feed_tbl
union
select * from data_tbl
)
group by code
having count(*) > 1
) diffs --"diffs" will return all differences *except* those in the primary key itself
where diffs.code = dt.code
union --plus the ones that are only in feed, but not in data
select * from feed_tbl ft where not exists(select code from data_tbl dt where dt.code = ft.code)
union --plus the ones that are only in data, but not in feed
select * from data_tbl dt where not exists(select code from feed_tbl ft where ft.code = dt.code)
I would use a minor variation in the second union:
where (ISNULL(feed_tbl.name, 'NONAME') <> ISNULL(data_tbl.name, 'NONAME')) OR
(ISNULL(data_tbl.status, 'NOSTATUS') <> ISNULL(feed_tbl.status, 'NOSTATUS')) OR
(ISNULL(data_tbl.update, '12/31/2039') <> ISNULL(feed_tbl.update, '12/31/2039'))
For reasons I have never understood, NULL does not equal NULL (at least in SQL Server).
You could also use a FULL OUTER JOIN and a CASE ... END statement on the diff_type column along with the aforementioned where clause in querying 2 tables with the same spec for the differences
That would probably achieve the same results, but in one query.