Require help on Inner join query on single table - sql

I have single table called TEST as follow :
job_id input_id match_outcome
101 1 MATCH
101 2 NO_MATCH
201 1 NO_MATCH
201 2 MATCH
Expected outcome:
job_id input_id match_outcome
201 1 NO_MATCH
101 2 NO_MATCH
Query I used:
select *
from ( select * from TEST where job_id = '101') q1 join
(select * from TEST where job_id = '201') q2
where q1.match_outcome= 'MATCH' and q2.match_outcome= 'NO_MATCH' OR
q2.match_outcome= 'MATCH' and q1.match_outcome= 'NO_MATCH'
Overall objective:
I need input_id and other data which is MATCH with one job_id and and the input id which is NO MATCH in another set of job id.But this query takes longer times since these table contains millions of record and I didn't see the outcome yet.(Fyi, I am using hive tables) any efficient or any different better way to don this!! Thanks

If I try to understand this: "I need input_id . . . which is MATCH with one job_id and and the input id which is NO MATCH in another set of job id", then you can use aggregation:
select input_id
from text
group by input_id
having sum(case when match_outcome = 'MATCH' then 1 else 0 end) > 0 and
sum(case when match_outcome = 'NO MATCH' then 1 else 0 end) > 0;
This assumes that input_id is not duplicated for a give job_id, which seems consistent with the data in the question.
If you want the original data row, then you can join this into the query:
select t.*
from test t join
(select input_id
from text
group by input_id
having sum(case when match_outcome = 'MATCH' then 1 else 0 end) > 0 and
sum(case when match_outcome = 'NO MATCH' then 1 else 0 end) > 0
) i
on t.input_id = i.input_id;

Select * from TEST
Where match_outcome='NO_MATCH'

Related

Sum a column and perform more calculations on the result? [duplicate]

This question already has an answer here:
How to use an Alias in a Calculation for Another Field
(1 answer)
Closed 3 years ago.
In my query below I am counting occurrences in a table based on the Status column. I also want to perform calculations based on the counts I am returning. For example, let's say I want to add 100 to the Snoozed value... how do I do this? Below is what I thought would do it:
SELECT
pu.ID Id, pu.Name Name,
COUNT(*) LeadCount,
SUM(CASE WHEN Status = 'Working' THEN 1 ELSE 0 END) AS Working,
SUM(CASE WHEN Status = 'Uninterested' THEN 1 ELSE 0 END) AS Uninterested,
SUM(CASE WHEN Status = 'Converted' THEN 1 ELSE 0 END) AS Converted,
SUM(CASE WHEN SnoozedId > 0 THEN 1 ELSE 0 END) AS Snoozed,
Snoozed + 100 AS Test
FROM
Prospects p
INNER JOIN
ProspectsUsers pu on p.OwnerId = pu.SalesForceId
WHERE
p.Store = '108'
GROUP BY
pu.Name, pu.Id
ORDER BY
Name
I get this error:
Invalid column name 'Snoozed'.
How can I take the value of the previous SUM statement, add 100 to it, and return it as another column? What I was aiming for is an additional column labeled Test that has the Snooze count + 100.
You can't use one column to create another column in the same way that you are attempting. You have 2 options:
Do the full calculation (as #forpas has mentioned in the comments above)
Use a temp table or table variable to store the data, this way you can get the first 5 columns, and then you can add the last column or you can select from the temp table and do the last column calculations from there.
You can not use an alias as a column reference in the same query. The correct script is:
SELECT
pu.ID Id, pu.Name Name,
COUNT(*) LeadCount,
SUM(CASE WHEN Status = 'Working' THEN 1 ELSE 0 END) AS Working,
SUM(CASE WHEN Status = 'Uninterested' THEN 1 ELSE 0 END) AS Uninterested,
SUM(CASE WHEN Status = 'Converted' THEN 1 ELSE 0 END) AS Converted,
SUM(CASE WHEN SnoozedId > 0 THEN 1 ELSE 0 END)+100 AS Snoozed
FROM
Prospects p
INNER JOIN
ProspectsUsers pu on p.OwnerId = pu.SalesForceId
WHERE
p.Store = '108'
GROUP BY
pu.Name, pu.Id
ORDER BY
Name
MSSQL does not allow you to reference fields (or aliases) in the SELECT statement from within the same SELECT statement.
To work around this:
Use a CTE. Define the columns you want to select from in the CTE, and then select from them outside the CTE.
;WITH OurCte AS (
SELECT
5 + 5 - 3 AS OurInitialValue
)
SELECT
OurInitialValue / 2 AS OurFinalValue
FROM OurCte
Use a temp table. This is very similar in functionality to using a CTE, however, it does have different performance implications.
SELECT
5 + 5 - 3 AS OurInitialValue
INTO #OurTempTable
SELECT
OurInitialValue / 2 AS OurFinalValue
FROM #OurTempTable
Use a subquery. This tends to be more difficult to read than the above. I'm not certain what the advantage is to this - maybe someone in the comments can enlighten me.
SELECT
5 + 5 - 3 AS OurInitialValue
FROM (
SELECT
OurInitialValue / 2 AS OurFinalValue
) OurSubquery
Embed your calculations. opinion warning This is really sloppy, and not a great approach as you end up having to duplicate code, and can easily throw columns out-of-sync if you update the calculation in one location and not the other.
SELECT
5 + 5 - 3 AS OurInitialValue
, (5 + 5 - 3) / 2 AS OurFinalValue
You can't use a column alias in the same select. The column alias do not precedence / sequence; they are all created after the eval of the select result, just before group by and order by.
You must repeat code :
SELECT
pu.ID Id,pu.Name Name,
COUNT(*) LeadCount,
SUM(CASE WHEN Status = 'Working' THEN 1 ELSE 0 END) AS Working,
SUM(CASE WHEN Status = 'Uninterested' THEN 1 ELSE 0 END) AS Uninterested,
SUM(CASE WHEN Status = 'Converted' THEN 1 ELSE 0 END) AS Converted,
SUM(CASE WHEN SnoozedId > 0 THEN 1 ELSE 0 END) AS Snoozed,
SUM(CASE WHEN SnoozedId > 0 THEN 1 ELSE 0 END)+ 100 AS Test
FROM
Prospects p
INNER JOIN
ProspectsUsers pu on p.OwnerId = pu.SalesForceId
WHERE
p.Store = '108'
GROUP BY
pu.Name, pu.Id
ORDER BY
Name
If you don't want to repeat the code, use a subquery
SELECT
ID, Name, LeadCount, Working, Uninterested,Converted, Snoozed, Snoozed +100 AS test
FROM
(SELECT
pu.ID Id,pu.Name Name,
COUNT(*) LeadCount,
SUM(CASE WHEN Status = 'Working' THEN 1 ELSE 0 END) AS Working,
SUM(CASE WHEN Status = 'Uninterested' THEN 1 ELSE 0 END) AS Uninterested,
SUM(CASE WHEN Status = 'Converted' THEN 1 ELSE 0 END) AS Converted,
SUM(CASE WHEN SnoozedId > 0 THEN 1 ELSE 0 END) AS Snoozed
FROM Prospects p
INNER JOIN ProspectsUsers pu on p.OwnerId = pu.SalesForceId
WHERE p.Store = '108'
GROUP BY pu.Name, pu.Id) t
ORDER BY Name
or a view

How to filter rows based on group values in SQL?

I have a table with the following format
serialnumber,test,result
-------------------------
ABC 1 "TOO HIGH"
ABC 2 "PASS"
ABC 3 "TOO LOW"
DEF 1 "PASS"
DEF 2 "PASS"
DEF 3 "PASS"
I need to do two operations:
1) for each serial number that has all pass records, I need to roll it up into a single record
2) for every serial that contains a "TOO HIGH" or "TOO LOW" record, I need to exclude all "PASS" records for that serial number
How would I go about doing this in teradata 15, preferably in a single statement?
SELECT *
FROM tab
QUALIFY
-- #1, only PASS
( SUM(CASE WHEN result <> 'PASS' THEN 1 ELSE 0 end)
OVER (PARTITION BY serialnumber) = 0
AND ROW_NUMBER()
OVER (PARTITION BY serialnumber
ORDER BY test) = 1
)
OR
-- #2
( SUM(CASE WHEN result <> 'PASS' THEN 1 ELSE 0 end)
OVER (PARTITION BY serialnumber) > 0
AND result_ <> 'PASS'
)
Consider a union query combining both conditions, using an aggregate query for #1 and a inner join query with derived tables for #2. Hopefully, Teradata's dialect supports the syntax:
SELECT TableName.SerialNumber,
Min(TableName.Test) As Test,
Min(TableName.Result) As Result
FROM SerialNumber
GROUP BY SerialNumber
HAVING Sum(CASE WHEN TableName.Result='"PASS"' THEN 1 ELSE 0 END) = Count(*)
UNION
SELECT TableName.SerialNumber,
TableName.Test,
TableName.Result
FROM SerialNumber
INNER JOIN
(SELECT SerialNumber FROM SerialNumber
WHERE TableName.Result = '"TOO HIGH"') AS toohighSub
INNER JOIN
(SELECT SerialNumber FROM SerialNumber
WHERE TableName.Result = '"TOO LOW"') AS toolowSub
ON toolowSub.SerialNumber = toohighSub.SerialNumber
ON TableName.SerialNumber = toolowSub.SerialNumber
WHERE TableName.Result <> '"PASS"';

Returning only id's of records that meet criteria

I need to return distinct ID's of records which meet following conditions :
must have records with field reason_of_creation = 1
and must NOT have records with field reason_of_creation = 0 or null
in the same time.
While i was able to do it, i keep wondering is there more elegant (even recommended) way of doing it.
Here is anonymized version of what i have :
select distinct st.some_id from (
select st.some_id, wanted.wanted_count as wanted, unwanted.unwanted_count as unwanted
from some_table st
left join (
select st.some_id, count(st.reason_of_creation) as wanted_count
from some_table st
where st.reason_of_creation=1
group by st.some_id
) wanted on wanted.some_id = st.some_id
left join (
select st.some_id, count(st.reason_of_creation) as unwanted_count
from some_table st
where st.reason_of_creation=0
group by st.some_id
) unwanted on unwanted.some_id = st.some_id
where wanted.wanted_count >0 and (unwanted.unwanted_count = 0 or unwanted.unwanted_count is null)
) st;
Sample data :
some_id reason_of_creation
1 1
1 0
2 1
3 null
4 0
4 1
5 1
desired result would be list of records with some_id = 2, 5
It seems to me your query is overkill,all you need is some post aggregation filtering
SELECT some_id FROM t
GROUP BY some_id
HAVING SUM(CASE WHEN reason_of_creation = 1 THEN 1 ELSE 0 END)>0
AND SUM(CASE WHEN reason_of_creation = 0 OR reason_of_creation IS NULL THEN 1 ELSE 0 END)=0
I think that more elegant query exists and it is based on assumption what reasoson_of_crdeation field is integer, so minimal possible it's value, which greater than 0 is 1
This is for possible negative values for reasoson_of_crdeation:
select someid from st
where reasoson_of_crdeation != -1
group by someid
having(min(nvl(abs(reasoson_of_crdeation), 0)) = 1)
or
select someid from st
group by someid
having(min(nvl(abs(case when reasoson_of_crdeation = -1 then -2 else reasoson_of_crdeation end), 0)) = 1)
And this one in a case if reasoson_of_crdeation is non-negative integer:
select someid from st
group by someid
having(min(nvl(reasoson_of_crdeation, 0)) = 1)

Return records that all match and all records where at least one doesn't match

Given a table of exam results, where 1 == PASS and 0 == FAIL
ID Name Test Result
--------------------
1 John MATH 1
2 John ENGL 1
3 Mary MATH 1
4 Mary PSYC 0
EDIT: assume that the name is unique.
I need to get all records for people who
1) passed all tests
2) failed at least one test
So, the 1st query should return John and all his records, and the 2nd query should return Mary and all her records (including the ones with PASS).
I'm trying to do a LEFT OUTER JOIN with itself and compare counts, but don't seem to get a working query.
SELECT * FROM Results R1
LEFT OUTER JOIN Results R2 on R1.ID=R2.ID and R2.Result=1
WHERE ??? count of rows from R1 is compared to count of non-null rows from R2
This is a "poster-child" exercise for the EXISTS clause:
At leasr one failed result:
select * from Results r
where exists (select * from Results rr where rr.Name=r.Name AND Result=0)
All passed:
select * from Results r
where not exists (select * from Results rr where rr.Name=r.Name AND Result=0)
See how these queries work on your data set at sqlfiddle.com.
All passed
SELECT Name FROM Results R1
GROUP BY NAME
HAVING SUM(RESULT) = COUNT(RESULT)
Some failed
SELECT Name FROM Results R1
GROUP BY NAME
HAVING SUM(RESULT) < COUNT(RESULT)
Hope it helps
Edit
All passed
SELECT Name FROM Results R1
GROUP BY NAME
HAVING SUM(1-RESULT) = 0
Some failed
SELECT Name FROM Results R1
GROUP BY NAME
HAVING SUM(1-RESULT) > 0
(This might run faster)
One way
Select Name,
Case failCount When 0 then 'X' Else '' End PassedAll,
Case failCount When 0 then '' Else 'X' End FailedOneOrMore
From (Select name,
Sum(Case Result when 0 Then 1 Else 0 End) failCount
From Results R
Group By Name) Z
to get all the records, just join to this
Select zz.Name, zz.PassedAll, zz.FailedOneOrMore,
r.Test, r.Result
From (Select Name,
Case failCount When 0 then 'X' Else '' End PassedAll,
Case failCount When 0 then '' Else 'X' End FailedOneOrMore
From (Select name,
Sum(Case Result when 0 Then 1 Else 0 End) failCount
From Results R
Group By Name) Z) ZZ
Left Join Results r On r.Name = zz.Name
This query uses a subquery to return all records (pass & fail) for people who have passed at least one of the Tests:
select * from Results where Name in (select Name from Results where Result = '1' group by Name);
Results exclude those who failed to pass any of the tests.

Get the distinct count of values from a table with multiple where clauses

My table structure is this
id last_mod_dt nr is_u is_rog is_ror is_unv
1 x uuid1 1 1 1 0
2 y uuid1 1 0 1 1
3 z uuid2 1 1 1 1
I want the count of rows with:
is_ror=1 or is_rog =1
is_u=1
is_unv=1
All in a single query. Is it possible?
The problem I am facing is that there can be same values for nr as is the case in the table above.
Case statments provide mondo flexibility...
SELECT
sum(case
when is_ror = 1 or is_rog = 1 then 1
else 0
end) FirstCount
,sum(case
when is_u = 1 then 1
else 0
end) SecondCount
,sum(case
when is_unv = 1 then 1
else 0
end) ThirdCount
from MyTable
you can use union to get multiple results e.g.
select count(*) from table with is_ror=1 or is_rog =1
union
select count(*) from table with is_u=1
union
select count(*) from table with is_unv=1
Then the result set will contain three rows each with one of the counts.
Sounds pretty simple if "all in a single query" does not disqualify subselects;
SELECT
(SELECT COUNT(DISTINCT nr) FROM table1 WHERE is_ror=1 OR is_rog=1) cnt_ror_reg,
(SELECT COUNT(DISTINCT nr) FROM table1 WHERE is_u=1) cnt_u,
(SELECT COUNT(DISTINCT nr) FROM table1 WHERE is_unv=1) cnt_unv;
how about something like
SELECT
SUM(IF(is_u > 0 AND is_rog > 0, 1, 0)) AS count_something,
...
from table
group by nr
I think it will do the trick
I am of course not sure what you want exactly, but I believe you can use the logic to produce your desired result.