SQL Count with join are returning double results

SQL Count with join are returning double results - sql

I have two tables, "event" and "soundType". I am trying to count the number of event with specific soundType.
This is my request :
SELECT Count(*) AS nb
FROM event
INNER JOIN soundtype
ON event.id = soundtype.eventid
WHERE ( soundtype.NAME = 'pop'
OR soundtype.NAME = 'rock' )
AND ( event.partytype = 'wedding'
OR event.partytype = 'Corporate evening'
OR event.partytype = 'birthday' )
Example of tables below:
event Table
id userId partyType
----------------------------
249 30 birthday
250 30 wedding
SoundType Table
id evenId name
-----------------------
1 249 pop
2 249 rock
3 250 pop
The result
nb
---
3
The result i expect
nb
---
2
Thank you for your help

You might find that exists is more efficient than count(distinct):
SELECT COUNT(*) AS nb
FROM event e
WHERE e.partytype IN ('wedding', 'Corporate evening' , 'birthday') AND
EXISTS (SELECT 1
FROM soundtype st
WHERE st.eventid = e.id AND
st.NAME IN ('pop', 'rock')
) ;
Your problem is (presumably) arising because some events have multiple sound types. You just need to match one of them. Multiplying out all the rows just to use COUNT(DISTINCT) is inefficient, when EXISTS (or IN) prevents the duplicates in the first place.

You count all the resulting records. But you need to count different events. So use distinct
SELECT COUNT(distinct event.id) AS nb
FROM event
INNER JOIN soundType ON event.id = soundType.eventId
WHERE soundType.name in('pop', 'rock')
AND event.partyType in('wedding', 'Corporate evening', 'birthday')

Related

How to combine multiple complex queries?

There is a query that displays data by id (R_PERS_ACCOUNT_ID) and date (MAX(RBS.CREATE_DATE))
Select rpao.r_pers_account_id, max(rbs.create_date)
from r_base_trans rbs
join r_pers_acc_operation rpao on rbs.r_base_trans_id = rpao.r_base_trans_id
where rbs.create_date between to_date('01.12.2017', 'dd.mm.yyyy') and to_date('31.12.2020', 'dd.mm.yyyy')
and rbs.M_BASE_TRANS_TYPE_ID NOT IN 26
and ROWNUM < 100
group by rpao.r_pers_account_id;
enter image description here
There is one more request in which you need to insert data from the previous select. In the where clause where pa.r_pers_account_id, you need to insert the id from the previous table. And in to_date('31-01-2018', 'dd-mm-yyyy') there is also date data from the previous table. (In my case, I manually inserted only one data)
select TP.IIN_BIN,
pa.r_pers_account_id,
pa.close_date,
kbk.kbk_code,
org.code_nk,
org.CODE_TPK,
op.m_operation_type_id,
pa.open_date,
sum(op.amount)
from r_pers_account pa
join r_tax_payer tp on pa.r_tax_payer_id = tp.r_tax_payer_id
join r_pers_acc_operation op on op.r_pers_account_id = pa.r_pers_account_id
join m_kbk kbk on kbk.m_kbk_id = pa.m_kbk_id
join m_tax_org org on org.m_tax_org_id = pa.m_tax_org_id
where pa.r_pers_account_id in (16616864)
and is_charge_fine = 0
and trunc(op.actual_date, 'fmdd') <= to_date('31-01-2018', 'dd-mm-yyyy')
and op.m_operation_type_id = 1
group by tp.IIN_BIN, pa.r_pers_account_id, pa.close_date, kbk.kbk_code, op.m_operation_type_id, org.code_nk,
org.code_tpk, pa.open_date;
enter image description here
In this select, you also need to insert data by id and date.
select TP.IIN_BIN,
pa.r_pers_account_id,
pa.close_date,
kbk.kbk_code,
org.code_nk,
org.CODE_TPK,
op.m_operation_type_id,
pa.open_date,
sum(op.amount)
from r_pers_account pa
join r_tax_payer tp on pa.r_tax_payer_id = tp.r_tax_payer_id
join r_pers_acc_operation op on op.r_pers_account_id = pa.r_pers_account_id
join m_kbk kbk on kbk.m_kbk_id = pa.m_kbk_id
join m_tax_org org on org.m_tax_org_id = pa.m_tax_org_id
where pa.r_pers_account_id in (16616864)
and is_charge_fine = 0
and trunc(op.actual_date, 'fmdd') <= to_date('31-01-2018', 'dd-mm-yyyy')
and op.m_operation_type_id = 2
group by tp.IIN_BIN, pa.r_pers_account_id, pa.close_date, kbk.kbk_code, op.m_operation_type_id, org.code_nk,
org.code_tpk, pa.open_date;
enter image description here
It is necessary to make so that these 3 requests were one select.
In addition, after combining these queries, you need to display data by condition if the second select column sum(op.amount) has a negative number, and the third select column sum(op.amount) has 0 or a positive number

I don't quite understand what queries you posted do (2nd and 3rd look just the same to me), but - generally speaking - if you want to "reuse" one query in queries that follow, there's a useful option: CTE (common table expression, i.e. the WITH factoring clause).
Simplified, it would look like this; I hope you'll manage to apply it to your code:
with
first_query as
(select rpao.r_pers_account_id, ...
from ...
where ...
),
second_query as
(select tp.iin_bin, ...
from FIRST_QUERY f1 join r_per_account pa on ...
--------------
-- this is new!
join ...
where ...
),
third_query as
(select tp.iin_bin, ...
from ...
-- use SECOND_QUERY (and/or FIRST_QUERY, if you have to)
where ...
)
-- Finally: extract data you really need
select ...
from third_query t3
where ...

use distinct within case statement

I have a query that uses multiple left joins and trying to get a SUM of values from one of the joined columns.
SELECT
SUM( case when session.usersessionrun =1 then 1 else 0 end) new_unique_session_user_count
FROM session
LEFT JOIN appuser ON appuser.appid = '6279df3bd2d3352aed591583'
AND appuser.userid = session.userid
LEFT JOIN userdevice ON userdevice.appid = '6279df3bd2d3352aed591583'
AND userdevice.userid = appuser.userid
WHERE session.appid = '6279df3bd2d3352aed591583'
AND (session.uploadedon BETWEEN '2022-04-18 08:31:26' AND '2022-05-18 08:31:26')
But this obviously gives a redundant session.usersessionrun=1 counts since it's a joined resultset.
Here the logic was to mark the user as new if the sessionrun for that record is 1.
I grouped by userid and usersessionrun and it shows that the records are repeated.
userid. sessionrun. count
628212 1 2
627a01 1 4
So what I was trying to do was something like
SUM(CASE distinct(session.userid) AND WHEN session.usersessionrun = 1 THEN 1 ELSE 0 END) new_unique_session_user_count
i.e. for every unique user count, session.usersessionrun = 1 should only be done once.

As you have discovered, JOIN operations can generate combinatorial explosions of data.
You need a subquery to count your sessions by userid. Then you can treat the subquery as a virtual table and JOIN it to the other tables to get the information you need in your result set.
The subquery (nothing in my answer is debugged):
SELECT COUNT(*) new_unique_session_user_count,
session.userid
FROM session
WHERE session.appid = '6279df3bd2d3352aed591583'
AND session.uploadedon BETWEEN '2022-04-18 08:31:26'
AND '2022-05-18 08:31:26'
AND session.usersessionrun = 1
AND session.appid = '6279df3bd2d3352aed591583'
GROUP BY userid
This subquery summarizes your session table and has one row per userid. The trick to avoiding JOIN-created combinatorial explosions is using subqueries that generate results with only one row per data item mentioned in a JOIN's ON-clause.
Then, you join it with the other tables like this
SELECT summary.new_unique_session_user_count
FROM (
SELECT COUNT(*) new_unique_session_user_count,
session.userid
FROM session
WHERE session.appid = '6279df3bd2d3352aed591583'
AND session.uploadedon BETWEEN '2022-04-18 08:31:26'
AND '2022-05-18 08:31:26'
AND session.usersessionrun = 1
AND session.appid = '6279df3bd2d3352aed591583'
GROUP BY userid
) summary
JOIN appuser ON appuser.appid = '6279df3bd2d3352aed591583'
AND appuser.userid = summary.userid
JOIN userdevice ON userdevice.appid = '6279df3bd2d3352aed591583'
AND userdevice.userid = appuser.userid
There may be better ways to structure this query, but it's hard to guess at them without more information about your table definitions and business rules.

Filter for combination of column values in SQL

I want to filter for all People who have the same AttributValue for certain Attributs as another Person
I have the following Query:
SELECT
p1.keyValue,
p1.Displayname,
p2.keyValue,
p2.Displayname,
p1.ImportantAttrName,
p1.ImportantAttrValue
FROM Person p1 WITH (NOLOCK)
JOIN Person p2 WITH (NOLOCK)
ON p1.ImportantAttr = p2.ImportantAttr
WHERE p1.keyValue != p2.keyValue
AND p1.ImportantAttrValue = p2.ImportantAttrValue
with this query I will get all entries twice, because every Person will be in p1 and p2.
So the result will look like this:
I123 Freddy Krüger A123 The Horsemen Moviecategorie Horror
A123 The Horsemen I123 Freddy Krüger Moviecategorie Horror
But for analysis purposes it would be be nice if I could get a combination of p1.keyvalue and p2.keyvalue only once, without respect to in which of both colums the values are.
So far I did this by exporting to excel and do the cleanup there, but is there a way to fix the query to not get this "duplicates"?

Use where p1.keyValue < p2.keyValue:
SELECT
p1.keyValue,
p1.Displayname,
p2.keyValue,
p2.Displayname,
p1.ImportantAttrName,
p1.ImportantAttrValue
FROM Person p1 WITH (NOLOCK)
INNER JOIN Person p2 WITH (NOLOCK)
ON p1.ImportantAttr = p2.ImportantAttr
WHERE
p1.keyValue < p2.keyValue AND -- change is here
p1.ImportantAttrValue = p2.ImportantAttrValue;
This will ensure that you do not see duplicate pairs. To understand numerically why this works, consider two key values, 1 and 2. Using the condition !=, both 1-2 and 2-1 meet that criteria. But using < results in only 1-2.

You can turn:
on p1.ImportantAttr = p2.ImportantAttr
to:
on p1.ImportantAttr = p2.ImportantAttr and p1.keyValue < p2.keyValue
The whole query could look like this:
SELECT
p1.keyValue,
p1.Displayname,
p2.keyValue,
p2.Displayname,
p1.ImportantAttrName,
p1.ImportantAttrValue
FROM Person p1 WITH (NOLOCK)
JOIN Person p2 WITH (NOLOCK)
ON p1.ImportantAttr = p2.ImportantAttr
AND p1.keyValue < p2.keyValue
WHERE p1.ImportantAttrValue = p2.ImportantAttrValue

this may be different way of approach but can be get the expected.
Using Partition Count(*) :
select count(*) over(partition by Attr) as RepeatCount, * from (
select keyValue,DisplayName,ImportantAttr + ' ' +ImportantAttrValue as Attr
from tblTest) tblTemp
as per the above Query you will get the result like below
> RepeatCount keyValue DisplayName Attr
>
> 1 P321 The Ironman Generalcategorie Test
> 2 I123 Freddy Krüger Moviecategorie Horror
> 2 A123 The Horsemen Moviecategorie Horror
from this result you can filter records by Repeatcount > 1

SQL Query returning multiple Duplicate Results

scenario : I have Three Tables(Prisoners,AddPaymentTransaction,WithdrawPaymentTransation)
Date in Tables : i have 1 row of prisoner with PrisonerID=5 and two rows in both other table,
i have wrote query to return there data if any prisoner have add some payment in there account or with draw any payment from there payment on same day or on different dates etc.
here is my query :
select at.PrisonerID ,at.Amount as AAmount,at.Date as ADate,wt.Amount as WAmount,wt.Date as WDate
from Prisoners p, AddPaymentTransaction at,WithdrawPaymentTransation wt
where p.PrisonerID=at.PrisonerID and p.PrisonerID=wt.PrisonerID and at.PrisonerID=wt.PrisonerID and at.PrisonerID=5
but it gives me 4 rows, 9 rows when i have 3 rows of data in each Table etc.
i want rows of data with out duplicate. any suggestions or help will be highly appreciated.

It looks like at.PrisonerID = wt.PrisonerID in your query might be what is causing all of the duplicates. I am guessing AddPaymentTransaction and WithdrawPaymentTransation should not be linked together. So, how about the following:
SELECT at.PrisonerID, at.Amount as AAmount, at.Date as ADate,
wt.Amount as WAmount, wt.Date as WDate
FROM Prisoners p
INNER JOIN AddPaymentTransaction at p.PrisonerID = at.PrisonerID
INNER JOIN WithdrawPaymentTransation wt ON p.PrisonerID = wt.PrisonerID
WHERE at.PrisonerID = 5
but this probably isn't going to give you exactly what you are looking for either. So maybe something like the following:
SELECT * FROM
(
SELECT p.PrisonerID, 'AddPayment' AS Type,
apt.Amount as TransAmount, apt.Date AS TransDate
FROM Prisoners p
INNER JOIN AddPaymentTransaction apt ON p.PrisonerID = apt.PrisonerID
WHERE apt.PrisonerID = 5
UNION
SELECT p.PrisonerID, 'WithdrawPayment' AS Type,
wt.Amount as TransAmount, wt.Date as TransDate
FROM Prisoners p
INNER JOIN WithdrawPaymentTransation wt ON p.PrisonerID = wt.PrisonerID
WHERE wt.PrisonerID = 5
) AS mq
ORDER BY mq.TransDate DESC

SQL to select parent that contains child specific value

I am actually creating a crystal reports v12 (2008) report but can't find the method, using Crystal, to extract the following. I thought if someone might answer in SQL language, I could piece it together.
2 Tables: hbmast, ddmast
SELECT hbmast.custno, hbmast.id, ddmast.name, ddmast.status
WHERE hbmast.custno = ddmast.custno
GROUP BY hbmast.id
pseudo code::show all hbmast values that have ddmast.status = '2'
Sample output:
J0001, 111222, PAUL JONES, 1
111222, PAUL JONES, 2
111222, PAUL JONES, 1
K0001, 555333, PETER KING, 3
555333, PETER KING, 1
I would like to have Paul show on the report with all child records but Peter should not be returned on the report since he has no child records with '2' for ddmast.status field.
Thanks for the help

I think you're looking for this:
select hb.custno, hb.id, dd.name, dd.status from hbmast hb
join ddmast dd on hb.custno = dd.custno
where hb.custno in (
select custno from ddmast
where status = '2'
)
Let me know if this returns your expected result.

The way to achieve this in Crystal would be to have your hb and dd tables then a second alias of the dd table.
So you would filter your dd alias table where status = 2 then join to your hb table and back to your dd table (not the alias). The SQL would end up looking like:
select hb.custno, hb.id, dd.name, dd.status from hbmast hb
inner join ddmast dd on hb.custno = dd.custno
inner join ddmast dd2 on hb.custno = dd2.custno
where dd2.status = '2'
Andomar makes a valid point about duplicate records appearing if there is more than 1 record per group with a status of 2. If that is the case you can either group by primary key and show row information at group footer level OR use a sql expression with a subquery in your selection formula instead of the double join method.
SQL Expression: (select count(*) from ddmast where custno = "hbmast.custno" and status = '2')
Then record selection expert: {%sqlexpression} > 0

And a different way to get the same...
SELECT hb.custno, hb.id, dd.name, dd.status
FROM hbmast hb
INNER join ddmast dd
on hb.custno = dd.custno
INNER JOIN DDMAST2 DD2
on DD2.custNo = HB.custNo
AND DD2.Status='2'

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL Count with join are returning double results - sql

You count all the resulting records. But you need to count different events. So use distinct SELECT COUNT(distinct event.id) AS nb FROM event INNER JOIN soundType ON event.id = soundType.eventId WHERE soundType.name in('pop', 'rock') AND event.partyType in('wedding', 'Corporate evening', 'birthday')

Related

How to combine multiple complex queries?

use distinct within case statement

Filter for combination of column values in SQL

SQL Query returning multiple Duplicate Results

SQL to select parent that contains child specific value

Categories

Resources