Group by in SQL Server giving wrong count - sql

I have a query which works, goes like this:
Select
count(InsuranceOrderLine.AntallPotensiale) as potensiale,
COUNT(InsuranceOrderLine.AntallSolgt) as Solgt,
InsuranceProduct.Name,
InsuranceProductCategory.Name as Kategori
From
InsuranceOrderLine, InsuranceProduct, InsuranceProductCategory
where
InsuranceOrderLine.FKInsuranceProductId = InsuranceProduct.InsuranceProductID
and InsuranceProduct.FKInsuranceProductCategory = InsuranceProductCategory.InsuranceProductCategoryID
Group by
InsuranceProduct.name, InsuranceProductCategory.Name
This query over returns what I need, but when I try to add more table (InsuranceOrder) to be able to get the regardingUser column, then all the count values are way high.
Select
count(InsuranceOrderLine.AntallPotensiale) as Potensiale,
COUNT(InsuranceOrderLine.AntallSolgt) as Solgt,
InsuranceProduct.Name,
InsuranceProductCategory.Name as Kategori,
RegardingUser
From
InsuranceOrderLine, InsuranceProduct, InsuranceProductCategory, InsuranceSalesLead
where
InsuranceOrderLine.FKInsuranceProductId = InsuranceProduct.InsuranceProductID
and InsuranceProduct.FKInsuranceProductCategory = InsuranceProductCategory.InsuranceProductCategoryID
Group by
InsuranceProduct.name, InsuranceProductCategory.Name,RegardingUser
Thanks in advance

You're adding one more table to your FROM statement, but you don't specify any JOIN condition for that table - so your previous result set will do a FULL OUTER JOIN (cartesian product) with your new table! Of course you'll get duplication of data....
That's one of the reasons that I'm recommending never to use that old, legacy style JOIN - do not simply list a comma-separated bunch of tables in your FROM statement.
Always use the new ANSI standard JOIN syntax with INNER JOIN, LEFT OUTER JOIN and so on:
SELECT
count(iol.AntallPotensiale) as Potensiale,
COUNT(iol.AntallSolgt) as Solgt,
ip.Name,
ipc.Name as Kategori,
isl.RegardingUser
FROM
dbo.InsuranceOrderLine iol
INNER JOIN
dbo.InsuranceProduct ip ON iol.FKInsuranceProductId = ip.InsuranceProductID
INNER JOIN
dbo.InsuranceProductCategory ipc ON ip.FKInsuranceProductCategory = ipc.InsuranceProductCategoryID
INNER JOIN
dbo.InsuranceSalesLead isl ON ???????? -- JOIN condition missing here !!
When you do this, you first of all see right away that you're missing a JOIN condition here - how is this new table InsuranceSalesLead linked to any of the other tables already used in this SQL statement??
And secondly, your intent is much clearer, since the JOIN conditions linking the tables are where they belong - right with the JOIN - and don't clutter up your WHERE clauses ...

It looks like you added the table join which slightly multiplies count of rows - make sure, that you properly joining the table. And be careful with aggregate functions over several joined tables - joins very often lead to duplicates

Related

ORA-01417 - Two outer joins error. New join syntax?

I never got my head around the "new" SQL join syntax, and therefore use the "old" join system, with the (+). I know it's about time I learned it - however I just find the old syntax a lot more intuitive, especially when working with multiple tables with multiple joins.
However I now have an operation which requires two outer joins on the same table. My code is:
SELECT
C.ID,
R.VALUE,
R.LOG_ID,
LOG.ACTION
FROM
C,
R,
LOG
WHERE
C.DELETED IS NULL
AND R.DELETED IS NULL
-- Two joins below
AND R.C_ID(+) = C.ID
AND R.LOG_ID(+) = LOG.ID
However this results in an error:
ORA-01417 - A table may be outer joined to at most one table.
Searching for this error I find that the solution is to use the new syntax For example this answer on SO:
Outer join between three tables causing Oracle ORA-01417 error
So I am aware that some may consider this question a duplicate as it technically already has an answer. However the "old" syntax posed in that question does not contain exactly the same number of tables and joins as I have here, and try as I might, I'm not sure how I would factor this in to my own code.
Is anyone able to assist? Thanks.
I think you want:
SELECT C.ID, R.VALUE, R.LOG_ID, LOG.ACTION
FROM C LEFT JOIN
R
ON R.C_ID = C.ID LEFT JOIN
LOG
ON R.LOG_ID = LOG.I
WHERE C.DELETED IS NULL AND
R.DELETED IS NULL;
The "new" (it is 25 years old) outer join syntax is actually very easy to follow, particularly for a simple example with just LEFT JOIN.
The idea is you want to keep all rows from one table (perhaps subject to filters in the WHERE clause). That is the first table. Then you use a chain of LEFT JOIN to bring in other tables.
All rows from the first table are in the result set. If there are matching rows in the other tables, then columns from those tables come from matching rows. If there are no matches, then the row from the first table is kept.

Access SQL subquery access field from parent

I have a SQL query that works in Access 2016:
SELECT
Count(*) AS total_tests,
Sum(IIf(score>=securing_threshold And score<mastering_threshold,1,0)) AS total_securing,
Sum(IIf(score>=mastering_threshold,1,0)) AS total_mastering,
total_securing/Count(*) AS percent_securing,
total_mastering/Count(*) AS percent_mastering,
(Count(*)-total_securing-total_mastering)/Count(*) AS percent_below,
subjects.subject,
students.year_entered,
IIf(Month(Date())<9,Year(Date())-students.year_entered+6,Year(Date())-students.year_entered+7) AS current_form,
groups.group
FROM
((subjects
INNER JOIN tests ON subjects.ID = tests.subject)
INNER JOIN (students
INNER JOIN test_results ON students.ID = test_results.student) ON tests.ID = test_results.test)
LEFT JOIN
(SELECT * FROM group_membership LEFT JOIN groups ON group_membership.group = groups.ID) As g
ON students.ID = g.student
GROUP BY subjects.subject, students.year_entered, groups.group;
However, I wish to filter out irrelevant groups before joining them to my table. The table groups has a column subject which is a foreign key.
When I try changing ON students.ID = g.student to ON students.ID = g.student And subjects.ID = g.subject I get the error 'JOIN expression not supported'.
Alternatively, when I try adding WHERE subjects.ID = groups.subject to the subquery, it asks me for the parameter value of subjects.ID, although it is a column in the parent query.
Googling reveals many similar errors but they were all resolved by changing the brackets. That didn't help.
Just in case the table relationships help:
Thank you.
EDIT: Sample database at https://www.dropbox.com/s/yh80oooem6gsni7/student%20tracker.ACCDB?dl=0
MS Access queries with many joins are difficult to update by SQL alone as parenetheses pairings are required unlike other RDBMS's and these pairings must follow an order. Moreover, some pairings can even be nested. Hence, for beginners it is advised to build queries with complex and many joins using the query design GUI in the MS Office application and let it build out the SQL.
For a simple filter on the g derived table, you could filter subject on the derived table, g, but likely you want want all subjects:
...
(SELECT * FROM group_membership
LEFT JOIN groups ON group_membership.group = groups.ID
WHERE groups.subject='Earth Science') As g
...
So for all subjects, consider re-building query from scratch in GUI that nearly mirrors your table relationships which actually auto-links joins in the GUI. Then, drop unneeded tables.
Usually you want to begin with the join table or set like groups and group_membership or tests and test_results. In fact, consider saving the g derived table as its own query.
Then add the distinct record primary source tables like students and subjects.
You may even need to play around with order in FROM and JOIN clauses to attain desired results, and maybe even add the same table in query. And be careful with adding join tables like group_membership (two one-to-many links), to GROUP BY queries as it leads to the duplicate record aggregation. So you may need to join aggregates queries by subject.
Unless you can post content of all tables, from our perspective it is difficult to help from here.
Your subquery g uses a LEFT JOIN, but there is a enforced 1:n relation between the two tables, so there will always be a matching group. Use a INNER JOIN instead.
With g.subject you are trying to join on a column that is on the right side of a left join, that cannot really work.
Also you shouldn't use SELECT * on a join of tables with identical column names. Include only the qualified column names that you need.
LEFT JOIN
(SELECT group_membership.student, groups.group, groups.subject
FROM group_membership INNER JOIN groups
ON group_membership.group = groups.ID) As g
ON (students.ID = g.student AND subjects.ID = g.subject)
I would call the columns in group_membership group_ID and student_ID to avoid confusion.
I don't have the database to test, but I would use subject table as subquery:
(SELECT * FROM subject WHERE filter out what you don't need) Subj
Then INNER JOIN this new Subj Table in your query which would exclude irrelevant groups.
Also I would never create join in WHERE clause (WHERE subjects.ID = groups.subject), what this does it creates cartesian product (table with all the possible combinations of subjects.ID and groups.subject) then it filters out records to satisfy your join. When dealing with huge data it might take forever or crash.
Error related to "Join expression may not be supported"; do datatypes match in those fields?
I solved it by (a lot of trial and error and) taking the advice here to make queries in the GUI and joining them. The end result is 4 queries deep! If the database was bigger, performance would be awful... but now its working I can tweak it eventually.
Thank you everybody!

multiple sql joins not producing desired results

I'm new to sql and trying to tweak someone else's huge stored procedure to get a subset of the results. The code below is maybe 10% of the whole procedure. I added the lp.posting_date, last left join, and the where clause. Trying to get records where the posting date is between the start date and the end date. Am I doing this right? Apparently not because the results are unaffected by the change. UPDATE: I CHANGED THE LAST JOIN. The results are correct if there's only one area allocation term. If there is more than one area allocation term, the results are duplicated for each term.
SELECT Distinct
l.lease_id ,
l.property_id as property_id,
l.lease_number as LeaseNumber,
l.name as LeaseName,
lty.name as LeaseType,
lst.name as LeaseStatus,
l.possession_date as PossessionDate,
l.rent as RentCommencementDate,
l.store_open_date as StoreOpenDate,
msr.description as MeasureUnit,
l.comments as Comments ,
lat.start_date as atStartDate,
lat.end_date as atEndDate,
lat.rentable_area as Rentable,
lat.usable_area as Usable,
laat.start_date as aatStartDate,
laat.end_date as aatEndDate,
MK.Path as OrgPath,
CAST(laa.percentage as numeric(9,2)) as Percentage,
laa.rentable_area as aaRentable,
laa.usable_area as aaUsable,
laa.headcounts as Headcount,
laa.area_allocation_term_id,
lat.area_term_id,
laa.area_allocation_id,
lp.posting_date
INTO #LEASES FROM la_tbl_lease l
INNER JOIN #LEASEID on l.lease_id=#LEASEID.lease_id
INNER JOIN la_tbl_lease_term lt on lt.lease_id=l.lease_id and lt.IsDeleted=0
LEFT JOIN la_tlu_lease_type lty on lty.lease_type_id=l.lease_type_id and lty.IsDeleted=0
LEFT JOIN la_tlu_lease_status lst on lst.status_id= l.status_id
LEFT JOIN la_tbl_area_group lag on lag.lease_id=l.lease_id
LEFT JOIN fnd_tlu_unit_measure msr on msr.unit_measure_key=lag.unit_measure_key
LEFT JOIN la_tbl_area_term lat on lat.lease_id=l.lease_id and lat.isDeleted=0
LEFT JOIN la_tbl_area_allocat_term laat on laat.area_term_id=lat.area_term_id and laat.isDeleted=0
LEFT JOIN dbo.la_tbl_area_allocation laa on laa.area_allocation_term_id=laat.area_allocation_term_id and laa.isDeleted=0
LEFT JOIN vw_FND_TLU_Menu_Key MK on menu_type_id_key=2 and isActive=1 and id=laa.menu_id_key
INNER JOIN la_tbl_lease_projection lp on lp.lease_projection_id = #LEASEID.lease_projection_id
where lp.posting_date <= laat.end_date and lp.posting_date >= laat.start_date
As may have already been hinted at you should be careful when using the WHERE clause with an OUTER JOIN.
The idea of the OUTER JOIN is to optionally join that table and provide access to the columns.
The JOINS will generate your set and then the WHERE clause will run to restrict your set. If you are using a condition in the WHERE clause that says one of the columns in your outer joined table must exist / equal a value then by the nature of your query you are no longer doing a LEFT JOIN since you are only retrieving rows where that join occurs.
Shorten it and copy it out as a new query in ssms or whatever you are using for testing. Use an inner join unless you want to preserve the left side set even when there is no matching lp.lease_id. Try something like
if object_id('tempdb..#leases) is not null
drop table #leases;
select distinct
l.lease_id
,l.property_id as property_id
,lp.posting_date
into #leases
from la_tbl_lease as l
inner join la_tbl_lease_projection as lp on lp.lease_id = l.lease_id
where lp.posting_date <= laat.end_date and lp.posting_date >= laat.start_date
select * from #leases
drop table #leases
If this gets what you want then you can work from there and add the other left joins to the query (getting rid of the select * and 'drop table' if you copy it back into your proc). If it doesn't then look at your Boolean date logic or provide more detail for us. If you are new to sql and its procedural extensions, try using the object explorer to examine the properties of the columns you are querying, and try selecting the top 1000 * from the tables you are using to get a feel for what the data looks like when building queries. -Mike
You can try the BETWEEN operator as well
Where lp.posting_date BETWEEN laat.start_date AND laat.end_date
Reasoning: You can have issues wheres there is no matching values in a table. In that instance on a left join the table will populate with null. Using the 'BETWEEN' operator insures that all returns have a value that is between the range and no nulls can slip in.
As it turns out, the problem was easier to solve and it was in a different place in the stored procedure. All I had to do was add one line to one of the cursors to include area term allocations by date.

Successive LEFT SQL joins back to original

I need to redo sql statement in legacy Foxpro application and don't understand whether it is meaningful at all. Syntax is a bit specific - it extracts data from temporary table into the same temporary table ( overwriting) with some joins.
SELECT aa.*,b.spa_date FROM (ALIAS()) aa INNER JOIN jobs ON aa.seq=jobs.seq ;
LEFT JOIN job2 ON jobs.job_no=job2.rucjob;
left join jobs b on b.job_no=job2.job_no;
WHERE jobs.qty1<>0 INTO CURSOR (ALIAS())
Since only one field is added from joined tables ( spa_date ) is there any point in 2 left joins or I am missing something. Isn't it equivalent to
SELECT aa.*,jobs.spa_date FROM (ALIAS()) aa INNER JOIN jobs ON aa.seq=jobs.seq ;
WHERE jobs.qty1<>0 INTO CURSOR (ALIAS())
They are different because b.spa_date come from the second left join. You may be missing filtered rows without both left joins.
You would need to know the intent of the original query and perhaps rewrite it to make more sense but I'd say the two queries are different.

Possible to combine two tables without losing all data by using JOINS

I have a table as below and I would like to know if I can still join them together, without losing existing data from both tables when they are combined by referencing JOIN methods.
Table details - VIEW Table
SELECT
r.domainid,
r.DomainStart,
r.Domain_End,
r.ddid,
r.confid,
r.pdbcode,
r.chainid,
d.pdbcode AS "CATH_PDBCODE",
d.cathbegin AS "CATH_BEGIN",
d.cathend AS "CATH_END"
FROM dyndom_domain_table r
JOIN cath_domains d ON d.pdbcode::character(4) = r.pdbcode
ORDER BY confid ASC;
As you can see, dyndom_domain_table is a VIEW Table that I have created to make it easier for me to use JOIN clauses with the other table that has the same pdbcode.
So far it just returns all of the data that matches with the PDB Code. What I would like to do is return all of the data that both matches and doesn't match each other's PDB Code.
Is there a rule in which I can apply it to? Or is it not possible?
I believe you want a FULL OUTER JOIN rather than just a JOIN (which is, by default, an INNER JOIN). In a FULL OUTER JOIN, every row in each table will correspond to some row in the result table; rows from one table that don't match the other will be extended with NULLs to fill the missing column.
So FULL OUTER JOIN instead of just JOIN, and that should do you.
I think you're asking for a left join, but I'm not sure.
SELECT
r.domainid,
r.DomainStart,
r.Domain_End,
r.ddid,
r.confid,
r.pdbcode,
r.chainid,
d.pdbcode AS "CATH_PDBCODE",
d.cathbegin AS "CATH_BEGIN",
d.cathend AS "CATH_END"
FROM dyndom_domain_table r
LEFT JOIN cath_domains d ON d.pdbcode::character(4) = r.pdbcode
ORDER BY confid ASC;