GROUP BY shows the same group more than once when using CASE - sql

I'm having an issue with a CASE Statement in T-SQL
Here is the query:
Select
CASE WHEN cri.ChartRetrievalMethodID IS NULL THEN wfseg.SiteEventGroupID
ELSE cri.ChartRetrievalMethodID END as Type,
count(distinct c.chartid) TotalCharts
From Sites s LEFT JOIN Charts c ON s.SiteID=c.SiteID
LEFT JOIN ChartRetrievalInformation cri ON c.ChartID=cri.ChartID
LEFT JOIN WFSiteEvents wfse ON wfse.SiteID=s.siteid
LEFT JOIN WFSiteEventTypes wfset ON wfset.EventTypeID=wfse.EventTypeID
LEFT JOIN WFSiteEventGroups wfseg ON wfset.SiteEventGroupID=wfseg.SiteEventGroupID
Where
wfse.EventStatusID in (1,2)
and s.ProjectID=110
group by
cri.ChartRetrievalMethodID, wfseg.SiteEventGroupID
I'm getting a lot of multiple rows instead of them combining into one - example:
+------+--------------+
| Type | Total Charts |
+------+--------------+
| 3 | 28 |
| 3 | 3 |
+------+--------------+
Ideally I would like these two rows mashed together to be just one:
+------+--------------+
| Type | Total Charts |
+------+--------------+
| 3 | 31 |
+------+--------------+
I'm sure there is nothing I'm writing incorrectly but I can't seem to see what it is.

If you include the fields cri.ChartRetrievalMethodID, wfseg.SiteEventGroupID in the column list for your select statement, it will become clear to you why these are shown in multiple rows with that grouping.
What you want to do is group by the value you're calling Type. In another DBMS this would be as simple as GROUP BY Type, but in SQL Server you must repeat the full expression in the GROUP BY clause.

Related

Having two SQL Server related tables, select complete rows of one and partial of other

In MS Access was very easy to acomplish but I'm having troubles with SQL Server
I have this query:
SELECT Organigrama.Item, Organigrama.Id, Organigrama.ParentItem, Rol_Menu.Cod_Rol
FROM Rol_Menu RIGHT JOIN
Organigrama ON Rol_Menu.Cod_Menu = Organigrama.Id
WHERE (Rol_Menu.Cod_Rol = '5')
The purpose is to get all the items of Organigrama and the elements in common with Rol_Menu.Col_Rol appears with 5, the others with Null
I need to fill a menu structure into a treeview
When the user select another Rol just get nodes checked that rol have access to
im determining if in the row the Col_Rol isn't null so the query I need to get
something like this:
Item | Id | ParentItem | Cod_Rol
A | 3 | null | 5
B | 4 | A | 5
C | 5 | A | null
D | 6 | B | 5
E | 7 | C | null
F | 8 | E | null
I think you just need to include the extra restriction in the join criteria rather then the where clause. The criteria are evaluated before the outer join adds the null columns. The where clause is evaluated afterwards, and eliminates the nulls.
select
Organigrama.Item,
Organigrama.Id,
Organigrama.ParentItem,
Rol_Menu.Cod_Rol
from
Rol_Menu
right join
Organigrama
on Rol_Menu.Cod_Menu = Organigrama.Id and
Rol_Menu.Cod_Rol = '5'
either that or add or Rol_Menu.Cod_Rol is null to the end of the where clause.

prevent from double/triple SUMing when JOINing

i am joining two tables: accn_demographics and accn_payments. The relationship between the two tables is one to many between accn_demographics.accn_id and accn_payments.accn_id
My question is when I am summing the PAID_AMT and COPAY_AMT, I am getting double/triple/quadrouple the number that I should be getting.
Is there an obvious problem with my join condition?
select sum(p.paid_amt) as SumPaidAmount
, sum(p.copay_amt) as SumCoPay
, p.pmt_date
, d.load_Date
, p.ACCN_ID
from accn_payments p
join
(
select distinct load_date, accn_id
from accn_demographics
) d
on p.ACCN_ID=d.ACCN_ID
where p.POSTED='Y'
and p.pmt_date between '20120701' and '20120731'
group by p.pmt_date, d.load_Date,p.ACCN_ID
order by 3 desc
thanks so much for your guidance.
You need to do the summation in a subquery:
select sum(p.SumPaidAmount) as SumPaidAmount, sum(p.SumCoPay) as SumCoPay,
p.pmt_date, d.load_Date, p.ACCN_ID
from (select accn_id, p.pmt_date, sum(paid_amt) as SumPaidAmt,
sum(copay_amt) as SumCoPay
from accn_payments p
where p.POSTED='Y' and
p.pmt_date between '20120701' and '20120731'
group by accn_id, pmt_date
) p join
(select distinct load_date, accn_id from accn_demographics) d
on p.ACCN_ID=d.ACCN_ID
group by p.pmt_date, d.load_Date,p.ACCN_ID
order by 3 desc
Question: do you really intend for pmt_date to be in the final results? It looks like you want to remove it from both the outer SELECT and the subquery.
The only thing I can see if that (select distinct load_date, accn_id from accn_demographics) might return several matches. Look at your data and run a separate query
select distinct load_date, accn_id from accn_demographics WHERE accn_id=SomeID
where SomeID is one of the result accounts that is returning double/triple values. That should pinpoint your problem.
Yes, but it's not so obvious for beginners. What happens is that for every accn_payments record, you're matching on ONLY the accn_id, which means if there are multiple records in accn_demographics for that particular accn_id, then you will get duplicate accn_payment records due to the join. Is there another limiting field on accn_demographics to join back to the payments?
Ultimately, think of it this way:
accn_payments (p):
accn_id | paid_amt | copay_amt | ...
----------------------------------------------------
1 | 100.00 | 20.00 | ...
accn_demographics (d):
accn_id | load_date | ...
------------------------------------
1 | 2012/01/01 | ...
1 | 2012/03/05 | ...
1 | 2012/06/23 | ...
After joining, your results will look like this:
p.accn_id | p.paid_amt | p.copay_amt | p... | d.accn_id | d.load_date | d...
----------------------------------------------------------------------------
1 | 100.00 | 20.00 | .... | 1 | 2012/01/01 | ....
1 | 100.00 | 20.00 | .... | 1 | 2012/03/05 | ....
1 | 100.00 | 20.00 | .... | 1 | 2012/06/21 | ....
As you can see, the same row from accn_payments gets replicated for every matching accn_demographics record, since you specified only the accn_id column to be the join criteria. It can't limit the results any further, so it the DB engine says "Hey, look, this p record matches for all these d records, this must be what he was asking for!" Obviously not what was intended, as when you sum on the p.paid_amt and p.copay_amt, it performs a sum for ALL ROWS (even though they are duplicated).
Ultimately, see if you can limit the join criteria for accn_demographics even further (by some date, perhaps), that way you limit the number of duplicate payment records during the join.

SQL: how to make a GROUP BY query considerin' also empty fields

I have to group some data into categories, based on the column "qualifier" that could be 1,2,3,4, or empty.
The problem is that "empty" is not considered into the "group by" categories.
Here's my query:
SELECT m.id, m.name, COUNT (*)
FROM _gialli_g2bff_distinct AS g
INNER JOIN flag.qualifier_flags AS m ON g.qualifier = m.id
GROUP BY m.name, m.id
ORDER BY m.id;
Here's the answer:
1 | "NOT_CONTRIBUTES_TO" | 2
2 | "CONTRIBUTES_TO" | 411
3 | "COLOCALIZES_WITH" | 200
4 | "NOT" | 983
The problem of this answer is that it does not take into account all the elements that have qualifier field EMPTY.
Here's what I would like to have as answer:
1 | "NOT_CONTRIBUTES_TO" | 2
2 | "CONTRIBUTES_TO" | 411
3 | "COLOCALIZES_WITH" | 200
4 | "NOT" | 983
5 | | 1854
How could I modify my query?
Thanks
Your problem is not occuring at the GROUP BY level, but rather in the JOIN. The rows with a NULL qualifier cannot be JOINed and, because you're using INNER JOIN, they fall out of the result set.
Use LEFT OUTER JOIN to see all the rows.

My SQL query within a query

I have 2 tables that I am trying to combine in a specific way
Table 1: ‘part_defs’ Table 2 Items_part_values
in value_defs:
ID | Name
-------------
1 | color
2 | size
3 | weight
in Items_part_values
ItemID | valueID | Value
-------------------------
10 | 1 | red
11 | 1 | blue
What I need is a query where for a given item all the rows from value_defs appear and if they have a value in Items_part_values the value.
So for Item 11 I want
ID | Name | Value
--------------------
1 | color | red
2 | size | NULL
3 | weight | NULL
I’m new to MySQL, in access I would have created a subquery with the ItemID as a parameter and then done a Left Join with value_defs on the result.
Is there a way of doing something similar in MySQL?
Thanks
Use:
SELECT p.id,
p.name,
ipv.value
FROM PART_DEFS p
LEFT JOIN ITEMS_PART_VALUES ipv ON ipv.valueid = p.id
AND ipv.itemid = ?
Replace the "?" with the itemid you want to search for.
This means all the PARTS_DEF rows will be returned, and if the ITEMS_PART_VALUES.valueid matches the PART_DEFS.id value, then the ITEMS_PART_VALUES.value value will be displayed for the item you are looking for. If there's no match, the value column will be NULL for that record.
There's a difference in OUTER JOINs, when specifying criteria in the JOIN vs the WHERE clause. In the JOIN, the criteria is applied before the JOIN occurs while in the WHERE clause the criteria is applied after the JOIN.
Use a left join:
SELECT * FROM Table1 LEFT JOIN Table2 USING (ID);
Edit:
SELECT * FROM part_defs LEFT JOIN Items_part_values ON part_defs.ID = Items_part_values.valueID;

Using multiple left joins to calculate averages and counts

I am trying to figure out how to use multiple left outer joins to calculate average scores and number of cards. I have the following schema and test data. Each deck has 0 or more scores and 0 or more cards. I need to calculate an average score and card count for each deck. I'm using mysql for convenience, I eventually want this to run on sqlite on an Android phone.
mysql> select * from deck;
+----+-------+
| id | name |
+----+-------+
| 1 | one |
| 2 | two |
| 3 | three |
+----+-------+
mysql> select * from score;
+---------+-------+---------------------+--------+
| scoreId | value | date | deckId |
+---------+-------+---------------------+--------+
| 1 | 6.58 | 2009-10-05 20:54:52 | 1 |
| 2 | 7 | 2009-10-05 20:54:58 | 1 |
| 3 | 4.67 | 2009-10-05 20:55:04 | 1 |
| 4 | 7 | 2009-10-05 20:57:38 | 2 |
| 5 | 7 | 2009-10-05 20:57:41 | 2 |
+---------+-------+---------------------+--------+
mysql> select * from card;
+--------+-------+------+--------+
| cardId | front | back | deckId |
+--------+-------+------+--------+
| 1 | fron | back | 2 |
| 2 | fron | back | 1 |
| 3 | f1 | b2 | 1 |
+--------+-------+------+--------+
I run the following query...
mysql> select deck.name, sum(score.value)/count(score.value) "Ave",
-> count(card.front) "Count"
-> from deck
-> left outer join score on deck.id=score.deckId
-> left outer join card on deck.id=card.deckId
-> group by deck.id;
+-------+-----------------+-------+
| name | Ave | Count |
+-------+-----------------+-------+
| one | 6.0833333333333 | 6 |
| two | 7 | 2 |
| three | NULL | 0 |
+-------+-----------------+-------+
... and I get the right answer for the average, but the wrong answer for the number of cards. Can someone tell me what I am doing wrong before I pull my hair out?
Thanks!
John
It's running what you're asking--it's joining card 2 and 3 to scores 1, 2, and 3--creating a count of 6 (2 * 3). In card 1's case, it joins to scores 4 and 5, creating a count of 2 (1 * 2).
If you just want a count of cards, like you're currently doing, COUNT(Distinct Card.CardId).
select deck.name, coalesce(x.ave,0) as ave, count(card.*) as count -- card.* makes the intent more clear, i.e. to counting card itself, not the field. but do not do count(*), will make the result wrong
from deck
left join -- flatten the average result rows first
(
select deckId,sum(value)/count(*) as ave -- count the number of rows, not count the column name value. intent is more clear
from score
group by deckId
) as x on x.deckId = deck.id
left outer join card on card.deckId = deck.id -- then join the flattened results to cards
group by deck.id, x.ave, deck.name
order by deck.id
[EDIT]
sql has built-in average function, just use this:
select deckId, avg(value) as ave
from score
group by deckId
What's going wrong is that you're creating a Cartesian product between score and card.
Here's how it works: when you join deck to score, you may have multiple rows match. Then each of these multiple rows is joined to all of the matching rows in card. There's no condition preventing that from happening, and the default join behavior when no condition restricts it is to join all rows in one table to all rows in another table.
To see it in action, try this query, without the group by:
select *
from deck
left outer join score on deck.id=score.deckId
left outer join card on deck.id=card.deckId;
You'll see a lot of repeated data in the columns that come from score and card. When you calculate the AVG() over data that has repeats in it, the redundant values magically disappear (as long as the values are repeated uniformly). But when you COUNT() or SUM() them, the totals are way off.
There may be remedies for inadvertent Cartesian products. In your case, you can use COUNT(DISTINCT) to compensate:
select deck.name, avg(score.value) "Ave", count(DISTINCT card.front) "Count"
from deck
left outer join score on deck.id=score.deckId
left outer join card on deck.id=card.deckId
group by deck.id;
This solution doesn't solve all cases of inadvertent Cartesian products. The more general-purpose solution is to break it up into two separate queries:
select deck.name, avg(score.value) "Ave"
from deck
left outer join score on deck.id=score.deckId
group by deck.id;
select deck.name, count(card.front) "Count"
from deck
left outer join card on deck.id=card.deckId
group by deck.id;
Not every task in database programming must be done in a single query. It can even be more efficient (as well as simpler, easier to modify, and less error-prone) to use individual queries when you need multiple statistics.
Using left joins isn't a good approach, in my opinion. Here's a standard SQL query for the result you want.
select
name,
(select avg(value) from score where score.deckId = deck.id) as Ave,
(select count(*) from card where card.deckId = deck.id) as "Count"
from deck;