comparison of two aggregate function without group by clause - sql

suppose we have 3 tables,1st table is the master table below is the details
MasterTable(Id,col1,col2)
ChildTable1(Id,MasterId,Amount1)
ChildTable2(Id,MasterId,Amount2)
i have to find the id from mastertable where sum(chiletable1.Amount1) should be greater than sum(chiletable2.Amount2)
i have written the query using groupby and having clause which givs the right data,
is there any other way to write the same query without groupby keeping in mind the performance issue if there is billons of records in the table.
following is my query
select Mastertable.Id
from Mastertable,Childtable1,ChildTable2
where Mastertable.Id=Childtable1.MasterId
and ChildTable1.MasterId=ChildTable2.MasterId
group by MasterTable.Id
having SUM(Childtable1.Amount1) > SUM(Childtable2.Amount2)

To answer your question, no.
The reason is simple: Without a GROUP BY you can't have SUM()s. (except maybe a grand total)
(EDIT: Well, maybe a stored procedure with loops, cursors and the like, but that would definitely be an overkill! ;) )

Well, you're going to have to group at some point, so I don't know why you don't want to group at all.
You could try grouping by the IDs in the child tables:
SELECT M.Id
FROM Mastertable M
INNER JOIN (SELECT MasterID,
SUM(Amount) Amount
FROM Childtable1
GROUP BY MasterID) C1
ON M.Id = C1.MasterID
INNER JOIN (SELECT MasterID,
SUM(Amount) Amount
FROM Childtable2
GROUP BY MasterID) C2
ON M.Id = C2.MasterID
WHERE C1.Amount > C2.Amount
If there's either a foreign key back to MasterTable or an index on MasterID in the child tables this is likely going to be the fastest way without pre-aggregating via snapshots or some other method.

Related

Subtracting values of columns from two different tables

I would like to take values from one table column and subtract those values from another column from another table.
I was able to achieve this by joining those tables and then subtracting both columns from each other.
Data from first table:
SELECT max_participants FROM courses ORDER BY id;
Data from second table:
SELECT COUNT(id) FROM participations GROUP BY course_id ORDER BY course_id;
Here is some code:
SELECT max_participants - participations AS free_places FROM
(
SELECT max_participants, COUNT(participations.id) AS participations
FROM courses
INNER JOIN participations ON participations.course_id = courses.id
GROUP BY courses.max_participants, participations.course_id
ORDER BY participations.course_id
) AS course_places;
In general, it works, but I was wondering, if there is some way to make it simplier or maybe my approach isn't correct and this code will not work in some conditions? Maybe it needs to be optimized.
I've read some information about not to rely on natural order of result set in databases and that information made my doubts to appear.
If you want the values per course, I would recommend:
SELECT c.id, (c.max_participants - COUNT(p.id)) AS free_places
FROM courses c LEFT JOIN
participations p
ON p.course_id = c.id
GROUP BY c.id, c.max_participants
ORDER BY 1;
Note the LEFT JOIN to be sure all courses are included, even those with no participants.
The overall number is a little tricker. One method is to use the above as a subquery. Alternatively, you can pre-aggregate each table:
select c.max_participants - p.num_participants
from (select sum(max_participants) as max_participants from courses) c cross join
(select count(*) as num_participants from participants from participations) p;

SQL Query to count the records

I am making up a SQL query which will get all the transaction types from one table, and from the other table it will count the frequency of that transaction type.
My query is this:
with CTE as
(
select a.trxType,a.created,b.transaction_key,b.description,a.mode
FROM transaction_data AS a with (nolock)
RIGHT JOIN transaction_types b with (nolock) ON b.transaction_key = a.trxType
)
SELECT COUNT (trxType) AS Frequency, description as trxType,mode
from CTE where created >='2017-04-11' and created <= '2018-04-13'
group by trxType ,description,mode
The transaction_types table contains all the types of transactions only and transaction_data contains the transactions which have occurred.
The problem I am facing is that even though it's the RIGHT join, it does not select all the records from the transaction_types table.
I need to select all the transactions from the transaction_types table and show the number of counts for each transaction, even if it's 0.
Please help.
LEFT JOIN is so much easier to follow.
I think you want:
select tt.transaction_key, tt.description, t.mode, count(t.trxType)
from transaction_types tt left join
transaction_data t
on tt.transaction_key = t.trxType and
t.created >= '2017-04-11' and t.created <= '2018-04-13'
group by tt.transaction_key, tt.description, t.mode;
Notes:
Use reasonable table aliases! a and b mean nothing. t and tt are abbreviations of the table name, so they are easier to follow.
t.mode will be NULL for non-matching rows.
The condition on dates needs to be in the ON clause. Otherwise, the outer join is turned into an inner join.
LEFT JOIN is easier to follow (at least for people whose native language reads left-to-right) because it means "keep all the rows in the table you have already read".

How to find the most frequent value in a select statement as a subquery?

I am trying to get the most frequent Zip_Code for the Location ID from table B. Table A(transaction) has one A.zip_code per Transaction but table B(Location) has multiple Zip_code for one area or City. I am trying to get the most frequent B.Zip_Code for the Account using Location_D that is present in both table.I have simplified my code and changed the names of the columns for easy understanding but this is the logic for my query I have so far.Any help would be appreciated. Thanks in advance.
Select
A.Account_Number,
A.Utility_Type,
A.Sum(usage),
A.Sum(Cost),
A.Zip_Code,
( select B.zip_Code from B where A.Location_ID= B.Location_ID having count(*)= max(count(B.Zip_Code)) as Location_Zip_Code,
A.Transaction_Date
From
Transaction_Table as A Left Join
Location Table as B On A.Location_ID= B.Location_ID
Group By
A.Account_Number,
A.Utility_Type,
A.Zip_Code,
A.Transaction_Date
This is what I come up with:
Select tt.Account_Number, tt.Utility_Type, Sum(tt.usage), Sum(tt.Cost),
tt.Zip_Code,
(select TOP 1 l.zip_Code
Location_Table l
where tt.Location_ID = l.Location_ID
group by l.zip_code
order by count(*) desc
) as Location_Zip_Code,
tt.Transaction_Date
From Transaction_Table tt
Group By tt.Account_Number, tt.Utility_Type, tt.Zip_Code, tt.Transaction_Date;
Notes:
Table aliases are a good thing. However, they should be abbreviations for the tables referenced, rather than arbitrary letters.
The table alias qualifies the column name, not the function. Hence sum(tt.usage) rather than tt.sum(usage).
There is no need for a join in the outer query. You are doing all the work in the subquery.
An order by with top seems the way to go to get the most common zip code (which, incidentally, is called the mode in statistics).

What is the best way to find the max record of a table per a foreign key?

At work, I often have to find the max status per a foreign key. I have for the most part always used a correlated sub-query on the join to get the right record. This is assuming the highest primary key is the most recent. Here is a little demo
select
c.plate_number, o.name
from
Car c
inner join Owner o
on o.owner_id = (
select max(owner_id)
from Owner
where owner_type = 'PRIMARY'
)
This is pretty fast in most queries I use, not to mention being able to put extra criteria in the sub-query for type columns. I have tried using NOT EXIST clauses to make sure there are no higher records, but can't find anything else. Can someone suggest anything better and if so why?
I recommend using the sandard windowing functions....
;with cte as (
select c.plateNumber, o.name,
row_number() over (partition by c.ownerId order by purchaseDate desc) rw
from car c
inner join owner o
on o.ownerid = c.ownerid
)
select *
from cte
where rw=1;
allows you to get whatever you want from either table, and still only get one record

Does this query use any sub-queries or temporary tables? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
How can I rewrite this query without sub-queries?
I am not allowed to use sub-queries or any kind of temporary tables as a part of an answer, does this query use any of those?
SELECT DISTINCT item_no, avg_price
FROM Prices
NATURAL JOIN (SELECT AVG(totalamount) avg_price FROM Prices GROUP BY price) av
WHERE sum > aurnover ORDER BY avg_price DESC , branch;
You are using two sub-queries on these lines:
NATURAL JOIN (SELECT branch, AVG(totalamount) avg_price FROM Prices GROUP BY branch) av
NATURAL JOIN (SELECT item_no,branch, SUM(totalamount) sum FROM Prices GROUP BY branch, table_no) su
Well, you've definitely got subqueries (you can see that clearly with the nested SELECT calls).
You'd likely need to inspect the output of EXPLAIN-ing that (just append EXPLAIN to the start of that query for your database) to know for sure if it'll generate a temporary table. Look for the text "using temporary" in the "Extra" column.
What you have are derived tables.
A derived table is similar to selecting from a view
Consider this:
SELECT
*
FROM (SELECT * FROM TABLEA) A
If you made a the table you are selecting from the outside is derived from the inner select
Note as a derived table the select will only run once
This is an example of a sub query which will run once for every tableA Record
Select
(Select ID from TABLEB WHERE tableB.Code = TableA.Code) AS TableBID
FROM TableA
This is also a subquery
Select
*
FROM TableA
Where TableA.Code IN (Select TABLEB.Code From TableB)
Subquerys are generally used in Where Clauses and Columns but can cause serious preformance issues.