SQL Multiple SUM analysis - sql

(First, sorry for my English I'm writing this from France)
I've read a lot of solutions here, but I'm quite lost!
Here's the problem: I've three tables
Budgets, Invoice, and Paiements
The relations are 1 to many between Invoice and Budgets, (there's always at least one budget) and, 0 to many between Invoice and Paiements i.e. (the Invoice can be not paid).
I'm trying to find any Invoice which is not paid OR partially paid!
Let's have an example
Then... I've written an SQL statement for it as the following:
Select sum(budget.amount) as m1, sum(Paiements.amount),budget.code
from
budget left outer join paiements
on
budget.code=Paiements.code
group by budget.code
I get this answer:
Now I'm trying to get only where C2 is 0 or C1 not equals C2.
How to modify my SQL statement?

You have to use the Having Clause to specify conditions that filter which group results should appear in the results.
The WHERE clause places conditions on the selected columns, whereas the HAVING clause places conditions on groups created by the GROUP BY clause.
So your query will be as the following:
Select sum(budget.amount) as m1, coalesce(sum(paiements.amount),0), budget.code
from
budget left outer join paiements
on
budget.code = paiements.code
group by budget.code
Having coalesce(sum(paiements.amount),0) = 0
or sum(budget.amount) <> coalesce(sum(paiements.amount),0)
The use of coalesce(sum(paiements.amount),0) function is to replace the null values with zeros.
See a demo from db-fiddle.
Note: it's a good practice to use aliases for the tables and columns in your query, doing so makes the query more readable.
Consider the following query:
select sum(B.amount) c1, coalesce(sum(P.amount),0) c2 ,B.code c3
from
budget B left outer join paiements P
on
B.code=P.code
group by B.code
having coalesce(sum(P.amount),0) =0
Or sum(B.amount) <> coalesce(sum(P.amount),0)

Related

SQL Query to remove duplicated data and take single column sum

I have the following table resulted from
SELECT m.MedName as [Medicine],m.MedSellPrice as [RetailPrice],m.MedType as [Type],
m.SoldQuantity as [Sold],m.Quantity as [Available],b.BillAmount as [Total Bill],b.BillDate
FROM BillMedicine AS bm LEFT JOIN
Medicine AS m
ON bm.MedicineID=m.id LEFT JOIN
Bill AS b
ON bm.BilIID = b. ID
but now I want to remove the repeated rows except the Sum of 'TotalBill'.
Use GROUP BY:
SELECT
m.MedName AS [Medicine],
m.MedSellPrice AS [RetailPrice],
m.MedType AS [Type],
m.SoldQuantity AS [Sold],
m.Quantity AS [Available],
SUM(b.BillAmount) AS [Total Bill]
FROM BillMedicine AS bm
LEFT JOIN Medicine AS m
ON bm.MedicineID = m.id
LEFT JOIN Bill AS b
ON bm.BilIID = b.ID
GROUP BY
m.MedName,
m.MedSellPrice,
m.MedType,
m.SoldQuantity,
m.Quantity;
Note that for the billing date, the two "duplicate" records you have highlighted have different dates. It is not clear which date, if any, you want to report here. I have omitted this column.
GROUP BY Is Best Option for DUPLICATE DATE Removed & SUM.
Select Column1,column2....., SUM(Total) as Total From Tablename Group BY column1,column2
You seem to want most (or all) columns from m and then the sum from another table. One method is a lateral join or correlated subquery:
SELECT m.*, -- or whatever columns you want,
(SELECT SUM(b.BillAmount)
FROM BillMedicine bm JOIN
Bill b
ON bm.BilIID = b.ID
WHERE bm.MedicineID = m.id
) as [Total Bill]
FROM Medicine m ;
I suggest this approach for several reasons.
This is often more efficient than an outer aggregation.
You have LEFT JOINs but they do not look correct. I suspect you want to start with the Medicine table.
You are including a date/time in the results, but clearly that is not appropriate when combining multiple rows.

LEFT JOIN help in sql

I have to make a list of customer who do not have any invoice but have paid an invoice … maybe twice.
But with my code (stated below) it contains everything from the left join. However I only need the lines highlighted with green.
How should I make a table with only the 2 highlights?
Select paymentsfrombank.invoicenumber,paymentsfrombank.customer,paymentsfrombank.value
FROM paymentsfrombank
LEFT OUTER JOIN debtors
ON debtors.value = paymentsfrombank.value
You only want to select columns from paymentsfrombank. So why do you even join?
select invoice_number, customer, value from paymentsfrombank
except
select invoice_number, customer, value from debtors;
(This requires exact matches as in your example, i.e. same amount for the invoice/customer).
There are two issues in your SQL. First, you need to join on Invoice number, not on value, as joining on value is pointless. Second, you need to only pick those payments where there are no corresponding debts, i.e. when you left-join, the table on the right has "null" in the joining column. The SQL would be something like this:
SELECT paymentsfrombank.invoicenumber,paymentsfrombank.customer,paymentsfrombank.value
FROM paymentsfrombank
LEFT OUTER JOIN debtors
ON debtors.InvoiceNumber = paymentsfrombank.InvoiceNumber
WHERE debtors.InvoiceNumber is NULL
in mysql we usually have this way to flip the relation and extract the rows that dosen't have relation.
Select paymentsfrombank.invoicenumber,paymentsfrombank.customer,paymentsfrombank.value
FROM paymentsfrombank
LEFT OUTER JOIN debtors
ON debtors.value = paymentsfrombank.value where debtors.value is null
You can use NOT EXISTS :
SELECT p.*
FROM paymentsfrombank p
WHERE NOT EXISTS (SELECT 1 FROM debtors d WHERE d.invoice_number = p.invoice_number);
However, the LEFT OUTER JOIN would also work if you add filtered with WHERE Clause to filtered out only missing customers that haven't any invoice information :
SELECT p.invoicenumber, p.customer, p.value
FROM paymentsfrombank P LEFT OUTER JOIN
debtors d
ON d.InvoiceNumber = p.InvoiceNumber
WHERE d.InvoiceNumber IS NULL;
Note : I have used table alias (p & d) that makes query to easier read & write.

SQL Query to count the records

I am making up a SQL query which will get all the transaction types from one table, and from the other table it will count the frequency of that transaction type.
My query is this:
with CTE as
(
select a.trxType,a.created,b.transaction_key,b.description,a.mode
FROM transaction_data AS a with (nolock)
RIGHT JOIN transaction_types b with (nolock) ON b.transaction_key = a.trxType
)
SELECT COUNT (trxType) AS Frequency, description as trxType,mode
from CTE where created >='2017-04-11' and created <= '2018-04-13'
group by trxType ,description,mode
The transaction_types table contains all the types of transactions only and transaction_data contains the transactions which have occurred.
The problem I am facing is that even though it's the RIGHT join, it does not select all the records from the transaction_types table.
I need to select all the transactions from the transaction_types table and show the number of counts for each transaction, even if it's 0.
Please help.
LEFT JOIN is so much easier to follow.
I think you want:
select tt.transaction_key, tt.description, t.mode, count(t.trxType)
from transaction_types tt left join
transaction_data t
on tt.transaction_key = t.trxType and
t.created >= '2017-04-11' and t.created <= '2018-04-13'
group by tt.transaction_key, tt.description, t.mode;
Notes:
Use reasonable table aliases! a and b mean nothing. t and tt are abbreviations of the table name, so they are easier to follow.
t.mode will be NULL for non-matching rows.
The condition on dates needs to be in the ON clause. Otherwise, the outer join is turned into an inner join.
LEFT JOIN is easier to follow (at least for people whose native language reads left-to-right) because it means "keep all the rows in the table you have already read".

How can I join 3 tables and calculate the correct sum of fields from 2 tables, without duplicate rows?

I have tables A, B, C. Table A is linked to B, and table A is linked to C. I want to join the 3 tables and find the sum of B.cost and the sum of C.clicks. However, it is not giving me the expected value, and when I select everything without the group by, it is showing duplicate rows. I am expecting the row values from B to roll up into a single sum, and the row values from C to roll up into a single sum.
My query looks like
select A.*, sum(B.cost), sum(C.clicks) from A
join B
left join C
group by A.id
having sum(cost) > 10
I tried to group by B.a_id and C.another_field_in_a also, but that didn't work.
Here is a DB fiddle with all of the data and the full query:
http://sqlfiddle.com/#!9/768745/13
Notice how the sum fields are greater than the sum of the individual tables? I'm expecting the sums to be equal, containing only the rows of the table B and C once. I also tried adding distinct but that didn't help.
I'm using Postgres. (The fiddle is set to MySQL though.) Ultimately I will want to use a having clause to select the rows according to their sums. This query will be for millions of rows.
If I understand the logic correctly, the problem is the Cartesian product caused by the two joins. Your query is a bit hard to follow, but I think the intent is better handled with correlated subqueries:
select k.*,
(select sum(cost)
from ad_group_keyword_network n
where n.event_date >= '2015-12-27' and
n.ad_group_keyword_id = 1210802 and
k.id = n.ad_group_keyword_id
) as cost,
(select sum(clicks)
from keyword_click c
where (c.date is null or c.date >= '2015-12-27') and
k.keyword_id = c.keyword_id
) as clicks
from ad_group_keyword k
where k.status = 2 ;
Here is the corresponding SQL Fiddle.
EDIT:
The subselect should be faster than the group by on the unaggregated data. However, you need the right indexes: ad_group_keyword_network(ad_group_keyword_id, ad_group_keyword_id, event_date, cost) and keyword_click(keyword_id, date, clicks).
I found this (MySQL joining tables group by sum issue) and created a query like this
select *
from A
join (select B.a_id, sum(B.cost) as cost
from B
group by B.a_id) B on A.id = B.a_id
left join (select C.keyword_id, sum(C.clicks) as clicks
from C
group by C.keyword_id) C on A.keyword_id = C.keyword_id
group by A.id
having sum(cost) > 10
I don't know if it's efficient though. I don't know if it's more or less efficient than Gordon's. I ran both queries and this one seemed faster, 27s vs. 2m35s. Here is a fiddle: http://sqlfiddle.com/#!15/c61c74/10
Simply split the aggregate of the second table into a subquery as follows:
http://sqlfiddle.com/#!9/768745/27
select ad_group_keyword.*, SumCost, sum(keyword_click.clicks)
from ad_group_keyword
left join keyword_click on ad_group_keyword.keyword_id = keyword_click.keyword_id
left join (select ad_group_keyword.id, sum(cost) SumCost
from ad_group_keyword join ad_group_keyword_network on ad_group_keyword.id = ad_group_keyword_network.ad_group_keyword_id
where event_date >= '2015-12-27'
group by ad_group_keyword.id
having sum(cost) > 20
) Cost on Cost.id=ad_group_keyword.id
where
(keyword_click.date is null or keyword_click.date >= '2015-12-27')
and status = 2
group by ad_group_keyword.id

Difference between Two Queries - Join vs IN

I have the following two queries. Query1 is returning 1000 as row count where as Query2 is returning 4000 as row count. Can someone please explain the difference between both the queries. I was hoping both would return same count.
Query1:
SELECT COUNT(*)
FROM TableA A
WHERE A.VIN IN (
SELECT VIN
FROM TableB B, TableC C
WHERE B.MODEL_YEAR = '2014' AND B.VIN_NBR = C.VIN
)
Query2:
SELECT COUNT(*)
FROM TABLEA A, TableB B, TableC C
WHERE B.MODEL_YEAR = '2014' AND B.VIN_NBR = C.VIN AND A.VIN = C.VIN
In many cases, they will return the same answer, but not necessarily. The first counts the number of rows in A that match the conditions -- each row is counted only once, regardless of the number of matches. The second does a join, which can multiply the number of rows.
The second query would be equivalent in results if it used count(distinct A.id), where id is unique or a primary key.
That said, although they are similar in functionality, how they are executed can be quite different. Different SQL engines might do a better job of optimizing one version or the other.
By the way, you should avoid the archaic join syntax that you are using. Since 1992, explicit joins have been part of SQL syntax.