Sql query with sum of sum - sql

I know that this type of question is answered many times here, but I can't use any answer to solve my problem, so please help. Here is my problem.
table 1
ID1 CustID Owe
Table 2
ID2 CustID Paid
I need simple thing, in one sql query I need sum(TotalOwe - TotalPaid) as Result where 1.custID=2.custID=#custID (this is not example of my query, don't correct it, this is just explanation). Or even more simpler, Customer with ID = 112 have TotalOwe of xxx and he is already paid TotalPaid, so he now owes TotalOwe - TotalPaid.
This looks really simple, I am even little embarrassed for asking, but I really don't have any more time for experimenting. I was close in one moment, but values of TotalOwe and TotalPaid was doubled, I don't know why but that is another thing.

SELECT COALESCE(TotalOwed,0) - COALESCE(TotalPaid,0)
FROM ( SELECT CustID,
SUM(Owe) TotalOwed
FROM table1
GROUP BY CustID) T1
FULL JOIN ( SELECT CustID,
SUM(Paid) TotalPaid
FROM table2
GROUP BY CustID) T2
ON T1.CustID = T2.CustID
WHERE COALESCE(T1.CustID,T2.CustID) = 112

Related

I have duplicate rows with one Id, and have to select and show only one with the latest date, here is my query:

I edited my question to make it more understanding.
I have 3 tables
1. st_kalk
id_artikli|cena_neto|iddg
1066 |25000 |34323
808 |231933 |25234
718 |22999 |34244
718 |22965 |23212
2._artikli
id_artikli
1066
808
718
718
3.dok
iddg |datum
34323 |4/22/2022
25234 |2/16/2021
23212 |1/29/2022
34244 |2/2/2022
In st_kalk I have column id_artikli as well in the _artikli table,
also st_kalk is related to dok by iddg as you can see in my query.
I have to join these 3 tables in order to get the correct price ( because some of my products have multiple prices (cena_neto)) based on the latest date (datum).
The query below works but only if I set id_artikla to specific one, but wont return every article with latest price and date, and that is what I am supposed to do.
I am sorry if this is confusing, it's my first time writing here also I am still learning sql. Thank you
SELECT top 1
k.id_artikla,
k.maxdatum,
cena_neto
FROM ( SELECT id_artikla,
MAX(datum) AS maxdatum,
cena_neto
FROM st_kalk INNER JOIN dok ON st_kalk.iddg=dok.iddg
GROUP BY id_artikla,cena_neto) k
INNER JOIN _artikli ON k.id_artikla=k.id_artikla
WHERE k.id_artikla=718
ORDER BY k.maxdatum DESC
Thank you in advance for your help
select distinct t.* from table t
inner join (select id, max(date) date from table group by id) t1
on t.id = t1.id and t.date = t1.date;
I used distinct on select to avoid pulling duplicate records if there exists duplicate latest dates for an Id.

subquery the same table in select statement

I have a resturant db and I need to total up the total value of all the items sold individually. So if I sold a hamburger that has a base price of $10.00 with bacon which costs $1.00 and a hambuger(again $10.00) with avacado that costs $0.50 I need to get $21.50 returned. My invoice table looks like this:
invoice_num item_num price item_id parent_item_id
111 hmbg 10.00 guid_1 ''
111 bacn 1.00 guid_2 guid_2
112 hmbg 10.00 guid_3 ''
112 avcd 0.50 guid_4 guid_3
I can get the sum of all the parent items like this:
SELECT item_num, SUM(price) FROM invoices WHERE parent_item_id = ''
it is the adding of the toppings that is confusing me. I feel like I need to add a subquery in the SUM but I'm not sure how to go about doing it and referencing the original query to use the item_id.
SELECT item_num, sum(i.price) + sum(nvl(x.ingred_price,0))
FROM invoices i
LEFT OUTER JOIN
(SELECT parent_item_id
, sum(price) ingred_price
FROM invoices
WHERE parent_item_id IS NOT NULL
GROUP BY parent_item_id) x
ON x.parent_item_id = i.item_id
WHERE i.parent_item_id IS NULL
GROUP BY item_num
Here's a SQL Fiddle that proves the above code works. I used Oracle, but you should be able to adapt it to whatever DB you are using.
Assumption: You don't have more than one level in a parent child relationship. E.g. A can have a child B, but B won't have any other children.
Not clear based on your question (see my comment) but as I understand it a simple group by will give you what you want. If not please explain (in the original question) why does this query does not work --- what is it missing from your requirements?
SELECT item_num, SUM(price)
FROM invoices
GROUP BY item_num
Hard to say, but looks like you need recursive cte.
Here's example for PostgreSQL:
with recursive cte as (
select
t.invoice_num, t.price, t.item_id, t.item_num
from Table1 as t
where t.parent_item_id is null
union all
select
t.invoice_num, t.price, t.item_id, c.item_num
from Table1 as t
inner join cte as c on c.item_id = t.parent_item_id
)
select invoice_num, item_num, sum(price)
from cte
group by invoice_num, item_num
sql fiddle demo
I've used null for empty parent_item_id (it's better solution than using empty strings), but you can change this to ''.

Count Response once in 30 days SQL

If I have a customer respond to the same survey in 30 days more than once, I only want to count it once. Can someone show me code to do that please?
create table #Something
(
CustID Char(10),
SurveyId char(5),
ResponseDate datetime
)
insert #Something
select 'Cust1', '100', '5/6/13' union all
select 'Cust1', '100', '5/13/13' union all
select 'Cust2', '100', '4/20/13' union all
select 'Cust2', '100', '5/22/13'
select distinct custid, SurveyId, Count(custid) as CountResponse from #Something
group by CustID, SurveyId
The above code only gives me the total count of Response, not sure how to code to count only once per 30 day period.
The output I'm looking for should be like this:
CustomerID SurveyId CountResponse
Cust1 100 1
Cust2 100 2
Going on the theory that you want your periods calculated as 30 days from the first time a survey is submitted, here is a (gross) solution.
declare #Something table
(
CustID Char(10),
SurveyId char(5),
ResponseDate datetime
)
insert #Something
select 'Cust1', '100', '5/6/13' union all
select 'Cust1', '100', '5/13/13' union all
select 'Cust1', '100', '7/13/13' union all
select 'Cust2', '100', '4/20/13' union all
select 'Cust2', '100', '5/22/13' union all
select 'Cust2', '100', '7/20/13' union all
select 'Cust2', '100', '7/24/13' union all
select 'Cust2', '100', '9/28/13'
--SELECT CustID,SurveyId,COUNT(*) FROM (
select a.CustID,a.SurveyId,b.ResponseStart,--CONVERT(int,a.ResponseDate-b.ResponseStart),
CASE
WHEN CONVERT(int,a.ResponseDate-b.ResponseStart) > 30
THEN ((CONVERT(int,a.ResponseDate-b.ResponseStart))-(CONVERT(int,a.ResponseDate-b.ResponseStart) % 30))/30+1
ELSE 1
END CustomPeriod -- defines periods 30 days out from first entry of survey
from #Something a
inner join
(select CustID,SurveyId,MIN(ResponseDate) ResponseStart
from #Something
group by CustID,SurveyId) b
on a.SurveyId=b.SurveyId
and a.CustID=b.CustID
group by a.CustID,a.SurveyId,b.ResponseStart,
CASE
WHEN CONVERT(int,a.ResponseDate-b.ResponseStart) > 30
THEN ((CONVERT(int,a.ResponseDate-b.ResponseStart))-(CONVERT(int,a.ResponseDate-b.ResponseStart) % 30))/30+1
ELSE 1
END
--) x GROUP BY CustID,SurveyId
At the very least you'd probably want to make the CASE statement a function so it reads a bit cleaner. Better would be defining explicit windows in a separate table. This may not be feasible if you want to avoid situations like surveys returned at the end of period one followed by another in period two a couple days later.
You should consider handling this on input if possible. For example, if you are identifying a customer in an online survey, reject attempts to fill out a survey. Or if someone is mailing these in, make the data entry person reject it if one has come within 30 days.
Or, along the same lines as "wild and crazy", add a bit and an INSERT trigger. Only turn the bit on if no surveys of that type for that customer found within the time period.
Overall, phrasing the issue a little more completely would be helpful. However I do appreciate the actual coded example.
I'm not a SQL Server guy, but in Oacle if you subtract integer values from a 'date', you're effectively subtracting "days," so something like this could work:
SELECT custid, surveyid
FROM Something a
WHERE NOT EXISTS (
SELECT 1
FROM Something b
WHERE a.custid = b.custid
AND a.surveyid = b.surveyid
AND b.responseDate between a.responseDate AND a.responseDate - 30
);
To get your counts (if I udnerstand what you're asking for):
-- Count of times custID returned surveyID, not counting same
-- survey within 30 day period.
SELECT custid, surveyid, count(*) countResponse
FROM Something a
WHERE NOT EXISTS (
SELECT 1
FROM Something b
WHERE a.custid = b.custid
AND a.surveyid = b.surveyid
AND b.responseDate between a.responseDate AND a.responseDate - 30
)
GROUP BY custid, surveyid
UPDATE: Per the case raised below, this actually wouldn't quite work. What you should probably do is iterate through your something table and insert the rows for the surveys you want to keep in a results table, then compare against the results table to see if there's already been a survey received in the last 30 days you want considered. I could show you how to do something like this in oracle PL/SQL, but I don't know the syntax off hand for SQL server. Maybe someone else who knows sql server wants to steal this strategy to code up an answer for you, or maybe this is enough for you to go on.
Call me wild and crazy, but I would solve this problem by storing more state with each survey. The approach I would take is to add a bit type column that indicates whether a particular survey should be counted (i.e., a Countable column). This solves the tracking of state problem inherent in solving this relationally.
I would set values in Countable to 1 upon insertion, if no survey with the same CustID/SurveyId can be found in the preceding 30 days with a Countable set to 1. I would set it to 0, otherwise.
Then the problem becomes trivially solvable. Just group by CustID/SurveyId and sum up the values in the Countable column.
One caveat of this approach is that it imposes that surveys must be added in chronological order and cannot be deleted without a recalculation of Countable values.
Here's one way to handle it I believe. I tested quickly, and it worked on the small sample of records so I'm hopeful it will help you out. Best of luck.
SELECT s.CustID, COUNT(s.SurveyID) AS SurveyCount
FROM #something s
INNER JOIN (SELECT CustID, SurveyId, ResponseDate
FROM (SELECT #Something.*,
ROW_NUMBER() OVER (PARTITION BY custid ORDER BY ResponseDate ASC) AS RN
FROM #something) AS t
WHERE RN = 1 ) f ON s.CustID = f.CustID
WHERE s.ResponseDate BETWEEN f.ResponseDate AND f.ResponseDate+30
GROUP BY s.CustID
HAVING COUNT(s.SurveyID) > 1
Your question is ambiguous, which may be the source of your difficulty.
insert #Something values
('Cust3', '100', '1/1/13'),
('Cust3', '100', '1/20/13'),
('Cust3', '100', '2/10/13')
Should the count for Cust3 be 1 or 2? Is the '2/10/13' response invalid because it was less than 30 days after the '1/20/13' response? Or is the '2/10/13' response valid because the '1/20/13' is invalidated by the '1/1/13' response and therefore more than 30 days after the previous valid response?
The code below is one approach which yields your example output. However, if you add a select 'Cust1', '100', '4/20/13', the result will still be Cust1 100 1 because they are all within 30 days of each prior survey response and so only the first one would be counted. Is this the desired behavior?
SELECT CustID, SurveyID, COUNT(*) AS CountResponse
FROM #SurveysTaken
WHERE (NOT EXISTS
(SELECT 1
FROM #SurveysTaken AS PriorSurveys
WHERE (CustID = #SurveysTaken.CustID)
AND (SurveyId = #SurveysTaken.SurveyId)
AND (ResponseDate >= DATEADD(d, - 30, #SurveysTaken.ResponseDate))
AND (ResponseDate < #SurveysTaken.ResponseDate)))
GROUP BY CustID, SurveyID
Alternatively, you could break the year into arbitrary 30 day periods, resetting with each new year.
SELECT CustID, SurveyID, COUNT(*) AS CountResponse
FROM (SELECT DISTINCT CustID, SurveyID, YEAR(ResponseDate) AS RepsonseYear,
DATEPART(DAYOFYEAR, ResponseDate) / 30 AS ThirtyDayPeriod
FROM #SurveysTaken) AS SurveysByPeriod
GROUP BY CustID, SurveyID
You could also just go by month.
SELECT CustID, SurveyID, COUNT(*) AS CountResponse
FROM (SELECT DISTINCT CustID, SurveyID, YEAR(ResponseDate) AS ResponseYear,
MONTH(ResponseDate) AS ResponseMonth
FROM #SurveysTaken) AS SurveysByMonth
GROUP BY CustID, SurveyID
You could use 30 day periods from an arbitrary epoch date. (Perhaps by pulling the date the survey was first created from another query?)
SELECT CustID, SurveyID, COUNT(*) AS CountResponse
FROM (SELECT DISTINCT CustID, SurveyID, DATEDIFF(D, '1/1/2013', ResponseDate) / 30 AS ThirtyDayPeriod
FROM #SurveysTaken) AS SurveysByPeriod
GROUP BY CustID, SurveyID
One final variation on arbitrary thirty periods is to base them on the first time the customer ever responded to the survey in question.
SELECT CustID, SurveyID, COUNT(*) AS CountResponse
FROM (SELECT DISTINCT CustID, SurveyID, DATEDIFF(DAY,
(SELECT MIN(ResponseDate)
FROM #SurveysTaken AS FirstSurvey
WHERE (CustID = #SurveysTaken.CustID)
AND (SurveyId = #SurveysTaken.SurveyId)), ResponseDate) / 30 AS ThirtyDayPeriod
FROM #SurveysTaken) AS SurveysByPeriod
GROUP BY CustID, SurveyID
There is one issue that you run into with the epoch/period trick which is that the counted surveys occur only once per period but aren't necessarily 30 days apart.

Getting SUM from 2 different tables into one result

I have been trying to get this to work for 12 hrs now and I cannot :-( Can someone please show me how I can get the ssnumber to group and get the total for each ssnumber.
Here is what I have now. In Table number 1 I have this code
SELECT
UNIT_NO, SUM(RATEB) AS TOTALRTE
FROM TABLE1
WHERE
TRUCK_PAID = 1
AND PICK_UP_DATE >= '(fromdate)'
AND PICK_UP_DATE <= '(todate)'
GROUP BY
UNIT_NO
ORDER BY
UNIT_NO
But table number 2 is where the ssnumber column is, so what I'm trying to do is the rateB sum from all of the loads for each unit_no and then group them and then go into table number 2 and group the ssnumber with the unit number from table number 1 and sum the rateB from table number 1.
Something like this (see below) but its not working :-(
SELECT
UNIT_NO, SUM(RATEB)
FROM
TABLE1
WHERE
TRUCK_PAID = 1
AND PICK_UP_DATE >= '(fromdate)'
AND PICK_UP_DATE <= '(todate)'
GROUP BY
UNIT_NO
JOIN
TABLE TABLE1.UNIT_NO = TABLE2.UNIT_NO GROUP BY TABLE2.SS_NUM
or
SELECT
UNIT_NO, SUM(RATEB) AS TOTALRATE
FROM
TABLE1
GROUP BY
UNIT_NO
JOIN
TRUCKS ON (TABLE1.UNIT_NO = TABLE2.UNIT_NO)
GROUP BY
TABLE2.SSNUMBER
Thank you guys so much for any help...
As requested, it is hard to really understand what you are trying to accomplish without more info about table2 and maybe an example of what you are expecting. However, what I got from your description is that you are trying to accomplish something like this?
SELECT UNIT_NO, TOTALRTE, TOTALLDSRTE
FROM
(
SELECT UNIT_NO,SUM(RATEB) AS TOTALRTE
FROM LOADS
GROUP BY UNIT_NO
) AS tbl1
JOIN
(
SELECT SS_NUM, SUM(RATEB) AS TOTALLDSRTE
FROM LOADS
GROUP BY SS_NUM
) AS tbl2
ON tbl1.UNIT_NO = tbl2.SS_NUM
I would suggest instead of getting data from two select queries in one select query, try to fetch them as separate queries. This saves a lot of time. That, or you can create a table for the result and update the result of each query into the table.

SQL Select Statement

I think this is a pretty basic question and I have looked around on the site but I am not sure what to search on to find the answer.
I have an SQL table that looks like:
studentId period class
1 1 math
1 2 english
2 1 math
2 2 history
I am looking for a SELECT statement that finds the studentId that is taking math 1st period and english 2nd period. I have tried something like SELECT studentID WHERE ( period = 1 AND class= "math" ) AND ( period = 2 AND class = "english" ) but that has not worked.
I have also thought about changing my table to be:
studentId period1 period2 period3 period4 period5 etc
But I think I want to be adding things besides classes like after school activities and wanted to be able to expand easily without constantly having to add columns.
Thanks for any help you can give me.
try something like:
select studentid from table where ( period = 1 AND class= "math" ) or ( period = 2 AND class =
"english" ) group by studentid having count(*) >= 2
the idea is to select all who meet the first criteria or the second criteria, group it by person and see where all are met by checking the number of rows grouped
You can use subqueries to do each individually and get only results where both subqueries match.
Select StudentId FROM table WHERE
StudentId IN
(SELECT studentID FROM table WHERE ( period = 1 AND class= "math" ) )
AND
StudentId IN
(SELECT studentID FROM table WHERE ( period = 2 AND class= "english" ) )
Edit - added
I have not tested this myself, but I was curious about performance considerations, so I looked it up. I found this quote:
Many Transact-SQL statements that
include subqueries can be
alternatively formulated as joins.
Other questions can be posed only with
subqueries. In Transact-SQL, there is
usually no performance difference
between a statement that includes a
subquery and a semantically equivalent
version that does not. However, in
some cases where existence must be
checked, a join yields better
performance. Otherwise, the nested
query must be processed for each
result of the outer query to ensure
elimination of duplicates. In such
cases, a join approach would yield
better results. The following is an
example showing both a subquery SELECT
and a join SELECT that return the same
result set:
here: http://technet.microsoft.com/en-us/library/ms189575.aspx
You could also do a self join
SELECT t1.studentID
FROM table t1
JOIN table t2 ON t1.studentID = t2.studentID
WHERE ( t1.period = 1 AND t1.class= "math" )
AND ( t2.period = 2 AND t2.class = "english" )