SQL - Group by with having Result - sql

For the following table i need to fetch user who did min 2 distinct transactions and have sum of net sale equal to or more than 20,
But, everything need to be in same select cant use temp table, i am using the below query, but getting ambiguity in result,
select z.customer_nbr, transaction_nbr
from sales_transaction,
(select customer_nbr
from sales_transaction
group by customer_nbr
having count(transaction_nbr) >=2) z
group by z.customer_nbr, transaction_nbr
having sum(net_sales_rtl)>20
Below is the result
Result ambiguity - customer_numer have no transaction with no 16

By "user", I assume you mean the entity referred to by customer_nbr.
Your query is only looking at the net sales for a single transaction, not for the entire customer.
You seem to want aggregation and having:
select st.customer_nbr
from sales_transaction st
group by st.customer_nbr
having count(distinct st.transaction_nbr) >= 2 and
sum(st.net_sales) > 20;
If you wanted all transactions to follow the 20 minimum, then two levels of aggregation would be appropriate:
select ct.customer_nbr
from (select st.customer_nbr, st.transaction_nbr,
sum(st.net_sales) as transaction_net_sales
from sales_transaction st
group by st.customer_nbr, st.transaction_nbr
) ct
group by ct.customer_nbr
having count(*) >= 2 and
min(ct.transaction_net_sales) > 20;

I think what is missing here is a join between the results from sales_transaction and the subquery z.
Considering both your tables share column transaction_nbr, you could have something like this:
select z.customer_nbr, s.transaction_nbr
from sales_transaction s,
(select customer_nbr, transaction_nbr
from sales_transaction
group by customer_nbr, transaction_nbr
having count(transaction_nbr) >=2) z
where z.transaction_nbr = s.transaction_nbr
group by z.customer_nbr, transaction_nbr
having sum(net_sales_rtl)>20

Related

Count Distinct values in one column based on other column

I am trying to count distinct values on Z_l based on value by using with clause. Sample data exercise included below.
please look at the picture, the distinct values of Z_l based on X='ny'
with distincz_l as (select ny.X, ny.z_l o.cnt From HOPL ny join (select X, count(*) as cnt from HOPL group by X) o on (ny.X = o.Z_l)) select * from HOPL;
You don't even need a WITH clause, since you just need one single sentence:
SELECT z_l, count(1)
FROM hopl
WHERE x='ny'
GROUP BY z_l
;

Count query with timestamp value

I would like to create a count query (in Postgres) which counts data.data_name dependent on data.todb_date.
So what I want to is that the query counts all the rows that are higher than the requirement in the WHERE clause. I tried Count(data.data_name) and Count(*) but they didn't work.
My planned result looks like this:
todb_date: 2016-01-01
data.data_name : test1
count: 150
todb_date: 2017-01-01
data.data_name : test1
count: 130
This is the query I have tried:
SELECT data.data_name, parentdata.data_id,
data.data_id, parentdata.todb_date,
COUNT (data.data_name)
FROM parentdata, data
WHERE parentdata.data_id = data.data_id
AND parentdata.todb_date > '2016-01-01'
GROUP BY parentdata.data_id, data.data_id, data.data_name, parentdata.todb_date
As #Usagi Miyamoto suggested, you should use a data_trunc() function to group your results according to certain time increments (here: per year):
SELECT d.data_name nam, date_trunc('year',p.todb_date) yr, COUNT(*) cnt
FROM parentdata p
INNER JOIN data d ON p.data_id = d.data_id AND p.todb_date > '2016-01-01'
GROUP BY d.data_name,date_trunc('year',p.todb_date)
ORDER BY nam, yr
If you replace 'year' by 'date' you will get daily counts, see here.

Select TOP predicate

I have a table with fields StudentID, ClassID, ExamID, SubjectID and Scores
I am trying to get a sum of 7 top Scores from attempted subjects from every student. The SQL statement below is giving me the sum of scores of all the subject from the top 7 students:
SELECT TOP 7 Sum(tblScores.Scores) AS Total, tblScores.AdmissionID
FROM tblScores
WHERE (((tblScores.ExamID)=[Forms]![frmReports]![lstC]) AND ((tblScores.ClassID)=[Forms]![frmReports]![lstB]))
GROUP BY tblScores.AdmissionID
ORDER BY Sum(tblScores.Scores) DESC;
The Class and Exam criteria is read from the Form "frmReports"
Any one who can help me out?
Consider a correlated subquery to calculate a running rank of scores. Then, nest this select query in a derived table for Score aggregation, filtered by each students' top 7 scores (including ties):
SELECT main.AdmissionID, Sum(main.Scores) As [Total]
FROM
(SELECT tblScores.AdmissionID, tblScores.Scores,
(SELECT Count(*) FROM tblScores sub
WHERE sub.AdmissionID = tblScores.AdmissionID
AND sub.Scores >= tblScores.Scores) As ScoreRank
FROM tblScores
WHERE (((tblScores.ExamID)=[Forms]![frmReports]![lstC])
AND ((tblScores.ClassID)=[Forms]![frmReports]![lstB]))
) As main
WHERE main.ScoreRank <= 7
GROUP BY main.AdmissionID
Try this:
SELECT TOP 7 Sum(tblScores.Scores) AS Total, tblScores.AdmissionID
FROM tblScores
HAVING (((tblScores.ExamID)=[Forms]![frmReports]![lstC]) AND ((tblScores.ClassID)=[Forms]![frmReports]![lstB]))
GROUP BY tblScores.AdmissionID
ORDER BY Sum(tblScores.Scores) DESC;
# Parfait - Thank you for your guidance. I have actually tweaked your solution to get a perfect answer: Here is the sql:
SELECT Dupe.AdmissionID, Dupe.Scores, Dupe.ScoreRank
FROM (SELECT qryFilteredScores.AdmissionID, qryFilteredScores.Scores, (SELECT Count(*) FROM qryFilteredScores AS sub
WHERE sub.AdmissionID = qryFilteredScores.AdmissionID AND sub.Scores >qryFilteredScores.Scores)+ 1 AS ScoreRank FROM qryFilteredScores WHERE (((qryFilteredScores.ExamID)=Forms!frmReports!lstC) And ((qryFilteredScores.ClassID)=Forms!frmReports!lstB))) AS Dupe
WHERE (((Dupe.Scores)<>0) AND ((Dupe.ScoreRank)<=7));
Alternatively, you can use a derived table or stored query (where you save derived table as separate object referenced in this query) and avoid inline subqueries:
SELECT Dupe.AdmissionID, Total.TotalScore
FROM qryFilteredScores AS Dupe
INNER JOIN
(SELECT sub.AdmissionID, Sum(sub.Scores) As TotalScore
FROM qryFilteredScores sub
WHERE ([AdmissionID]=Dupe.[AdmissionID]
AND ((sub.ClassID)=[Forms]![frmReports]![lstB])
AND ((tblScores.ExamID)=[Forms]![frmReports]![lstC]))
GROUP BY sub.AdmissionID) AS Total
ON Dupe.AdmissionID = Total.AdmissionID
GROUP BY Dupe.AdmissionID, Total.TotalScore
ORDER BY Dupe.AdmissionID;

SQL Nested Select -Subquery returned more than 1 value-

I have a table Sales with columns SalesID, SalesName, SalesCity, SalesState.
I am trying to come up with a query that only shows salesName where there is one SalesName per SalesCity. So for example, if SaleA is in Houston and SaleB is in Houston, SaleA and SaleB will not be returned.
select
SalesName, SalesCity, SalesState
from
Sales
where
(select count(*) from Sales group by SalesCity) = 1;
I am not entirely sure how to link the inner select back out. I need another column in the nested select to identify the SalesID. I am currently stuck and have made no progress.
You can get the names of cities that have only 1 sale by using GROUP BY and HAVING operators. Then use these results in your where clause:
SELECT SalesName, SalesCity, SalesState
FROM Sales WHERE SalesCity IN
(
SELECT SalesCity
FROM Sales
GROUP BY SalesCity
HAVING COUNT(SalesCity) = 1
)
You can do this without a subquery:
select MIN(SalesName) as SalesName, SalesCity, MIN(SalesState) as SalesState
from Sales
group by SalesCity
having count(*) = 1;
If there is only one row for the city, then the min() will return the value on that row.

How do I get the top 10 results of a query?

I have a postgresql query like this:
with r as (
select
1 as reason_type_id,
rarreason as reason_id,
count(*) over() count_all
from
workorderlines
where
rarreason != 0
and finalinsdate >= '2012-12-01'
)
select
r.reason_id,
rt.desc,
count(r.reason_id) as num,
round((count(r.reason_id)::float / (select count(*) as total from r) * 100.0)::numeric, 2) as pct
from r
left outer join
rtreasons as rt
on
r.reason_id = rt.rtreason
and r.reason_type_id = rt.rtreasontype
group by
r.reason_id,
rt.desc
order by r.reason_id asc
This returns a table of results with 4 columns: the reason id, the description associated with that reason id, the number of entries having that reason id, and the percent of the total that number represents.
This table looks like this:
What I would like to do is only display the top 10 results based off the total number of entries having a reason id. However, whatever is leftover, I would like to compile into another row with a description called "Other". How would I do this?
with r2 as (
...everything before the select list...
dense_rank() over(order by pct) cause_rank
...the rest of your query...
)
select * from r2 where cause_rank < 11
union
select
NULL as reason_id,
'Other' as desc,
sum(r2.num) over() as num,
sum(r2.pct) over() as pct,
11 as cause_rank
from r2
where cause_rank >= 11
As said above Limit and for the skipping and getting the rest use offset... Try This Site
Not sure about Postgre but SELECT TOP 10... should do the trick if you sort correctly
However about the second part: You might use a Right Join for this. Join the TOP 10 Result with the whole table data and use only the records not appearing on the left side. If you calculate the sum of those you should get your "Sum of the rest" result.
I assume that vw_my_top_10 is the view showing you the top 10 records. vw_all_records shows all records (including the top 10).
Like this:
SELECT SUM(a_field)
FROM vw_my_top_10
RIGHT JOIN vw_all_records
ON (vw_my_top_10.Key = vw_all_records.Key)
WHERE vw_my_top_10.Key IS NULL