PostgreSQL query not fetching correct result for the conditional comparison of aggregate function - sql

I have a products table with following values with id as INT and profit as numeric data type
id profit
1 6.00
2 3.00
3 2.00
4 3.00
5 2.00
6 8.00
7 4.00
8 3.00
9 1.00
10 4.00
11 10.00
12 3.00
13 6.00
14 5.00
15 2.00
16 7.00
17 6.00
18 5.00
19 2.00
20 16.00
21 3.00
22 6.00
23 5.00
24 5.00
25 1.00
26 4.00
27 1.00
28 7.00
29 11.00
30 2.00
31 1.00
32 3.00
33 2.00
34 5.00
35 4.00
I want to fetch id's which have profit more than average profit
My QUERY:
SELECT product_id,profit
FROM products
GROUP BY product_id,profit
HAVING profit > AVG(profit)::INT
But, the above query return's empty result.

when you execute the group by query, the records are grouped based on the parameters and then where/having clauses are applied.
so first group is student id 1 and further grouped by profit 6.00 making its average as 6.00 and with having condition profit >avg(profit), there are no records which match the criteria.
same for all other records. that is why you get empty result set as no number can be > itself.
based on your description though, it can be achieved by multiple selects.
select * from products where profit >(select avg(profit) from products)

Related

Converting Columns into Rows in Pentaho

i have a table below that looks like the current one below. How can i put the values for date, count2, amount2 from columns into rows just ONCE?
current:
date type count amount type2 date count2 amount2
1/1/20 00 5 13.49 ZZZ 1/1/20 0 5.00
1/1/20 12 6 14.69 ZZZ 1/1/20 0 5.00
1/1/20 11 10 20.66 ZZZ 1/1/20 0 5.00
expected
date type count amount
1/1/20 12 6 14.69
1/1/20 12 6 14.69
1/1/20 11 10 20.66
1/1/20 ZZZ 0 5.00
what can you do it just splits data flow into two select steps.
in the first select step use your first three columns.
in the second select step use your remaining other columns. and use filter rows to remove redundancy. then use dummy step and link to both select steps.

Left Outer Join SSMS

I am attempting to left outer join between two tables.
select id, startdate, name, code, email from edw.dbo.starts
id startdate name code email yearfiled
15 2/4/2018 SO 1083 sql#gmail.com 2018
17 3/4/2018 SO 1083 ssms#gmail.com 2018
19 4/4/2018 SO 1083 ssrs#gmail.com 2018
21 5/4/2018 SO 1083 ssas#gmail.com 2018
21 5/5/2017 SO 1083 who#gmail.com 2017
select customer, return_year, revenue, code from sql.dbo.paid
customer return_year revenue code
15 2018 15.00 1083
17 2018 25.00 1083
21 2018 35.00 1083
21 2017 35.00 1083
select
month(os.startdate) as startmonth
,os.name
,os.code
,coalesce(s.revenue, 0) as revenue
,count(os.email) as commission
from
edw.dbo.starts as os
left outer join
sql.dbo.paid as s
on
os.id = s.customer
and os.yearfile = s.return_year
where
os.yearfiled = 2018
and os.code = '1083'
and os.startdate is not null
group by
month(os.startdate)
,os.name
,os.code
,coalesce(s.revenue, 0);
startmonth name code revenue commission
2 SO 1083 15.00 1
3 SO 1083 25.00 1
4 SO 1083 0.00 1
5 SO 1083 0.00 1
The issue:
Customer = 21 from sql.dbo.paid shows zero for the revenue in the joined query even though it had a reported $35.00 revenue in the table.
Requested:
startmonth name code revenue commission
2 SO 1083 15.00 1
3 SO 1083 25.00 1
4 SO 1083 0.00 1
5 SO 1083 35.00 1
Please try this one.
create table starts
(id int ,
startdate date ,
name varchar(10),
code int ,
email varchar(20),
yearfiled int )
create table paid
( customer int ,
return_year int ,
revenue decimal(10,2),
code int)
insert into starts values
(15,'2/4/2018','SO',1083,'sql#gmail.com',2018),
(17,'3/4/2018','SO',1083,'ssms#gmail.com',2018),
(19,'4/4/2018','SO',1083,'ssrs#gmail.com',2018),
(21,'5/4/2018','SO',1083,'ssas#gmail.com',2018),
(21,'5/5/2017','SO',1083,'who#gmail.com',2017)
insert into paid values
(15,2018,15.00,1083),
(17,2018,25.00,1083),
(21,2018,35.00,1083),
(21,2017,35.00,1083)
select month(a.startdate)as startmonth,a.name,a.code,
case when b.revenue is null then 0 else b.revenue end as revenue,
count(a.email) as commission
from starts a
left join paid b on a.yearfiled=b.return_year and a.id=b.customer
where a.yearfiled=2018 and a.code=1083
group by month(a.startdate),a.name,a.code,b.revenue
/*
startmonth name code revenue commission
----------- ---------- ----------- --------------------------------------- -----------
2 SO 1083 15.00 1
3 SO 1083 25.00 1
4 SO 1083 0.00 1
5 SO 1083 35.00 1
*/

Pandas Group By With Running Total

My granny has some strange ideas. Every birthday she takes me shopping.
She has some strict rules. If I buy a present less than $20 she wont contribute anything. If I spend over $20 she will contribute up to $30.
So if a present costs $27 she would contribute $7.
That now leaves me with $23 to spend on extra presents that birthday; the same rules as above apply on any additional presents.
Once the $30 are spent there are no more contributions from granny and I must pay the rest myself.
Here is an example table of my 11th, 12th and 13th birthday.
DollarsSpent granny_pays
BirthDayAge PresentNum
11 1 25.00 5.00 -- I used up $5
2 100.00 25.00 -- I used up last $20
3 10.00 0.00
4 50.00 0.00
12 1 39.00 19.00 -- I used up $19 only $11 left
2 7.00 0.00
3 32.00 11.00 -- I used up the last $11 despite $12 of $32 above the $20 starting point
4 19.00 0.00
13 1 21.00 1.00 -- used up $1
2 27.00 7.00 -- used up $7, total used up $8 and never spent last $22
So in pandas I have gotten this far.
import pandas as pd
granny_wont_pay_first = 20.
granny_limit = 30.
df = pd.DataFrame({'BirthDayAge' : ['11','11','11','11','12','12','12','12','13','13']
,'PresentNum' : [1,2,3,4,1,2,3,4,1,2]
,'DollarsSpent' : [25.,100.,10.,50.,39.,7.,32.,19.,21.,27.]
})
df.set_index(['BirthDayAge','PresentNum'],inplace=True)
df['granny_pays'] = df['DollarsSpent'] - granny_wont_pay_first
df['granny_limit'] = granny_limit
df['zero'] = 0.0
df['granny_pays'] = df[['granny_pays','zero','granny_limit']].apply(np.median,axis=1)
df.drop(['granny_limit','zero'], axis=1, inplace=True)
print df.head(len(df))
And this is the output. Using the median on the 3 numbers is a nice way to work out what granny will contribute.
The problem is that you can see each present is treated in isolation and I don't correctly erode my $30 each present within each BirthDayAge.
DollarsSpent granny_pays
BirthDayAge PresentNum
11 1 25.00 5.00
2 100.00 30.00 -- should be 25.0
3 10.00 0.00
4 50.00 30.00 -- should be 0.0
12 1 39.00 19.00
2 7.00 0.00
3 32.00 12.00 -- should be 11.0
4 19.00 0.00
13 1 21.00 1.00
2 27.00 7.00
Trying to think of a nice pandas way to do this erosion.
Hopefully no loops please.
I don't know if there is a more concise way, but this should work and does avoid loops as requested.
df['per_gift'] = df.DollarsSpent - 20
df['per_gift'] = np.where( df.per_gift > 0, df.per_gift, 0 )
df['per_bday'] = df.groupby('BirthDayAge').per_gift.cumsum()
df['per_bday'] = np.where( df.per_bday > 30, 30, df.per_bday )
df['granny_pays'] = df.groupby('BirthDayAge').per_bday.diff()
df['granny_pays'] = df.granny_pays.fillna(df.per_bday)
Note that 'per_gift' ignores the maximum subsidy of $30 and 'per_bday' is the cumulative subsidy (capped at $30) per 'BirthDayAge'.
BirthDayAge DollarsSpent PresentNum per_gift per_bday granny_pays
0 11 25 1 5 5 5
1 11 100 2 80 30 25
2 11 10 3 0 30 0
3 11 50 4 30 30 0
4 12 39 1 19 19 19
5 12 7 2 0 19 0
6 12 32 3 12 30 11
7 12 19 4 0 30 0
8 13 21 1 1 1 1
9 13 27 2 7 8 7

How to select top 4 records of amount from 2 tables [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 8 years ago.
Improve this question
I have 2 tables one contain document detail as below :
Table1 : contains customer and total document amount.
DocEntry CustID CustName City DocAmount
1 GF002 Raffy N London 120.00
2 GF025 Jhon Liverpool 50.00
3 GF120 Keng London 125.25
4 GF055 Tung L. London 30.00
5 GF020 Lee H. Manchester 60.00
Table2 : contains item and item price of each document.
DocEntry LineNum ItemID ItemName ItemPrice Qty LineAmount
1 0 I0001 Mouse 6.00 5 30.00
1 1 I0002 Key Broad 6.00 5 30.00
1 2 I0200 Monitor 60.00 1 60.00
2 0 I0501 Ext.HDD1 50.00 1 50.00
3 0 I0665 Printer 125.00 1 125.00
4 0 I0002 Key Broad 6.00 4 24.00
4 1 I0001 Mouse 6.00 1 6.00
5 0 I0050 ODD 12.00 1 12.00
5 1 I0001 Mouse 6.00 8 48.00
I would like to select the top 3 of documents from Table1 which have highest DocAmount and in the top 3 selected have to show line detail from Table2
the result should be :
Row DocEntry CustID CustName DocAmount ItemID ItemName ItemPrice Qty LineAmount
1 3 GF120 Keng 125.25 I0665 Printer 125.00 1 125.00
2 1 GF002 Raffy N 120.00 I0001 Mouse 6.00 5 30.00
3 1 GF002 Raffy N 120.00 I0002 Key Broad 6.00 5 30.00
4 1 GF002 Raffy N 120.00 I0200 Monitor 60.00 1 60.00
5 5 GF020 Lee H. 60.00 I0050 ODD 12.00 1 12.00
5 5 GF020 Lee H. 60.00 I0001 Mouse 6.00 8 48.00
select Table2.DocEntry, CustID, CustName, DocAmount, ItemID, ItemName,
ItemPrice, Qty, LineAmount
from (select top 3 * from Table1 order by DocAmount desc) TopDocs
join Table2 on TopDocs.DocEntry=Table2.DocEntry
order by DocAmount desc
SQL Fiddle here.

How to sum Accounts by account code length?

I have 2 tables: DimAccounts and FactBudget.
DimAccounts example:
AccountKey AccountCode AccountName AccountGroup AccountType
1.6 1 6 1 NN 6 S
1.6 10 6 10 MMM 6 S
1.6 101 6 101 TTT 6 S
1.6 1010 6 1010 IIII 6 B
1.6 1011 6 1011 OOOO 6 B
1.6 1012 6 1012 KKK 6 B
FactBudget example:
TimeKey AccountKey Debit Credit
20110719 1.6 1010 20.00 5.00
20110719 1.6 1011 15.00 0.00
20110719 1.6 1000 5.00 0.00
20110719 1.6 1012 10.00 5.00
20110719 1.6 1112 10.00 0.00
In FactBudget are many Accounts just with type B. I need to get Debit and Credit Sums for Account type S (Sum).
Solution example for example data:
TimeKey AccountKey Debit Credit
20110719 1.6 1 60.00 10.00
20110719 1.6 10 50.00 10.00
20110719 1.6 101 45.00 10.00
To calculate debit and credit for sum account 1.6 101 (7 symbols with whitespace) we need to substring all acounts in factbudget up to 7 symbols (1.6 1012 -> 1.6 101, 1.6 1112 -> 1.6 111, 1.6 1011->1.6 101) and then where are they equal
(1.6 101 = 1.6 101) to group by timekey and sum debit and credit.
To calculate debit and credit for sum account 1.6 1 (5 symbols with whitespace) we need to substring all acounts in factbudget up to 5 symbols (1.6 1012 -> 1.6 1, 1.6 1112 -> 1.6 1, 1.6 1011->1.6 1) and then where are they equal
(1.6 1 = 1.6 1) to group by timekey and sum debit and credit:) and so on.
So, How to get S Accounts Debit and Cred Sum by TimeKey and AccountKey?
Basically, you could take this answer and just change one of the join conditions:
SELECT
f.TimeKey,
s.AccountKey,
SUM(f.Debit) AS Debit,
SUM(f.Credit) AS Credit
FROM DimAccounts s
INNER JOIN DimAccounts b ON b.AccountCode LIKE s.AccountCode + '%'
/* alternatively: ON s.AccountCode = LEFT(b.AccountCode, LEN(s.AccountCode)) */
INNER JOIN FactBudget f ON f.AccountKey = b.AccountKey
WHERE s.AccountType = 'S'
AND b.AccountType = 'B'
GROUP BY
f.TimeKey,
s.AccountKey
SELECT F.TimeKey,
D.AccountKey,
SUM(F.Debit) Debit,
SUM(F.Credit) Credit
FROM DimAccounts D
INNER JOIN FactBudget F ON F.AccountKey LIKE D.AccountKey + '%'
WHERE D.AccountType = 'S'
GROUP BY F.TimeKey,
D.AccountKey
Maybe something like this:
SELECT
FactBudget.TimeKey,
DimAccounts.AccountKey,
SUM(FactBudget.Debit) AS Debit,
SUM(FactBudget.Credit) AS Credit
FROM
DimAccounts
JOIN FactBudget
ON DimAccounts.AccountKey=
SUBSTRING(FactBudget.AccountKey,0,DATALENGTH(DimAccounts.AccountKey)+1)
WHERE
DimAccounts.AccountType='S'
GROUP BY
FactBudget.TimeKey,
DimAccounts.AccountKey