I am attempting to select the max requester_req for each customer group, but after trying numerous different approaches, my result set continues to display every row instead of the max for the customer group.
The query:
SELECT
x2.customer,
x.customer_req,
x2.requester_name,
MAX(x2.requester_req) AS requester_req
FROM x, x2
WHERE x.customer = x2.customer
GROUP BY x2.customer, x2.requester_name, x.customer_req
ORDER BY x2.customer
A sample result set:
customer customer_req requester_name requester_req
Bob's Burgers 7 Bob 9
Bob's Burgers 7 Jon 12
Hello Kitty 9 Jane 3
Hello Kitty 9 Luke 7
Expected result set:
customer customer_req requester_name requester_req
Bob's Burgers 7 Jon 12
Hello Kitty 9 Luke 7
Have I screwed up something in my group by clause? I can't count how many times I've switched things up and get the same result set.
Thank you very much for your help!
select the max requester_req for each customer group
Don't aggregate. Instead, you can filter with a correlated subquery:
select
x2.customer,
x.customer_req,
x2.requester_name,
x2.requester_req
from x
inner join x2 on x.customer = x2.customer
where x2.requester_req = (
select max(x20.requester_req) from x2 x20 where x20.customer = x2.customer
)
order by x2.customer
Side note: always use explicit, standard joins (with the on keywords) instead of old-school implicit joins (with commas in the from clause): this syntax is not recommended anymore since more than 20 years, mostly because it is harder to follow.
Related
I am hoping someone can advise on the below please?
I have some code (below), it is pulling the data I need with no issues. I have been trying (in vain) to add a COUNT function in here somewhere. The output I am looking for would be a count of how many orders are assigned to each agent. I tried a few diffent things based on other questions but can't seem to get it correct. I think I am placing the COUNT 'Agent' statement and the GROUP BY in the wrong place. Please can someone advise? (I am using Oracle SQL Developer).
select
n.ordernum as "Order",
h.employee as "Name"
from ordermgmt n, orderheader h
where h.ordernum = n.ordernum
and h.employee_group IN ('ORDER.MGMT')
and h.employee is NOT NULL
and n.percentcomplete = '0'
and h.order_status !='CLOSED'
Output I am looking for would be, for example:
Name Orders Assigned
Bob 3
Peter 6
John 2
Thank you in advance
Name
Total
49
49
49
49
49
John
4
John
4
John
4
John
4
Peter
2
Peter
2
Bob
3
Bob
3
Bob
3
for example. so there are 49 blank rows summed up as 49 in the Total column. I did not add the full 49 blank columns to save space
Would be easier with sample data and expected output, but maybe you are looking for something like this
select
n.ordernum as "Order",
h.employee as "Name",
count(*) over (partition by h.employee) as OrdersAssigned
from ordermgmt n, orderheader h
where h.ordernum = n.ordernum
and h.employee_group IN ('ORDER.MGMT')
and h.employee is NOT NULL
and n.percentcomplete = '0'
and h.order_status !='CLOSED'
The use of COUNT (as other aggregate functions) is simple.
If you want to add an aggregate function, please group all scalar fields in the GROUP BY clause.
So, in the SELECT you can manage field1, field2, count(1) and so on but you must add in group by (after where conditions) field1, field2
Try this:
select
h.employee as "Name",
count(1) as "total"
from ordermgmt n, orderheader h
where h.ordernum = n.ordernum
and h.employee_group IN ('ORDER.MGMT')
and h.employee is NOT NULL
and n.percentcomplete = '0'
and h.order_status !='CLOSED'
GROUP BY h.employee
I have a table like this :
### Table name: studentresult ###
Name Cls Roll Mark result Rank
Jubayer 10 1 600 Pass
Jewel 10 2 620 Pass
James 10 3 590 Pass
Jemi 10 4 590 Pass
Kalis 10 5 449 Fail
Lelin 10 6 600 Pass
I want to generate the ranks of the students automatically. The rank will depend on the mark (higher mark implies better rank). If two students (or more) have the same mark, the roll will determine the relative ranking (lower roll implies better rank). Finally, if a student has failed, he will not be considered in the ranking.
In my example, the result would be like this :
Name Cls Roll Mark result Rank
Jubayer 10 1 600 Pass 2
Jewel 10 2 620 Pass 1
James 10 3 590 Pass 4
Jemi 10 4 590 Pass 5
Kalis 10 5 449 Fail **
Lelin 10 6 600 Pass 3
SELECT
name,
Cls,
Roll,
Mark,
CASE WHEN t.result LIKE 'Pass' THEN a.pos+1
WHEN t.result LIKE 'Fail' THEN '**'
END as Rank
FROM table t
CROSS JOIN (SELECT posexplode(split(repeat(',', 70), ','))) a
ORDER BY t.Roll
This wil create 70 rows if you have 70 students.
I have just modify the code of Mr. GordonLinoff answers. Try below query...
SELECT sr.*, IIf(sr.result="Pass",(select count(*)
from Table1 as sr2
where sr2.result = "Pass" and
(sr2.Mark*10000-sr2.roll >= sr.Mark*10000-sr.roll and sr2.roll = sr2.roll
)
),Null) AS ranking
FROM Table1 AS sr;
You can use the generic function RowRank from my project RowNumbers.
Too much code to post here, but go to paragraph 5. Rank for the details.
Example:
A bit tricky as Mark is sorted Desc while Roll is sorted Asc:
SELECT
studentresult.Name,
studentresult.Cls,
studentresult.Roll,
studentresult.Mark,
studentresult.Pass,
IIf([Pass]='Pass',RowRank("[Mark],-[Roll]","(select * from studentresult where Pass='Pass')",[Mark],-[Roll],2),Null) AS Rank
FROM
studentresult;
Output:
In MS Access, you can use a correlated subquery:
select sr.*,
iif(sr.result = "Pass",
(select count(*)
from studentresult as sr2
where sr2.result = "Pass" and
(sr2.Mark > sr.Mark or
sr2.Mark = sr.Mark and sr2.roll <= sr.roll
)
),
NULL
) as ranking
from studentresult as sr;
Note that this uses NULL for the missing value for fails. This is much more convenient than '**'. You can use the latter but you need to convert the number to a string so the entire column is a string. NULL just works.
I have a query that collects many different columns, and I want to include a column that sums the price of every component in an order. Right now, I already have a column that simply shows the price of every component of an order, but I am not sure how to create this new column.
I would think that the code would go something like this, but I am not really clear on what an aggregate function is or why I get an error regarding the aggregate function when I try to run this code.
SELECT ID, Location, Price, (SUM(PriceDescription) FROM table GROUP BY ID WHERE PriceDescription LIKE 'Cost.%' AS Summary)
FROM table
When I say each component, I mean that every ID I have has many different items that make up the general price. I only want to find out how much money I spend on my supplies that I need for my pressure washers which is why I said `Where PriceDescription LIKE 'Cost.%'
To further explain, I have receipts of every customer I've worked with and in these receipts I write down my cost for the soap that I use and the tools for the pressure washer that I rent. I label all of these with 'Cost.' so it looks like (Cost.Water), (Cost.Soap), (Cost.Gas), (Cost.Tools) and I would like it so for Order 1 it there's a column that sums all the Cost._ prices for the order and for Order 2 it sums all the Cost._ prices for that order. I should also mention that each Order does not have the same number of Costs (sometimes when I use my power washer I might not have to buy gas and occasionally soap).
I hope this makes sense, if not please let me know how I can explain further.
`ID Location Price PriceDescription
1 Park 10 Cost.Water
1 Park 8 Cost.Gas
1 Park 11 Cost.Soap
2 Tom 20 Cost.Water
2 Tom 6 Cost.Soap
3 Matt 15 Cost.Tools
3 Matt 15 Cost.Gas
3 Matt 21 Cost.Tools
4 College 32 Cost.Gas
4 College 22 Cost.Water
4 College 11 Cost.Tools`
I would like for my query to create a column like such
`ID Location Price Summary
1 Park 10 29
1 Park 8
1 Park 11
2 Tom 20 26
2 Tom 6
3 Matt 15 51
3 Matt 15
3 Matt 21
4 College 32 65
4 College 22
4 College 11 `
But if the 'Summary' was printed on every line instead of just at the top one, that would be okay too.
You just require sum(Price) over(Partition by Location) will give total sum as below:
SELECT ID, Location, Price, SUM(Price) over(Partition by Location) AS Summed_Price
FROM yourtable
WHERE PriceDescription LIKE 'Cost.%'
First, if your Price column really contains values that match 'Cost.%', then you can not apply SUM() over it. SUM() expects a number (e.g. INT, FLOAT, REAL or DECIMAL). If it is text then you need to explicitly convert it to a number by adding a CAST or CONVERT clause inside the SUM() call.
Second, your query syntax is wrong: you need GROUP BY, and the SELECT fields are not specified correctly. And you want to SUM() the Price field, not the PriceDescription field (which you can't even sum as I explained)
Assuming that Price is numeric (see my first remark), then this is how it can be done:
SELECT ID
, Location
, Price
, (SELECT SUM(Price)
FROM table
WHERE ID = T1.ID AND Location = T1.Location
) AS Summed_Price
FROM table AS T1
to get exact result like posted in question
Select
T.ID,
T.Location,
T.Price,
CASE WHEN (R) = 1 then RN ELSE NULL END Summary
from (
select
ID,
Location,
Price ,
SUM(Price)OVER(PARTITION BY Location)RN,
ROW_number()OVER(PARTITION BY Location ORDER BY ID )R
from Table
)T
order by T.ID
This is how my query results look like currently. How can I get the MAX() value for each unique id ?
IE,
for 5267139 is 8.
for 5267145 is 4
5267136 5
5267137 8
5267137 2
5267139 8
5267139 5
5267139 3
5267141 4
5267141 3
5267145 4
5267145 3
5267146 1
5267147 2
5267152 3
5267153 3
5267155 8
SELECT DISTINCT st.ScoreID, st.ScoreTrackingTypeID
FROM ScoreTrackingType stt
LEFT JOIN ScoreTracking st
ON stt.ScoreTrackingTypeID = st.ScoreTrackingTypeID
ORDER BY st.ScoreID, st.ScoreTrackingTypeID DESC
GROUP BY will partition your table into separate blocks based on the column(s) you specify. You can then apply an aggregate function (MAX in this case) against each of the blocks -- this behavior applies by default with the below syntax:
SELECT First_column, MAX(Second_column) AS Max_second_column
FROM Table
GROUP BY First_column
EDIT: Based on the query above, it looks like you don't really need the ScoreTrackingType table at all, but leaving it in place, you could use:
SELECT st.ScoreID, MAX(st.ScoreTrackingTypeID) AS ScoreTrackingTypeID
FROM ScoreTrackingType stt
LEFT JOIN ScoreTracking st ON stt.ScoreTrackingTypeID = st.ScoreTrackingTypeID
GROUP BY st.ScoreID
ORDER BY st.ScoreID
The GROUP BY will obviate the need for DISTINCT, MAX will give you the value you are looking for, and the ORDER BY will still apply, but since there will only be a single ScoreTrackingTypeID value for each ScoreID you can pull it out of the ordering.
I have user data:
user store item cost
1 10 100 5
1 10 101 3
1 11 102 7
2 10 101 3
2 12 103 4
2 12 104 5
I want a table which will tell me for each user how much he bought from each store and how much he bought in total:
user store cost_this_store cost_total
1 10 8 15
1 11 7 15
2 10 3 12
2 12 9 12
I can do this with two group by and a join:
select s.user, s.store, s.cost_this_store, u.cost_total
from (select user, store, sum(cost) as cost_this_store
from my_data
group by user, store) s
join (select user, sum(cost) as cost_total
from my_data
group by user) u
on s.user = u.user
However, this is definitely not how I would do this if I were writing this in any other language (join is clearly avoidable, and the two group by are not independent).
Is it possible to avoid the join in sql?
PS. I need the solution to work in hive.
You can do this with a windowing function... which Hive added support for last year:
select distinct
user,
store,
sum(cost) over (partition by user, store) as cost_this_store,
sum(cost) over (partition by user) as cost_total
from my_data
However, I'd argue that there wasn't anything glaringly wrong with your original implementation. You've essentially got two different sets of data, which you're combining through a JOIN.
The duplication might look like a code smell in a different language, but this isn't necessarily the wrong approach in SQL, and often you'll have to take approaches such as this that duplicate a portion of a query between two intermediate result sets for performance reasons.
SQL Fiddle (SQL Server)