average query ORA-00936 error - sql

SQL> SELECT consignmentNo, VoyageNo, Weight
2 (SELECT (AVG(WEIGHT) FROM consignment), AS AVERAGE,
3 WHERE Weight = 650,
4 FROM consignment;
(SELECT (AVG(WEIGHT) FROM consignment), AS AVERAGE,
*
ERROR at line 2:
ORA-00936: missing expression
average weight for a particular ship, listing consignments for the particular ship also, unable to identify the error

Are you simply looking for group by?
SELECT VoyageNo, AVG(Weight)
FROM consignment
GROUP BY VoyageNo;
If you want the average along with the detailed information, you want a window function:
SELECT c.*, AVG(Weight) OVER (PARTITION BY VoyageNo)
FROM consignment c;
This assumes that VoyageNo is what you mean by ship.

You seems want :
SELECT consignmentNo, VoyageNo, Weight, avg.AVERAGE
FROM consignment CROSS JOIN
(SELECT AVG(WEIGHT) AS AVERAGE FROM consignment) avg
WHERE Weight = 650;

You have an extra , in your query (before AS AVERAGE) and you are missing a , after Weight. Also from and where is not in the right order. Try this:
SELECT consignmentNo, VoyageNo, Weight,
(SELECT (AVG(WEIGHT) FROM consignment) AS AVERAGE,
FROM consignment
WHERE Weight = 650;

Related

Group by after a partition by in MS SQL Server

I am working on some car accident data and am stuck on how to get the data in the form I want.
select
sex_of_driver,
accident_severity,
count(accident_severity) over (partition by sex_of_driver, accident_severity)
from
SQL.dbo.accident as accident
inner join SQL.dbo.vehicle as vehicle on
accident.accident_index = vehicle.accident_index
This is my code, which counts the accidents had per each sex for each severity. I know I can do this with group by but I wanted to use a partition by in order to work out % too.
However I get a very large table (I assume for each row that is each sex/severity. When I do the following:
select
sex_of_driver,
accident_severity,
count(accident_severity) over (partition by sex_of_driver, accident_severity)
from
SQL.dbo.accident as accident
inner join SQL.dbo.vehicle as vehicle on
accident.accident_index = vehicle.accident_index
group by
sex_of_driver,
accident_severity
I get this:
sex_of_driver
accident_severity
(No column name)
1
1
1
1
2
1
-1
2
1
-1
1
1
1
3
1
I won't give you the whole table, but basically, the group by has caused the count to just be 1.
I can't figure out why group by isn't working. Is this an MS SQL-Server thing?
I want to get the same result as below (obv without the CASE etc)
select
accident.accident_severity,
count(accident.accident_severity) as num_accidents,
vehicle.sex_of_driver,
CASE vehicle.sex_of_driver WHEN '1' THEN 'Male' WHEN '2' THEN 'Female' end as sex_col,
CASE accident.accident_severity WHEN '1' THEN 'Fatal' WHEN '2' THEN 'Serious' WHEN '3' THEN 'Slight' end as serious_col
from
SQL.dbo.accident as accident
inner join SQL.dbo.vehicle as vehicle on
accident.accident_index = vehicle.accident_index
where
sex_of_driver != 3
and
sex_of_driver != -1
group by
accident.accident_severity,
vehicle.sex_of_driver
order by
accident.accident_severity
You seem to have a misunderstanding here.
GROUP BY will reduce your rows to a single row per grouping (ie per pair of sex_of_driver, accident_severity values. Any normal aggregates you use with this, such as COUNT(*), will return the aggregate value within that group.
Whereas OVER gives you a windowed aggregated, and means you are calculating it after reducing your rows. Therefore when you write count(accident_severity) over (partition by sex_of_driver, accident_severity) the aggregate only receives a single row in each partition, because the rows have already been reduced.
You say "I know I can do this with group by but I wanted to use a partition by in order to work out % too." but you are misunderstanding how to do that. You don't need PARTITION BY to work out percentage. All you need to calculate a percentage over the whole resultset is COUNT(*) * 1.0 / SUM(COUNT(*)) OVER (), in other words a windowed aggregate over a normal aggregate.
Note also that count(accident_severity) does not give you the number of distinct accident_severity values, it gives you the number of non-null values, which is probably not what you intend. You also have a very strange join predicate, you probably want something like a.vehicle_id = v.vehicle_id
So you want something like this:
select
sex_of_driver,
accident_severity,
count(*) as Count,
count(*) * 1.0 /
sum(count(*)) over (partition by sex_of_driver) as PercentOfSex
count(*) * 1.0 /
sum(count(*)) over () as PercentOfTotal
from
dbo.accident as accident a
inner join dbo.vehicle as v on
a.vehicle_id = v.vehicle_id
group by
sex_of_driver,
accident_severity;

Oracle SQL query, getting a a maximum of a sum

Hey, guys. I'm struggling to solve one query, just cant get around it.
Basically, I got a some tables from data mart :
DimTheatre(TheatreId(PK), TheatreNo, Name, Address, MainTel);
DimTrow(TrowId(PK), TrowNo, RowName, RowType);
DimProduction(ProductionId(PK), ProductionNo, Title, ProductionDir, PlayAuthor);
DimTime(TimeId(PK), Year, Month, Day, Hour);
TicketPurchaseFact( TheatreId(FK), TimeId(FK), TrowId(FK),
PId(FK), TicketAmount);
The thing I'm trying to achieve in oracle is - I need to retrieve the most popular row type in each theatre by value of ticket sale
Thing I'm doing now is :
SELECT dthr.theatreid, dthr.name, max(tr.rowtype) keep(dense_rank last order
by tpf.ticketamount), sum(tpf.ticketamount) TotalSale
FROM TicketPurchaseFact tpf, DimTheatre dthr, DimTrow tr
WHERE dthr.theatreid = tpf.theatreid
GROUP BY dthr.theatreid, dthr.name;
It does give me the output, but the 'TotalSale' column is totally out of place, it gives much way higher numbers than they should be.. How could I approach this issue :) ?
I am not sure how MAX() KEEP () would help your case if I understand the problem correctly. But the below approach should work:
SELECT x.theatreid, x.name, x.rowtype, x.total_sale
FROM
(SELECT z.theatreid, z.name, z.rowtype, z.total_sale, DENSE_RANK() OVER (PARTITION BY z.theatreid, z.name ORDER BY z.total_sale DESC) as popular_row_rank
FROM
(SELECT dthr.theatreid, dthr.name, tr.rowtype, SUM(tpf.ticketamount) as total_sale
FROM TicketPurchaseFact tpf, DimTheatre dthr, DimTrow tr
WHERE dthr.theatreid = tpf.theatreid AND tr.trowid = tpf.trowid
GROUP BY dthr.theatreid, dthr.name, tr.rowtype) z
) x
WHERE x.popular_row_rank = 1;
You want the row type per theatre with the highest ticket amount. So join purchases and rows and then aggregate to get the total per rowtype. Use RANK to rank your row types per theatre and stay with the best ranked ones. At last join with the theatre table to get the theatre name.
select
theatreid,
t.name,
tr.trowid
from
(
select
p.theatreid,
r.rowtype,
rank() over (partition by p.theatreid order by sum(p.ticketamount) desc) as rn
from ticketpurchasefact p
join dimtrow r using (trowid)
group by p.theatreid, r.rowtype
) tr
join dimtheatre t using (theatreid)
where tr.rn = 1;

Weighted Average in BigQuery

I am using imported GA data to calculate the average product position on a page, I am currently doing this by averaging the item position by SKU - is there a way to calculate this as a weighted average within my query, as a product could display 10 times in position 1, and once at position 10, I wouldn't want the average to be 5.
Here is my query so far:
SELECT hits.product.productSKU AS SKU, AVG(hits.product.productListPosition) AS Average_Position
FROM (TABLE_DATE_RANGE([***.ga_sessions_], TIMESTAMP('2016-04-24'), TIMESTAMP('2016-04-30')))
GROUP BY SKU
ORDER BY Average_Position ASC
I tested this query and it worked here:
SELECT
sku,
nom / den avg_position from(
SELECT
sku,
SUM(position * freq) nom,
SUM(freq) den from(
SELECT
prods.productsku sku,
prods.productlistposition position,
COUNT(prods.productlistposition) freq
FROM
`project_id.dataset_id.ga_sessions_*`,
UNNEST(hits) AS hits,
UNNEST(hits.product) prods
WHERE
1 = 1
AND PARSE_TIMESTAMP('%Y%m%d', REGEXP_EXTRACT(_table_suffix, r'.*_(.*)')) BETWEEN TIMESTAMP('2016-04-24') AND TIMESTAMP('2016-04-30')
AND prods.productlistposition > 0
GROUP BY
sku,
position )
GROUP BY
sku )
Notice that I used the Standard version of BigQuery as this is highly recommended.
If you must use the Legacy version adapting this query might be easy (supposing you don't have to use the FLATTEN operation).
You said you want to consider the positions on a given page, this can be done as well by inserting in the first where clause the condition
and hits.page.pagepath = 'your page url'

How to get a percentile rank based on a computation

there are four tables as :
T_SALES has columns like
CUST_KEY,
ITEM_KEY,
SALE_DATE,
SALES_DLR_SALES_QTY,
ORDER_QTY.
T_CUST has columns like
CUST_KEY,
CUST_NUM,
PEER_GRP_ID
T_PEER_GRP has columns like
PEER_GRP_ID,
PEER_GRP_DESC,
PRNT_PEER_GRP_ID
T_PRNT_PEEER has columns like
PRNT_PEER_GRP_ID,
PRNT_PEER_DESC
Now for the above tables, i need to generate a percentile rank of the customer based on the computation fillrate = SALES_QTY / ORDER_QTY * 100 by peer group within a parent peer.
could someone please help on this?
You can use the analytic function PERCENT_RANK() to calculate the percentile rank, as below:
SELECT
t_s.cust_key,
t_c.cust_num,
PERCENT_RANK() OVER (ORDER BY (t_s.SALES_DLR_SALES_QTY / ORDER_QTY) DESC) as pr
FROM t_sales t_s
INNER JOIN t_cust t_c ON t_s.cust_key = t_c.cust_key
ORDER BY pr;
Reference:
PERCENT_RANK on Oracle® Database SQL Reference
If by "percentile rank" you mean "percent rank" (documented here), then the harder part is the joins. I think this is the basic data that you want for the percentile rank:
select t.PEER_GRP_ID, t.PRNT_PEER_GRP_ID,
sum(SALES_DLR_SALES_QTY * ORDER_QTY) as total
from t_sales s join
t_customers c
on s.CUST_KEY = c.cust_key join
t_peer_grp t
on t.PEER_GRP_ID = c.PEER_GRP_ID
group by t.PEER_GRP_ID, t.PRNT_PEER_GRP_ID;
You can then calculate the percentile (0 to 100) as:
select t.PEER_GRP_ID, t.PRNT_PEER_GRP_ID,
sum(SALES_DLR_SALES_QTY * ORDER_QTY) as total,
percentile_rank() over (partition by t.PRNT_PEER_GRP_ID
order by sum(SALES_DLR_SALES_QTY * ORDER_QTY)
)
from t_sales s join
t_customers c
on s.CUST_KEY = c.cust_key join
t_peer_grp t
on t.PEER_GRP_ID = c.PEER_GRP_ID
group by t.PEER_GRP_ID, t.PRNT_PEER_GRP_ID;
Note that this mixes analytic functions with aggregation functions. This can look awkward when you first learn about it.

SQL query count divided by a distinct count of same query

Having some trouble with some SQL.
Take the following result for instance:
LOC_CODE CHANNEL
------------ --------------------
3ATEST-01 CHAN2
3ATEST-01 CHAN3
3ATEST-02 CHAN4
What I need to do is get a count of the above query, grouped by channel, but i want that count to be divided by the count that the "LOC_CODE" appears.
Example of the result I am after is:
CHANNEL COUNT
---------------- ----------
CHAN2 0.5
CHAN3 0.5
CHAN4 1
Above explaination is that the CHAN2 appears next to "3ATEST-01", but that LOC_CODE of "3ATEST-01" appears twice, so the count should be divided by 2.
I know I can do this by basically duplicating the query with a distinct count, but the underlying query is quite complex and don't really want to harm performance.
Please let me know if you would like more information!
Try:
select channel,
count(*) over (partition by channel, loc_code)
/ count(*) over (partition by loc_code) as count_ratio
from my_table
SELECT t.CHANNEL, COUNT(*) / gr.TotalCount
FROM my_table t JOIN (
SELECT LOC_CODE, COUNT(*) TotalCount
FROM my_table
GROUP BY LOC_CODE
) gr USING(LOC_CODE)
GROUP BY t.LOC_CODE, t.CHANNEL
Create a index on (LOC_CODE, CHANNEL)
If are no duplicate channels, replace COUNT(*) / gr.TotalCount with 1 / gr.TotalCount and remove the GROUP BY clause
First, find a query that gets you the correct results. Then, see if it can be optimised. My guess is that it's hard to optimise as you require two different groupings, one per Channel and one pre Loc_Code.
I'm not even sure that this fits your description:
SELECT t.CHANNEL
, COUNT(*) / SUM(grp.TotalCount)
FROM my_table t
JOIN
( SELECT LOC_CODE
, COUNT(*) TotalCount --- or is it perhaps?:
--- COUNT(DISTINCT CHANNEL)
FROM my_table
GROUP BY LOC_CODE
) grp
ON grp.LOC_CODE = t.LOC_CODE
GROUP BY t.CHANNEL
Your requirements are still a bit unclear to me when it comes to duplicate CHANNELs, but this should work if you want grouping on both CHANNEL and LOC_CODE to sum up later;
SELECT L1.CHANNEL, 1/COUNT(L2.LOC_CODE)
FROM Locations L1
LEFT JOIN Locations L2 ON L1.LOC_CODE = L2.LOC_CODE
GROUP BY L1.CHANNEL, L1.LOC_CODE
Demo here.