In Oracle SQL, how do you query the proportion of records of a certain value? - sql

Say, you have a query like
SELECT COUNT(*), date FROM ORDERS GROUP BY date ORDER BY date
but you also want to have a third "phantom/dummy field", where it basically tells you the fraction of orders each day that are of a particular type (lets say "Utensils" and "Perishables").
I should say that there is an additional column in the ORDERS table that has the type of the order:
order_type
The third dummy column should do something like take the count of orders on a date that have the "Utensils" or the "Perishables" type (not XOR), then divide by the total count of orders of that day, and then round to 2 decimal points, and append a percentage sign.
The last few formatting things, aren't really important...all I really need to know is how to apply the logic in valid PLSQL syntax.
Example output
4030 2012-02-02 34.43%
4953 2012-02-03 16.66%

You can do something like
SELECT COUNT(*),
dt,
round( SUM( CASE WHEN order_type = 'Utensils'
THEN 1
ELSE 0
END) * 100 / COUNT(*),2) fraction_of_utensils_orders
FROM ORDERS
GROUP BY dt
ORDER BY st
If you find it easier to follow, you could also
SELECT COUNT(*),
dt,
round( COUNT( CASE WHEN order_type = 'Utensils'
THEN 1
ELSE NULL
END) * 100/ COUNT(*), 2) fraction_of_utensils_orders
FROM ORDERS
GROUP BY dt
ORDER BY st

To Add sum of orders of same type to query:
select
o.*,
(
select count(o2.OrderType)
from ORDERS o2
where o2.OrderType = o.OrderType
) as NumberOfOrdersOfThisType
from ORDERS o
To Add fraction of orders of same type to query:
(Check variable definition to make sure it is PL/SQL)
declare totalCount number
select count(*)
into totalCount
from ORDERS
select
o.*,
(
select count(o2.OrderType)
from ORDERS o2
where o2.OrderType = o.OrderType
) / totalCount as FractionOfOrdersOfThisType
from ORDERS o

Related

How to return first record and totalizers

How can I return first purchase date, first salesman and first store by customer along with his total expenses?
select
bi.biifclie as customer,
aux.salesman as first_salesman,
aux.store as first_store,
aux.date_ as first_date,
CAST(SUM(bi.biifpliq) as float64) as total_bought,
CAST((SUM(bi.biifptab)-SUM(bi.biifpliq)) as float64) as discount,
CAST(SUM(bi.biifpliq)-SUM(bi.biifcrep)-SUM(bi.biifvari + bi.biifcomb + bi.biifcomc + bi.biificmc)-SUM(bi.biiffixo) as float64) as rentability,
COUNT(DISTINCT bi.biifcodi) AS orders
MAX(bi.biifdata) AS last_purchase_date,
MIN(bi.biifdata) AS first_purchase_date,
DATE_DIFF(MAX(bi.biifdata),current_date(),month)*-1 as inactivity_time,
FROM yyyyyyy.gix.bi_biif bi
LEFT JOIN
(
SELECT
aux0.biifclie as customer,
aux0.biifvend as salesman,
aux0.biifempe as store,
aux0.biifdata as date_
FROM yyyyyyy.gix.bi_biif aux0 ORDER BY date_ ASC LIMIT 1
) AS aux ON aux.cliente = bi.biifclie
GROUP BY customer,first_salesman,first_store,first_date
I tried to do that using a left join sub query, ordering it by date (so that I can return the first date), but those fields (
aux.salesman as first_salesman,
aux.store as first_store,
aux.date_ as first_date,
)
all returned null
Am I doing smethng wrong or the logic is not correct?
Thanks!
Consider below
select biifclie as customer,
array_agg(struct(biifvend as salesman, biifempe as store, biifdata as date) order by biifdata limit 1)[offset(0)].*,
cast(sum(biifpliq) as float64) as total_bought
from `yyyyyyy.gix.bi_biif` t
group by customer
Above solution, does 1) grouping by customer 2) for each customer it takes all the respective rows and leaves the one - the first one ordered by date 3) than it "converts result from array to separate columns

Summing a column over a date range in a CTE?

I'm trying to sum a certain column over a certain date range. The kicker is that I want this to be a CTE, because I'll have to use it multiple times as part of a larger query. Since it's a CTE, it has to have the date column as well as the sum and ID columns, meaning I have to group by date AND ID. That will cause my results to be grouped by ID and date, giving me not a single sum over the date range, but a bunch of sums, one for each day.
To make it simple, say we have:
create table orders (
id int primary key,
itemID int foreign key references items.id,
datePlaced datetime,
salesRep int foreign key references salesReps.id,
price int,
amountShipped int);
Now, we want to get the total money a given sales rep made during a fiscal year, broken down by item. That is, ignoring the fiscal year bit:
select itemName, sum(price) as totalSales, sum(totalShipped) as totalShipped
from orders
join items on items.id = orders.itemID
where orders.salesRep = '1234'
group by itemName
Simple enough. But when you add anything else, even the price, the query spits out way more rows than you wanted.
select itemName, price, sum(price) as totalSales, sum(totalShipped) as totalShipped
from orders
join items on items.id = orders.itemID
where orders.salesRep = '1234'
group by itemName, price
Now, each group is (name, price) instead of just (name). This is kind of sudocode, but in my database, just this change causes my result set to jump from 13 to 32 rows. Add to that the date range, and you really have a problem:
select itemName, price, sum(price) as totalSales, sum(totalShipped) as totalShipped
from orders
join items on items.id = orders.itemID
where orders.salesRep = '1234'
and orderDate between 150101 and 151231
group by itemName, price
This is identical to the last example. The trouble is making it a CTE:
with totals as (
select itemName, price, sum(price) as totalSales, sum(totalShipped) as totalShipped, orderDate as startDate, orderDate as endDate
from orders
join items on items.id = orders.itemID
where orders.salesRep = '1234'
and orderDate between startDate and endDate
group by itemName, price, startDate, endDate
)
select totals_2015.itemName as itemName_2015, totals_2015.price as price_2015, ...
totals_2016.itemName as itemName_2016, ...
from (
select * from totals
where startDate = 150101 and endDate = 151231
) totals_2015
join (
select *
from totals
where startDate = 160101 and endDate = 160412
) totals_2016
on totals_2015.itemName = totals_2016.itemName
Now the grouping in the CTE is way off, more than adding the price made it. I've thought about breaking the price query into its own subquery inside the CTE, but I can't escape needing to group by the dates in order to get the date range. Can anyone see a way around this? I hope I've made things clear enough. This is running against an IBM iSeries machine. Thank you!
Depending on what you are looking for, this might be a better approach:
select 'by sales rep' breakdown
, salesRep
, '' year
, sum(price * amountShipped) amount
from etc
group by salesRep
union
select 'by sales rep and year' breakdown
, salesRep
, convert(char(4),orderDate, 120) year
, sum(price * amountShipped) amount
from etc
group by salesRep, convert(char(4),orderDate, 120)
etc
When possible group by the id columns or foreign keys because the columns are indexed already you'll get faster results. This applies to any database.
with cte as (
select id,rep, sum(sales) sls, count(distinct itemid) did, count(*) cnt from sommewhere
where date between x and y
group by id,rep
) select * from cte order by rep
or more fancy
with cte as (
select id,rep, sum(sales) sls, count(distinct itemid) did, count(*) cnt from sommewhere
where date between x and y
group by id,rep
) select * from cte join reps on cte.rep = reps.rep order by sls desc
I eventually found a solution, and it doesn't need a CTE at all. I wanted the CTE to avoid code duplication, but this works almost as well. Here's a thread explaining summing conditionally that does exactly what I was looking for.

Grouping multiple selects within a SQL query

I have a table Supplier with two columns, TotalStock and Date. I'm trying to write a single query that will give me stock totals by week / month / year for a list of suppliers.
So results will look like this..
SUPPLIER WEEK MONTH YEAR
SupplierA 50 100 2000
SupplierB 60 150 2500
SupplierC 15 25 200
So far I've been playing around with multiple selects but I can't get any further than this:
SELECT Supplier,
(
SELECT Sum(TotalStock)
FROM StockBreakdown
WHERE Date >= '2014-5-12'
GROUP BY Supplier
) AS StockThisWeek,
(
SELECT Sum(TotalStock)
FROM StockBreakdown
WHERE Date >= '2014-5-1'
GROUP BY Supplier
) AS StockThisMonth,
(
SELECT Sum(TotalStock)
FROM StockBreakdown
WHERE Date >= '2014-1-1'
GROUP BY Supplier
) AS StockThisYear
This query throws an error as each individual grouping returns multiple results. I feel that I'm close to the solution but can't work out where to go
You don't have to use subqueries to achieve what you want :
SELECT Supplier
, SUM(CASE WHEN Date >= CAST('2014-05-12' as DATE) THEN TotalStock END) AS StockThisWeek
, SUM(CASE WHEN Date >= CAST('2014-05-01' as DATE) THEN TotalStock END) AS StockThisMonth
, SUM(CASE WHEN Date >= CAST('2014-01-01' as DATE) THEN TotalStock END) AS StockThisYear
FROM StockBreakdown
GROUP BY Supplier
You may need to make the selects for the columns return only a single result. You could try this (not tested currently):
SELECT Supplier,
(
SELECT TOP 1 StockThisWeek FROM
(
SELECT Supplier, Sum(TotalStock) AS StockThisWeek
FROM StockBreakdown
WHERE Date >= '2014-5-12'
GROUP BY Supplier
) tmp1
WHERE tmp1.Supplier = Supplier
) AS StockThisWeek,
(
SELECT TOP 1 StockThisMonth FROM
(
SELECT Supplier, Sum(TotalStock) AS StockThisMonth
FROM StockBreakdown
WHERE Date >= '2014-5-1'
GROUP BY Supplier
) tmp2
WHERE tmp2.Supplier = Supplier
) AS StockThisMonth,
...
This selects the supplier and then tries to create two columns StockThisWeek and StockThisMonth by selecting the first entry from the select you created before. As through the GROUP BY there should only be one entry per supplier, so you don't lose and data.

Using a column in sql join without adding it to group by clause

My actual table structures are much more complex but following are two simplified table definitions:
Table invoice
CREATE TABLE invoice (
id integer NOT NULL,
create_datetime timestamp with time zone NOT NULL,
total numeric(22,10) NOT NULL
);
id create_datetime total
----------------------------
100 2014-05-08 1000
Table payment_invoice
CREATE TABLE payment_invoice (
invoice_id integer,
amount numeric(22,10)
);
invoice_id amount
-------------------
100 100
100 200
100 150
I want to select the data by joining above 2 tables and selected data should look like:-
month total_invoice_count outstanding_balance
05/2014 1 550
The query I am using:
select
to_char(date_trunc('month', i.create_datetime), 'MM/YYYY') as month,
count(i.id) as total_invoice_count,
(sum(i.total) - sum(pi.amount)) as outstanding_balance
from invoice i
join payment_invoice pi on i.id=pi.invoice_id
group by date_trunc('month', i.create_datetime)
order by date_trunc('month', i.create_datetime);
Above query is giving me incorrect results as sum(i.total) - sum(pi.amount) returns (1000 + 1000 + 1000) - (100 + 200 + 150) = 2550.
I want it to return (1000) - (100 + 200 + 150) = 550
And I cannot change it to i.total - sum(pi.amount), because then I am forced to add i.total column to group by clause and that I don't want to do.
You need a single row per invoice, so aggregate payment_invoice first - best before you join.
When the whole table is selected, it's typically fastest to aggregate first and join later:
SELECT to_char(date_trunc('month', i.create_datetime), 'MM/YYYY') AS month
, count(*) AS total_invoice_count
, (sum(i.total) - COALESCE(sum(pi.paid), 0)) AS outstanding_balance
FROM invoice i
LEFT JOIN (
SELECT invoice_id AS id, sum(amount) AS paid
FROM payment_invoice pi
GROUP BY 1
) pi USING (id)
GROUP BY date_trunc('month', i.create_datetime)
ORDER BY date_trunc('month', i.create_datetime);
LEFT JOIN is essential here. You do not want to loose invoices that have no corresponding rows in payment_invoice (yet), which would happen with a plain JOIN.
Accordingly, use COALESCE() for the sum of payments, which might be NULL.
SQL Fiddle with improved test case.
Do the aggregation in two steps. First aggregate to a single line per invoice, then to a single line per month:
select
to_char(date_trunc('month', t.create_datetime), 'MM/YYYY') as month,
count(*) as total_invoice_count,
(sum(t.total) - sum(t.amount)) as outstanding_balance
from (
select i.create_datetime, i.total, sum(pi.amount) amount
from invoice i
join payment_invoice pi on i.id=pi.invoice_id
group by i.id, i.total
) t
group by date_trunc('month', t.create_datetime)
order by date_trunc('month', t.create_datetime);
See sqlFiddle
SELECT TO_CHAR(invoice.create_datetime, 'MM/YYYY') as month,
COUNT(invoice.create_datetime) as total_invoice_count,
invoice.total - payments.sum_amount as outstanding_balance
FROM invoice
JOIN
(
SELECT invoice_id, SUM(amount) AS sum_amount
FROM payment_invoice
GROUP BY invoice_id
) payments
ON invoice.id = payments.invoice_id
GROUP BY TO_CHAR(invoice.create_datetime, 'MM/YYYY'),
invoice.total - payments.sum_amount

Working out total from sub total and amount

I have a table with purchased orders data.
Each row contails the amount of certain item purchased, cost per item and the order number group. Each different item purchased is a new row with same order number.
I basically want to return the total cost for that order. I have tried the following but am getting nowhere:
SELECT order_number, SUM( sub_total ) AS `total`
FROM
SELECT order_number, SUM( SUM( amount ) * SUM( cost_per_item ) ) AS `sub_total`
FROM `ecom_orders`
WHERE member_id = '4'
GROUP BY order_number
ORDER BY purchase_date DESC
Pretty much any SQL-92 compliant RDBMS will take this:
SELECT
order_number
,SUM(amount * cost_per_item) AS total
,purchase_date
FROM
ecom_orders
WHERE member_id = '4'
GROUP BY order_number,purchase_date
ORDER BY purchase_date DESC