Get Sum of quantities from multiple tables? - sql

I have at least 8 tables from where I need to match the customer name and fetch the quantities and get the sum of all the quantities fetched from these 8 tables. I am trying to write a code which will ignore the customer whose sum of quantities is zero.
For an example lets take two tables purchase_sugar and sales_sugar I have tried a lot of queries but only this one is returning some result which is wrong.
SELECT sum(purchase_sugar.qty + sales_sugar.qty) AS Total_Amount from purchase_sugar inner join sales_sugar on purchase_sugar.supplier = sales_sugar.customer WHERE purchase_sugar.supplier = "+str(x.id)+"
The Table structures are like:
purchase_sugar have two columns supplier and qty.
And sales_sugar have structure like customer and qty.
How can I get the SUM of QUANTITIES of these tables if I provide one name and search it through these tables and get the quantities. The other thing is that I dont want the customer to be found in all the tables. If it is found in one table we should just get the quantity from that one table and for that reason I don't think that JOIN is useful or may be i am wrong.

To take care of the situation where a supplier/customer is not in all the tables, you can use union all and group by:
select name, sum(p_qty) as sum_p, sum(s_qty) as sum_s,
sum(p_qty) + sum(s_qty)
from ((select ps.supplier as name, ps.qty as p_qty, 0 as s_qty
from purchase_sugar ps
) union all
(select ss.customer as name, 0, ss.qty
from sales_sugar ss
)
) s
group by name;
Notes:
This query gets results for all names. You can use a where clause to restrict the results to one name.
You don't have to split the quantities into two (or eight) different columns, if you just want the overall sum.
You can aggregate before the union all, but that is not necessary.

you should JOIN the sum and not sum the join
select t1.purchase_sum + sales_sum as Total_Amount
from (
select purchase_sugar.supplier, sum(purchase_sugar.qty) as purchase_sum
from purchase_sugar
group by purchase_sugar.supplier
) t1
inner join (
select sales_sugar.customer, sum(sales_sugar.qty) as sales_sum
from sales_sugar
group by sales_sugar.customer
) t2 on t1.supplier = t2.customer and t1.supplier = "+str(x.id)+"

Related

For a given product, for each store, sum daily sales at all nearby stores

Given a daily_summary table containing columns {order_date, store_code, product_id, sales} and a stores table containing columns {store_code,latitude,longitude}, how can I:
For a given product_id (eg "1234"), for each store_code, get the daily SUM(sales) for the same product at nearby stores (within a 10km radius)? Output is a table with columns {store_code, order_date, sales_at_nearby_stores}, and I'm asking specifically for BigQuery.
My current query works, but is too slow. I'm sure there's a faster way to do it. Here's what I have so far:
WITH store_distances AS (
SELECT
t1.store_code store1,
t2.store_code store2,
ST_DISTANCE(
ST_GEOGPOINT(t1.longitude,t1.latitude),
ST_GEOGPOINT(t2.longitude,t2.latitude)
) AS distance_meters
FROM stores t1
CROSS JOIN stores t2
WHERE t1.store_code != t2.store_code
), nearby_stores_table AS (
SELECT
t1.store1 AS store_code,
STRING_AGG(DISTINCT t2.store2) AS nearby_stores
FROM store_distances t1
LEFT JOIN store_distances t2 USING (store1)
WHERE t2.distance_meters < 10000
GROUP BY t1.store1
ORDER BY t1.store1
), ds_with_nearby_stores AS (
SELECT
order_date, store_code, nearby_stores, sales
FROM daily_summary
LEFT JOIN nearby_stores_table USING (store_code)
WHERE product_id="1234"
)
SELECT DISTINCT
store_code, order_date,
(
SELECT SUM(sales)
FROM ds_with_nearby_stores t2
WHERE t2.store_code IN UNNEST(SPLIT(t1.nearby_stores)) AND t1.order_date=t2.order_date
) AS sales_at_nearby_stores,
FROM ds_with_nearby_stores t1
ORDER BY store_code, order_date
The first part of the query generates a table with {store1, store2, and the distance_meters between the 2}. The second part generates a table with {store_code, nearby_stores which is a comma-separated string of nearby stores}. The third part of the query joins the 2nd table with daily_summary (filtered on product_id), which gives us a table with {order_date, store_code, nearby_stores, sales}. Finally the last unpacks the string of nearby_stores and adds up the sales from those stores, giving us {store_code, order_date, sales_at_nearby_stores}
It is hard to say what exactly is slow here, without data, and without the query explanation that is displayed after the query finishes. If it finishes at all - please add query explanations.
One of the reasons it might be slow is it computes all the pair-wise distances between all stores - creating large join, and computing tons of distances. BigQuery has optimized Spatial JOIN that is able to do it much faster using ST_DWithin predicate - which filters out by given distance. The first two CTEs can be rewritten as
WITH stores_with_loc AS (
SELECT
store_code store,
ST_GEOGPOINT(longitude,latitude) loc
FROM stores
), nearby_stores_table AS (
SELECT
t1.store AS store_code,
ARRAY_AGG(DISTINCT IF(t2.store <> t1.store, t2.store, NULL) IGNORE NULLS) AS nearby_stores
FROM stores_with_loc t1
JOIN stores_with_loc t2
ON ST_DWithin(t1.loc, t2.loc, 10000)
GROUP BY t1.store
)
select * from nearby_stores_table
Other tweaks:
I used ARRAY_AGG, should be faster than converting to strings
Used regular join, rather than LEFT JOIN - BigQuery only optimized inner spatial join right now. The store always joins itself, so it is OK. We later drop the self-reference inside ARRAY_AGG expression.
Don't use ORDER BY in sub-queries, they don't change anything anyway.

Aggregating based on GROUPING of multiple columns

I am trying to subquery and aggregate in SQL after doing an initial query with multiple joins. My ultimate goal is to get a count (or a sum) of specimens tested based on a grouping of multiple columns. This is slightly different from SQL Server query - Selecting COUNT(*) with DISTINCT and SQL Server: aggregate error on grouping.
The three tables that I use (PERSON, SPECIMEN, TEST), have 1-many relationships. So PERSON has many SPECIMENS and those SPECIMENS have many TESTS. I did three inner joins to combine these tables plus an additional table (ANALYSIS).
WITH TALLY as (
SELECT PERSON.NAME, PERSON.PHASE, TEST.DATE_STARTED, TEST.ANALYSIS, SPECIMEN.GROUP, TEST.STATUS,
ANALYSIS.ANALYSIS_TYPE, SPECIMEN.SPECIMEN_NUMBER
FROM DB.TEST
INNER JOIN
DB.SAMPLE ON
TEST.SPECIMEN_NUMBER = SPECIMEN.SPECIMEN_NUMBER
INNER JOIN
DB.PRODUCT ON
SPECIMEN.PERSON = PERSON.NAME
INNER JOIN
DB.ANALYSIS ON
TEST.ANALYSIS = ANALYSIS.NAME
WHERE PERSON.NAME = 'Joe'
AND TEST.DATE_STARTED >= '20-DEC-16' AND TEST.DATE_STARTED <='01-APR-18'
AND PERSON.PHASE = 'PHASE1'
ORDER BY TEST.DATE_STARTED)
SELECT COUNT(DISTINCT ANALYSIS) as SPECIMEN_COUNT, DATE_STARTED, ANALYSIS, STATUS, GROUP, ANALYSIS_TYPE
FROM TALLY
GROUP BY DATE_STARTED, ANALYSIS, STATUS, GROUP, ANALYSIS_TYPE
ORDER BY DATE_STARTED;
This gives me the repeated columns: first grouping repeated 4 times
What I am trying to see is: aggregated first grouping with total count
Any thoughts as to what is missing? SUM instead of COUNT or in addition to COUNT creates an error. Thanks in advance!
9/17/2020 Update: I have tried adding a subquery because I also need to use a new column of metadata (ANALYSIS_TYPE_ALIAS) which is created in the first query through a CASE STATEMENT(...). I have also tried using another subquery with inner join to count based on those conditions to a temp table, but still cannot seem to aggregate to flatten the table. Here is my current attempt:
WITH TALLY as (
SELECT PERSON.NAME, PERSON.PHASE, TEST.DATE_STARTED, TEST.ANALYSIS, SPECIMEN.GROUP, TEST.STATUS,
ANALYSIS.ANALYSIS_TYPE...
FROM DB.TEST
INNER JOIN
DB.SAMPLE ON
TEST.SPECIMEN_NUMBER = SPECIMEN.SPECIMEN_NUMBER
INNER JOIN
DB.PRODUCT ON
SPECIMEN.PERSON = PERSON.NAME
INNER JOIN
DB.ANALYSIS ON
TEST.ANALYSIS = ANALYSIS.NAME
WHERE PERSON.NAME = 'Joe'
AND TEST.DATE_STARTED >= '20-DEC-16' AND TEST.DATE_STARTED <='01-APR-18'
AND PERSON.PHASE = 'PHASE1'
ORDER BY TEST.DATE_STARTED),
SUMMARY_COMBO AS (SELECT DISTINCT(CONCAT(CONCAT(CONCAT(CONCAT(ANALYSIS, DATE_STARTED),STATUS), GROUP), ANALYSIS_TYPE_ALIAS))AS UUID,
TALLY.NAME, TALLY.PHASE, TALLY.DATE_STARTED, TALLY.ANALYSIS, TALLY.GROUP, TALLY.STATUS, TALLY.ANALYSIS_TYPE_ALIAS
FROM TALLY)
SELECT SUMMARY_COMBO.NAME, SUMMARY_COMBO.PHASE, SUMMARY_COMBO.DATE_STARTED, SUMMARY_COMBO.ANALYSIS,SUMMARY_COMBO.GROUP, SUMMARY_COMBO.STATUS, SUMMARY_COMBO.ANALYSIS_TYPE_ALIAS,
COUNT(SUMMARY_COMBO.ANALYSIS) OVER (PARTITION BY SUMMARY_COMBO.UUID) AS SPECIMEN_COUNT
FROM SUMMARY_COMBO
ORDER BY SUMMARY_COMBO.DATE_STARTED;
This gave me the following table Shows aggregated counts, but doesn't aggregate based on unique UUID. Is there a way to take the sum of the count? I've tried to do this by storing count to a subquery and then referencing that count variable, but I am missing something in how to group the 8 columns of data that I want to show + the count of that combination of columns.
Thanks!
Just remove analysis from the group by clause, since that's the column whose distinct values you want to count. Otherwise, the query generates more groups than what you need (and the count of distinct analysis values in each group is always 1).
WITH TALLY as ( ...)
SELECT COUNT(DISTINCT ANALYSIS) as SPECIMEN_COUNT, DATE_STARTED, ANALYSIS, STATUS, GROUP, ANALYSIS_TYPE
FROM TALLY
GROUP BY DATE_STARTED, STATUS, GROUP, ANALYSIS_TYPE
ORDER BY DATE_STARTED;

How to get the sum of transaction amount happened in one date?

I am trying to write a query that will give me transaction amount sum happened in one date. The problem is , when I added column date in my query, I get individual values not their sum. The requirement for this query is to have one entry for each merchant but i am getting multiple rows for one merchant.
SELECT SUBSTR(m.MERCHANTLASTNAME, 1, 36) Name1,
m.MERCHANTBANKBSB MerchantAccbsb,
m.MERCHANTBANKACCNR Merchant_act,
m.MERCHANTID merchantid,
t.transactiondate date1,
sum(t.TRANSACTIONAMOUNT) as total
FROM fss_merchant m
JOIN fss_terminal term
ON m.MERCHANTID = term.MERCHANTID
JOIN FSS_DAILY_TRANSACTION t
ON term.TERMINALID = t.TERMINALID
group by t.transactiondate, SUBSTR(m.MERCHANTLASTNAME, 1, 36), m.MERCHANTID, m.MERCHANTBANKBSB, m.MERCHANTBANKACCNR,
m.MERCHANTLASTNAME
Output of my query:
I want to get one entry per each merchant with the sum of transaction amount in one day, not multiple rows of transaction in that day.
You can calculate the total amount in different inner query with the truncated date and join it with FSS_MERCHANT table so that issues described by #SatishSK and #mangusta is taken care.
You can use the following query:
SELECT
SUBSTR(M.MERCHANTLASTNAME, 1, 36) NAME1,
M.MERCHANTBANKBSB MERCHANTACCBSB,
M.MERCHANTBANKACCNR MERCHANT_ACT,
M.MERCHANTID MERCHANTID,
M_DATA.TRANSACTIONDATE DATE1,
M_DATA.TOTAL AS TOTAL
FROM
FSS_MERCHANT M
INNER JOIN (
SELECT
TERM.MERCHANTID MERCHANTID,
TRUNC(T.TRANSACTIONDATE) TRANSACTIONDATE,
SUM(T.TRANSACTIONAMOUNT) AS TOTAL
FROM
FSS_TERMINAL TERM
JOIN FSS_DAILY_TRANSACTION T ON TERM.TERMINALID = T.TERMINALID
GROUP BY
TERM.MERCHANTID,
TRUNC(T.TRANSACTIONDATE)
) M_DATA ON ( M.MERCHANTID = M_DATA.MERCHANTID );
Good luck!!
t.transactiondate column might contain date+time values. Use TRUNC(t.transactiondate) where you are using just t.transactiondate. You will get sum(transaction amount) "Date-wise" for each merchant.
OR
Filter out rows based on "Date" value in WHERE clause to retrieve data pertaining to a specific date.
Probably the reason is that you have included both m.MERCHANTLASTNAME and SUBSTR(m.MERCHANTLASTNAME,1,36) into the group by clause.
In case if there are entries with same SUBSTR(m.MERCHANTLASTNAME,1,36) but different m.MERCHANTLASTNAME, this is going to yield duplicates. You need to remove m.MERCHANTLASTNAME from group by clause

Subtract two SUM GROUP BY fields

I have two tables itemOrders and itemUsage.
itemOrders has two fields: item_number and qty_ordered
itemUsage has two fields: item_number and qty_used
I'm trying to word my SQL query so that it sums up the quantity of each item number in both tables then subtracts the totals in itemUsage from itemOrders
I've come up with this so far:
SELECT itemOrders.item_number
,(
SELECT sum(qty_ordered)
FROM itemOrders
GROUP BY itemOrders.item_number
) - (
SELECT sum(qty_used)
FROM itemUsage
GROUP BY itemUsage.item_number
) AS item_total
FROM itemOrders
INNER JOIN itemUsage
ON itemOrders.item_number = itemUsage.item_number
GROUP BY itemOrders.item_number
What happens here is that all fields come out to 0.
Example if item number "A" was showing a total quantity of 3 ordered across all instances of "A" in the itemOrders table, and only a total quantity used of 1 across all instances of "A" in the itemUsage table. The sql should show the number one in the item_total field next to 1 instance of "A" in the item_number field.
The problem is you are creating a CARTESIAN PRODUCT and repeting the values on the SUM just calculate each value separated and then LEFT JOIN both. In case no item are used COALESCE will convert NULL to 0
SELECT io_total.item_number,
order_total - COALESCE(used_total, 0) as item_total
FROM (SELECT io.item_number, sum(io.qty_ordered) as order_total
FROM itemOrders io
GROUP BY io.item_number
) io_total
LEFT JOIN (SELECT iu.item_number, sum(iu.qty_used) as used_total
FROM itemUsage iu
GROUP BY iu.item_number
) as iutotal
ON io_total.item_number = iutotal.item_number
Well it looks like you are making three queries, two separate ones, one for each sum, and one with an inner join that isn't being used.
Try
Select itemOrders.item_number, sum(itemOrders.qty_ordered - itemUsage.qty_used) as item_total
from itemOrders INNER JOIN itemUsage
On itemOrders.item_number = itemUsage.item_number
GROUP BY itemOrders.item_number

SQL Server : create a view with Union with data from first query

My SQL is quite rusty, so much that I have not created a view before and I am not entirely sure how to do what I need. Perhaps I need a stored procedure. Here is the deal.
We have a a database of ticket history (purchases). We want to filter on a certain SKU, but we want all line items from each ticket that has that SKU. For isntance, Someone buys a shirt and a hat. I want to filter on the shirt to find everyone who wants a shirt but display the entire ticket showing the shirt and the hat.
I thought my query would be something like this but I don't think it would work.
select
ticket_id, post_date, qty_sold, total_price, sales_total
from
ticket_history
where
sku = 'xxxx'
Union
select
sku as trans_sku, qty_sold as trans_qty_sold, desc as trans_desc, total_price as trans_total_price
from
ticket_history
where
ticket_id = <the ticket id in first query>
Perhaps a sub-select is what is needed but I'm not too understanding of how to do that either.
Any suggestions would be great.
I am not sure what you are trying to do here and whether UNION is what you are looking for or not.
In your query the columns are different and doesn't matched between the two queries. Any way, you can use a Common table Expression so that you can reuse the subquery, this should solve your problem:
WITH FirstQuery
AS
(
select
ticket_id,
post_date,
qty_sold,
total_price,
sales_total
from ticket_history
where sku = 'xxxx'
)
SELECT *
FROM FirstQuery
UNION
SELECT
... -- You should select the same number of columns
... -- and with the same data types to match the first columns
from ticket_history
where ticket_id IN(SELECT ticket_id FROM FirstQuery);
Here the FirstQuery acts like a subquery, but here you can reuse it later like what we did and use it in the where clause.
But, again the columns you selected in the first query:
ticket_id,
post_date,
qty_sold,
total_price,
sales_total
are different than the columns you selected in the second query:
sku as trans_sku,
qty_sold as trans_qty_sold,
desc as trans_desc,
total_price as trans_total_price
These columns should be matched (the count of them and data types). Otherwise you will got an error.
Things to note about UNION:
the columns count should be the same between the two queries.
The columns' names are driven from the first query.
When doing a UNION, the selected columns must match between the two select's. (Same number of columns, and matching data types.)
Maybe you want a self join instead?
select th1.ticket_id, th1.post_date, th1.qty_sold, th1.total_price, th1.sales_total,
th2.sku as trans_sku, th2.qty_sold as trans_qty_sold,
th2.desc as trans_desc, th2.total_price as trans_total_price
from ticket_history th1
left join ticket_history th2 on th2.ticket_id = th1.ticket_id
where th1.sku = 'xxxx'
LEFT JOIN to get th1 rows even if there are no matching th2 row.