simple finance database, need some statistics, how to query? - sql

I want to use a simple private cashflow database. Nothing special.
Therefor I use a table "finance_flow", where I can put all my in- and outcomes.
Income -> amount > 0
outcome -> amount < 0
Table structure
table "finance_flow" with example-data
id
category_id
amount
date
note
int
int
float
timestamp
varchar
1
1
+60,00
5.2.23
use for xy
2
2
-10,00
8.2.23
to Tom for school
3
3
-8,96
8.2.23
milk, bread, cheese
table "category"
id
name
1
tips
2
kids
3
shop
Of course there is a correct foreign-key-constraint.
What I want:
I want some statistical data, for example:
-current status of my money
-total outcomes for each category
-procentual values of those would be nice
I know how to get the total current state:
SELECT sum(amount) as total FROM finance_flow
I know how to get the total per category
SELECT abs(sum(amount)) as total_per_cat, category.name
FROM finance_flow
LEFT JOIN category ON financeflow.cat_id = category.id
GROUP BY category_id [WHERE date = 'february']
(Here I use the function abs(x), because I am not interessted in the sign.
The where-clause is optional, I want this, if the basics are correct, for monthly reports.
How to get the procentual values?
Can I get all this stuff with one query? :)
Expected result:
procentual_per_cat = total_per_cut / total_income(february) * 100
where
total_per_cut = abs(sum(amount)) for category x
total_income(february) = 60
resulting table:
name
total_per_cat
procentual_per_cat
kids
10
16.67 %
shop
8.96
14.93 %

Calculating the percentage is the same a dividing by the total amount:
SELECT
abs(sum(amount)) as total_per_cat,
abs(sum(amount))/(select sum(amount) from finance_flow where amount>0) *100 as percentage,
category.name
FROM finance_flow
LEFT JOIN category ON finance_flow.cat_id = category.id
WHERE amount<0
GROUP BY category.name
see: DBFIDDLE

Related

COUNT with multiple LEFT joins [duplicate]

This question already has answers here:
Two SQL LEFT JOINS produce incorrect result
(3 answers)
Closed 12 months ago.
I am having some troubles with a count function. The problem is given by a left join that I am not sure I am doing correctly.
Variables are:
Customer_name (buyer)
Product_code (what the customer buys)
Store (where the customer buys)
The datasets are:
Customer_df (list of customers and product codes of their purchases)
Store1_df (list of product codes per week, for Store 1)
Store2_df (list of product codes per day, for Store 2)
Final output desired:
I would like to have a table with:
col1: Customer_name;
col2: Count of items purchased in store 1;
col3: Count of items purchased in store 2;
Filters: date range
My query looks like this:
SELECT
DISTINCT
C_customer_name,
C.product_code,
COUNT(S1.product_code) AS s1_sales,
COUNT(S2.product_code) AS s2_sales,
FROM customer_df C
LEFT JOIN store1_df S1 USING(product_code)
LEFT JOIN store2_df S2 USING(product_code)
GROUP BY
customer_name, product_code
HAVING
S1_sales > 0
OR S2_sales > 0
The output I expect is something like this:
Customer_name
Product_code
Store1_weekly_sales
Store2_weekly_sales
Luigi
120012
4
8
James
100022
6
10
But instead, I get:
Customer_name
Product_code
Store1_weekly_sales
Store2_weekly_sales
Luigi
120012
290
60
James
100022
290
60
It works when instead of COUNT(product_code) I do COUNT(DSITINCT product_code) but I would like to avoid that because I would like to be able to aggregate on different timespans (e.g. if I do count distinct and take into account more than 1 week of data I will not get the right numbers)
My hypothesis are:
I am joining the tables in the wrong way
There is a problem when joining two datasets with different time aggregations
What am I doing wrong?
The reason as Philipxy indicated is common. You are getting a Cartesian result from your data thus bloating your numbers. To simplify, lets consider just a single customer purchasing one item from two stores. The first store has 3 purchases, the second store has 5 purchases. Your total count is 3 * 5. This is because for each entry in the first is also joined by the same customer id in the second. So 1st purchase is joined to second store 1-5, then second purchase joined to second store 1-5 and you can see the bloat. So, by having each store pre-query the aggregates per customer will have AT MOST, one record per customer per store (and per product as per your desired outcome).
select
c.customer_name,
AllCustProducts.Product_Code,
coalesce( PQStore1.SalesEntries, 0 ) Store1SalesEntries,
coalesce( PQStore2.SalesEntries, 0 ) Store2SalesEntries
from
customer_df c
-- now, we need all possible UNIQUE instances of
-- a given customer and product to prevent duplicates
-- for subsequent queries of sales per customer and store
JOIN
( select distinct customerid, product_code
from store1_df
union
select distinct customerid, product_code
from store2_df ) AllCustProducts
on c.customerid = AllCustProducts.customerid
-- NOW, we can join to a pre-query of sales at store 1
-- by customer id and product code. You may also want to
-- get sum( SalesDollars ) if available, just add respectively
-- to each sub-query below.
LEFT JOIN
( select
s1.customerid,
s1.product_code,
count(*) as SalesEntries
from
store1_df s1
group by
s1.customerid,
s1.product_code ) PQStore1
on AllCustProducts.customerid = PQStore1.customerid
AND AllCustProducts.product_code = PQStore1.product_code
-- now, same pre-aggregation to store 2
LEFT JOIN
( select
s2.customerid,
s2.product_code,
count(*) as SalesEntries
from
store2_df s2
group by
s2.customerid,
s2.product_code ) PQStore2
on AllCustProducts.customerid = PQStore2.customerid
AND AllCustProducts.product_code = PQStore2.product_code
No need for a group by or having since all entries in their respective pre-aggregates will result in a maximum of 1 record per unique combination. Now, as for your needs to filter by date ranges. I would just add a WHERE clause within each of the AllCustProducts, PQStore1, and PQStore2.

Grouping and Summing Totals in a Joined Table

I have two tables Medication and Inventory. I'm trying to SELECT all the below details from both tables but there are multiple listings of medication ids with different BRANCH_NO also in the INVENTORY table (the primary key in INVENTORY is actually BRANCH_NO, MEDICATION_ID composite key)
I need to total up the various medication_IDs and also join the tables in one SELECT command and display all the infomation for each med (there are 5) with a total sum of each med at the end of each row. But im getting all muddled trying Group by and Sum and at one point partition. Help please I'm new to this.
Below is the latest non working version - but it doesn't display
Medication Name
Medication Desc
Manufacturer
Pack Size
like i chanced it might.
SELECT I.MEDICATION_ID,
SUM(I.STOCK_LEVEL)
FROM INVENTORY I
INNER JOIN (SELECT MEDICATION_NAME, SUBSTR(MEDICATION_DESC,1,20) "Medication Description",
MANUFACTURER, PACK_SIZE FROM MEDICATION) M ON MEDICATION_ID=I.MEDICATION_ID
GROUP BY I.MEDICATION_ID;
For the data imagine I want this sort of output:
MEDICATION_ID MEDICATION_NAME STOCK_LEVEL OtherColumns.....
1 Alpha 10
2 Bravo 20
3 Charlie 20
1 Alpha 30
4 Delta 10
5 Echo 20
5 Echo 40
2 Bravo 10
grouping and totalling into this:
MEDICATION_ID MEDICATION_NAME STOCK_LEVEL OtherColumns.....
1 Alpha 40
2 Bravo 30
3 Charlie 20
4 Delta 10
5 Echo 60
I can get this when its just one table but when Im trying to join tables and also SELECT things its just not working.
Thanks in advance guys. I appreciate it may be a simple solution, but it will be a big help.
You need to write explicitly all non-aggregated columns into both SELECT and GROUP BY lists ( Btw, no need to use a nested query, and if it's the case MEDICATION_ID column is missing in it ) :
SELECT I.MEDICATION_ID, M.MEDICATION_NAME, SUM(I.STOCK_LEVEL) AS STOCK_LEVEL,
SUBSTR(M.MEDICATION_DESC,1,20) "Medication Description", M.MANUFACTURER, M.PACK_SIZE
FROM INVENTORY I
JOIN MEDICATION M ON M.MEDICATION_ID = I.MEDICATION_ID
GROUP BY I.MEDICATION_ID, M.MEDICATION_NAME, SUBSTR(M.MEDICATION_DESC,1,20),
M.MANUFACTURER, M.PACK_SIZE;
This way, you'll be able to return all the listed columns.

Total Sum SQL Server

I have a query that collects many different columns, and I want to include a column that sums the price of every component in an order. Right now, I already have a column that simply shows the price of every component of an order, but I am not sure how to create this new column.
I would think that the code would go something like this, but I am not really clear on what an aggregate function is or why I get an error regarding the aggregate function when I try to run this code.
SELECT ID, Location, Price, (SUM(PriceDescription) FROM table GROUP BY ID WHERE PriceDescription LIKE 'Cost.%' AS Summary)
FROM table
When I say each component, I mean that every ID I have has many different items that make up the general price. I only want to find out how much money I spend on my supplies that I need for my pressure washers which is why I said `Where PriceDescription LIKE 'Cost.%'
To further explain, I have receipts of every customer I've worked with and in these receipts I write down my cost for the soap that I use and the tools for the pressure washer that I rent. I label all of these with 'Cost.' so it looks like (Cost.Water), (Cost.Soap), (Cost.Gas), (Cost.Tools) and I would like it so for Order 1 it there's a column that sums all the Cost._ prices for the order and for Order 2 it sums all the Cost._ prices for that order. I should also mention that each Order does not have the same number of Costs (sometimes when I use my power washer I might not have to buy gas and occasionally soap).
I hope this makes sense, if not please let me know how I can explain further.
`ID Location Price PriceDescription
1 Park 10 Cost.Water
1 Park 8 Cost.Gas
1 Park 11 Cost.Soap
2 Tom 20 Cost.Water
2 Tom 6 Cost.Soap
3 Matt 15 Cost.Tools
3 Matt 15 Cost.Gas
3 Matt 21 Cost.Tools
4 College 32 Cost.Gas
4 College 22 Cost.Water
4 College 11 Cost.Tools`
I would like for my query to create a column like such
`ID Location Price Summary
1 Park 10 29
1 Park 8
1 Park 11
2 Tom 20 26
2 Tom 6
3 Matt 15 51
3 Matt 15
3 Matt 21
4 College 32 65
4 College 22
4 College 11 `
But if the 'Summary' was printed on every line instead of just at the top one, that would be okay too.
You just require sum(Price) over(Partition by Location) will give total sum as below:
SELECT ID, Location, Price, SUM(Price) over(Partition by Location) AS Summed_Price
FROM yourtable
WHERE PriceDescription LIKE 'Cost.%'
First, if your Price column really contains values that match 'Cost.%', then you can not apply SUM() over it. SUM() expects a number (e.g. INT, FLOAT, REAL or DECIMAL). If it is text then you need to explicitly convert it to a number by adding a CAST or CONVERT clause inside the SUM() call.
Second, your query syntax is wrong: you need GROUP BY, and the SELECT fields are not specified correctly. And you want to SUM() the Price field, not the PriceDescription field (which you can't even sum as I explained)
Assuming that Price is numeric (see my first remark), then this is how it can be done:
SELECT ID
, Location
, Price
, (SELECT SUM(Price)
FROM table
WHERE ID = T1.ID AND Location = T1.Location
) AS Summed_Price
FROM table AS T1
to get exact result like posted in question
Select
T.ID,
T.Location,
T.Price,
CASE WHEN (R) = 1 then RN ELSE NULL END Summary
from (
select
ID,
Location,
Price ,
SUM(Price)OVER(PARTITION BY Location)RN,
ROW_number()OVER(PARTITION BY Location ORDER BY ID )R
from Table
)T
order by T.ID

Postgresql: Query to know which fraction of the values are larger/smaller

I would like to query my database to know which fraction/percentage of the elements of a table are larger/smaller than a given value.
For instance, let's say I have a table shopping_list with the following schema:
id integer
name text
price double precision
with contents:
id name price
1 banana 1
2 book 20
3 chicken 5
4 chocolate 3
I am now going to buy a new item with price 4, and I would like to know where this new item will be ranked in the shopping list. In this case the element will be greater than 50% of the elements.
I know I can run two queries and count the number of elements, e.g.:
-- returns = 4
SELECT COUNT(*)
FROM shopping_list;
-- returns = 2
SELECT COUNT(*)
FROM shopping_list
WHERE price > 4;
But I would like to do it with a single query to avoid post-processing the results.
if you just want them in single query use UNION
SELECT COUNT(*), 'total'
FROM shopping_list
UNION
SELECT COUNT(*),'greater'
FROM shopping_list
WHERE price > 4;
The simplest way is to use avg():
SELECT AVG( (price > 4)::float)
FROM shopping_list;
One way to get both results is as follows:
select count(*) as total,
(select count(*) from shopping_list where price > 4) as greater
from shopping_list
It will get both results in a single row, with the names you specified. It does, however, involve a query within a query.
I found the aggregate function PERCENT_RANK which does exactly what I wanted:
SELECT PERCENT_RANK(4) WITHIN GROUP (ORDER BY price)
FROM shopping_list;
-- returns 0.5

Access SQL: How to specify which record to return based on the "more important" condition?

I have 2 tables (MS ACCESS):
Table "Orders"
OrderID Product Product_Group Client Client_Group Revenue
1 Cars Vehicles Men People 10 000
2 Houses NC_Assets Women People 15 000
3 Houses NC_Assets Partnersh Companies 12 000
4 Cars Vehicles Corps Companies 3 000
Table "Gouping"
Product Product_Group Client Client_Group Tax rate
Cars Companies Taxable 30%
Vehicles Companies Taxable 15%
Houses People Taxable 13%
Houses Women Taxable 15%
I want to join these tables to see which orders will fall into which taxable group. As you can see some products/clients are mapped differently than their groups -> if that is the case, the query should return only one record for this pair and exclude any pairing containing their groups. In pseudo-code:
If there's product-client grouping, return this record Else
If there's product-client grouping ---//----- else
If there's product group - client ----///-----else
If there's product group-client group ---///----
End if * 4
In that order.
Now my query (pseudo):
SELECT [Orders].*, [Grouping].* FROM [Orders] LEFT JOIN [Grouping] ON
(([Orders].Product = [Grouping].Product OR [Orders].Product_Group = [Grouping].Product_Group) AND
([Orders].Client = [Grouping].Client OR [Orders].Client_Group = [Grouping].Client_Group))
Returns both Cars-Companies and Vehicles-Companies. I'm out of ideas how to set it up to get only the most granular records from each combination. UNION? NOT EXISTS?
Any help appreciated.
I want to join these tables to see how many orders qualify as good,
mediocre etc.
Sounds like you want counts of the particular conditions...Assuming you have a SUM and CASE (I haven't written queries for MS Access in about 10 years...), here's some pseudo-code that should get you started:
SELECT SUM(CASE WHEN {mediocre-conditions} THEN 1 ELSE 0 END) AS MediocreCount,
SUM(CASE WHEN {good-conditions} THEN 1 ELSE 0 END) AS GoodCount,
SUM(CASE WHEN {great-conditions} THEN 1 ELSE 0 END) AS GreatCount
FROM [Orders] LEFT JOIN [Grouping] ON (([Orders].Product = [Grouping].Product OR [Orders].Product_Group = [Grouping].Product_Group) AND ([Orders].Client = [Grouping].Client OR [Orders].Client_Group = [Grouping].Client_Group))
[update] I don't like giving bad answers, so did a quick look...based on this link: Does MS Access support "CASE WHEN" clause if connect with ODBC?, it appears you may be able to do:
SELECT SUM(IIF({mediocre-conditions},1,0)) AS MediocreCount,
SUM(IIF({good-conditions},1,0)) AS GoodCount,
SUM(IIF({great-conditions},1,0)) AS GreatCount