Select value based on computation of column value and aggregate - sql

I'm trying to build a Grafana Dashboard to understand what SQL queries are processed by my PostgreSQL server. I'm using the pg_stats_statements extension.
This is the query I currently have:
SELECT
query,
calls,
FROM pg_stat_statements
ORDER BY calls DESC limit 3;
Which gets me these results:
query | calls
---------+--------
Query 1 | 500000
Query 2 | 250000
Query 3 | 250000
Now, I'd like to select an additional value, in addition to calls, to see the share of each calls value compared to sum(calls) on all rows. This is the expected output:
query | calls | share
---------+--------+------ # 1 000 000 total calls
Query 1 | 500000 | 0.5 # 500 000 / 1 000 000
Query 2 | 250000 | 0.25 # 250 000 / 1 000 000
Query 3 | 250000 | 0.25 # 250 000 / 1 000 000
Is it possible to do that and if yes, how can I rewrite my query to get this output?

WITH sum_query AS MATERIALIZED
(select sum(calls) as call_sum from pg_stat_statements)
select
ps.query,
sum(ps.calls),
avg(round((ps.total_time/ps.calls)::numeric,2)) as mean_time,
sum(ps.calls) / (select call_sum from sum_query) as "share"
from pg_stat_statements ps
group by ps.query
In this query, I use WITH AS MATERIALIZED for performance.

Related

KQL summarize by count and then filter

The goal of my query is to see if at any given minute we have more than 500 logs.
I have this line at the end | summarize count() by bin(env_time, 1m), but now I want to know if I can add filtering beyond that to only see rows with more than 500 results. Something along the lines of:
| totals = summarize count() by bin(env_time, 1m)
| where totals>500
Is there a way to do this correctly in KQL?
TIA
let t = materialize(range i from 1 to 9700 step 1 | extend env_time = ago(20m * rand()));
t
| summarize count() by bin(env_time, 1m)
| where count_ > 500
env_time
count_
2023-01-08T09:54:00Z
531
2023-01-08T09:56:00Z
501
2023-01-08T09:57:00Z
501
2023-01-08T10:00:00Z
510
2023-01-08T10:03:00Z
502
Fiddle
or (with alias for count())
let t = materialize(range i from 1 to 9700 step 1 | extend env_time = ago(20m * rand()));
t
| summarize rows_per_minute = count() by bin(env_time, 1m)
| where rows_per_minute > 500
env_time
rows_per_minute
2023-01-08T09:51:00Z
539
2023-01-08T09:57:00Z
501
2023-01-08T10:02:00Z
516
Fiddle

I need the top x most recent (by SALEDT) rows grouped be neighborhood (NBHD)

I'm using microsoft access and I need a sql query to return the top x (40 in my case) most recent sales for each neighborhood (NBHD). My data looks something like this:
PARID PRICE SALEDT SALEVAL NBHD
04021000 140000 1/29/2016 11 700
04021000 160000 2/16/2016 11 700
04018470 250000 4/23/2015 08 701
04018470 300000 4/23/2015 08 701
04016180 40000 5/9/2017 11 705
04023430 600000 6/12/2017 19 700
And what I need is the top 40 most recent SALEDT entries for each NBHD, and if the same PARID would show up in that top 40 twice or more, I only want the most recent one. If the rows have the same PARID and the same SALEDT, I need the only most expensive one. For this small set of sample data, I would get:
PARID PRICE SALEDT SALEVAL NBHD
04021000 160000 2/16/2016 11 700
04023430 600000 6/12/2017 19 700
04018470 300000 4/23/2015 08 701
04016180 40000 5/9/2017 11 705
I get row 2 (as it has a later SALEDT than row 1), row 4 (as it has a higher PRICE than row 3, and row 5 and row 6. Hopefully that is clear. Also, I'm using MS access SQL to do this, but wouldn't be opposed to some VBA solution if that is easier. Thanks in advance.
Here you go:
select a.parid, max(a.price)price, a.saledt, a.saleval, a.nbhd from #table a join (
select parid, max(saledt) saledt from #table
group by parid ) b on a.parid=b.parid and a.saledt=b.saledt
group by a.parid, a.saledt, a.saleval, a.nbhd
order by a.nbhd
In MS Access, you can do the following to get the 40 most recent entries for each neighborhood:
select t.*
from t
where t.salesdt in (select top 40 t2.salesdt
from t as t2
where t2.nbhd = t.nbhd
order by t2.salesdt desc
);
Your additional constraints are rather confusing. I'm not sure I fully follow them because I don't know what the columns really refer to.

sum revenue based on criteria form another table Powerpivot

I have a model where I have Revenue table that has revenue2016 column
another table Programs where i have
program | min
I would like to add a calculated column to programs table so that it sums revenue that is grater than the min like so
=CALCULATE(SUM(Revenue[revenue2016 ]),Revenue[revenue2016]>=Programs[min])
this gave me an error
The data should look like this
#Revenue
Revenue
10
10
10
10
10
100
100
100
100
100
1000
1000
1000
1000
1000
#Programs
program | min | summed rev
a | 10 | 5550
b | 100 | 5500
c | 1000 | 5000
Just After I posted it I found the answer, I'll share it if someone else came across same issue
=calculate(sum(Revenue[revenue2016]),filter(Revenue,Revenue[revenue2016]>=Programs[Min]))

Filter SQL query results by aggregrate

I need a query that shows the JobIDs where the Worker has not been paid BUT where the Company has been paid. Below are the table columns and sample data:
tblInvoices columns:
-------------------
JobID
InvoiceID
WorkerPaidAmountTotal
CompanyPaidAmountTotal
Sample data
-----------
JobID | InvoiceID | WorkerPaidAmountTotal | CompanyPaidAmountTotal
1 30 100 150
1 31 0 100
2 32 0 75
3 33 25 50
3 34 10 30
4 35 0 0
I know how to get the SUM of the amounts paid to either a Worker or the Company. The results look like this:
JobID Worker Company
1 100 250
2 0 75
3 35 80
4 0 0
But what I need are the results of just the JobIDs where the Worker has got 0 and the company >0. The results I want should be this, but I can't figure out the query to do so:
JobID Worker Company
2 0 75
Use HAVING clause to filter the groups. Try this :
SELECT jobid,
Worker=Sum(WorkerPaidAmountTotal),
Company=Sum(CompanyPaidAmountTotal)
FROM tablename
GROUP BY jobid
HAVING Sum(WorkerPaidAmountTotal) = 0
AND Sum(CompanyPaidAmountTotal) > 0
select jobid, worker, company where WorkerPaidAmountTotal = 0 and CompanyPaidAmountTotal
Seems to plain to do it... may be i did'nt understand the question

SQL: how to separate combined row into individual rows

I have a database table like this:
id | check_number | amount
1 | 1001]1002]1003 | 200]300]100
2 | 2001]2002 | 500]1000
3 | 3002]3004]3005]3007 | 100]300]600]200
I want to separate the records into something like this:
id | check_number | amount
1 | 1001 | 200
2 | 1002 | 300
3 | 1003 | 100
. | . | .
. | . | .
. | . | .
How do I do this just using SQL in Oracle and SQL Server?
Thanks,
Milo
In Oracle Only, using the CONNECT BY LEVEL method (see here), with several caveats:
select rownum, id,
substr(']'||check_number||']'
,instr(']'||check_number||']',']',1,level)+1
,instr(']'||check_number||']',']',1,level+1)
- instr(']'||check_number||']',']',1,level) - 1) C1VALUE,
substr(']'||amount||']'
,instr(']'||amount||']',']',1,level)+1
,instr(']'||amount||']',']',1,level+1)
- instr(']'||amount||']',']',1,level) - 1) C2VALUE
from table
connect by id = prior id and prior dbms_random.value is not null
and level <= length(check_number) - length(replace(check_number,']')) + 1
ROWNUM ID C1VALUE C2VALUE
1 1 1001 200
2 1 1002 300
3 1 1003 100
4 2 2001 500
5 2 2002 1000
6 3 3002 100
7 3 3004 300
8 3 3005 600
9 3 3007 200
Essentially we blow out the query using the hierarchical functions of oracle and then only get the substrings for the data in each "column" of data inside the check_number and amount columns.
Major Caveat: The data to be transformed must have the same number of "data elements" in both columns, since we use the first column to "count" the number of items to be transformed.
I have tested this on 11gR2. YMMV depending on DMBS version as well. Note the need to use the "PRIOR" operator, which prevents oracle from going into an infinite connect by loop.