SQL Avoid multiplication on inner joins with several returns - sql

OK, not the best title but could not explain it better.
I have a SQL query with a line like this.
count(PRStatusChangesLog.EffectiveMinutes) as timeInHandoverExternal
it works so far but I also want to add something like this
COUNT (distinct a.ActionId) as 'Number Of Actions',
which requires this
INNER JOIN PRAction a on a.PrId = PRHeader.prid
Now the problem which I am sure some of you have already seen. The previous count is now multiplied by the number of actions.
I can see why this happens but I am not sure how best to do this so I can get both the number of actions and the right count without the multiplier.
Simplified full query
SELECT
PRHeader.PrId,
COUNT (distinct a.ActionId) AS 'Number Of Actions',
COUNT (PRStatusChangesLog.EffectiveMinutes) AS timeInHandoverExternal
FROM
PRHeader
LEFT JOIN
PRStatusChangesLog ON PRStatusChangesLog.PrId = PRHeader.PrId
AND PRStatusChangesLog.StatusId = 4100
INNER JOIN
PRAction a ON a.PrId = PRHeader.prid
WHERE
DATEDIFF(mm, prheader.ClosedDate, getdate()) = 1
AND (PRHeader.siteId = 74)
AND prheader.PRTypeId IN (17, 19)
AND PRHeader.tmpStatusId <> 6010
GROUP BY
PRHeader.PrId

You can count a unique column with DISTINCT like COUNT(DISTINCT PRStatusChangesLog.id).
If this is not possible use a subquery for counting the actions. In the SELECT clause you should write something like: (SELECT COUNT(DISTINCT a.ActionId) FROM ... WHERE PRAction a on a.PrId = PRHeader.prid) AS action_count

Using select statement clause in joins to get counts individually then add with final outer select statement.
SELECT
PRHeader.PrId, Count1 'Number Of Actions', Count2 timeInHandoverExternal
FROM
PRHeader
JOIN
(SELECT COUNT (ActionId) Count1
FROM PRAction
GROUP BY PrId) A ON A.PrId = PRHeader.prid
LEFT JOIN
(SELECT
COUNT(PRStatusChangesLog.EffectiveMinutes) Count2, PrId, StatusId
FROM
PRStatusChangesLog
WHERE
StatusId = 4100
GROUP BY
PrId, StatusId) B ON B.PrId = PRHeader.PrId
WHERE
DATEDIFF(mm, prheader.ClosedDate, getdate()) = 1
AND (PRHeader.siteId = 74 )
AND prheader.PRTypeId IN (17,19)
AND PRHeader.tmpStatusId <> 6010

Related

SQL Queries Returning Non-equivalent Results and Different Counts Every Run

So I have 2 SQL queries both including the same variable: basically (n_orders and orders_count) should return the same exact results. Problem is:
the 2 columns are not always equivalent for all values
the count of different values changes every run
so first run could be that 20 rows have different (n_orders, orders_count) values then 2nd run says count of different values is 56 for example and so on with changing counts every run.
Query 1:
SELECT product_id,
packing_unit_id,
count(DISTINCT product_sales_order.sales_order_id)
FROM product_sales_order
WHERE product_sales_order.created_at::date BETWEEN '{start}' AND '{end}'
GROUP BY 1, 2
ORDER BY product_id, packing_unit_id
Query 2:
select kpis.*, lr.lr
FROM
(SELECT product_sales_order.product_id,
product_sales_order.packing_unit_id,
count(DISTINCT product_sales_order.sales_order_id) AS orders_count,
count(DISTINCT sales_orders.retailer_id) AS retailers_count,
count(DISTINCT product_sales_order.sales_order_id)*1.0 / count(DISTINCT sales_orders.retailer_id) AS frequency,
(count(DISTINCT sales_orders.retailer_id)*1.0 /(SELECT count(DISTINCT sales_orders.retailer_id) AS month_retailers
FROM sales_orders
JOIN retailers on retailers.id = sales_orders.retailer_id
WHERE sales_orders.created_at::Date BETWEEN '{start}' AND '{end}'
AND sales_orders.sales_order_status_id = 6
AND retailers.is_market_type_private = false)) AS reach,
sum(product_sales_order.total_price) AS nmv,
(sum(product_sales_order.total_price)*1.0 / (SELECT sum(product_sales_order.total_price) AS month_nmv
FROM product_sales_order
WHERE product_sales_order.created_at::Date BETWEEN '{start}' AND '{end}'
AND product_sales_order.purchased_item_count <> 0)) AS contribution,
sum(product_sales_order.purchased_item_count * product_sales_order.basic_unit_count) AS bskt_size,
sum(product_sales_order.total_price)*1.0 / count(DISTINCT product_sales_order.sales_order_id) AS avg_ts,
sum(product_sales_order.total_price)*1.0 / count(DISTINCT sales_orders.retailer_id) AS nmv_p_retailer
FROM product_sales_order
LEFT JOIN sales_orders ON sales_orders.id = product_sales_order.sales_order_id
LEFT JOIN products ON products.id = product_sales_order.product_id
LEFT JOIN retailers on retailers.id = sales_orders.retailer_id
WHERE product_sales_order.created_at::date BETWEEN '{start}' AND '{end}'
GROUP BY 1,2
ORDER BY product_sales_order.product_id, product_sales_order.packing_unit_id, orders_count
) as kpis
LEFT JOIN (
SELECT performance.lost_revenue.product_id,
sum(performance.lost_revenue.lost_revenue) as lr
FROM performance.lost_revenue
WHERE performance.lost_revenue.created_at::Date between '{start}' AND '{end}'
GROUP BY 1
)as lr on lr.product_id = kpis.product_id
What could be corrected regarding the structure of the 2nd SQL query to make it yield the same results for orders_count?
Why does different values count return different results every run?

Make sum of num_importe by the entidad code sql

My code:
select distinct entidad, sum(cast(num_importe as float))
from envio_remesa
inner join remesa
on envio_remesa.id = remesa.envio_remesa_id
where remesa.envio_remesa_id = 3 and remesa.tipo_doc='201';
The case is that for example I have two different "entidades"(suppose 18 and 21, but it can be any number), and I want to group in two different records the sum of the "num_importe" for the "entidad" 18, and the sum of the "num_importe" for the "entidad" 21.How could I do it?
What I want to come out:
entidad num_importe
18 92.300,00
21 56.000,20
432 120.000,32
12 12.232,12
you shoud use group by (by the way, distinct is useless here)
select entidad, sum(cast(num_importe as float))
from envio_remesa
inner join remesa
on envio_remesa.id = remesa.envio_remesa_id
where remesa.envio_remesa_id = 3 and remesa.tipo_doc='201'
group by entidad;
You can use aggregation:
select entidad, sum(cast(num_importe as float))
from envio_remesa er inner join
remesa r
on er.id = r.envio_remesa_id
where r.envio_remesa_id = 3 and r.tipo_doc = '201'
group by entidad;
Note: You should qualify entidad and num_importe so it is clear what table they come from.
Also, I added table aliases into the query. They make the query easier to write and to read.

How could I join these queries together?

I have 2 queries. One includes a subquery and the other is a pretty basic query. I can't figure out how to combine them to return a table with name, workdate, minutes_inactive, and hoursworked.
I have the code below for what I have tried. The simple query is lines 1,2, and the last 5 lines. I also added a join clause (join punchclock p on p.servrepID = l.repid) to it.
Both these queries ran on their own so this is solely just the problem of combining them.
select
sr.sr_name as liaison, cast(date_created as date) workdate,
(count(date_created) * 4) as minutes_inactive,
(select
sr_name, cast(punchintime as date) as workdate,
round(sum(cast(datediff(minute,punchintime, punchouttime) as real) / 60), 2) as hoursworked,
count(*) as punches
from
(select
sr_name, punchintime = punchdatetime,
punchouttime = isnull((select top 1 pc2.punchdatetime
from punchclock pc2
where pc2.punchdatetime > pc.punchdatetime
and pc.servrepid = pc2.servrepid
and pc2.inout = 0
order by pc2.punchdatetime), getdate())
from
punchclock pc
join
servicereps sr on pc.servrepid = sr.servrepid
where
punchyear >= 2017 and pc.inout = 1
group by
sr_name, cast(punchintime as date)))
from
tbl_liga_popup_log l
join
servicereps sr on sr.servrepID = l.repid
join
punchclock p on p.servrepID = l.repid collate latin1_general_bin
group by
cast(l.date_created as date), sr.sr_name
I get this error:
Msg 102, Level 15, State 1, Line 19
Incorrect syntax near ')'
I keep getting this error but there are more errors if I adjust that part.
I don't know that we'll fix everything here, but there are a few issues with your query.
You have to alias your sub-query (technically a derived table, but whatever)
You have two froms in your outer query.
You have to join to the derived table.
Here's an crude example:
select
<some stuff>
from
(select col1 from table1) t1
inner join t2
on t1.col1 = t2.col2
The large error here is that you are placing queries in the select section (before the from). You can only do this if the query returns a single value. Else, you have to put your query in a parenthesis (you have done this) in the from section, give it an alias, and join it accordingly.
You also seem to be using group bys that are not needed anywhere. I can't see aggregation functions like sum().
My best bet is that you are looking for the following query:
select
sr_name as liaison
,cast(date_created as date) workdate
,count(distinct date_created) * 4 as minutes_inactive
,cast(punchintime as date) as workdate
,round(sum(cast(datediff(minute,punchintime,isnull(pc2_query.punchouttime,getdate())) as real) / 60), 2) as hoursworked
,count(*) as punches
from
punchclock pc
inner join servicereps sr on pc.servrepid = sr.servrepid
cross apply
(
select top 1 pc2.punchdatetime as punchouttime
from punchclock pc2
where pc2.punchdatetime > pc.punchdatetime
and pc.servrepid = pc2.servrepid
and pc2.inout = 0
order by pc2.punchdatetime
)q1
inner join tbl_liga_popup_log l on sr.servrepID = l.repid
where punchyear >= 2017 and pc.inout = 1

How to force postgres to return 0 even if there are no rows matching query, using coalesce, group by and join

I've been trying hopelessly to get the following SQL statement to return the query results and default to 0 if there are no rows matching the query.
This is the intended result:
vol | year
-------+------
0 | 2018
Instead I get:
vol | year
-----+------
(0 rows)
Here is the sql statement:
select coalesce(vol,0) as vol, year
from (select sum(vol) as vol, year
from schema.fact_data
join schema.period_data
on schema.fact_data.period_tag = schema.period_data.tag
join schema.product_data
on schema.fact_data.product_tag =
schema.product_data.tag
join schema.market_data
on schema.fact_data.market_tag = schema.market_data.tag
where "retailer"='MadeUpRetailer'
and "product_tag"='FakeProductTag'
and "year"='2018' group by year
) as DerivedTable;
I know the query works because it returns data when there is data. Just doesn't default to 0 as intended...
Any help in finding why this is the case would be much appreciated!
Using your subquery DerivedTable, you could write:
SELECT coalesce(DerivedTable.vol, 0) AS vol,
y.year
FROM (VALUES ('2018'::text)) AS y(year)
LEFT JOIN (SELECT ...) AS DerivedTable
ON DerivedTable.year = y.year;
Remove the GROUP BY (and the outer query):
select 2018 as year, coalesce(sum(vol), 0) as vol
from schema.fact_data f join
schema.period_data p
on f.period_tag = p.tag join
schema.product_data pr
on f.product_tag = pr.tag join
schema.market_data m
on fd.market_tag = m.tag
where "retailer" = 'MadeUpRetailer' and
"product_tag" = 'FakeProductTag' and
"year" = '2018';
An aggregation query with no GROUP BY always returns exactly one row, so this should do what you want.
EDIT:
The query would look something like this:
select v.yyyy as year, coalesce(sum(vol), 0) as vol
from (values (2018), (2019)) v(yyyy) left join
schema.fact_data f
on f.year = v.yyyy left join -- this is just an example. I have no idea where year is coming from
schema.period_data p
on f.period_tag = p.tag left join
schema.product_data pr
on f.product_tag = pr.tag left join
schema.market_data m
on fd.market_tag = m.tag
group by v.yyyy
However, you have to move the where conditions to the appropriate on clauses. I have no idea where the columns are coming from.
From the code you posted it is not clear in which table you have the year column.
You can use UNION to fetch just 1 row in case there are no rows in that table for the year 2018 like this:
select sum(vol) as vol, year
from schema.fact_data innrt join schema.period_data
on schema.fact_data.period_tag = schema.period_data.tag
inner join schema.product_data
on schema.fact_data.product_tag = schema.product_data.tag
inner join schema.market_data
on schema.fact_data.market_tag = schema.market_data.tag
where
"retailer"='MadeUpRetailer' and
"product_tag"='FakeProductTag' and
"year"='2018'
group by "year"
union
select 0 as vol, '2018' as year
where not exists (
select 1 from tablename where "year" = '2018'
)
In case there are rows for the year 2018, then nothing will be fetched by the 2nd query,

Display rows that have a zero count

I am trying to display rows even if they return a count of zero. However no luck.
I tried using left join.
select
a.Month,
count(b.InsuranceFromJob) [Number of Participants without Insurance]
from
hsAdmin.ReportPeriodLkup a
left join hsAdmin.ClientReport b on
b.ReportPeriod = a.ReportPeriodId
where
b.insurancefromjob = 2 and
a.reportperiodid between (#lastReportId - 11) and #lastReportId
group by
a.Month
Because clientreport is in the where, only rows that exists in clientreport will be in the resultset.
Move the check to the join and you will get the desired result:
select
a.Month,
count(b.InsuranceFromJob) [Number of Participants without Insurance]
from
hsAdmin.ReportPeriodLkup a
left join hsAdmin.ClientReport b on
b.ReportPeriod = a.ReportPeriodId
and b.insurancefromjob = 2
where
a.reportperiodid between (#lastReportId - 11) and #lastReportId
group by
a.Month