Solved: Multiplying the result of a SUM() on one table by the result of a COUNT() from another table - sql

I'm trying to multiply the sum computed in one table by the count of values found in another table linked to it.
I have a table with entries like this:
Table_A
Id
SubId
Start
Stop
For each Id there can be several SubId. The Id is linked through a whole series of other tables to a second table which basically looks like this:
Table_B
Id
Val
Id in Table_B can occur from one to multiple times.
The idea is that I want to mulitply the result of the sum of time spans found in Table_A for each SubId by the count of distinct Val found in Table_B by linking it through the Id.
I already have the sum of time spans from Table_A like this (I'm no developper so it might be very shoddy code, I apologize in advance):
SELECT SUM(duration) FROM (
SELECT (mod_stop - mod_start) AS duration FROM (
SELECT
(CASE
WHEN date_start < '#begin' THEN '#begin'
ELSE date_start
END) AS mod_start,
(CASE
WHEN date_stop > '#end' THEN '#end'
WHEN date_stop = '0' THEN '#end'
ELSE date_stop
END) AS mod_stop
FROM table_a
WHERE (
state = 'Launching' OR
state = 'Running' OR
state = 'Finishing'
)
AND table_a_id IN (
SELECT
DISTINCT(table_a_id)
FROM table_a
WHERE
(to_timestamp(date_start), to_timestamp(date_stop)) OVERLAPS (to_timestamp('#begin'), to_timestamp('#end'))
AND
(state = 'Running' OR state = 'Launching' OR state = 'Finishing')
AND
NOT date_stop = '0'
)
) AS t
) AS d
;
This works but I now need to multiply each duration by the number of Val associated to it's Id and I can't work out how to do this.
I thought having another AND ... IN clause in the WHERE with all the table linking mechanisms returning the individual Val count would do it but the query does not return anything despite running for over an hour whereas without it it returns in approximately ten minutes (there is no index on date_* and Table_A is several tens of million records long which explains why it is so slow) so I fear I might have got something wrong.
Is there a way to do this? Thanks!

Thanks to the comments posted and what I managed to find I've cobbled a solution together. It might be crude but it works. I build a temp table whose values I use to build a CTE. Feel free to comment if you think this can be done better.
BEGIN;
CREATE TEMP TABLE my_jsl ON COMMIT DROP AS
SELECT job_id, job_state, mod_start, mod_stop, duration FROM (
SELECT job_id, job_state, mod_start, mod_stop, mod_stop - mod_start duration FROM (
SELECT job_id, job_state,
(CASE
WHEN date_start < :my_begin THEN :my_begin
ELSE date_start
END) AS mod_start,
(CASE
WHEN date_stop > :my_end THEN :my_end
WHEN date_stop = '0' THEN :my_end
ELSE date_stop
END) AS mod_stop
FROM job_state_logs
WHERE (
job_state = 'Launching' OR
job_state = 'Running' OR
job_state = 'Finishing'
)
AND job_state_log_id IN (
SELECT
DISTINCT(job_state_log_id)
FROM job_state_logs
WHERE
(to_timestamp(date_start), to_timestamp(date_stop)) OVERLAPS (to_timestamp(:my_begin), to_timestamp(:my_end))
AND
(job_state = 'Running' OR job_state = 'Launching' OR job_state = 'Finishing')
AND
NOT date_stop = '0'
)
) AS job_temp_log
) AS job_log
;
WITH my_jr AS (
SELECT COUNT(a.resource_id) jrc, j.job_id
FROM assigned_resources a
INNER JOIN moldable_job_descriptions m ON m.moldable_id = a.moldable_job_id
INNER JOIN jobs j ON j.assigned_moldable_job = m.moldable_id
WHERE j.job_id IN (
SELECT DISTINCT(job_id) FROM my_jsl
)
GROUP BY j.job_id
),
my_surf AS (
SELECT my_jr.job_id, my_jsl.duration, my_jr.jrc, (my_jsl.duration * my_jr.jrc) AS surface
FROM my_jr, my_jsl
WHERE my_jr.job_id = my_jsl.job_id
)
SELECT SUM(surface) job_surface FROM my_surf
;
COMMIT;
;

Related

Sum only when they have the same date in SQL Server

I am writing a query in SQL Server where I have to show the sum of the records of a column that have the same date but for now it only adds all the records regardless of date.
How can I make it so that it only adds the repeated dates of the column and not all together?
SELECT
FechaHoraReg,
(SELECT SUM(CantidaIngLamina)
FROM MovimientoMaterial Produccion
WHERE IdTipoMov = '1') AS FilminaI,
(SELECT SUM(CantidaIngresa)
FROM MovimientoMaterial Produccion
WHERE IdTipoMov = '1') AS PapelI
FROM
MovimientoMaterial_Produccion
GROUP BY
FechaHoraReg
The following query will give you a sum of the 2 columns independently
WITH CTE_FILMINAI
AS
(
SELECT
FECHAHORAREG,
SUM(CANTIDAINGLAMINA) AS FILMINAI
FROM MOVIMIENTOMATERIAL_PRODUCCION
WHERE 1=1
AND IDTIPOMOV = '1'
GROUP BY FECHAHORAREG
), CTE_PAPELI
AS
(
SELECT
FECHAHORAREG,
SUM(CANTIDAINGRESA) AS PAPELI
FROM MOVIMIENTOMATERIAL_PRODUCCION
WHERE 1=1
AND IDTIPOMOV = '1'
GROUP BY FECHAHORAREG
)
SELECT
MP.FECHAHORAREG,
CF.FILMINAI,
CP.PAPELI
FROM MOVIMIENTOMATERIAL_PRODUCCION MP
LEFT JOIN CTE_FILMINAI CF ON CF.FECHAHORAREG = MP.FECHAHORAREG
LEFT JOIN CTE_PAPELI CP ON CP.FECHAHORAREG = MP.FECHAHORAREG
This should allow you to independently find a sum of the two columns and change the where however you need it to be. Additionally, it is a left join on the main table in the event that somehow there is FECHAHORAREG in one of the CTEs. This also could be changed depending on what you are needed it for.
Just filter out the values to ignore via a case expression. There's no reason to involve other joins or subqueries:
select FechaHoraReg,
sum(case when IdTipoMov = '1' -- should this actually be a numeric comparison?
then CantidaIngLamina else 0 end) as FilminaI,
sum(case when IdTipoMov = '1' -- should this actually be a numeric comparison?
then CantidaIngresa else 0 end) as PapelI
from MovimientoMaterial_Produccion
group by FechaHoraReg
The subquery needs filter by id and the date, but the date is the same of the origin query.
Try it:
SELECT
origin.FechaHoraReg,
(SELECT SUM(CantidaIngLamina)
FROM MovimientoMaterial_Produccion AS sub
WHERE sub.IdTipoMov = '1'
AND sub.FechaHoraReg = origin.FechaHoraReg) AS FilminaI,
(SELECT SUM(CantidaIngresa)
FROM MovimientoMaterial_Produccion AS subdos
WHERE subdos.IdTipoMov = '1'
AND subdos.FechaHoraReg = origin.FechaHoraReg) AS PapelI
FROM
MovimientoMaterial_Produccion AS origin
GROUP BY
origin.FechaHoraReg

how to update parent table twice if child table return two same id of parent table in mysql using IN

This is the query I'm using, it's working fine but the problem is if (select u.ChannelId from u) return twice the same id of a channelInfo, IN function only update ChannelInfo.Amount once, but it has to minus Amount twice.
For example, if (select u.ChannelId from u) return like this (3, 3), it has to minus ChannelInfo.Amount twice, but it's only minus it once.
How I can solve this problem, can anyone help me, please
ChannelInfo table has one to many relation with Reporting, so in Reporting table can have 2 or 3 ChannelInfo id
with u as (
update "Reporting"
set "Status" = 'run'
where "Status" = 'a'
RETURNING "ChannelId"
)
update "ChannelInfo"
set "Amount"= (CASE WHEN "Duration" = '60' THEN "Amount" - 12.25 ELSE "Amount" - 6.13 END)
where "id" in (select u.ChannelId from u);
I'm not 100% sure what you are asking for, but I think it is to subtract the count of the rows in u from the amount. That is, the constant would be multiplied by a count.
You can use a join to get a count and use that for the calculation:
with u as (
update "Reporting"
set "Status" = 'run'
where "Status" = 'a'
RETURNING "ChannelId"
)
update "ChannelInfo" ci
set "Amount"= (CASE WHEN ci."Duration" = '60' THEN ci."Amount" - uc.cnt * 12.25 ELSE ci."Amount" - uc.cnt * 6.13 END)
from (select u.ChannelId, count(*) as cnt
from u
group by u.ChannelId
) uc
where ci."id" = uc.ChannelId;

Check whether an employee is present on three consecutive days

I have a table called tbl_A with the following schema:
After insert, I have the following data in tbl_A:
Now the question is how to write a query for the following scenario:
Put (1) in front of any employee who was present three days consecutively
Put (0) in front of employee who was not present three days consecutively
The output screen shoot:
I think we should use case statement, but I am not able to check three consecutive days from date. I hope I am helped in this
Thank you
select name, case when max(cons_days) >= 3 then 1 else 0 end as presence
from (
select name, count(*) as cons_days
from tbl_A, (values (0),(1),(2)) as a(dd)
group by name, adate + dd
)x
group by name
With a self-join on name and available = 'Y', we create an inner table with different combinations of dates for a given name and take a count of those entries in which the dates of the two instances of the table are less than 2 units apart i.e. for each value of a date adate, it will check for entries with its own value adate as well as adate + 1 and adate + 2. If all 3 entries are present, the count will be 3 and you will have a flag with value 1 for such names(this is done in the outer query). Try the below query:
SELECT Z.NAME,
CASE WHEN Z.CONSEQ_AVAIL >= 3 THEN 1 ELSE 0 END AS YOUR_FLAG
FROM
(
SELECT A.NAME,
SUM(CASE WHEN B.ADATE >= A.ADATE AND B.ADATE <= A.ADATE + 2 THEN 1 ELSE 0 END) AS CONSEQ_AVAIL
FROM
TABL_A A INNER JOIN TABL_A B
ON A.NAME = B.NAME AND A.AVAILABLE = 'Y' AND B.AVAILABLE = 'Y'
GROUP BY A.NAME
) Z;
Due to the complexity of the problem, I have not been able to test it out. If something is really wrong, please let me know and I will be happy to take down my answer.
--Below is My Approch
select Name,
Case WHen Max_Count>=3 Then 1 else 0 end as Presence
from
(
Select Name,MAx(Coun) as Max_Count
from
(
select Name, (count(*) over (partition by Name,Ref_Date)) as Coun from
(
select Name,adate + row_number() over (partition by Name order by Adate desc) as Ref_Date
from temp
where available='Y'
)
) group by Name
);
select name as employee , case when sum(diff) > =3 then 1 else 0 end as presence
from
(select id, name, Available,Adate, lead(Adate,1) over(order by name) as lead,
case when datediff(day, Adate,lead(Adate,1) over(order by name)) = 1 then 1 else 0 end as diff
from table_A
where Available = 'Y') A
group by name;

Constructing A Query In BigQuery With CASE Statements

So I'm trying to construct a query in BigQuery that I'm struggling with for a final part.
As of now I have:
SELECT
UNIQUE(Name) as SubscriptionName,
ID,
Interval,
COUNT(mantaSubscriptionIdmetadata) AS SubsPurchased,
SUM(RevenueGenerated) as RevenueGenerated
FROM (
SELECT
mantaSubscriptionIdmetadata,
planIdmetadata,
INTEGER(Amount) as RevenueGenerated
FROM
[sample_internal_data.charge0209]
WHERE
revenueSourcemetadata = 'new'
AND
Status = 'Paid'
GROUP BY
mantaSubscriptionIdmetadata,
planIdmetadata,
RevenueGenerated
)a
JOIN (
SELECT
id,
Name,
Interval
FROM
[sample_internal_data.subplans]
WHERE
id in ('150017','150030','150033','150019')
GROUP BY
id,
Name,
Interval )b
ON
a.planIdmetadata = b.id
GROUP BY
ID,
Interval,
Name
ORDER BY
Interval ASC
The resulting query looks like this
Which is exactly what I'm looking for up to that point.
Now what I'm stuck on this. There is another column I need to add called SalesRepName. The resulting field will either be null or not null. If its null it means it was sold online. If its not null, it means it was sold via telephone. What I want to do is create two additional columns where it says how many were sold via telesales and via online. The sum total of the two columns will always equal the SubsPurchased total.
Can anyone help?
You can include case statements within aggregate functions. Here you could choose sum(case when SalesRepName is null then 1 else 0 end) as online and sum(case when SalesRepName is not null then 1 else 0 end) as telesales.
count(case when SalesRepName is null then 1 end) as online would give the same result. Using sum in these situations is simply my personal preference.
Note that omitting the else clause is equivalent to setting else null, and null isn't counted by count. This can be very useful in combination with exact_count_distinct, which has no equivalent in terms of sum.
Try below:
it assumes your SalesRepName field is in [sample_internal_data.charge0209] table
and then it uses "tiny version" of SUM(CASE ... WHEN ...) which works when you need 0 or 1 as a result to be SUM'ed
SUM(SalesRepName IS NULL) AS onlinesales,
SUM(NOT SalesRepName IS NULL) AS telsales
SELECT
UNIQUE(Name) AS SubscriptionName,
ID,
Interval,
COUNT(mantaSubscriptionIdmetadata) AS SubsPurchased,
SUM(RevenueGenerated) AS RevenueGenerated,
SUM(SalesRepName IS NULL) AS onlinesales,
SUM(NOT SalesRepName IS NULL) AS telesales
FROM (
SELECT SalesRepName, mantaSubscriptionIdmetadata, planIdmetadata, INTEGER(Amount) AS RevenueGenerated
FROM [sample_internal_data.charge0209]
WHERE revenueSourcemetadata = 'new'
AND Status = 'Paid'
GROUP BY mantaSubscriptionIdmetadata, planIdmetadata, RevenueGenerated
)a
JOIN (
SELECT id, Name, Interval
FROM [sample_internal_data.subplans]
WHERE id IN ('150017','150030','150033','150019')
GROUP BY id, Name, Interval
)b
ON a.planIdmetadata = b.id
GROUP BY ID, Interval, Name
ORDER BY Interval ASC

How do I determine if a group of data exists in a table, given the data that should appear in the group's rows?

I am writing data to a table and allocating a "group-id" for each batch of data that is written. To illustrate, consider the following table.
GroupId Value
------- -----
1 a
1 b
1 c
2 a
2 b
3 a
3 b
3 c
3 d
In this example, there are three groups of data, each with similar but varying values.
How do I query this table to find a group that contains a given set of values? For instance, if I query for (a,b,c) the result should be group 1. Similarly, a query for (b,a) should result in group 2, and a query for (a, b, c, e) should result in the empty set.
I can write a stored procedure that performs the following steps:
select distinct GroupId from Groups -- and store locally
for each distinct GroupId: perform a set-difference (except) between the input and table values (for the group), and vice versa
return the GroupId if both set-difference operations produced empty sets
This seems a bit excessive, and I hoping to leverage some other commands in SQL to simplify. Is there a simpler way to perform a set-comparison in this context, or to select the group ID that contains the exact input values for the query?
This is a set-within-sets query. I like to solve it using group by and having:
select groupid
from GroupValues gv
group by groupid
having sum(case when value = 'a' then 1 else 0 end) > 0 and
sum(case when value = 'b' then 1 else 0 end) > 0 and
sum(case when value = 'c' then 1 else 0 end) > 0 and
sum(case when value not in ('a', 'b', 'c') then 1 else - end) = 0;
The first three conditions in the having clause check that each elements exists. The last condition checks that there are no other values. This method is quite flexible, for various exclusions and inclusion conditions on the values you are looking for.
EDIT:
If you want to pass in a list, you can use:
with thelist as (
select 'a' as value union all
select 'b' union all
select 'c'
)
select groupid
from GroupValues gv left outer join
thelist
on gv.value = thelist.value
group by groupid
having count(distinct gv.value) = (select count(*) from thelist) and
count(distinct (case when gv.value = thelist.value then gv.value end)) = count(distinct gv.value);
Here the having clause counts the number of matching values and makes sure that this is the same size as the list.
EDIT:
query compile failed because missing the table alias. updated with right table alias.
This is kind of ugly, but it works. On larger datasets I'm not sure what performance would look like, but the nested instances of #GroupValues key off GroupID in the main table so I think as long as you have a good index on GroupID it probably wouldn't be too horrible.
If Object_ID('tempdb..#GroupValues') Is Not Null Drop Table #GroupValues
Create Table #GroupValues (GroupID Int, Val Varchar(10));
Insert #GroupValues (GroupID, Val)
Values (1,'a'),(1,'b'),(1,'c'),(2,'a'),(2,'b'),(3,'a'),(3,'b'),(3,'c'),(3,'d');
If Object_ID('tempdb..#FindValues') Is Not Null Drop Table #FindValues
Create Table #FindValues (Val Varchar(10));
Insert #FindValues (Val)
Values ('a'),('b'),('c');
Select Distinct gv.GroupID
From (Select Distinct GroupID
From #GroupValues) gv
Where Not Exists (Select 1
From #FindValues fv2
Where Not Exists (Select 1
From #GroupValues gv2
Where gv.GroupID = gv2.GroupID
And fv2.Val = gv2.Val))
And Not Exists (Select 1
From #GroupValues gv3
Where gv3.GroupID = gv.GroupID
And Not Exists (Select 1
From #FindValues fv3
Where gv3.Val = fv3.Val))