Calculating sum of jobs processed every month using two tables

Calculating sum of jobs processed every month using two tables - sql

I have these two tables:
TIME (this table contains the time_id which in turn gives the details like the day,month, year etc)
|time_id|hour|day|month|year|
_____________________________
|1234 |1 |6 |9 |2013|
_____________________________
|1235 |2 |7 |9 |2013|
_____________________________
|1223 |2 |4 |8 |2014|
_____________________________
|1227 |2 |8 |8 |2014|
SUM_JOBS_PROCESSED (this table contains the time_id and the no of jobs processed for this particular time_id.)
|time_id|sum_of_jobs_processed|
_______________________________
|1234 |5 |
_______________________________
|1235 |6 |
_______________________________
|1223 |4 |
_______________________________
|1227 |4 |
I am trying to write a query which should display something like this
|month|year|sum_of_jobs_processed|
__________________________________
|9 |2013| 11 |
__________________________________
|8 |2014| 8 |
__________________________________
It should display total number of jobs processed for a month.
Could anyone please help me with these? I am able to find total number of jobs processed for a day, but number of jobs processed for a month, is not happening.

Not sure I fully understood what you're trying, but I think this query should give you the desired result:
SELECT t.month,
t.year,
SUM(s.sum_of_jobs_processed)
FROM bspm_dim_time t
JOIN bspm_sum_jobs_day s
ON t.time_id = s.time_id
GROUP BY t.month,
t.year
ORDER BY t.year,
t.month
Live DEMO.

Try this:
SELECT month,
year,
sum(sum_of_jobs_processed)
FROM TIME
INNER JOIN SUM_JOBS_PROCESSED
ON TIME.time_id = SUM_JOBS_PROCESSED.time_id
GROUP BY month,
year
ORDER BY month,
year
Mark as answer if correct.

Related

In PostgreSQL, conditionally count rows

Background
I'm a novice Postgres user running a local server on a Windows 10 machine. I've got a dataset g that looks like this:
+--+---------+----------------+
|id|treatment|outcome_category|
+--+---------+----------------+
|a |1 |cardiovascular |
|a |0 |cardiovascular |
|b |0 |metabolic |
|b |0 |sensory |
|c |1 |NULL |
|c |0 |cardiovascular |
|c |1 |sensory |
|d |1 |NULL |
|d |0 |cns |
+--+---------+----------------+
The Problem
I'd like to get a count of outcome_category by outcome_category for those id who are "ever treated" -- defined as "id's who have any row where treatment=1".
Here's the desired result:
+----------------+---------+
|outcome_category| count |
+----------------+---------+
|cardiovascular | 3 |
|sensory | 1 |
|cns | 1 |
+----------------+---------+
It would be fine if the result had to contain metabolic, like so:
+----------------+---------+
|outcome_category|treatment|
+----------------+---------+
|cardiovascular | 3 |
|metabolic | 0 |
|sensory | 1 |
|cns | 1 |
+----------------+---------+
Obviously I don't need the rows to be in any particular order, though descending would be nice.
What I've tried
Here's a query I've written:
select treatment, outcome_category, sum(outcome_ct)
from (select max(treatment) as treatment,
outcome_category,
count(outcome_category) as outcome_ct
from g
group by outcome_category) as sub
group by outcome_category, sub.treatment;
But it's a mishmash result:
+---------+----------------+---+
|treatment|outcome_category|sum|
+---------+----------------+---+
|1 |cardiovascular |3 |
|1 |sensory |2 |
|0 |metabolic |1 |
|1 |NULL |0 |
|0 |cns |1 |
+---------+----------------+---+
I'm trying to identify the "ever exposed" id's using that first line in the subquery: select max(treatment) as treatment. But I'm not quite getting at the rest of it.
EDIT
I realized that the toy dataset g I originally gave you above doesn't correspond to the idiosyncrasies of my real dataset. I've updated g to reflect that many id's who are "ever treated" won't have a non-null outcome_category next to a row with treatment=1.

Interesting little problem. You can do:
select
outcome_category,
count(x.id) as count
from g
left join (
select distinct id from g where treatment = 1
) x on x.id = g.id
where outcome_category is not null
group by outcome_category
order by count desc
Result:
outcome_category count
----------------- -----
cardiovascular 3
sensory 1
cns 1
metabolic 0
See running example at db<>fiddle.

This would appear to be just a simple aggregation,
select outcome_category, Count(*) count
from t
where treatment=1
group by outcome_category
order by Count(*) desc
Demo fiddle

How to subset the readmitted cases from an inpatients’ table to calculate the total length of stay of the readmitted cases in SQL Server 17?

I am working with an inpatients' data table that looks like the following:
ID | AdmissionDate |DischDate |LOS |Readmitted30days
+------+-------+-------------+---------------+---------------+
|001 | 2014-01-01 | 2014-01-12 |11 |1
|101 | 2014-02-05 | 2014-02-12 |7 |1
|001 | 2014-02-18 | 2018-02-27 |9 |1
|001 | 2018-02-01 | 2018-02-13 |12 |0
|212 | 2014-01-28 | 2014-02-12 |15 |1
|212 | 2014-03-02 | 2014-03-15 |13 |0
|212 | 2016-12-23 | 2016-12-29 |4 |0
|1011 | 2017-06-10 | 2017-06-21 |11 |0
|401 | 2018-01-01 | 2018-01-11 |10 |0
|401 | 2018-10-01 | 2018-10-10 |9 |0
I want to create another table from the above in which the total length of stay (LOS) is summed up for those who have been readmitted within 30 days. The table I want to create looks like the following:
ID |Total LOS
+------+-----------
|001 |39
|212 |28
|212 |4
|1011 |11
|401 |10
|401 |9
I am using SQL Server Version 17.
Could anyone help me do this?
Thanks in advance

The Readmitted30days column seems irrelevant to the question and a complete red herring. What you seem to want is to aggregate rows which are within 30 days of each other.
This is a type of gaps-and-islands problem. There are a number of solutions, here is one:
We use LAG to check whether the previous DischDate is within 30 days of this AdmissionDate
Based on that we assign a grouping ID by doing a running count
Then simply group by ID and our grouping ID, and sum
The dates and LOS don't seem to match up, so I've given you both
WITH StartPoints AS (
SELECT *,
IsStart = CASE WHEN
DATEADD(day, -30, AdmissionDate) <
LAG(DischDate) OVER (PARTITION BY ID ORDER BY DischDate)
THEN 1 END
FROM YourTable
),
Groupings AS (
SELECT *,
GroupId = COUNT(IsStart) OVER (PARTITION BY ID ORDER BY DischDate ROWS UNBOUNDED PRECEDING)
FROM StartPoints
)
SELECT
ID,
TotalBasedOnDates = SUM(DATEDIFF(day, AdmissionDate, DischDate)), -- do you need to add 1 within the sum?
TotalBasedOnLOS = SUM(LOS)
FROM Groupings
GROUP BY ID, GroupID;
db<>fiddle

if I understand correctly :
select Id, sum(LOS)
from tablename
where Readmitted30days = 1
group by Id

You want to use aggregation:
select id, sum(los)
from t
group by id
having max(Readmitted30days) = 1;
This filters after the aggregation so all los values are included in the sum.
EDIT:
I think I understand. Every occasion where Readmitted30days = 0, you want a row in the result set that combines that row with the following rows up to the next matching row.
If that interpretation is correct, you can construct groups using a cumulative sum and then aggregate:
select id, sum(los)
from (select t.*,
sum(1 - Readmitted30days = 0) over (partition by id order by admissiondate) as grp
from t
) t
group by id, grp;

Group by a period Postgresql

I'm trying to group by a period of time the following table (example) :
------------------
|month|year|value|
------------------
|7 |2019|1.2 |
|8 |2019|1.7 |
|9 |2019|1.5 |
|10 |2019|0.7 |
|11 |2019|0.2 |
|12 |2019|1.7 |
|1 |2020|1.0 |
|2 |2020|0.1 |
|3 |2020|2.1 |
|4 |2020|1.2 |
|5 |2020|1.2 |
|6 |2020|1.7 |
|7 |2020|2.1 |
|8 |2020|1.7 |
|9 |2020|1.5 |
|10 |2020|0.7 |
|11 |2020|0.2 |
|12 |2020|1.7 |
|1 |2021|1.0 |
|2 |2021|0.1 |
|3 |2021|2.1 |
|4 |2021|1.2 |
|5 |2021|1.7 |
|6 |2021|1.5 |
Etc..
I have to group every 12 month from july(7) to June(6 of the next year).
I already tried some solution found online but nothing work for me, anyone have a solution?
I'm using Postgresql.
Thanks in advance

One way is to use arithmetic
select floor((year * 12 + month - 7) / 12) as effective_year, avg(value)
from t
group by effective_year;

#GordonLinoff has the appropriate solution missing only the actual period covered by the effective_year. However, that period is easily derived using the effective_year and a couple built in functions: daterange and make_date.
select daterange(make_date(effective_year ,07 ,01)
,make_date(effective_year+1,06 ,30)
,'[]'
)
, avg_value
from (select floor((year * 12 + month - 7) / 12)::integer as effective_year
, avg(value) avg_value
from test_data
group by effective_year
) da
order by effective_year;
See full example.

Counts based on an "as of" date

My apologies if I'm not wording the question correctly, and that's why I can't find any previous question/answers on this.....
My specific situation can be generalized as:
I have a table containing records of bed assignments for patients at a system of hospitals. A patient's placement into a bed is tagged with a date, and a reason for their placement there.
Patient |Hospital |Bed |Reason |Date
--------|---------|----|-------|--------
1234 |HOSP1 |111 |A |1/1/2016
5678 |HOSP1 |222 |A |2/1/2016
9012 |HOSP2 |333 |B |3/1/2016
3456 |HOSP3 |444 |C |3/1/2016
2345 |HOSP3 |555 |A |3/1/2016
7890 |HOSP1 |111 |D |4/1/2016
Based on the very small sample set above, I need to get a count of the "Reasons", per Hospital, given an "as of" date. So given an "as of" date of 3/15/2016:
As of Date: 3/15/2016
Hospital|Reason |Count
--------|---------|-----
HOSP1 |A |2
HOSP2 |B |1
HOSP3 |A |1
HOSP3 |C |1
But when changing the "as of" date to 4/1/16, I would hope to see the following:
As of Date: 4/15/2016
Hospital|Reason |Count
--------|---------|-----
HOSP1 |A |1
HOSP1 |D |1
HOSP2 |B |1
HOSP3 |A |1
HOSP3 |C |1
Any suggestions on the best route to accomplish this without melting my CPU or the servers? (my real record set is about 36m rows, going back 15 years). And my ultimate end goal is determine yearly averages of "reason" counts at each "hospital", but I know the first step is to get these initial counts finalized first (or is it???).

What you want is the most recent record before a certain date. This is pretty easy to do using window functions:
select hospital, reason, count(*)
from (select t.*,
row_number() over (partition by hospital, bed order by date desc) as seqnum
from t
where date <= '2016-03-15'
) t
where seqnum = 1
group by hospital, reason;

count and distinct over multiple columns

I have a database table containing two costs. I want to find the distinct costs over these two columns. I also want to find the count that these costs appear. The table may look like
|id|cost1|cost2|
|1 |50 |60 |
|2 |20 |50 |
|3 |50 |70 |
|4 |20 |30 |
|5 |50 |60 |
In this case I want a result that is distinct over both columns and count the number of times that appears. So the result I would like is
|distinctCost|count|
|20 |2 |
|30 |1 |
|50 |4 |
|60 |2 |
|70 |1 |
and ideally ordered
|disctinCost1|count|
|50 |4 |
|60 |2 |
|20 |2 |
|70 |1 |
|30 |1 |
I can get the distinct over two columns by doing something like
select DISTINCT c FROM (SELECT cost1 AS c FROM my_costs UNION SELECT cost2 AS c FROM my_costs);
and I can get the count for each column by doing
select cost1, count(*)
from my_costs
group by cost1
order by count(*) desc;
My problem is how can I get the count for both columns? I am stuck on how to do the count over each individual column and then add it up.
Any pointers would be appreciated.
I am using Oracle DB.
Thanks

By combining your two queries..
select cost, count(*)
from
(
SELECT id, cost1 AS cost FROM my_costs
UNION ALL
SELECT id, cost2 AS c FROM my_costs
) v
group by cost
order by count(*) desc;
(If when a row has cost1 and cost2 equal, you want to count it once not twice, change the union all to a union)

You can use the unpivot statement :
select *
from
(
SELECT cost , count(*) as num_of_costs
FROM my_costs
UNPIVOT
(
cost
FOR cost_num IN (cost1,cost2)
)
group by cost
)
order by num_of_costs desc;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Calculating sum of jobs processed every month using two tables - sql

Not sure I fully understood what you're trying, but I think this query should give you the desired result: SELECT t.month, t.year, SUM(s.sum_of_jobs_processed) FROM bspm_dim_time t JOIN bspm_sum_jobs_day s ON t.time_id = s.time_id GROUP BY t.month, t.year ORDER BY t.year, t.month Live DEMO.

Try this: SELECT month, year, sum(sum_of_jobs_processed) FROM TIME INNER JOIN SUM_JOBS_PROCESSED ON TIME.time_id = SUM_JOBS_PROCESSED.time_id GROUP BY month, year ORDER BY month, year Mark as answer if correct.

Related

In PostgreSQL, conditionally count rows

How to subset the readmitted cases from an inpatients’ table to calculate the total length of stay of the readmitted cases in SQL Server 17?

Group by a period Postgresql

Counts based on an "as of" date

count and distinct over multiple columns

Categories

Resources