Grouping by some value and time periods - sql

I'm using Entity Framework.
There is table A with columns id, someInt, someDateTime.
For example:
id | someInt | someDateTime
1 | 2 | 2014-03-11
2 | 2 | 2013-01-01
3 | 2 | 2013-01-02
4 | 1 | 2014-03-05
5 | 1 | 2014-03-06
Now I want to take some statistics: per each someInt value I want to know how many rows are not more than i.e. 24h old, 1 week old, 1 month old. From values above I would get:
someInt | 24h | 1 week | 1 month | any time
1 | 0 | 2 | 2 | 2
2 | 1 | 1 | 1 | 3
Is it possible in SQL and if so is it possible in Entity Framework? How should I make queries like this?

Group records by someInt and then make sub-queries to get number of rows for each period of time you are interested in:
DateTime now = DateTime.Now;
DateTime yesterday = now.AddDays(-1);
DateTime weekAgo = now.AddDays(-7);
DateTime monthAgo = now.AddDays(-30); // just assume you need 30 days
DateTime yearAgo = new DateTime(now.Year - 1, now.Month, now.Day);
var query = from a in db.A
group a by a.someInt into g
select new {
someInt = g.Key,
lastDay = g.Count(a => a.someDateTime >= yesterday),
lastWeek = g.Count(a => a.someDateTime >= weekAgo),
lastMonth = g.Count(a => a.someDateTime >= monthAgo),
lastYear = g.Count(a => a.someDateTime >= yearAgo),
anyTime = g.Count()
};

Related

SQL: Selecting data from multiple tables with multiple records to select and join

I have three tables: VolunteerRelationships, Organizations, and CampaignDates. I'm trying to write a query that will give me the organization id and name, and the org's start and end campaign dates for the current campaign year <#CampaignYear>, based on the selected volunteer <#SelectedInd>.
Dates are stored as separate column values for day, month and year which I'm trying to cast into a more an formatted date value. If I can get this, I'd also like to use a case statement to get the status of the campaign based on whether the date campaign dates are upcoming, currently running, or already closed, but need to get the first part of the query first.
Sorry if I'm leaving a lot of needed info out, this is my first time posting a question to this forumn. Thank you!
VolunteerRelationships
id | name | managesId |expiryDate
1 | john | 1 |
2 | jack | 2 |6/30/2020
3 | jerry| 3 |12/31/2021
Organizations
id | name1
1 | ACME
CampaignDates
orgId | dateDay | dateMonth | dateYear | dateType | Campaign Year
1 | 5 | 11 | 2020 | Start | 2020
1 | 15 | 11 | 2020 | End | 2020
Result
orgId | orgName | startDate | endDate | Status
1 | ACME | 2020-01-01| 2020-01-15 | Closed
select
v.MANAGEDACCOUNT,
o.Name1,
select * from
(select cast(cast dateyear*1000 + datemonth*100 + dateday as varchar(255)) as date as date1 from <#Schema>.CampaignDates where datetype = 'Start' and campaignyear = <#CampaignYear> and orgaccountnumber = v.MANAGEDACCOUNT) d1,
(select cast(cast dateyear*1000 + datemonth*100 + dateday as varchar(255)) as date as date2 from <#Schema>.CampaignDates where datetype = 'End' and campaignyear = <#CampaignYear> and orgaccountnumber = v.MANAGEDACCOUNT) d2
from <#Schema>.VolunteerRelationships v
inner join <#Schema>.organizations o
on o.accountnumber=v.MANAGEDACCOUNT
where v.VOLUNTEERACCOUNT = <#SelectedInd> and ( v.EXPIRYDATE IS NULL OR v.EXPIRYDATE > <#Today> )

Select distinct values from one table and join with another table

I have a problem that I've spent way to much time trying to figure out, with close to no success at all.. I'll try to describe the problem as good as I can, and use an example, which is the solution I use right now.
I have two different MS SQL tables.
Table 1:
itemNumber - 192031, 533853 etc.
date - the date the database post was added
quantity - the amount of items for each item number
Table 2:
MTITNO - also item number, contains many different item numbers (more than Table 1)
MTTRDT - the date the database post was added
MTTYP - transaction type. I will be looking for MTTYP = 11
MTTRQT - transaction quantity. I will be looking for MTTRQT < 0
So what I want to do is to get DISTINCT itemNumber between two dates from Table 1. Once I have those item numbers, I would like to join Table 2 on item number, and also between the same dates that I use in the query for Table 1. I also need to only get the values from Table 2 where MTTYP = 11 and MTTRQT < 0 and SUM MTTRQT.
I've sorted this by using loops in java code, which isn't that good to be honest. What I do is this:
SELECT DISTINCT itemNumber "itemNumber"
FROM Table 1
WHERE date BETWEEN #fromDate AND #toDate;
Take the top value from this result (that is the first item number) and then:
SELECT Sum(MTTRQT) "SUM_MTTRQT_"
FROM Table 2
WHERE MTITNO = "the first item number from the result query from above"
AND MTTTYP = 11
AND MTTRDT BETWEEN #fromDate AND #toDate
AND MITTRA.MTTRQT < 0
Add the result to a new list. Remove the item number used
Loop through all the item numbers in the list and run step 3 and 4 for every single item number (this is the bad part).
Surely there must be a SQL query that produces the same result!?
Appreciate any help I can get!
Update:
This is the data I have.
Table 1
|Item number | Quantity | date
192031 | 1 | 20190521
192031 | 1 | 20190522
19192301 | 2 | 20190521
19189507 | 1 | 20190523
19189507 | 1 | 20190521
19189507 | 1 | 20190524
Table 2
|MTITNO | MTTRDT | MTTTYP | MTTRQT
192031 | 20190520 | 11 | -1
192031 | 20190520 | 11 | -1
192031 | 20190520 | 11 | -1
192031 | 20190520 | 11 | -1
19189507 | 20190520 | 11 | -1
19189507 | 20190520 | 11 | -1
19189507 | 20190520 | 11 | -1
19189507 | 20190520 | 11 | -1
19189507 | 20190521 | 11 | -1
19189507 | 20190521 | 11 | -1
19189507 | 20190521 | 11 | -1
Table 2 contains all sorts of item numbers (that is item numbers that you can find in Table 2, but not in Table 1), and many more posts. There can be posts in Table 1 and no posts in Table 2 for one or more item numbers.
I want to summarise the MTTRQT for all items where the item number is in both Table 1 and Table 2 and within the date span I have set. The "amount used" in the desired result below is MTTRQT added up for every single item number.
Desired result
So if I look for all the item numbers with date between 20190520 - 20190524, I should get the list below.
"Item number" is supposed to be DISTINCT item numbers from Table 1.
"Amount used" is the SUM function, that sums MTTRQT where all the conditions are met.
|Item Number | Amount used
192031 | -4
19189507 | -7
Reading through the lines a bit, but is this not what you're after?
SELECT SUM(T2.MTTRQT) AS [SUM_MTTRQT_]
FROM [Table 2] T2
LEFT JOIN (SELECT TOP (1)
T1.ItemNumber
FROM [Table 1] T1
WHERE T1.[date] BETWEEN #fromDate AND #toDate --Note, if [date] has a time portion, this is unlikely to work as you expect
ORDER BY T1.ItemNumber) T1 ON T2.MTITNO = T1.ItemNumber --Assumed ORDER BY clause
WHERE T2.MTTTYP = 11
AND T2.MTTRDT BETWEEN #fromDate AND #toDate --Note, if MTTRDT has a time portion, this is unlikely to work as you expect
AND T2.MITTRA.MTTRQT < 0;
If I am following your logic correctly:
select sum(mttrqt)
from table2 t2
where t2.mtitno in (select t1.itemno
from table1 t1
where t1.date >= #date1 and t1.date <= #date2
) and
t2.mttrdt >= #date1 and
t2.mttrdt <= #date1 and
t2.mttype = 11 and
t2.mttrqt < 10;
Have you tried this:
SELECT Sum(MTTRQT) "SUM_MTTRQT_"
FROM Table 2
WHERE MTITNO in (SELECT DISTINCT itemNumber "itemNumber"
FROM Table 1
WHERE date BETWEEN #fromDate AND #toDate;)
AND MTTTYP = 11
AND MTTRDT BETWEEN #fromDate AND #toDate
AND MITTRA.MTTRQT < 0

Correlate Sequences of Independent Events - Calculate Time Intersection

We are building a PowerBI reporting solution and I (well Stack) solved one problem and the business came up with a new reporting idea. Not sure of the best way to approach it as I know very little about PowerBI and the business seems to want quite complex reports.
We have two sequences of events from separate data sources. They both contain independent events occurring to vehicles. One describes what location a vehicle is within - the other describes incident events which have a reason code for the incident. The business wants to report on time spent in each location for each reason. Vehicles can change location totally independent of the incident events occurring - and events actually are datetime and occur at random points throughtout day. Each type of event has a startime/endtime and a vehicleID.
Vehicle Location Events
+------------------+-----------+------------+-----------------+----------------+
| LocationDetailID | VehicleID | LocationID | StartDateTime | EndDateTime |
+------------------+-----------+------------+-----------------+----------------+
| 1 | 1 | 1 | 2012-1-1 | 2016-1-1 |
| 2 | 1 | 2 | 2016-1-1 | 2016-4-1 |
| 3 | 1 | 1 | 2016-4-1 | 2016-11-1 |
| 4 | 2 | 1 | 2011-1-1 | 2016-11-1 |
+------------------+-----------+------------+-----------------+----------------+
Vehicle Status Events
+---------+---------------+-------------+-----------+--------------+
| EventID | StartDateTime | EndDateTime | VehicleID | ReasonCodeID |
+---------+---------------+-------------+-----------+--------------+
| 1 | 2012-1-1 | 2013-1-1 | 1 | 1 |
| 2 | 2013-1-1 | 2015-1-1 | 1 | 3 |
| 3 | 2015-1-1 | 2016-5-1 | 1 | 4 |
| 4 | 2016-5-1 | 2016-11-1 | 1 | 2 |
| 5 | 2015-9-1 | 2016-2-1 | 2 | 1 |
+---------+---------------+-------------+-----------+--------------+
Is there anyway I can correlate the two streams together and calculate total time per Vehicle per ReasonCode per location? This would seem to require me to be able to relate the two events - so a change of location may occur part way through a given ReasonCode.
Calculation Example ReasonCodeID 4
VehicleID 1 is in location ID 1 from 2012-1-1 to 2016-1-1 and
2016-4-1 to 2016-11-1
VehicleID 1 is in location ID 2 from 2016-1-1
to 2016-4-1
VehcileID 1 has ReasonCodeID 4 from 2015-1-1 to
2016-5-1
Therefore first Period in location 1 intersects with 365 days of ReasonCodeID 4 (2015-1-1 to 2016-1-1). 2nd period in location 1 intersects with 30 days (2016-4-1 to 2016-5-1).
In location 2 intersects with 91 days of ReasonCodeID 4(2016-1-1 to 2016-4-1
Desired output would be the below.
+-----------+--------------+------------+------------+
| VehicleID | ReasonCodeID | LocationID | Total Days |
+-----------+--------------+------------+------------+
| 1 | 1 | 1 | 366 |
| 1 | 3 | 1 | 730 |
| 1 | 4 | 1 | 395 |
| 1 | 4 | 2 | 91 |
| 1 | 2 | 1 | 184 |
| 2 | 1 | 1 | 154 |
+-----------+--------------+------------+------------+
I have created a SQL fiddle that shows the structure here
Vehicles have related tables and I'm sure the business will want them grouped by vehicle class etc but if I can understand how to calculate the intersection points in this case that would give me the basis for rest of reporting.
I think this solution requires a CROSS JOIN implementation. The relationship between both tables is Many to Many which implies the creation of a third table that bridges LocationEvents and VehicleStatusEvents tables so I think specifying the relationship in the expression could be easier.
I use a CROSS JOIN between both tables, then filter the results only to get those rows which VehicleID columns are the same in both tables. I am also filtering the rows that VehicleStatusEvents range dates intersects LocationEvents range dates.
Once the filtering is done I am adding a column to calculate the count of days between each intersection. Finally, the measure sums up the days for each VehicleID, ReasonCodeID and LocationID.
In order to implement the CROSS JOIN you will have to rename the VehicleID, StartDateTime and EndDateTime on any of both tables. It is necessary for avoiding ambigous column names errors.
I rename the columns as follows:
VehicleID : LocationVehicleID and StatusVehicleID
StartDateTime : LocationStartDateTime and StatusStartDateTime
EndDateTime : LocationEndDateTime and StatusEndDateTime
After this you can use CROSSJOIN in the Total Days measure:
Total Days =
SUMX (
FILTER (
ADDCOLUMNS (
FILTER (
CROSSJOIN ( LocationEvents, VehicleStatusEvents ),
LocationEvents[LocationVehicleID] = VehicleStatusEvents[StatusVehicleID]
&& LocationEvents[LocationStartDateTime] <= VehicleStatusEvents[StatusEndDateTime]
&& LocationEvents[LocationEndDateTime] >= VehicleStatusEvents[StatusStartDateTime]
),
"CountOfDays", IF (
[LocationStartDateTime] <= [StatusStartDateTime]
&& [LocationEndDateTime] >= [StatusEndDateTime],
DATEDIFF ( [StatusStartDateTime], [StatusEndDateTime], DAY ),
IF (
[LocationStartDateTime] > [StatusStartDateTime]
&& [LocationEndDateTime] >= [StatusEndDateTime],
DATEDIFF ( [LocationStartDateTime], [StatusEndDateTime], DAY ),
IF (
[LocationStartDateTime] <= [StatusStartDateTime]
&& [LocationEndDateTime] <= [StatusEndDateTime],
DATEDIFF ( [StatusStartDateTime], [LocationEndDateTime], DAY ),
IF (
[LocationStartDateTime] >= [StatusStartDateTime]
&& [LocationEndDateTime] <= [StatusEndDateTime],
DATEDIFF ( [LocationStartDateTime], [LocationEndDateTime], DAY ),
BLANK ()
)
)
)
)
),
LocationEvents[LocationID] = [LocationID]
&& VehicleStatusEvents[ReasonCodeID] = [ReasonCodeID]
),
[CountOfDays]
)
Then in Power BI you can build a matrix (or any other visualization) using this measure:
If you don't understand completely the measure expression, here is the T-SQL translation:
SELECT
dt.VehicleID,
dt.ReasonCodeID,
dt.LocationID,
SUM(dt.Diff) [Total Days]
FROM
(
SELECT
CASE
WHEN a.StartDateTime <= b.StartDateTime AND a.EndDateTime >= b.EndDateTime -- Inside range
THEN DATEDIFF(DAY, b.StartDateTime, b.EndDateTime)
WHEN a.StartDateTime > b.StartDateTime AND a.EndDateTime >= b.EndDateTime -- |-----|*****|....|
THEN DATEDIFF(DAY, a.StartDateTime, b.EndDateTime)
WHEN a.StartDateTime <= b.StartDateTime AND a.EndDateTime <= b.EndDateTime -- |...|****|-----|
THEN DATEDIFF(DAY, b.StartDateTime, a.EndDateTime)
WHEN a.StartDateTime >= b.StartDateTime AND a.EndDateTime <= b.EndDateTime -- |---|****|-----
THEN DATEDIFF(DAY, a.StartDateTime, a.EndDateTime)
END Diff,
a.VehicleID,
b.ReasonCodeID,
a.LocationID --a.StartDateTime, a.EndDateTime, b.StartDateTime, b.EndDateTime
FROM LocationEvents a
CROSS JOIN VehicleStatusEvents b
WHERE a.VehicleID = b.VehicleID
AND
(
(a.StartDateTime <= b.EndDateTime)
AND (a.EndDateTime >= b.StartDateTime)
)
) dt
GROUP BY dt.VehicleID,
dt.ReasonCodeID,
dt.LocationID
Note in T-SQL you could use an INNER JOIN operator too.
Let me know if this helps.
select coalesce(l.VehicleID,s.VehicleID) as VehicleID
,s.ReasonCodeID
,l.LocationID
,sum
(
datediff
(
day
,case when s.StartDateTime > l.StartDateTime then s.StartDateTime else l.StartDateTime end
,case when s.EndDateTime < l.EndDateTime then s.EndDateTime else l.EndDateTime end
)
) as TotalDays
from VehicleLocationEvents as l
full join VehicleStatusEvents as s
on s.VehicleID =
l.VehicleID
and case when s.StartDateTime > l.StartDateTime then s.StartDateTime else l.StartDateTime end <=
case when s.EndDateTime < l.EndDateTime then s.EndDateTime else l.EndDateTime end
group by coalesce(l.VehicleID,s.VehicleID)
,s.ReasonCodeID
,l.LocationID
or
select VehicleID
,ReasonCodeID
,LocationID
,sum (datediff (day,max_StartDateTime,min_EndDateTime)) as TotalDays
from (select coalesce(l.VehicleID,s.VehicleID) as VehicleID
,s.ReasonCodeID
,l.LocationID
,case when s.StartDateTime > l.StartDateTime then s.StartDateTime else l.StartDateTime end as max_StartDateTime
,case when s.EndDateTime < l.EndDateTime then s.EndDateTime else l.EndDateTime end as min_EndDateTime
from VehicleLocationEvents as l
full join VehicleStatusEvents as s
on s.VehicleID =
l.VehicleID
) ls
where max_StartDateTime <= min_EndDateTime
group by VehicleID
,ReasonCodeID
,LocationID

SQLlite strftime function to get grouped data by months

i have table with following structure and data:
I would like to get grouped data by months in given date range for example (from 2014-01-01 to 2014-12-31). Data for some months cannot be available but i still need to have in result information that in given month is result 0.
Result should have following format:
MONTH | DIALS_CNT | APPT_CNT | CONVERS_CNT | CANNOT_REACH_CNT |
2014-01 | 100 | 50 | 20 | 30 |
2014-02 | 100 | 40 | 30 | 30 |
2014-03 | 0 | 0 | 0 | 0 |
etc..
WHERE
APPT_CNT = WHERE call.result = APPT
CONVERS_CNT = WHERE call.result = CONV_NO_APPT
CANNOT_REACH_CNT = WHERE call.result = CANNOT_REACH
How can i do it please with usage function strftime ?
Many thanks for any help or example.
SELECT Month,
(SELECT COUNT(*)
FROM MyTable
WHERE date LIKE Month || '%'
) AS Dials_Cnt,
(SELECT SUM(Call_Result = 'APPT')
FROM MyTable
WHERE date LIKE Month || '%'
) AS Appt_Cnt,
...
FROM (SELECT '2014-01' AS Month UNION ALL
SELECT '2014-02' UNION ALL
SELECT '2014-03' UNION ALL
...
SELECT '2014-12')

Postgres: How to use value of field in subquery

I have db table like this:
id | user | service | pay_date | status
---+------+---------+--------------+------------
1 | john | 1 | 2014-08-20 | 1
2 | john | 3 | 2014-07-24 | 0
I am trying to build query which return all users who are not payed service 1 or service 3 between 2 dates and haven`t active services (status = 0 for service 1 and service 2). The problem is that query return service 3, but this is not correct because service 1 is active.
SELECT *
FROM services
WHERE date > 2014-07-01
AND date < 2014-07-30
AND service = 1 OR service = 3
My solution is to check if service 3 is active when i`am checking service 1. Something like that...
SELECT *
FROM services
WHERE date > 2014-07-01
AND date < 2014-07-30
AND ((service = 1 AND "SERVICE 3 IS NOT ACTIVE") OR service = 3)
but i cant check if "SERVICE 3 IS NOT ACTIVE" for current user
I think you want NOT EXISTS, i.e. checking that another active record does not exist for the same user, possibly something like:
SELECT *
FROM Services
WHERE Service IN (1, 3)
AND pay_date > '2014-07-01'
AND pay_date < '2014-07-30'
AND NOT EXISTS
( SELECT 1
FROM services AS s2
WHERE s2.Name = s.Name
AND s2.Status = 1
AND s2.Service IN (1, 3)
AND s2.pay_date > '2014-07-01'
AND s2.pay_date < '2014-07-30'
);
There is no single record that meets all those criteria.
id 1 record cannot be both service = 1 & service = 3 at the some time
id | user | service | pay_date | status
---+------+---------+--------------+------------
1 | john | 1 | 2014-08-20 | 1
id 2 record cannot be both service = 1 & service = 3 at the some time
id | user | service | pay_date | status
---+------+---------+--------------+------------
2 | john | 3 | 2014-08-24 | 0 -- nb: I changed this date!
SO: your criteria require comparing record 1 to record 2 (or more broadly, that you have to compare "across records"). below I use EXISTS to compare across records.
SELECT
*
FROM services
WHERE (pay_date >= '2014-08-01'
AND pay_date < '2014-08-30')
AND service = 1
AND EXISTS (
SELECT
NULL
FROM services s3
WHERE (s3.pay_date >= '2014-08-01'
AND s3.pay_date < '2014-08-30')
AND s3.service = 3
and s3.status = 1
AND s3.user = services.user
)
see SQLfiddle Demo
by the way, it is more conventional to use a date range via >= & <