Access 2010 here.
Back with another puzzler. I have this query:
SELECT DischargeDatabase.Date, Avg([pH]) AS [pH Value], Avg([Temperature]) AS [Temperature (°C)], Avg([ZincLevel]) AS [Zinc (mg/l)], Sum([Effluent]) AS [Discharge (gal)], Count(*) AS [# Discharges]
FROM DischargeDatabase
WHERE DischargeDatabase.Date Between Forms!QueryForm!TextCriteriaQ0A And Forms!QueryForm!TextCriteriaQ0B
GROUP BY DischargeDatabase.Date;
from a waste water treatment database that I've been building. This gives a by-day summary of waste water discharges, averaging the pH, Temperature, and zinc levels, and summing the discharge volume (effluent). The user selects a range in two text boxes on the "QueryForm" with date pickers, and runs the query.
What is shown is discharges, grouped by day, for the date range, and only days that had discharges are listed. What a user has requested is for every day in the range selected to be shown, and those days without records in the "DischargeDatabase" just have zeros for the field values.
i.e. from this (date range 4/11/2013 to 4/16/2013, over a weekend):
Date | ph Value | Temperature (°C) | Zinc (mg/l) | Discharge (gal) | # Discharges
4/11/2013 9.5 18.6 0.89 5000 5
4/12/2013 9.1 17.9 1.68 3000 2
4/15/2013 8.9 19.6 1.47 10000 7
4/16/2013 9.6 18.2 0.35 1500 1
to this:
Date | ph Value | Temperature (°C) | Zinc (mg/l) | Discharge (gal) | # Discharges
4/11/2013 9.5 18.6 0.89 5000 5
4/12/2013 9.1 17.9 1.68 3000 2
4/13/2013 0.0 0.0 0.0 0 0
4/14/2013 0.0 0.0 0.0 0 0
4/15/2013 8.9 19.6 1.47 10000 7
4/16/2013 9.6 18.2 0.35 1500 1
This is all so that the user can paste the query into an excel spreadsheet without issue. I'm not even sure that this is possible, or within the scope of a query (you are "selecting" records that don't exist). What might work is some sort of join with a bogus table/query pre-filled with zeros?
Thank you for the help and any ideas!
This could be fairly easy with a calendar table. You can build your own using custom CreateTable_calendar and LoadCalendar procedures.
Create a query which filters the calendar table based the the date range and LEFT JOIN it to your other table. (I simplified the SELECT field list in this example.)
SELECT
c.the_date,
Count(ddb.Date) AS [# Discharges]
FROM
tblCalendar AS c
LEFT JOIN DischargeDatabase AS ddb
ON c.the_date = ddb.Date
WHERE
c.the_date Between
Forms!QueryForm!TextCriteriaQ0A
And Forms!QueryForm!TextCriteriaQ0B
GROUP BY c.the_date;
You are on the right track with your last statement! This kind of thing - fill all the groups even when there is no data - can be done with what we call a numbers table or tally table, which can be a CTE, or a real table, whatever you want to do. You can expand the CTE to generate dates...
;WITH CTE AS (
SELECT 1 as Num
UNION ALL
SELECT Num + 1 FROM CTE WHERE Num < #Max
)
SELECT * FROM CTE
This pattern can be expanded to generate your dates...
declare #startDate datetime
set #startDate = getdate() --to start from today
;WITH CTE AS (
SELECT #startDate as myDate
UNION ALL
SELECT dateadd(day, 1, myDate) as myDate FROM CTE WHERE myDate < dateadd(day, 30, #startDate)
)
SELECT myDate FROM CTE
Now, you can use that CTE as the left table in a right outer join. In Access, I think this will need to be a real table. Just create one and manually populate it with numbers - you only have to do this one time.
Related
I have a list of unique ID's in one table that has a date column. Example:
TABLE1
ID Date
0 2018-01-01
1 2018-01-05
2 2018-01-15
3 2018-01-06
4 2018-01-09
5 2018-01-12
6 2018-01-15
7 2018-01-02
8 2018-01-04
9 2018-02-25
Then in another table I have a list of different values that appear multiple times for each ID with various dates.
TABLE 2
ID Value Date
0 18 2017-11-28
0 24 2017-12-29
0 28 2018-01-06
1 455 2018-01-03
1 468 2018-01-16
2 55 2018-01-03
3 100 2017-12-27
3 110 2018-01-04
3 119 2018-01-10
3 128 2018-01-30
4 223 2018-01-01
4 250 2018-01-09
4 258 2018-01-11
etc
I want to find the value in table 2 that is closest to the unique date in table 1.
Sometimes table 2 does contain a value that matches the date exactly and I have had no problem in pulling through those values. But I can't work out the code to pull through the value closest to the date requested from table 1.
My desired result based on the examples above would be
ID Value Date
0 24 2017-12-29
1 455 2018-01-03
2 55 2018-01-03
3 110 2018-01-04
4 250 2018-01-09
Since I can easily find the ID's with an exact match, one thing I have tried is taking the ID's that don't have an exact date match and placing them with their corresponding values into a temporary table. Then trying to find the values where I need the closest possible match, but it's here that I'm not sure where to begin on the coding of that.
Apologies if I'm missing a basic function or clause for this, I'm still learning!
The below would be one method:
WITH Table1 AS(
SELECT ID, CONVERT(date, datecolumn) DateColumn
FROM (VALUES (0,'20180101'),
(1,'20180105'),
(2,'20180115'),
(3,'20180106'),
(4,'20180109'),
(5,'20180112'),
(6,'20180115'),
(7,'20180102'),
(8,'20180104'),
(9,'20180225')) V(ID, DateColumn)),
Table2 AS(
SELECT ID, [value], CONVERT(date, datecolumn) DateColumn
FROM (VALUES (0,18 ,'2017-11-28'),
(0,24 ,'2017-12-29'),
(0,28 ,'2018-01-06'),
(1,455,'2018-01-03'),
(1,468,'2018-01-16'),
(2,55 ,'2018-01-03'),
(3,100,'2017-12-27'),
(3,110,'2018-01-04'),
(3,119,'2018-01-10'),
(3,128,'2018-01-30'),
(4,223,'2018-01-01'),
(4,250,'2018-01-09'),
(4,258,'2018-01-11')) V(ID, [Value],DateColumn))
SELECT T1.ID,
T2.[Value],
T2.DateColumn
FROM Table1 T1
CROSS APPLY (SELECT TOP 1 *
FROM Table2 ca
WHERE T1.ID = ca.ID
ORDER BY ABS(DATEDIFF(DAY, ca.DateColumn, T1.DateColumn))) T2;
Note that if the difference is days is the same, the row returned will be random (and could differ each time the query is run). For example, if Table had the date 20180804 and Table2 had the dates 20180803 and 20180805 they would both have the value 1 for ABS(DATEDIFF(DAY, ca.DateColumn, T1.DateColumn)). You therefore might need to include additional logic in your ORDER BY to ensure consistent results.
dude.
I'll say a couple of things here for you to consider, since SQL Server is not my comfort zone, while SQL itself is.
First of all, I'd join TABLE1 with TABLE2 per ID. That way, I can specify on my SELECT clause the following tuple:
SELECT ID, Value, DateDiff(d, T1.Date, T2.Date) qt_diff_days
Obviously, depending on the precision of the dates kept there, rather they have times or not, you can change the date field on DateDiff function.
Going forward, I'd also make this date difference an absolute number (to resolve positive / negative differences and consider only the elapsed time).
After that, and that's where it gets tricky because I don't know the SQL Server version you're using, but basically I'd use a ROW_NUMBER window function to rank all my lines per difference. Something like the following:
SELECT
ID, Value, Abs(DateDiff(d, T1.Date, T2.Date)) qt_diff_days,
ROW_NUMBER() OVER(PARTITION BY ID ORDER BY Abs(DateDiff(d, T1.Date, T2.Date)) ASC) nu_row
ROW_NUMBER (Transact-SQL)
Numbers the output of a result set. More specifically, returns the sequential number of a row within a partition of a result set, starting at 1 for the first row in each partition.
If you could run ROW_NUMBER properly, you should notice the query will rank it's data per ID, starting with 1 and increasing this ranking by it's difference between both dates, reseting it's rank to 1 when ID changes.
After that, all you need to do is select only those lines where nu_row equals to 1. I'd use a CTE to that.
WITH common_table_expression (Transact-SQL)
Specifies a temporary named result set, known as a common table expression (CTE).
I am trying to translate a query that I wrote in TSQL to DAX and have tough time figuring out how to use subquery in DAX.
Here is my source SQL code:
Select [DayOfWeekNumber], [DayOfWeek], Ratio=Avg (100*Opened/ Sent) From
(select [DayOfWeekNumber], [DayOfWeek], e.Schedule_ID
, Count (e.opened) as Opened
, Sum (e.NoOfEmailSent) as Sent
from
Events e
join dim_date D on d.ID_Date = e.ID_Date
where
e.AccountNumber = 1
group by d.[DayOfWeek], [DayOfWeekNumber], e.Schedule_ID
having
Sum (e.NoOfEmailSent) > Count (e.opened)
) OBSHour
group by [DayOfWeekNumber], [DayOfWeek]
Order by [DayOfWeekNumber]
The purpose of this query is to calculate open ratio of number of opened emails vs number of emails sent by first calculating opened / Sent for each email schedule and weekdays and then taking average of all ratios for the same weekday.
For example if data is like this for a Sunday:
Schedule_ID Open Sent Ratio
123 10 100 .1
125 2 10 .2
129 1 4 .25
Then final ratio for Sunday will become (.1+.2+.25)/3=.18
I appreciate your help.
Given the following table (much simplified for the purposes of this question):
id perPeriod actuals createdDate
---------------------------------------------------------
1 14 22 2011-10-04 00:00:00.000
2 14 9 2011-10-04 00:00:00.000
3 14 3 2011-10-03 00:00:00.000
4 14 5 2011-10-03 00:00:00.000
I need a query that gives me the average daily "actuals" figure. Note, however, that there are TWO RECORDS PER DAY (often more), so I can't just do AVG(actuals).
Also, if the daily "actuals" average exceeds the daily "perPeriod" average, I want to take the perPeriod value instead of the "average" value. Thus, in the case of the first two records: The actuals average for 4th October is (22+9) / 2 = 15.5. And the perPeriod average for the same day is (14 + 14) / 2 = 14. Now, 15.5 is greater than 14, so the daily "actuals" average for that day should be the "perPeriod" average.
Hope that makes sense. Any pointers greatly appreciated.
EDIT
I need an overall daily average, not an average per date. As I said, I would love to just do AVG(actuals) on the entire table, but the complicating factor is that a particular day can occupy more than one row, which would skew the results.
Is this what you want?
First, if the second payperiod average needed to be the average across a different grouping (It doesn't in this case), then you would need to use a subquery like this:
Select t.CreatedDate,
Case When Avg(actuals) < p.PayPeriodAvg
Then Avg(actuals) Else p.PayPeriodAvg End Average
From table1 t Join
(Select CreatedDate, Avg(PayPeriod) PayPeriodAvg
From table1
Group By CreatedDate) as p
On p.CreatedDate = t.CreatedDate
Group By t.CreatedDate, p.PayPeriodAvg
or, in this case, since the PayPeriod Average is grouped on the same thing, (CreatedDate) as the actuals average, you don't need a subquery, so even easier:
Select t.CreatedDate,
Case When Avg(actuals) < Avg(PayPeriod)
Then Avg(actuals) Else Avg(PayPeriod) End Average
From table1 t
Group By t.CreatedDate
with your sample data, both of these return
CreatedDate Average
----------------------- -----------
2011-10-03 00:00:00.000 4
2011-10-04 00:00:00.000 14
SELECT DAY(createdDate), MONTH(createdDate), YEAR(createdDate), MIN(AVG(actuals), MAX(perPeriod))
FROM MyTable
GROUP BY Day(createdDate, MONTH(createdDate), YEAR(createdDate)
Try this out:
select createdDate,
case
when AVG(actuals) > max(perPeriod) then max(perPeriod)
else AVG(actuals)
end
from SomeTestTable
group by createdDate
Let's say there is this table which stores the number of visitors for each day.
When I want to query the table and create a graph from it a problem arises.
The days without activity have no corresponding rows on the table.
For example
Day1 - 7
Day2 - 8
Day4 - 7
And the graph generated would not be correct. Since it needs a 0 for Day3.
Now, without using anything other than SQL is it possible to create those values for the inactivity days?
I thought of creating another table which would create all the dates for the 30days to come each time the scripts gets executed and the problem would have been fixed, but I'm wondering if there is a more practical solution.
Thanks in advance.
Your solution of creating a table with the 30 days is a very simple and practical solution.
You can do it without an extra table if you really want to, but it's not pleasant. SQL is not really designed to allow you to select data that doesn't exist in your database. A much easier solution in general is to add the missing rows client-side rather than trying to write a complex SQL statement to do this.
Using Sql Server 2005+ and CTE Recursive (Using Common Table Expressions) you could try
DECLARE #Table TABLE(
DateVal DATETIME,
Visists INT
)
INSERT INTO #Table SELECT '01 Jan 2010', 10
INSERT INTO #Table SELECT '03 Jan 2010', 1
INSERT INTO #Table SELECT '05 Jan 2010', 30
INSERT INTO #Table SELECT '10 Jan 2010', 50
;WITH MinMax AS (
SELECT MIN(DateVal) Startdate,
MAX(DateVal) EndDate
FROM #Table
),
DateRange AS(
SELECT StartDate DateVal
FROM MinMax
UNION ALL
SELECT DateRange.DateVal + 1
FROM DateRange,
MinMax
WHERE DateRange.DateVal + 1 <= MinMax.EndDate
)
SELECT DateRange.DateVal,
ISNULL(t.Visists,0) TotalVisits
FROM DateRange LEFT JOIN
#Table t ON DateRange.DateVal = t.DateVal
With output as
DateVal TotalVisits
----------------------- -----------
2010-01-01 00:00:00.000 10
2010-01-02 00:00:00.000 0
2010-01-03 00:00:00.000 1
2010-01-04 00:00:00.000 0
2010-01-05 00:00:00.000 30
2010-01-06 00:00:00.000 0
2010-01-07 00:00:00.000 0
2010-01-08 00:00:00.000 0
2010-01-09 00:00:00.000 0
2010-01-10 00:00:00.000 50
I wouldn't call this SQL only, since it uses a PostgreSQL specific function - but there may be something similar in whatever database your using.
PostgreSQL has a nice function: generate_series
You can use this function to create a series of 30 days.
select current_date + s.a as dates from generate_series(0,30) as s(a);
dates
------------
2010-04-22
2010-04-23
2010-04-24
(.. etc ..)
You can then use that in a query, something like:
select vpd.visits, temp.dates
from (select current_date + s.a as dates from generate_series(0,30) as s(a)) as temp
left outer join visits_per_day vpd on vpd.day = temp.dates
visits | dates
--------+------------
10 | 2010-04-22
| 2010-04-23
20 | 2010-04-24
| 2010-04-25
| 2010-04-26
30 | 2010-04-27
No, there is no standard way using only SQL to add an indeterminate number of missing rows into the result of an SQL query without first storing those rows in a table.
Either you can have a single table which contains all the dates over which your application will operate or you can have a table into which you put only the dates that your current query will use. If you choose the second solution, remember to plan for different users executing the same query with different date ranges at the same time — you'll want the table to be temporary and user-specific if your DBMS supports that.
If I have a table containing schedule information that implies particular dates, is there a SQL statement that can be written to convert that information into actual rows, using some sort of CROSS JOIN, perhaps?
Consider a payment schedule table with these columns:
StartDate - the date the schedule begins (1st payment is due on this date)
Term - the length in months of the schedule
Frequency - the number of months between recurrences
PaymentAmt - the payment amount :-)
SchedID StartDate Term Frequency PaymentAmt
-------------------------------------------------
1 05-Jan-2003 48 12 1000.00
2 20-Dec-2008 42 6 25.00
Is there a single SQL statement to allow me to go from the above to the following?
Running
SchedID Payment Due Expected
Num Date Total
--------------------------------------
1 1 05-Jan-2003 1000.00
1 2 05-Jan-2004 2000.00
1 3 05-Jan-2005 3000.00
1 4 05-Jan-2006 4000.00
1 5 05-Jan-2007 5000.00
2 1 20-Dec-2008 25.00
2 2 20-Jun-2009 50.00
2 3 20-Dec-2009 75.00
2 4 20-Jun-2010 100.00
2 5 20-Dec-2010 125.00
2 6 20-Jun-2011 150.00
2 7 20-Dec-2011 175.00
I'm using MS SQL Server 2005 (no hope for an upgrade soon) and I can already do this using a table variable and while loop, but it seemed like some sort of CROSS JOIN would apply but I don't know how that might work.
Your thoughts are appreciated.
EDIT: I'm actually using SQL Server 2005 though I initially said 2000. We aren't quite as backwards as I thought. Sorry.
I cannot test the code right now, so take it with a pinch of salt, but I think that something looking more or less like the following should answer the question:
with q(SchedId, PaymentNum, DueDate, RunningExpectedTotal) as
(select SchedId,
1 as PaymentNum,
StartDate as DueDate,
PaymentAmt as RunningExpectedTotal
from PaymentScheduleTable
union all
select q.SchedId,
1 + q.PaymentNum as PaymentNum,
DATEADD(month, s.Frequency, q.DueDate) as DueDate,
q.RunningExpectedTotal + s.PaymentAmt as RunningExpectedTotal
from q
inner join PaymentScheduleTable s
on s.SchedId = q.SchedId
where q.PaymentNum <= s.Term / s.Frequency)
select *
from q
order by SchedId, PaymentNum
Try using a table of integers (or better this: http://www.sql-server-helper.com/functions/integer-table.aspx) and a little date math, e..g. start + int * freq
I've used table-valued functions to achieve a similar result. Basically the same as using a table variable I know, but I remember being really pleased with the design.
The usage ends up reading very well, in my opinion:
/* assumes #startdate and #enddate schedule limits */
SELECT
p.paymentid,
ps.paymentnum,
ps.duedate,
ps.ret
FROM
payment p,
dbo.FUNC_get_payment_schedule(p.paymentid, #startdate, #enddate) ps
ORDER BY p.paymentid, ps.paymentnum
A typical solution is to use a Calendar table. You can expand it to fit your own needs, but it would look something like:
CREATE TABLE Calendar
(
calendar_date DATETIME NOT NULL,
is_holiday BIT NOT NULL DEFAULT(0),
CONSTRAINT PK_Calendar PRIMARY KEY CLUSTERED calendar_date
)
In addition to the is_holiday you can add other columns that are relevant for you. You can write a script to populate the table up through the next 10 or 100 or 1000 years and you should be all set. It makes queries like that one that you're trying to do much simpler and can give you additional functionality.