How to get MS Access to group by date column - sql

I'm having difficulty grouping by date in Microsoft Access. Any guidance would be great. Thank you.
Fake Table - schedules:
Date
Location
Appointment ID
People
1/3/23
East
98765
3
1/4/23
West
983746
2
1/3/23
East
09382
5
Query I'm using:
SELECT schedules.Date, schedules.Location, COUNT(schedules.[Appointment ID]) AS [Appointment Count], SUM(schedules.[People]) AS [People Served]
FROM schedules
GROUP BY schedules.Date, schedules.Location;
I'd like the following table:
Date
Location
Appointment Count
People Served
1/3/23
East
2
8
1/4/23
West
1
2
BUT my code won't allow grouping on the date for whatever reason. It's a short date without the time. With my code, the first table is returned back to me. If I omit the date, it will group by location.

Base your groups on DateValue(<your Date/Time field>) to group all records from the same day together regardless of their time component.
SELECT
DateValue(s.Date),
s.Location,
COUNT(s.[Appointment ID]) AS [Appointment Count],
SUM(s.[People]) AS [People Served]
FROM schedules AS s
GROUP BY DateValue(s.Date), s.Location;

Related

How to find if there is a match in a certain time interval between two different date fields SQL?

I have a column in my fact table that defines whether a Supplier is old or new based on the following case-statement:
CASE
WHEN (SUBSTRING([Reg date], 1, 6) = SUBSTRING([Invoice date], 1, 6)
THEN ('New supplier')
ELSE('Old supplier')
END as [Old/New supplier]
So for example, if a Supplier was registered 201910 and Invoice date was 201910 then the Supplier would be considered a 'New supplier' that month. Now I want to calculate the number of Old/New suppliers for each month by doing an distinct count on Supplier no, which is not a problem. The last step is where it gets tricky, now I want to count the number of New/Old suppliers over a 12-month period(if there has been a match on Invoice date and reg date in any of the lagging 12 months). So I create the following mdx expression:
aggregate(parallelperiod([D Time].[Year-Month-Day].[Year],1,[D Time].[Year-Month-Day].currentmember).lead(1) : [D Time].[Year-Month-Day].currentmember ,[Measures].[Supplier No Distinct Count])
The issue I am facing is that it will count Supplier no "1234" twice since it has been both new and old during that time period. What I wish is that, if it finds one match it would be considered a "New" Supplier for that 12- month period.
This is how the result ends up looking but I want it to be zero for "Old" since Reg date and Invoice date matched once during that 12-month period it should be considered new for the whole Rolling 12 month on 201910
Any help, possible approaches or ideas are highly appreciated.
Best regards,
Rubrix
Aggregate first at the supplier level and then at the type level:
select type, count(*)
from (select supplierid,
(case when min(substring(regdate, 1, 6)) = min(substring(invoicedate, 1, 6))
then 'new' else 'old'
end) as type
from t
group by supplierid
) s
group by type;
Note: I assume your date columns are in some obscure string format for your code to work. Otherwise, you should be using appropriate date functions.
SELECT COUNT(*) OVER () AS TotalCount
FROM Facts
WHERE Regdate BETWEEN(olddate, newdate) OR InvoiceDate BETWEEN(olddate, newdate)
GROUP BY
Supplier
The above query will return all the suppliers within that time period and then group them. Thus COUNT(*) will only include unique subscribers.
You might wanna change the WHERE clause because I didn't quite understand how you are getting the 12 month period. Generally if your where clause returns the suppliers within that time period(they don't have to be unique) then the group by and count will handle the rest.

How to add custom YoY field to output?

I'm attempting to determine the YoY growth by month, 2017 to 2018, for number of Company bookings per property.
I've tried casting and windowed functions but am not obtaining the correct result.
Example Table 1: Bookings
BookID Amnt BookType InDate OutDate PropertyID Name Status
-----------------------------------------------------------------
789555 $1000 Company 1/1/2018 3/1/2018 22111 Wendy Active
478141 $1250 Owner 1/1/2017 2/1/2017 35825 John Cancelled
There are only two book types (e.g., Company, Owner) and two Book Status (e.g., Active and Cancelled).
Example Table 2: Properties
Property ID State Property Start Date Property End Date
---------------------------------------------------------------------
33111 New York 2/3/2017
35825 Michigan 7/21/2016
The Property End Date is blank when the company still owns it.
Example Table 3: Months
Start of Month End of Month
-------------------------------------------
1/1/2018 1/31/2018
The previous developer created this table which includes a row for each month from 2015-2020.
I've tried many various iterations of my current code and can't even come close.
Desired Outcome
I need to find the YoY growth by month, 2017 to 2018, for number of Company bookings per property. The stakeholder has requested the output to have the below columns:
Month Name Bookings_Per_Property_2017 Bookings_Per_Property_2018 YoY
-----------------------------------------------------------------------
The number of Company bookings per property in a month should be calculated by counting the total number of active Company bookings made in a month divided by the total number of properties active in the month.
Here is a solution that should be close to what you need. It works by:
LEFT JOINing the three tables; the important part is to properly check the overlaps in date ranges between months(StartOfMonth, EndOfMonth), bookings(InDate, OutDate) and properties(PropertyStartDate, PropertyEndDate): you can have a look at this reference post for general discussion on how to proceed efficiently
aggregating by month, and using conditional COUNT(DISTINCT ...) to count the number of properties and bookings in each month and year. The logic implicitly relies on the fact that this aggregate function ignores NULL values. Since we are using LEFT JOINs, we also need to handle the possibility that a denominator could have a 0 value.
Notes:
you did not provide expected results so this cannot be tested
also, you did not explain how to compute the YoY column, so I left it alone; I assume that you can easily compute it from the other columns
Query:
SELECT
MONTH(m.StartOfMonth) AS [Month],
COUNT(DISTINCT CASE WHEN YEAR(StartOfMonth) = 2017 THEN b.BookID END)
/ NULLIF(COUNT(DISTINCT CASE WHEN YEAR(StartOfMonth) = 2017 THEN p.PropertyID END), 0)
AS Bookings_Per_Property_2017,
COUNT(DISTINCT CASE WHEN YEAR(StartOfMonth) = 2018 THEN b.BookID END)
/ NULLIF(COUNT(DISTINCT CASE WHEN YEAR(StartOfMonth) = 2018 THEN p.PropertyID END), 0)
AS Bookings_Per_Property_2018
FROM months m
LEFT JOIN bookings b
ON m.StartOfMonth <= b.OutDate
AND m.EndOfMonth >= b.InDate
AND b.status = 'Active'
AND b.BookType = 'Company'
LEFT JOIN properties p
ON m.StartOfMonth <= COLAESCE(p.PropertyEndDate, m.StartOfMonth)
AND m.EndOfMonth >= p.PropertyStartDate
GROUP BY MONTH(m.StartOfMonth)

How to get the historical weather for any city with BigQuery?

BigQuery has NOAA's gsod data loaded as a public dataset - starting in 1929: https://www.reddit.com/r/bigquery/comments/2ts9wo/noaa_gsod_weather_data_loaded_into_bigquery/
How can I retrieve the historical data for any city?
Update 2019: For convenience
SELECT *
FROM `fh-bigquery.weather_gsod.all`
WHERE name='SAN FRANCISCO INTERNATIONAL A'
ORDER BY date DESC
Updated daily - or report here if it doesn't
For example, to get the hottest days for San Francisco stations since 1980:
SELECT name, state, ARRAY_AGG(STRUCT(date,temp) ORDER BY temp DESC LIMIT 5) top_hot, MAX(date) active_until
FROM `fh-bigquery.weather_gsod.all`
WHERE name LIKE 'SAN FRANC%'
AND date > '1980-01-01'
GROUP BY 1,2
ORDER BY active_until DESC
Note that this query processed only 28MB thanks to a clustered table.
And similar, but instead of using the station name I'll use a location and a table clustered by the location:
WITH city AS (SELECT ST_GEOGPOINT(-122.465, 37.807))
SELECT name, state, ARRAY_AGG(STRUCT(date,temp) ORDER BY temp DESC LIMIT 5) top_hot, MAX(date) station_until
FROM `fh-bigquery.weather_gsod.all_geoclustered`
WHERE EXTRACT(YEAR FROM date) > 1980
AND ST_DISTANCE(point_gis, (SELECT * FROM city)) < 40000
GROUP BY name, state
HAVING EXTRACT(YEAR FROM station_until)>2018
ORDER BY ST_DISTANCE(ANY_VALUE(point_gis), (SELECT * FROM city))
LIMIT 5
Update 2017: Standard SQL and up-to-date tables:
SELECT TIMESTAMP(CONCAT(year,'-',mo,'-',da)) day, AVG(min) min, AVG(max) max, AVG(IF(prcp=99.99,0,prcp)) prcp
FROM `bigquery-public-data.noaa_gsod.gsod2016`
WHERE stn='722540' AND wban='13904'
GROUP BY 1
ORDER BY day
Additional example, to show the coldest days in Chicago in this decade:
#standardSQL
SELECT year, FORMAT('%s%s',mo,da) day ,min
FROM `fh-bigquery.weather_gsod.stations` a
JOIN `bigquery-public-data.noaa_gsod.gsod201*` b
ON a.usaf=b.stn AND a.wban=b.wban
WHERE name='CHICAGO/O HARE ARPT'
AND min!=9999.9
AND mo<'03'
ORDER BY 1,2
To retrieve the historical weather for any city, first we need to find what station reports in that city. The table [fh-bigquery:weather_gsod.stations] contains the name of known stations, their state (if in the US), country, and other details.
So to find all the stations in Austin, TX, we would use a query like this:
SELECT state, name, lat, lon
FROM [fh-bigquery:weather_gsod.stations]
WHERE country='US' AND state='TX' AND name CONTAINS 'AUST'
LIMIT 10
This approach has 2 problems that need to be solved:
Not every known station is present in that table - I need to get an updated version of this file. So don't give up if you don't find the station you are looking for here.
Not every station found in this file has been operating every year - so we need to find stations that have data during the year we are looking for.
To solve the second problem, we need to join the stations table with the actual data we are looking for. The following query looks for stations around Austin, and the column c looks at how many days during 2015 have actual data:
SELECT state, name, FIRST(a.wban) wban, FIRST(a.stn) stn, COUNT(*) c, INTEGER(SUM(IF(prcp=99.99,0,prcp))) rain, FIRST(lat) lat, FIRST(lon) long
FROM [fh-bigquery:weather_gsod.gsod2015] a
JOIN [fh-bigquery:weather_gsod.stations] b
ON a.wban=b.wban
AND a.stn=b.usaf
WHERE country='US' AND state='TX' AND name CONTAINS 'AUST'
GROUP BY 1,2
LIMIT 10
That's good! We found 4 stations with data for Austin during 2015.
Note that we had to treat "rain" in a special way: When a station doesn't monitor for rain, instead of null, it marks it as 99.99. Our query filters those values out.
Now that we know the stn and wban numbers for these stations, we can pick any of them and visualize the results:
SELECT TIMESTAMP('2015'+mo+da) day, AVG(min) min, AVG(max) max, AVG(IF(prcp=99.99,0,prcp)) prcp
FROM [fh-bigquery:weather_gsod.gsod2015]
WHERE stn='722540' AND wban='13904'
GROUP BY 1
ORDER BY day
There's now an official set of the NOAA data on BigQuery in addition to Felipe's "official" public dataset. There's a blog post describing it.
An example getting minimum temperatures for August 15, 2016:
SELECT
name,
value/10 AS min_temperature,
latitude,
longitude
FROM
[bigquery-public-data:ghcn_d.ghcnd_stations] AS stn
JOIN
[bigquery-public-data:ghcn_d.ghcnd_2016] AS wx
ON
wx.id = stn.id
WHERE
wx.element = 'TMIN'
AND wx.qflag IS NULL
AND STRING(wx.date) = '2016-08-15'
Which returns:
Thanks for pulling in the data and making it a public table. Here is a BigQuery that returns the total rainfall in 2014 for every station in Texas:
SELECT FIRST(name) AS station_name, stn, SUM(prcp) AS annual_precip
FROM [fh-bigquery:weather_gsod.gsod2014] gsod
JOIN [fh-bigquery:weather_gsod.stations] stations
ON gsod.wban=stations.wban AND gsod.stn=stations.usaf
WHERE state='TX' AND prcp != 99.99
GROUP BY stn
which returns:
Pulling in the number of rainy days at every location, and sorting the results based on this:
SELECT FIRST(name) AS station_name, stn, SUM(prcp) AS annual_precip, COUNT(prcp) AS rainy_days
FROM [fh-bigquery:weather_gsod.gsod2014] gsod
JOIN [fh-bigquery:weather_gsod.stations] stations
ON gsod.wban=stations.wban AND gsod.stn=stations.usaf
WHERE state='TX' AND prcp != 99.99 AND prcp > 0
GROUP BY stn
ORDER BY rainy_days DESC
comes up with .
Using the station name is unreliable. Also, it's hard to use a geospatial query using the new bigquery capabilities, because boundaries of cities do not have clear shapes (like a circle or a rectangle).
Therefore the best solution I found to your problem is to using reverse geocoding, asking Google Maps API to produce the address, state, city and county, for each station, using it's lat/lon coordinates.
Here is the resulting CSV (StationNumber,Lat,Lon,Address,State,City,County,Zip) for the US (you will notice 98% of stations exist there):
https://gist.github.com/orcaman/a3e23c47489705dff93aace2e35f57d3
Here's the code in case you want to re-run it over stations outside the US (golang):
https://gist.github.com/orcaman/8de55f14f1c70ef5b0c124cf2fb7d9d1

Getting repeated rows for where with or condition

I am trying find employees that worked during a specific time period and the hours they worked during that time period. My query has to join the employee table that has employee id as pk and uses effective_date and expiration_date as time measures for the employee's position to the timekeeping table that has a pay period id number as pk and also uses effective and expiration dates.
The problem with the expiration date in the employee table is that if the employee is currently employed then the date is '12/31/9999'. I am looking for employees that worked in a certain year and current employees as well as the hours they worked separated by pay periods.
When I take this condition in account in the where with an OR statement, I get duplicates that is employees that have worked the time period I am looking for and beyond as well as duplicate records for the '12/31/9999' and the valid employee in that time period.
This is the query I am using:
SELECT
J.EMPL_ID
,J.DEPT
,J.UNIT
,J.LAST_NM
,J.FIRST_NM
,J.TITLE
,J.EFF_DT
,J.EXP_DT
,TM1.PPRD_ID
,TM1.EMPL_ID
,TM1.EXP_DT
,TM1.EFF_DT
--PULLING IN THE DAILY HRS WORKED
,(SELECT NVL(SUM(((to_number(SUBSTR(TI.DAY_1, 1
,INSTR(TI.DAY_1, ':', 1, 1)-1),99))*60)+
(TO_NUMBER(SUBSTR(TI.DAY_1
,INSTR(TI.DAY_1,':', -1, 1)+1),99))),0)
FROM PPRD_LINE TI
WHERE
TI.PPRD_ID=TM1.PPRD_ID
) "DAY1"
---AND THE REST OF THE DAYS FOR THE WORK PERIOD
FROM PPRD_LINE TM1
JOIN EMPL J ON TM1.EMPL_ID=J.EMPL_ID
WHERE
J.EMPL_ID='some id number' --for test purposes, will need to break down to depts-
AND
J.EFF_DT >=TO_DATE('1/1/2012','MM/DD/YYYY')
AND
(
J.EXP_DT<=TO_DATE('12/31/2012','MM/DD/YYYY')
OR
J.EXP_DT=TO_DATE('12/31/9999','MM/DD/YYYY') --I think the problem might be here???
)
GROUP BY
J.EMPL_ID
,J.DEPT
,J.UNIT
,J.LAST_NM
,J.FIRST_NM
,J.TITLE
,J.EFF_DT
,J.EXP_DT
,TM1.PPRD_ID
,TM1.EMPL_ID
,TM1.DOC_ID
,TM1.EXP_DT
,TM1.EFF_DT
ORDER BY
J.EFF_DT
,TM1.EFF_DT
,TM1.EXP_DT
I'm pretty sure I'm missing something simple but at this point I can't see the forest for the trees. Can anyone out there point me in the right direction?
an example of the duplicate records:
for employee 1 for the year of 2012:
Empl_ID Dept Unit Last First Title Eff Date Exp Date PPRD ID Empl_ID
00001 04 012 Babbage Charles Somejob 4/1/2012 10/15/2012 0407123 00001
Exp Date_1 Eff Date_1
4/15/2012 4/1/2012
this record repeats 3 times and goes past the pay periods in 2012 to the current pay period in 2013
the subquery I use to convert time to be able to add hrs and mins together to compare down the line.
I'm going to take a wild guess and see if this is what you want, remember I could not test so there may be typos.
If this is and especially if it is not, you should read in the FAQ about how to ask good questions. If this is what you were trying to understand your question should have been answered within about 10 mins. Because it was not clear what you were asking no one could answer your question.
You should include inputs and outputs and EXPECTED output in your question. The data you gave was not the output of the select statement (it did not have the DAY1 column).
SELECT
J.EMPL_ID
,J.DEPT
,J.UNIT
,J.LAST_NM
,J.FIRST_NM
,J.TITLE
,J.EFF_DT
,J.EXP_DT
,TM1.PPRD_ID
,TM1.EMPL_ID
-- ,TM1.EXP_DT Can't have these if you are summing accross multiple records.
-- ,TM1.EFF_DT
--PULLING IN THE DAILY HRS WORKED
,NVL(SUM(((to_number(SUBSTR(TM1.DAY_1, 1,INSTR(TM1.DAY_1, ':', 1, 1)-1),99))*60)+
(TO_NUMBER(SUBSTR(TM1.DAY_1,INSTR(TM1.DAY_1,':', -1, 1)+1),99))),0)
"DAY1"
---AND THE REST OF THE DAYS FOR THE WORK PERIOD
FROM PPRD_LINE TM1
JOIN EMPL J ON TM1.EMPL_ID=J.EMPL_ID
WHERE
J.EMPL_ID='some id number' --for test purposes, will need to break down to depts-
AND J.EFF_DT >=TO_DATE('1/1/2012','MM/DD/YYYY')
AND(J.EXP_DT<=TO_DATE('12/31/2012','MM/DD/YYYY') OR J.EXP_DT=TO_DATE('12/31/9999','MM/DD/YYYY'))
GROUP BY
J.EMPL_ID
,J.DEPT
,J.UNIT
,J.LAST_NM
,J.FIRST_NM
,J.TITLE
,TM1.PPRD_ID
,TM1.EMPL_ID
,TM1.DOC_ID
ORDER BY
MIN(J.EFF_DT)
,MAX(TM1.EFF_DT)
,MAX(TM1.EXP_DT)

select all records from one table and return null values where they do not have another record in second table

I have looked high and low for this particular query and have not seen it.
We have two tables; Accounts table and then Visit table. I want to return the complete list of account names and fill in the corresponding fields with either null or the correct year etc. this data is used in a matrix report in SSRS.
sample:
Acounts:
AccountName AccountGroup Location
Brown Jug Brown Group Auckland
Top Shop Top Group Wellington
Super Shop Super Group Christchurch
Visit:
AcccountName VisitDate VisitAction
Brown Jug 12/12/2012 complete
Super Shop 1/10/2012 complete
I need to select weekly visits and show those that have had a complete visit and then the accounts that did not have a visit.
e.g.
Year Week AccountName VisitStatus for week 10/12/2012 should show
2012 50 Brown Jug complete
2012 50 Top Group not complete
2012 50 Super Shop not complete
e.g.
Year Week AccountName VisitStatus for week 1/10/2012 should show
2012 2 Brown Jug not complete
2012 2 Top Group not complete
2012 2 Super Shop complete
please correct me if am worng
select to_char(v.visitdate,'YYYY') year,
to_char(v.visitdate,'WW') WEAK,a.accountname,v.visitaction
from accounts a,visit v
where a.accountname=v.ACCCOUNTNAME
and to_char(v.visitdate,'WW')=to_char(sysdate,'WW')
union all
select to_char(sysdate,'YYYY') year,
to_char(sysdate,'WW') WEAK,a.accountname,'In Complete'
from accounts a
where a.accountname not in ( select v.ACCCOUNTNAME
from visit v where to_char(v.visitdate,'WW')=to_char(sysdate,'WW'));
The following answer assumes that
A) You want to see every week within a given range, whether any accounts were visited in that week or not.
B) You want to see all accounts for each week
C) For accounts that were visited in a given week, show their actual VisitAction.
D) For accounts that were NOT visited in a given week, show "not completed" as the VisitAction.
If all those are the case then the following query may do what you need. There is a functioning sqlfiddle example that you can play with here: http://sqlfiddle.com/#!3/4aac0/7
--First, get all the dates in the current year.
--This uses a Recursive CTE to generate a date
--for each week between a start date and an end date
--In SSRS you could create report parameters to replace
--these values.
WITH WeekDates AS
(
SELECT CAST('1/1/2012' AS DateTime) AS WeekDate
UNION ALL
SELECT DATEADD(WEEK,1,WeekDate) AS WeekDate
FROM WeekDates
WHERE DATEADD(WEEK,1,WeekDate) <= CAST('12/31/2012' AS DateTime)
),
--Next, add meta data to the weeks from above.
--Get the WeekYear and WeekNumber for each week.
--Note, you could skip this as a separate query
--and just included these in the next query,
--I've included it this way for clarity
Weeks AS
(
SELECT
WeekDate,
DATEPART(Year,WeekDate) AS WeekYear,
DATEPART(WEEK,WeekDate) AS WeekNumber
FROM WeekDates
),
--Cross join the weeks data from above with the
--Accounts table. This will make sure that we
--get a row for each account for each week.
--Be aware, this will be a large result set
--if there are a lot of weeks & accounts (weeks * account)
AccountWeeks AS
(
SELECT
*
FROM Weeks AS W
CROSS JOIN Accounts AS A
)
--Finally LEFT JOIN the AccountWeek data from above
--to the Visits table. This will ensure that we
--see each account/week, and we'll get nulls for
--the visit data for any accounts that were not visited
--in a given week.
SELECT
A.WeekYear,
A.WeekNumber,
A.AccountName,
A.AccountGroup,
IsNull(V.VisitAction,'not complete') AS VisitAction
FROM AccountWeeks AS A
LEFT JOIN Visits AS V
ON A.AccountName = V.AccountName
AND A.WeekNumber = DATEPART(WEEK,V.VisitDate)
--Set the maxrecursion number to a number
--larger than the number of weeks you will return
OPTION (MAXRECURSION 200);
I hope that helps.