MS Access SQL find the year a value first appears - sql

Hi I'm trying to run a query on a species database. I want to query unique values but also find the year that species was first found.
So far I have this:
SELECT DISTINCT [Genus_HeTR] & " " & [Species_HeTR] AS Species
FROM HeTR_Rec
WHERE [Species_HeTR] <> "sp."
UNION SELECT DISTINCT [Genus_HeOP] & " " & [Species_HeOP] AS Species
FROM HeOP_Rec
WHERE [Species_HeOP] <> ""
AND [Species_HeOP] <> "sp.";
I'm concatenating the Genus and species name and adding data from two different tables (hence the UNION). This provides a species list but I would like to know the year that species was seen at this site.

I will hazard a guess that both your source tables include a Date/Time field which stores the date of each observation. If that is so, you can UNION data from the 2 tables and use that as a subquery source in a GROUP BY query where you derive the minimum observation year for each species.
SELECT
sub.Species,
Min(sub.observation_year) AS first_sighting_year
FROM
(
SELECT
[Genus_HeTR] & " " & [Species_HeTR] AS Species,
Year(observation_date) AS observation_year
FROM HeTR_Rec
WHERE [Species_HeTR] <> "sp."
UNION ALL
SELECT
[Genus_HeOP] & " " & [Species_HeOP] AS Species,
Year(observation_date) AS observation_year
FROM HeOP_Rec
WHERE [Species_HeOP] <> ""
AND [Species_HeOP] <> "sp."
) AS sub
GROUP BY sub.Species;

Related

SQL - select 5 columns where 3 are distinct using multiple JOINs

I searched Stack Overflow before asking and looked through the 5 most relevant questions, but they did not seem to answer this.
I need to select 5 columns from a table using multiple INNER JOINs, but I only want to get the records where 3 of the 5 columns are distinct. If I use:
"select DISTINCT pos_segments.pos, ",
" pos_segments.org, ",
" pos_segments.obj, ",
" pos_segments.proja, ",
" pos_segments.eff_dt ",
" from pos_segments ",
"INNER JOIN PersonnelPositions ",
"ON pos_segments.pos = PersonnelPositions.Position ",
"AND pos_segments.eff_dt = PersonnelPositions.EffectiveDate ",
"INNER JOIN Accounts ",
"ON PersonnelPositions.PayrollGLAccountId = Accounts.Id ",
" WHERE <where clause here>
I get back over 22k records. I need to get the pos_segments columns only where the combination of the following three columns are distinct:
pos_segments.pos, pos_segments.proja, pos_segments.eff_dt
These three columns taken together serve as a unique key for this table. How can I only get back the records which are distinct based on the combination of these three columns?
P.S. - we are using MS SQL Server
Thanks!
Edit: As Sean pointed out, the ORDER BY statement is not optional. I never tried without it so I wasn't sure. And it would appear that the code block below is more of a suggestion than a copy paste so take it with a grain of salt. However, this approach should still work with some tweaking to fit your application. Since you mentioned that it may be arbitrary which distinct row is grabbed, you can put either of the other two columns after the ORDER BY statement.
I very recently needed to solve a very similar issue. I was able to accomplish what I needed by using a partition in conjunction with a where statement. You code would be modified to look like this:
"select * from ("
"select ROW_NUMBER() OVER (PARTITION BY pos_segments.pos,
pos_segments.proja, pos_segments.eff_dt ORDER BY pos_segments.org)
AS row_num_throwaway"
" pos_segments.pos, ",
" pos_segments.org, ",
" pos_segments.obj, ",
" pos_segments.proja, ",
" pos_segments.eff_dt ",
" from pos_segments ",
"INNER JOIN PersonnelPositions ",
"ON pos_segments.pos = PersonnelPositions.Position ",
"AND pos_segments.eff_dt = PersonnelPositions.EffectiveDate ",
"INNER JOIN Accounts ",
"ON PersonnelPositions.PayrollGLAccountId = Accounts.Id ",
" WHERE <where clause here>)"
"WHERE row_num_throwaway = 1"
That first row_number partition line just finds distinct versions of the three columns specified and assigns a row number that counts up as more instances of that distinct group are found. By only looking where row_num_throwaway = 1, you are only getting the first time that combo is present. To ensure you retrieve the correct entry, you can always add an ORDER BY statement where I showed with the stars.
Hope this helps!

Access VBA How to use a Union ALL statement

My UNION ALL statement is not returning what I hoped it would. I am putting products into a location (73) and taking them out of the same location. I would like to know how many are remaining in that location. I am trying to figure this out by adding the amount in and subtracting the amount out. I am storing my transactions in tblWarehouseTransfer.
I would like to have one line for each product with the total. What I am getting is one line with the sum of the amount put into the location and one line with the sum of the amount taken out (as a negative number).
I am using a list box to display the list of all my products.
Me.lstCutWipers.RowSource = "SELECT tblProducts.ProductID, tblProducts.ProductName, Sum(tblWarehouseTransfer.Qty) AS SumOfQty " _
& " FROM tblWarehouseTransfer INNER JOIN tblProducts ON tblWarehouseTransfer.ProductID = tblProducts.ProductID " _
& " GROUP BY tblProducts.Productid, tblProducts.ProductName, tblWarehouseTransfer.LocationTo " _
& " HAVING (((tblWarehouseTransfer.LocationTo) = 73)) " _
& " UNION ALL SELECT tblProducts.ProductID, tblProducts.ProductName, -Sum(tblWarehouseTransfer.Qty) AS SumOfQty " _
& " FROM tblWarehouseTransfer INNER JOIN tblProducts ON tblWarehouseTransfer.ProductID = tblProducts.ProductID " _
& " GROUP BY tblProducts.Productid, tblProducts.ProductName, tblWarehouseTransfer.LocationFrom " _
& " HAVING (((tblWarehouseTransfer.LocationFrom)= 73))"
Can someone help me to join the 'in' and the 'out' as one total.
This example joins two subqueries which allows your two different sums to be added together, whereas a UNION only lists rows of the two queries together.
One downside to having subqueries is that it cannot be fully edited in query Design View... it requires the SQL View to edit the whole thing. BUT, you could save each subquery separately and then join those queries together in a third query. Then you could edit each part separately in Design View.
Also notice that I changed the HAVING clause to a WHERE clause. WHERE clauses can be more efficient if you are applying criteria to source values before they are aggregated (i.e. grouped and summed). HAVING applies the criteria after aggregating the data. If the criteria involves aggregate expressions, then they must appear in HAVING clause.
By changing to a WHERE clause it also means that you don't have to group on that field. The difference in speed may be negligible and it should return the same information, but just not necessary since every row contributing to that query will only be for the value in the WHERE clause. Just be aware that if you change the query at all, you need to consider the proper clause to apply criteria.
EDIT: Changed to LEFT JOIN and handled NULL in TotalSum with call to nz().
SELECT ToQuery.ProductID, ToQuery.ProductName, (ToQuery.SumOfQty + nz(FromQuery.SumOfQty, 0.0)) As TotalSum
FROM
(SELECT tblProducts.ProductID, tblProducts.ProductName, Sum(tblWarehouseTransfer.Qty) AS SumOfQty
FROM tblWarehouseTransfer INNER JOIN tblProducts ON tblWarehouseTransfer.ProductID = tblProducts.ProductID
WHERE tblWarehouseTransfer.LocationTo = 73
GROUP BY tblProducts.Productid, tblProducts.ProductName) AS ToQuery
LEFT JOIN
(SELECT tblProducts.ProductID, tblProducts.ProductName, -Sum(tblWarehouseTransfer.Qty) AS SumOfQty
FROM tblWarehouseTransfer INNER JOIN tblProducts ON tblWarehouseTransfer.ProductID = tblProducts.ProductID
WHERE tblWarehouseTransfer.LocationFrom = 73
GROUP BY tblProducts.Productid, tblProducts.ProductName) AS FromQuery
ON ToQuery.ProductID = FromQuery.ProductID
To be complete, this assumes that ProductID is a primary key and that ProductName is unique to each ProductID. If that is not true, you will need to change the outer query ON expression to match ProductName values as well (i.e. add AND ToQuery.ProductName = FromQuery.ProductName).

Count instances of consecutive dates for associated name (VBA, SQL)

Good morning all,
I am trying to determine instances of consecutive dates (excluding Sunday) from a data set. The data is stored in Access and I am pulling the required dates into Excel. I am then trying to determine how many instances each person has in the data provided. Example below.
Data example:
| Name | Date of absence|
| Bob | 02/01/17 |
| Jill | 02/01/17 |
| Bob | 03/01/17 |
| Jill | 04/01/17 |
Result example:
Bob - 1 Instance, 2 days
Jill - 2 Instance, 2 days
I started trying to work through this with VBA in Excel using loops to rotate through each instance of absence until all people had been completed/ticked off, however the code was becoming really cumbersome and it felt very inefficient, not to mention how slow it was getting for larger data sets! I wonder if it is possible to query the database for the info or to write something a bit more efficient.
Any help or suggestions would be appreciated!
Update:
Testing Tom's suggestion;
Sql = "SELECT Absence.Racf,count(RecordDate) as dups"
Sql = Sql & " FROM Absence"
Sql = Sql & " left outer join"
Sql = Sql & " (select Racf, [RecordDate]+IIf(Weekday([RecordDate],7)=1,2,1) as date1 from Absence) t1"
Sql = Sql & " on Absence.RecordDate=t1.date1 and Absence.Racf=t1.Racf"
Sql = Sql & " where date1 Is Not Null"
Sql = Sql & " group by Absence.Racf"
But unfortunately on the list of dates below it returns 7, instead of 5.
Dates:
23-Feb-16,24-Feb-16,08-Aug-16,09-Aug-16,10-Aug-16,31-Aug-16,24-Oct-16,25-Oct-16,26-Oct-16,25-Jan-17,26-Jan-17,27-Jan-17
So this is how the SQL might actually look in an Access query
SELECT table1.name,count(date) as dups
FROM Table1
left outer join
(select name, [date]+IIf(Weekday([Date],7)=1,2,1) as date1 from table1) t1
on table1.date=t1.date1 and table1.name=t1.name
where date1 is not null
group by table1.name
;
If you want to run this from Excel using a macro, here is a useful reference.
I lifted the code from there and changed the lines which set up the SQL query string to
SQL = "SELECT table1.name,count(date) as dups"
SQL = SQL & " FROM table1"
SQL = SQL & " left outer join"
SQL = SQL & " (select name, [date]+IIf(Weekday([Date],7)=1,2,1) as date1 from table1) t1"
SQL = SQL & " on table1.date=t1.date1 and table1.name=t1.name"
SQL = SQL & " where date1 Is Not Null"
SQL = SQL & " group by table1.name"
and it worked fine.
Try this if you want to get sequences with length greater than one
SELECT Absence.Racf, Count(Absence.RecordDate) AS CountOfRecordDate
FROM (Absence LEFT JOIN (select Racf, RecordDate+IIf(Weekday([RecordDate],7)=1,2,1) as RecordDate1 from Absence) AS t1 ON (Absence.RecordDate = t1.RecordDate1) AND (Absence.Racf = t1.Racf))
LEFT JOIN (select Racf, [RecordDate]-IIf(Weekday([RecordDate],2)=1,2,1) as RecordDate2 from Absence) AS t2 ON (Absence.RecordDate = t2.RecordDate2) AND (Absence.Racf = t2.Racf)
WHERE (((t1.RecordDate1) Is Not Null) AND ((t2.RecordDate2) Is Null))
GROUP BY Absence.Racf;
Or this if you want to get sequences of one or more consecutive dates
SELECT Absence.Racf, Count(Absence.RecordDate) AS CountOfRecordDate
FROM Absence LEFT JOIN (select Racf, [RecordDate]+IIf(Weekday([RecordDate],7)=1,2,1) as RecordDate2 from Absence) AS t2 ON (Absence.RecordDate = t2.RecordDate2) AND (Absence.Racf = t2.Racf)
WHERE (((t2.RecordDate2) Is Null))
GROUP BY Absence.Racf;
adding to the SQL string as before.
This can be done using array formula in Excel. In D I have =INDEX($A2:$A$15,MATCH(0,COUNTIF($D$1:$D1,$A2:$A$15),0)) to get the unique employees, then in E I have the following to count the instances =SUM(--(($A$1:$A$15=D1)*(OFFSET($A$1:$A$15,1,0)=D1)*(OFFSET($B$1:$B$15,1,0)-$B$1:$B$15)=1)) which gives the result something like this. You'll need to add another criteria, based on weekday (I will adjust a little later as running low on time) This relies on the data being in date order
EDIT : I understand this is not the full answer and will require modification, a starting point :o)
Covering the Sunday absence (will still need weekday check):
=D1 & " " & COUNTIF($A$1:$A$15,D1) &" instances " & SUM(--(--($A$1:$A$15=D1)*--(OFFSET($A$1:$A$15,1,0)=D1))*--(--(OFFSET($B$1:$B$15,1,0)-$B$1:$B$15=1)+--(OFFSET($B$1:$B$15,1,0)-$B$1:$B$15=2)))&" Consecutive"
Checking the weekday also
=D2 & " " & COUNTIF($A$1:$A$15,D2) &" instances " & SUM(--(--($A$1:$A$15=D2)*--(OFFSET($A$1:$A$15,1,0)=D2))*--(--(OFFSET($B$1:$B$15,1,0)-$B$1:$B$15=1)+--(WEEKDAY(OFFSET($B$1:$B$15,1,0),2)=1)*((OFFSET($B$1:$B$15,1,0)-$B$1:$B$15=2)))) & " Consecutive"
A SQL approach would be something along the lines of, based on a table 000Absence, which is the data from examples EEName and AbsDate.
SELECT abs1.EEName, abs1.AbsDate,
(select count(abs2.EEName) from 000Absence as abs2 where abs2.[EEName]=abs1.[EEName]) AS INSTANCES,
(select count(abs3.EEName) from 000Absence as abs3 where abs3.[EEName]=abs1.[EEName] and abs3.[AbsDate]=abs1.[AbsDate]+iif(weekday(abs3.[AbsDate],7)=1,2,1)) AS CONSECUTIVE
FROM 000Absence AS abs1;
Where the output can be got from the query, grouping by Employee etc.

How do code (SQL) a combo box to display data from a field in a table, minus data which is also in another table?

I'm using SQL to code a combobox which will display names from one table (students), minus any students in my other table (Bookings) who have made a booking on a specific day.
In my "students" table I have one field, "Name". In my "Bookings" table I have two fields, "Name" and "Day".
Here is the code I am using to tell what is going in my combobox;
Combobox.RowSource = "SELECT Name FROM Students" & _
"LEFT OUTER JOIN Bookings " & _
"ON Students.Name = Bookings.Name" & _
"WHERE Bookings.Day = 'Monday';"
To my knowledge this should display all of the names in Students.Name minus any names in Bookings which are on the day Monday. But it's just opening an error message "Syntax error in FROM clause."
Thanks in advance for any advice.
Udate:
I found out how to make it work. Instead of a "Left Outer Join" you have to use a "Not In" clause.
Here' s the code I used.
combobox.RowSource = "SELECT Name FROM Students " & _
"WHERE Name " & _
"NOT IN (SELECT Bookings.Name FROM Bookings WHERE Bookings.Day = ""Monday"") "

SQL String: Sum of maximum of distinct values (group by)

Assume the following table:
ID COMPANY SUBSIDIARY NR_LIVES INSURANCE_LINE FACTOR_CALC
1 COMPANY_X SUB_1 860 LIFE YES
2 COMPANY_X SUB_1 860 DISABILITY YES
3 COMPANY_X SUB_1 860 MEDICAL YES
4 COMPANY_X SUB_2 46 LIFE YES
5 COMPANY_X SUB_2 689 MEDICAL YES
6 COMPANY_X SUB_3 852 LIFE YES
I need an SQL string that returns to me the value 2401.
This is done by making the sum of the highest NR_Of_Lives per subsidiary where FACTOR_CALC = Yes.
I probably would know how to do it loading everything in a recordset and then using VBA, but I would appreciate it if it were possible in one SQL command.
UPDATE:
The current query:
sSQL_Select = "SELECT SUM(NR_LIVES) FROM (SELECT SUBSIDIARY, MAX(NR_LIVES) FROM T_WILMA WHERE PARENT=" & lParent & " AND ACC_YEAR=" & lAcc_Year & _
" AND FACTOR_CALCULATION=TRUE GROUP BY SUBSIDIARY);"
throws an error: Too few parameters, expected 1.
The subquery on its own works as expected.
Thanks to replies so far, but I haven't succeeded to make it work so far.
You can determine the maximum per subsidiary in a subquery. The outer query can then sum the maximums.
select sum(MaxLives)
from (
select company
, subsidiary
, max(nr_lives) as MaxLives
from YourTable
where factor_calc = 'yes'
group by
company
, subsidiary
) as SubQueryAlias
I'll suggest you include some aliasing to see whether that helps unconfuse the db engine.
sSQL_Select = "SELECT SUM(sub.MaxOfNR_LIVES) AS NR_LIVES" & vbcrlf & _
"FROM (" & vbCrLf & _
"SELECT SUBSIDIARY, MAX(NR_LIVES) AS MaxOfNR_LIVES" & vbCrLf & _
"FROM T_WILMA WHERE PARENT=" & lParent & _
" AND ACC_YEAR=" & lAcc_Year & _
" AND FACTOR_CALCULATION=TRUE GROUP BY SUBSIDIARY) AS sub;"
Debug.Print sSQL_Select
SELECT SUM(NR_LIVES)
from(
SELECT SUBSIDIARY,MAX(NR_LIVES) as NR_LIVES
from <Table>
where FACTOR_CALC='YES'
group by SUBSIDIARY)a
You need to let the system know which NR_LIVES it's trying to add up. On your table (and taking out the extra stuff in the WHERE that is not in your example, this returns 2401
SELECT Sum(MAXNR_LIVES) AS Expr1
FROM (SELECT SUBSIDIARY, MAX(NR_LIVES) AS MAXNR_LIVES FROM T_WILMA
WHERE FACTOR_CALC=TRUE GROUP BY SUBSIDIARY);