Using range of cells as conditions in SQL Query - sql

My company uses a SQL Server database.
Is it possible to use a range of cells as a condition in a SQL query if it equals ANY of those values? Can it even use date ranges on the same rows?
Reference Example:
Data Example:
Output Desired:
Question 1:
Can I reference an entire column?
SELECT ID, sum(units) FROM sales WHERE ID = any ID in Column A
Question 2:
Can I specify just a cell range?
SELECT ID, sum(units) FROM table WHERE ID = any value in A2:A10
Question 3:
Can I add a date range cell reference with the possibility that the same ID may appear more than once but have a different date range (see 747375 in sample) and return results for both ranges separately?
SELECT ID, sum(units) FROM table WHERE ID = any value in A2:A10 AND DATE >= date found in column B that is next to ID in the same row AND DATE <= date found in column C that is next to ID in the same row

You can use between as following
select
r.id,
sum(units) as units
from reference r
join data d
on r.id = d.id
where d.date between r.start and r.end
group by
r.id

Question 1: Can I reference an entire column?
Yes. A default select without a where clause will reference the entire column.
Your example SELECT ID, sum(units) FROM sales WHERE ID = any ID in Column A is not logically sound. From the select, I am presuming that you want the sum of units for each individual ID, not the sum of all the units without regard to the ID. For this, you want to use group by
select ID, sum(units) totalunits
from sales
group by ID
There is no need for a where clause because you want everything.
Question 2: Can I specify just a cell range?
Yes.
And no.
There is no direct concept of "cell range" in SQL (well, maybe top but not really). Data is stored unordered in SQL. In Excel, the cell range "A2:A10" means "whatever values just happen to be in those cells at this point in time". Often this will mean "the 2nd through 10th values entered in time", or "the first through 9th values entered in time" if there is a header row. But then later you can sort the data differently and now there is different data there. In SQL, there is no order in storage. You can specify an order for the output when you select data, but that is manually specified for each select.
However, the related concept is probably rather obvious. "A2:A10" is often going to mean "the first 9 values by date/time", or "the largest/smallest 9 values" etc.
Your example SELECT ID, sum(units) FROM table WHERE ID = any value in A2:A10 needs to change to define what values you expect to be in A2:A10. For example, if A2:A10 represents the first 9 values by date, you would do something like this: (untested)
select ID, sum(units) totalunits
from sales
where ID in (select top(9) ID
from sales
order by date
)
group by ID
This would provide the sum of units for each of the IDs that were amongst the first 9 IDs entered by date (what to do with a tie for 9th I will not go into here).
Question 3: Can I add a date range cell reference with the possibility that the same ID may appear more than once but have a different date range (see 747375 in sample) and return results for both ranges separately?
This one is difficult to understand. And it might be meaningless based on the answer to your 2nd question. However, you can setup a query that chooses the IDs you want, and in that query you can also select the min and max dates. Finally, you can use the information from that query as a subquery to get the information by ID that has the sum of units within the min/max dates and one that is the sum of units outside the min/max dates. This would require some effort and I will not at this time try to figure that out for you.

Related

SQL query for percentage change compared to previous date

I have a table within access containing the performance of departments on different reference dates. All data is within one table "tblmain". The table contains the following fields:
reference date (called "ref_date", formatted dd.mm.yyyy)
department identifier (called "dep_id")
performance value (called "val")
Every reference date consists of round about 100 departments and every week I import a new reference date.
My goal now is to build a query which calculates the percentage change from on reference date compared to the previous reference date. Furthermore, it should only show the departments with a change bigger than 5%.
I am currently stuck. I have created a query that gives me the val from the previous reference date but only for one specific department. And I do not know how to continue. This query looks as follows:
SELECT TOP 1 tblmain.val
FROM (SELECT TOP 2 tblmain.val, tblmain.ref_date FROM tblmain WHERE dep_id=1 ORDER BY tblmain.ref_date DESC)
ORDER BY tblmain.ref_date;
I would appreciate any feedback. After finishing this query, I plan to use this query in a form where I can choose an reference date and threshold.
Many thanks in advance!
Query to pull prior val for each record:
SELECT tblMain.ID, tblMain.ref_date, tblMain.dep_id, tblMain.val,
(SELECT TOP 1 val FROM tblMain AS Dupe
WHERE Dupe.dep_id=tblMain.dep_id AND Dupe.ref_Date < tblMain.ref_date
ORDER BY dupe.ref_date) AS PriorVal
FROM tblMain;
Now use that query to calculate percentage:
SELECT Query1.*, Abs(([PriorVal]-[val])/[PriorVal]*100) AS P
FROM Query1
WHERE (((Abs(([PriorVal]-[val])/[PriorVal]*100))>5));

Group or Sum the data based on overlapping period

I'm working on migrating legacy system data to a new system. I'm trying to migrate the data with history based on changed date. My current query results to below output.
Since it's a legacy system, some of the data falls within same period. I want to group the data based on id and name, and add the value as active record or inactive based on the data falls under same period.
My expected output:
For example, lets take 119 as an example and explain the same. One row marked as yellow since its not falls any overlapping period between other rows, but other two rows overlaps the period 01-No-18 to 30-Sep-19.
I need to split the data for overlapping period, and add the value only for overlapped period. So I need to look for combination based on date, which results to introduce a two rows one for non overlapped which results to below two rows
Another row for overlapped row
Same scenario applied for 148324, two rows introduced, one for overlapped and another non overlapped row.
Also is it possible to get non-overlapped data alone based on any condition ? I want to move overlapping data alone to temp table, and I can move the non-overlapped data directly to output table.
I think I dont have 100% solution, but its hard to decision what data are right and how them sort.
This query is based on lead/lag analytic functions. I had to change NULL values to adequate values in sequence (future and past).
Please try and modify this query and I hope it will fit in your case.
My table:
Query:
SELECT id,name,value,startdate,enddate,
CASE WHEN nvl(next_startdate,29993112)>nvl(prev_enddate,19900101) THEN 'Y' ELSE 'N' END AS active
FROM
(
SELECT datatable.*,
lag(enddate) over (partition by id,name order by startdate,value desc) prev_enddate,
lead(startdate) over (partition by id,name order by startdate,value desc) next_startdate
FROM datatable
) dt
Results:

How to count employees that have been promoted?

I'm trying to figure out how to come up with a calculation or query to count the number of employees by grade promoted on each pay period.
*count the number of records who's value in grade have increased by pay period.
Sample solution:
Soln:
Year Payroll Period Count
2018 16 2
2019 6 1
2019 10 1
I've tried pivot and queries in access but I think this needs to have an inner join to identify specific employees who got promoted. thanks for the assistance.
code in excel that seems to work but needs to be transferred in access due to the number of records. I think inner join would make this work. =AND(B2<>B3,C2=C3,D3>D2)
Based on EXCEL, you can derive your solution, assuming that your records are in sequence for columns Year, Payroll, Employee & Grade.
Add another column to determine if there is a grade increase for that particular Payroll Period.
For excel cell reference sake, "Year" is in cell A1
Set formula of 1st cell of this column to false
For the next cell in this new column, set it as such:
The above checks if there is a grade increase for that particular Payroll Period.
The explanation of the formula in sequence is as such, 1. Check if year same (A3=A2), 2. Check if Payroll Period is different(B3<>B2), 3. Check if Employee is the same (C3=C2) and finally 4. Check if there is a change in grade (D3=D2).
Copy this formula down to the rest of your range.
Next, you can start to pivot.
Add your pivot table from your table/range with the following
Filter Grade Increase to true and also change the values aggregation of Employee from Sum to Count.
You will get the following:
I would rename Count of Employees to make it more meaningful.
One caveat for the above approach is that if the grade was increased at the beginning of the 1st Payroll Period of the year, the increase won't be captured. For such, you can remove the year check from the formula A3=A2.
Edit:
Doing a bit of research, perhaps you can do
select t1.*, (t1.Grade > t2.Grade) as Grade_Increase
from YourTableName t1 left join YourTableName t2 on
t1.Employee = t2.Employee and
(((t1.Year - 2018)*26) + t1.Payroll_Period) =
(((t2.Year - 2018)*26) + t2.Payroll_Period - 1) -- -1 to get the prior record to compare grades
What the above does is essentially joining the table to itself.
Records that are 'next in sequence' are combined into the same row. And a comparison is done.
This was not verified in Access.
Substitute 2018 with whatever your base year is. I'm using 2018 to calculate the sequence number of the records. Initially I thought of using common table expressions, rank and row_number. But access doesn't seem to support these functions.

MS Access SQL statement count usage

I am new to SQL. I was given a coursework to report data of usage over the last 2 month. Can someone help me with the SQL statement?
SELECT COUNT(Member_ID,Non_Member_Name) AS Pool_usage_last_2_months
FROM Use_of_pool
WHERE DATEDIFF(‘2012-04-21’,’2012-02-21’)
What I meant to do is to count the total number of member usage(member_ID) and non member usage(no ID,name only) from the last two months and then output the name and date and time,etc. on the same report. Is there any SQL statement to output that kind of information? Correction/Suggestions are welcomed.
You need a different WHERE clause. Assuming your Use_of_pool table includes a Date/Time field, date_field:
WHERE date_field >= #2012-02-21# AND date_field <= #2012-04-21#
If date_field values can include a time component other than midnight, advance the end date range by one day to capture all the possible Date/Time values from Apr. 21:
WHERE date_field >= #2012-02-21# AND date_field <= #2012-04-22#
That should restrict the rows to match what I think you want. It should offer fast performance with an index on date_field.
I'm unclear about the count(s) you want ... whether it is to be one count for all visits (both member and non-member), or separate counts for members and non-members.
Edit: If each row of the table represents a visit by one person, you can simply count the rows to determine the number of visits during your selected time frame.
SELECT Count(*) AS CountOfVisits
FROM Use_of_pool
WHERE date_field >= #2012-02-21# AND date_field <= #2012-04-21#
Notice each visit by the same person will contribute to CountOfVisits, which is what I think you want. If you wanted to know how many different people visited, we will need a different approach.
Edit2:
It sounds like you can use Member_ID and Non_Member_Name to distinguish between member and nonmember visits. Member_ID is Null for nonmembers and non-Null for members. And Non_Member_Name is Null for members and non-Null for nonmembers.
If that is true, try this query to count member and nonmember visits separately.
SELECT
Sum(IIf(Member_ID Is Not Null, 1, 0)) AS member_visits,
Sum(IIf(Non_Member_Name Is Not Null, 1, 0)) AS non_member_visits
FROM Use_of_pool
WHERE date_field >= #2012-02-21# AND date_field <= #2012-04-21#
Aggregate functions of SQL use all the data in a column (more precisely, all the data your WHERE clause selects) to produce a single datum. COUNT gives you the number of data rows that matched your WHERE clause. So for example:
SELECT COUNT(*) AS Non_members FROM Use_of_pool WHERE Member_ID IS NULL
will give you the number of times the pool was used by a non-member, and
SELECT COUNT(DISTINCT Member_ID) AS Members FROM Use_of_pool
will give you the number of members who have used the pool at least once (the DISTINCT tells the database engine to ignore duplicates when counting).
You can expand the WHERE clause to further specify what you want to count. If "last two months" means the current and previous calendar month, you'll need:
... WHERE DateDiff("m",Date_field,Date())<=1
If it means a rolling 2-month period, I'd approximate that with 60 days and say
... WHERE DateDiff("d",Date_field,Date())<60
(Replace Date_field with the name of the field containing the date.)
If you want to count rows according to multiple different criteria, or output both aggregate data and individual data, you'll be best off using separate SELECT statements.

Having trouble using Partition() in Access

I have a table with the fields: resourceID,work_date,stringValue
I'm trying to build an Access query that will show a count of how many different resourceID numbers with a given stringValue occur in each week over a given date range. Using partition() seems to be the simplest approach, however, when I use the following query:
select partition([work_date],#6/6/2011#,#9/4/2011#,7),stringValue,
count(resourceID) from
(select distinct resourceID,work_date,stringValue from myTable) as subQuery
group by partition([work_date],#6/6/2011#,#9/4/2011#,7), stringValue
then I have two problems:
-My dates end up formatted as integers, e.g.:
:40699
40700:40706
40784:40790
whereas I want them to appear as, e.g., 6/6/2011:6/12/2011 (I also don't want the :40699 value)
-resourceID gets counted more than once per week if it appears on more than one weekday; I just want it to be counted once for each stringValue if it appears at all that week. I thought the distinct qualifier would accomplish this but it didn't.
EDIT: I've solved the excess resourceID count by putting the partition in a subquery as follows:
select datePartition,stringValue,count(ID)
from (select partition() as datePartition, stringValue, ID
from (select distinct stringvalue,ID,work_date))
group by datePartition,stringvalue
and then pulling count(ID) from that subquery. Still can't figure out the date formatting, though.
I see 2 solutions, but I see no reason for using Partition() here !
Solution 1: use SELECT Format([DateFact];"ww-yyyy") as weekOfYear will return the week number and year: fine for grouping by week, displaying the week number.
Solution 2: use SELECT [DateFact]-Weekday([datefact]+1) as weekStarting will return first day of the week as a nicely formatted date: fine for grouping by week, displaying the week starting day.
To get rid of the range which ends with 40699, try revising the subquery piece. Add a WHERE condition to limit rows to work_date values >= #6/6/2011#
SELECT DISTINCT resourceID, work_date, stringValue
FROM myTable
WHERE work_date >= #6/6/2011#;
I'm not clear about your description regarding the duplicate rows from the SELECT DISTINCT. One possibility could be that at least some of your work_date values contain different time values from the same date. So two rows with the same resourceID and stringValue, but work_date values of #6/6/2011 01:00 AM# and #6/6/2011 02:00 AM#, would legitimately qualify as distinct.
If that's not the explanation, clarify your question by showing us a smallish set of data from myTable, and the resultset of that data which illustrates how you want the data evaluated.
AFAICT, the issue about Partition() presenting your date ranges as whole numbers is unavoidable. According to the help topic, Partition() expects whole numbers for its parameters. Apparently it's willing to accept your Date/Time values by casting them to whole numbers. But it's not willing/able to transform them back to date strings. You would have to transform them back yourself. A user-defined function could do it when called from a query.
Public Function WholeNumRange2Date(ByVal pRange As String)
Const cstrFmt As String = "m/d/yyyy"
Dim varPieces As Variant
varPieces = Split(pRange, ":")
WholeNumRange2Date = Format(CDate(varPieces(0)), cstrFmt) & _
":" & Format(CDate(varPieces(1)), cstrFmt)
End Function
An example from the Immediate Window which uses that function ...
? WholeNumRange2Date("40700:40706")
6/6/2011:6/12/2011