How to simplify WHERE clause with group by conditions - sql

Need SQL help as I am new to writing SQL Queries
Given below are the sample table structure and the sample GROUP BY result
Assuming that the table can have a very large no of records by having different column values.
The output I am expecting is the sum of each different group as per the user selection.
e.g: below is the expected output:
Here is the SQL Query which I am using to fetch the above shown result:
SELECT project,
grant,
program,
department,
SUM(amount) AS Total
FROM tran_table
WHERE ( project = 'pj1'
AND grant = 'gr1'
AND program = 'pg1'
AND department = 'dp1' )
OR ( project = 'pj3'
AND grant = 'gr2'
AND program = 'pg1'
AND department = 'dp2' )
OR ( project = 'pj6'
AND grant = 'gr3'
AND program = 'pg2'
AND department = 'dp1' )
GROUP BY project,
grant,
program,
department
Question:
Is this a correct way to write the SQL Query with all different group values in the WHERE condition when the expected output could be for 100 different groups?

Your approach is fine for a 100 selections, you will fail with more than 1000 selections which is the IN list Oracle restriction.
You may use the following WHERE clause
WHERE (project, grant, program, department) in
(('pj1', 'gr1', 'pg1', 'dp1'),
('pj3', 'gr2', 'pg1', 'dp2'),
('pj6', 'gr3', 'pg2', 'dp1')
)
As this is reporting query leading to full table scan on large table you need not care much about bind variables and limiting the parsing. You should however care about checking the user input to prevent the SQL injection.
Of course the alternative is to use a temporary table filled with the input data with an INNER JOIN to filter the data.

Related

Access query to remove a group of records

I'm trying to create a query in MS Access that only provides records where, when grouped if one of the criteria is TRUE, then that group is not included in the resultant data set.
I have the following fields ID, Teacher_Name, Dsp_Prd, Course_Key, Long_Description, Sec, Tot_Stds, Contains_CC, SchoolCode, IsCoTeach.
A teacher can appear multiple times in a given period (DSP_prd). However if they are assigned to a class as an assistant (IsCoTeach=TRUE) then all of the classes that they appear in should be filtered from the dataset.
For example:
The results should be:
Thanks for your help!
You can use not exists:
select t.*
from SchoolData as t
where not exists (
select 1
from SchoolData as t1
where t1.teacher_name = t.teacher_name and t1.dsp_prd = t.dsp_prd and t1.IsCoTeach = 'TRUE'
)

Randomize Return Results - Access

I need to match up an employee with a task in a small Microsoft Access DB I built. Essentially, I have a list of 45 potential tasks, and I have 25 employees. What I need is:
Each employee to have at LEAST one task
No employee to have more than TWO
Be able to randomize the results every time I run the query (so the same people don't get consistently the same tasks)
My table structure is:
Employees - w/ fields: ID, Name
Tasks - w/ fields: ID, Location, Task Group, Task
I know this is a dumb question, but I truly am struggling. I have searched through SO and Google for help but have been unsuccessful.
I don't have a way to link together employees to tasks since each employee is capable of every task, so I was going to:
1. SELECT * from Employees
2. SELECT * from Tasks
3. Union
4. COUNT(Name) <= 2
But I don't know how to randomize those results so that folks are randomly matched up, with each person at least once and nobody more than twice.
Any help or guidance is appreciated. Thank you.
Consider a cross join with an aggregate query that randomizes the choice set. Currently, at 45 X 25 this yields a cartesian product of 1,125 records which is manageable.
Select query (save as a query object, assumes Tasks has autonumber field)
SELECT cj.[Emp_Name], Max(cj.ID) As M_ID, Max(cj.Task) As M_Task
FROM
(SELECT e.[Emp_Name], t.ID, t.Task
FROM Employees e,
Tasks t) cj
GROUP BY cj.[Emp_Name], Rnd(cj.ID)
ORDER BY cj.[Emp_Name], Rnd(cj.ID)
However, the challenge here is this above query randomizes the order of all 45 tasks per each of the 25 employees whereas you need the top two tasks per employee. Unfortunately, MS Access does not have a row id like other DBMS to use to select top 2 per employee. And we cannot use a correlated subquery on Task ID per Employee since this will always return the highest two task IDs by their value and not random top two IDs.
Therefore to do so in Access, you will need a temp table regularly cleaned out prior to each allocation of employee tasks and use autonumber for selection via correlated subquery.
Create table (run once, autonumber field required)
CREATE TABLE CrossJoinRandomPicks (
ID AUTOINCREMENT PRIMARY KEY,
Emp_Name TEXT(255),
M_ID LONG,
M_Task TEXT(255)
)
Delete query (run regularly)
DELETE FROM CrossJoinRandomPicks;
Append query (run regularly)
INSERT INTO CrossJoinRandomPicks ([Emp_Name], [M_ID], [M_Task])
SELECT [Emp_Name], [M_ID], [M_Task]
FROM mySavedCrossJoinQuery;
Final query (selects top two random tasks for each employee)
SELECT c.name, c.M_Letter
FROM CrossJoinRandomPicks c
WHERE
(SELECT Count(*) FROM CrossJoinRandomPicks sub
WHERE sub.name = c.name
AND sub.ID <= c.ID) <= 2;

Add aliases to each table in SQL query (programmatically)

I need programmatically edit SQL commands in such way that each table in the SQL command will have an alias. The input is a schema for the database and an SQL command. The output should be an SQL query where every table and has an alias and it is always used when we reference an attribute of that table.
For example, let us have a database person(id, name, salary, did) and department(did, name) and the following SQL command:
select id, t.did, maxs
from person
join (
select did, max(salary) maxs
from person
group by did
) t on t.maxs = salary and person.did = t.did
The expected result for such input would be
select p1.id, t.did, t.maxs
from person p1
join (
select p2.did, max(p2.salary) maxs
from person p2
group by p2.did
) t on t.maxs = p1.salary and p1.did = t.did
I was considering using ANTLR4 for this, however, I was curious whether there is a simpler solution. I recently come across TSqlParser, is it possible to use this class to achieve such rewrite in some simple way?

Multi-level GROUP BY clause not allowed in subquery

I have a query as follows in MS Access
SELECT tblUsers.Forename, tblUsers.Surname,
(SELECT COUNT(ID)
FROM tblGrades
WHERE UserID = tblUsers.UserID
AND (Grade = 'A' OR Grade = 'B' OR Grade = 'C')) AS TotalGrades
FROM tblUsers
I've put this into a report and now when trying to view the report it displays an alert "Multi-level GROUP BY clause is not allowed in subquery"
What I dont get is I dont even have any GROUP BY clauses in the query so why is it returning this error?
From Allen Browne's excellent website of Access tips: Surviving Subqueries
Error: "Multi-level group by not allowed"
You spent half an hour building a query with subquery, and verifying it all works. You create a report based on the query, and immediately it fails. Why?
The problem arises from what Access does behind the scenes in response to the report's Sorting and Grouping or aggregation. If it must aggregate the data for the report, and that's the "multi-level" grouping that is not permitted.
Solutions
In report design, remove everything form the Sorting and Grouping dialog, and do not try to sum anything in the Report Header or Report Footer. (In most cases this is not a practical solution.)
In query design, uncheck the Show box under the subquery. (This solution is practical only if you do not need to show the results of the subquery in the report.)
Create a separate query that handles the subquery. Use this query as a source "table" for the query the report is based on. Moving the subquery to the lower level query sometimes (not always) avoids the problem, even if the second query is as simple as
SELECT * FROM Query1;
Use a domain aggregate function such as DSum() instead of a subquery. While this is fine for small tables, performance will be unusable for large ones.
If nothing else works, create a temporary table to hold the data for the report. You can convert your query into an Append query (Append on Query menu in query design) to populate the temporary table, and then base the report on the temporary table.
IMPORTANT NOTE: I'm reposting the info here because I believe Allen Browne explicitly allows it. From his website:
Permission
You may freely use anything (code, forms, algorithms, ...) from these articles and sample databases for any purpose (personal, educational, commercial, resale, ...). All we ask is that you acknowledge this website in your code, with comments such as:
'Source: http://allenbrowne.com
'Adapted from: http://allenbrowne.com
Try this version:
SELECT users.Forename, users.Surname, grades.TotalGrades
FROM tblUsers AS users
LEFT JOIN (SELECT COUNT(ID) as TotalGrades, UserID FROM tblGrades WHERE (Grade = 'A' OR Grade = 'B' OR Grade = 'C') group by userid) AS grades on grades.UserID = users.UserID
I have not tested it. The query itself should be OK, but I'm not sure whether it works in the report data source.
try this:
SELECT users.Forename, users.Surname, count(grades.id) AS TotalGrades
FROM tblUsers AS users
INNER JOIN tblGrades AS grades ON users.ID=grades.UserID
WHERE grades.Grade in ("A","B","C") group by users.ID;
This is a simple joined table. Basically it means. Select all cases where a user has a grade with "A" or "B" or "C" (which would give you a table like this:
user1 | A
user1 | B
user1 | A
user2 | A
...
And then it groups it by users, counting how many times a grade appeared -> giving you the number of grades in the desired range for each user.

Generate "scatter plot" result of members against sets from SQL query

I have a staff database table containing staff members, with user_no and user_name columns. I have another, department, table containing the departments which staff can be members of, with dept_no and dept_name as columns.
Because staff can be members of multiple departments, I have a third, staff_dept, table with a user_no column and a dept_no column, which are the primary keys of those other two tables. This table shows which departments each member of staff belongs to and contains one row for each user/department intersection.
I would like to have an output in the form of a spreadsheet (CSV file, whatever; I'll be fine mangling the results into a usable form after I've got them) with one column for each department, and one row for each user, with an X appearing at each intersection, as defined in staff_dept.
Can I write a single SQL query which will achieve this result? or will I have to do some "real" programming (because it's not a "real" program until you've nested three or four for loops, obviously) to collect and format this data?
This can be done with a PIVOT table (using SQL Server):
SELECT user_name, [dept1name], [dept2name], [dept3name], ...
FROM
(SELECT s.user_name, d.dept_name,
case when sd.user_no is not null then 'X' else '' end as matches
from staff s
cross join department d
left join staff_dept sd on s.user_no = sd.user_no and d.dept_no = sd.dept_no
) AS s
PIVOT
(
min(matches)
FOR dept_name IN ([dept1name], [dept2name], [dept3name], ...)
) AS pvt
order by user_name
Demo: http://www.sqlfiddle.com/#!3/c136d/5
Edit: To generate the PIVOT query dynamically from the list of departments in the table, you would make use of dynamic SQL, i.e., generate the code into a variable and use sp_executesql helper stored procedure. Here's an example: http://www.sqlfiddle.com/#!3/c136d/14
In SQL Server (if you're using SQL Server), I would start with a full outer join (to include all staff and departments, not just those involved in the relation), drop that into a pivot statement to pivot all departments into columns, and then build a short script to generate and dynamically execute that SELECT statement (because the columns created by a pivot statement must be hard-coded, they can't be dynamically generated at run time).
Here's a sample -- it's an unpivot statement, but the concept is pretty much the same.