I'm working on building a workload tracking system, I have a table that currently has listed all the tasks to be completed (each with a unique ID), but also has all the updates with a datestamp so that I can track how long it took for the status to be updated.
My dilemma is that for a form I want to query only the latest update, currently the select query shows both the original task and the updated task separately.
In words, I guess what I need to do is to select only a task given that the ID is the last one with that same task number (which is different than the ID, there will be duplicates when it is updated)
So if I have:
ID Task Date
1 A 4/30/13
2 B 5/2/13
3 A 5/3/13
That the table only shows:
ID Task Date
3 A 5/3/13
2 B 5/2/13
How can I do this? I think I'm missing something simple...
There are multiple ways to approach this query, even in Access. Here is a way using in with a subquery:
select t.*
from t
where t.id in (select MAX(id) as maxid
from t
group by task
)
order by task
The subquery finds the maximum ids for all the tasks. It then returns the rows from the original table that match those ids.
Related
I have a table that contains two different columns called: a_id and b_id.
I want to process these two columns and count the distinct number of job ids in each column and report that. Ultimately, I want to determine how many total jobs were created on any given day and how many jobs were completed on any given day.
I started writing up a rough query to attempt to solve this issue but realized that the select distinct is more complicated than I thought. I also am producing one column: called total_jobs and I think there should be two columns: total jobs created and total jobs completed based on job_created_date and job_completed_date. Kind of lost here.
SELECT
job_created_date,
job_completed_date,
category,
COUNT(
SELECT
DISTINCT a_id, b_id
FROM my_data_table
) AS total_jobs
FROM my_data_table
WHERE
ds BETWEEN '<DATEID-8>' AND '<DATEID-1>'
GROUP BY
1, 2, 3, 4
I want the output to help me create a bar graph with dates on the x-axis and stacked bars representing # of jobs created on that day and # of jobs remaining to be completed on that day.
I don't' know what your data looks like, but I can speculate that you have one row per job and job_completed_date is non-NULL for completed jobs.
If so, you can do:
SELECT t.*,
COUNT(*) OVER () as total_jobs,
COUNT(job_complete) as total_completed_jobs
FROM my_data_table t
WHERE t.ds BETWEEN '<DATEID-8>' AND '<DATEID-1>';
I need to match up an employee with a task in a small Microsoft Access DB I built. Essentially, I have a list of 45 potential tasks, and I have 25 employees. What I need is:
Each employee to have at LEAST one task
No employee to have more than TWO
Be able to randomize the results every time I run the query (so the same people don't get consistently the same tasks)
My table structure is:
Employees - w/ fields: ID, Name
Tasks - w/ fields: ID, Location, Task Group, Task
I know this is a dumb question, but I truly am struggling. I have searched through SO and Google for help but have been unsuccessful.
I don't have a way to link together employees to tasks since each employee is capable of every task, so I was going to:
1. SELECT * from Employees
2. SELECT * from Tasks
3. Union
4. COUNT(Name) <= 2
But I don't know how to randomize those results so that folks are randomly matched up, with each person at least once and nobody more than twice.
Any help or guidance is appreciated. Thank you.
Consider a cross join with an aggregate query that randomizes the choice set. Currently, at 45 X 25 this yields a cartesian product of 1,125 records which is manageable.
Select query (save as a query object, assumes Tasks has autonumber field)
SELECT cj.[Emp_Name], Max(cj.ID) As M_ID, Max(cj.Task) As M_Task
FROM
(SELECT e.[Emp_Name], t.ID, t.Task
FROM Employees e,
Tasks t) cj
GROUP BY cj.[Emp_Name], Rnd(cj.ID)
ORDER BY cj.[Emp_Name], Rnd(cj.ID)
However, the challenge here is this above query randomizes the order of all 45 tasks per each of the 25 employees whereas you need the top two tasks per employee. Unfortunately, MS Access does not have a row id like other DBMS to use to select top 2 per employee. And we cannot use a correlated subquery on Task ID per Employee since this will always return the highest two task IDs by their value and not random top two IDs.
Therefore to do so in Access, you will need a temp table regularly cleaned out prior to each allocation of employee tasks and use autonumber for selection via correlated subquery.
Create table (run once, autonumber field required)
CREATE TABLE CrossJoinRandomPicks (
ID AUTOINCREMENT PRIMARY KEY,
Emp_Name TEXT(255),
M_ID LONG,
M_Task TEXT(255)
)
Delete query (run regularly)
DELETE FROM CrossJoinRandomPicks;
Append query (run regularly)
INSERT INTO CrossJoinRandomPicks ([Emp_Name], [M_ID], [M_Task])
SELECT [Emp_Name], [M_ID], [M_Task]
FROM mySavedCrossJoinQuery;
Final query (selects top two random tasks for each employee)
SELECT c.name, c.M_Letter
FROM CrossJoinRandomPicks c
WHERE
(SELECT Count(*) FROM CrossJoinRandomPicks sub
WHERE sub.name = c.name
AND sub.ID <= c.ID) <= 2;
I'm to a query multiple times from a single query in BIRT. For example, my DB2 query could be SELECT * FROM GROUPS and my dataset would look like
id | name
1 | group 1
2 | group 2
From that dataset I'd like to run another query for each row. So maybe something like SELECT * FROM ORDERS WHERE group_id = params['id'] where id is id of the current GROUP record.
The actual report would look something like:
Order for Group 1
01/01/2015 Order #321
01/15/2015 Order #948
Orders for Group 2
01/02/2015 Order #123
01/23/2015 Order #456
I'm fairly new to BIRT and have seen examples of using scripts on certain events (beforeOpen, etc), but I wanted to make sure that was the proper way to go for something this rudimentary.
Grouping on Groups in my example and OP's question, read carefully.
From what I understand of your requirements probably the easiest way to get what you want it to group on the table.
Put your fields in the data set and on the table then 'group by' elements of the table.
The report below is grouped by date and UPMC_Assign, then I count the number of INCIDENT_ID (ticket owner is criteria that is not displayed)
Create your 'Data Set', drop the Data Set on the Layout, a table is auto created.
Add a 'Group' (red circle lower part of second image), in my case I grouped by Date then by Group, in your case you would group by your 'Group'
I added an aggregation from the Palette to get counts. You can delete anything from the table you don't want. In my case i started out with a line for every ticket, but I deleted the entire row, and just show the groups and counts.
See my answer here for suggestions on versioning reports during development.
I have 3 tables that I wish to UPDATE data against (lets call them PROCESS, DIARY and HISTORY)
The 3 tables all have an ID column and the subset of data I wish to update is retrieved from a SELECT statement against the PROCESS table
SELECT ID FROM PROCESS WHERE STATUS = 1 AND COMPANY = 'XYZ'
Using T-SQL, I was planning to do 3 UPDATE statements (with the PROCESS table being last as it is the reference list) like so
UPDATE HISTORY ... WHERE ID IN (SELECT ID FROM PROCESS WHERE STATUS = 1 AND COMPANY = 'XYZ')
UPDATE DIARY ... WHERE ID IN (SELECT ID FROM PROCESS WHERE STATUS = 1 AND COMPANY = 'XYZ'
)
UPDATE PROCESS ... WHERE STATUS = 1 AND COMPANY = 'XYZ'
My question is: is this the most efficient way to do this within T-SQL - or should I be creating some sort of CTE to reference only once? (The number of documents/performance are not a problem, I'm just trying to find out if as an ex OO developer coming to SQL, I'm slipping into bad habits or missing a trick somewhere
I don't think you will be able to use CTE as CTE can be referenced only once. Updating 3 tables requires 3 separate queries to be run.
If obtaining the ID's in your inner query is expensive, you may consider running the query to get them only once and storing the results in a temporary table or table variable. This way you will be able to reference that temporary table or table variable in all update statements.
If the inner query is inexpensive to run, I would leave it as is to not complicate things unnecessarily.
I have been working on a query that identifies an issue with the data in my database:
SELECT t1.*
FROM [DailyTaskHours] t1
INNER JOIN (
SELECT ActivityDate
,taskId
,EnteredBy
FROM [DailyTaskHours]
WHERE hours != 0
GROUP BY EnteredBy
,taskId
,ActivityDate
HAVING COUNT(*) > 1
) t2 ON (
t1.ActivityDate = t2.ActivityDate
AND t1.taskId = t2.taskId
AND t1.EnteredBy = t2.EnteredBy
AND t1.Hours != 0
)
ORDER BY ActivityDate
What this does is find duplicate hours booked for the same person on the same task on the same day:
Now that I found the issues I want to correct them with an UPDATE. I want the duplicate activity that was created earlier than the other to move the value from Hours to doubleBookedHours and for Hours to be zeroed out. Secondly, I want the more recent row's DoubleBookedFLag column to be updated to 1.
How can I achieve this?
You can write a SQL Server Agent Job to call T-SQL or a SSIS package to perform your logic.
I always like using pseudo code when designing my algorithm.
For instance.
Find duplicate entries and save them to a temporary table, either in a staging area or tempdb. Some location that is accessible by multiple processes (spids).
Find least recent records (1+). Move hours to double booked column?
Find least recent records (1+). Zero out hours column.
Update the most recent record to have double book flag column set to 1.
You were not specific on moving the value from hours to double booked hours. Are these columns?
In short, a SQL Server Agent job and several correct T-SQL steps should solve your problem.