Count the accumulated hours within a column - sql

I have a table which holds information on the type of work a worker does and the amount of hours spent on the work.
eg.
work_id | user_id | work_type | hours_spent
-------------------------------------------------
1 | 1 | Maintain | 7
2 | 1 | sick | 4
3 | 1 | maintain | 3
4 | 1 | maintain | 6
5 | 2 | Web | 5
6 | 2 | Develop | 8
7 | 2 | develop | 5
8 | 3 | maintain | 5
9 | 3 | sick | 7
10 | 3 | sick | 7
I would like to count the amount of accumulated hours each user has spent on a type of work to display something like this:
user id | work_type | hours_spent
-----------------------------------
1 | maintain | 16
1 | sick | 4
2 | Web | 5
2 | develop | 13
3 | maintain | 5
3 | sick | 14
The sum() function I'm using now returns all the hours in the hours_spent column. Is this the right function for what I want to achieve?
I'm using SQL Server 2008 R2.

SELECT
user_id,
work_type = LOWER(work_type),
hours_spent = SUM(hours_spent)
FROM dbo.tablename
GROUP BY user_id, LOWER(work_type)
ORDER BY user_id, LOWER(work_type);
You don't need LOWER() there unless you have a case sensitive collation. And if you do, enter those strings consistently - or better yet, use a lookup table for those strings and store a tinyint in the main table instead.

Related

Query to sum calls

I have table contain call durations of a telecom company.
ex:
Table 1
| callerid | receiverid | call duration
| 1 | 2 | 5
| 1 | 2 | 2
| 2 | 3 | 4
| 1 | 5 | 2
i need to query above table so the result table after query:
Table 2
| callerid | receiverid | call duration
| 1 | 2 | 7
| 2 | 3 | 4
| 1 | 5 | 2
use below
select callerid, receiverid, sum(call_duration) call_duration
from your_table
group by callerid, receiverid
if applied to sample data in your question - output is

SQL generate unique ID from rolling ID

I've been trying to find an answer to this for the better part of a day with no luck.
I have a SQL table with measurement data for samples and I need a way to assign a unique ID to each sample. Right now each sample has an ID number that rolls over frequently. What I need is a unique ID for each sample. Below is a table with a simplified dataset, as well as an example of a possible UID that would do what I need.
| Row | Time | Meas# | Sample# | UID (Desired) |
| 1 | 09:00 | 1 | 1 | 1 |
| 2 | 09:01 | 2 | 1 | 1 |
| 3 | 09:02 | 3 | 1 | 1 |
| 4 | 09:07 | 1 | 2 | 2 |
| 5 | 09:08 | 2 | 2 | 2 |
| 6 | 09:09 | 3 | 2 | 2 |
| 7 | 09:24 | 1 | 3 | 3 |
| 8 | 09:25 | 2 | 3 | 3 |
| 9 | 09:25 | 3 | 3 | 3 |
| 10 | 09:47 | 1 | 1 | 4 |
| 11 | 09:47 | 2 | 1 | 4 |
| 12 | 09:49 | 3 | 1 | 4 |
My problem is that rows 10-12 have the same Sample# as rows 1-3. I need a way to uniquely identify and group each sample. Having the row number or time of the first measurement on the sample would be good.
One other complication is that the measurement number doesn't always start with 1. It's based on measurement locations, and sometimes it skips location 1 and only has locations 2 and 3.
I am going to speculate that you want a unique number assigned to each sample, where now you have repeats.
If so, you can use lag() and a cumulative sum:
select t.*,
sum(case when prev_sample = sample then 0 else 1 end) over (order by row) as new_sample_number
from (select t.*,
lag(sample) over (order by row) as prev_sample
from t
) t;

SQL Statement to show columns multiple times

I have a table containing an integer column that represents a work place, an integer column that represents the number of workpieces finished at that workplace and a date column.
I want to create a query that creates rows of the following type
location int | date of Max(workpiece) | max workpieces | Min(Date) | workpieces (Min(Date)) | max(Date) | workpieces (Max(Date))
So i want a row for each location containing the date of the day where the most pieces where finished plus the amount of the pieces, the oldest date and the pieces finished on that day and the newest date plus the number of pieces finished that day.
Do I have to use joins, to join the table with itself 3 times each given one of the criteria and then join on location? Is The GROUP BY Operator involved, which I don't quite get the hang of?
EDIT: Here's some sample data
+-------+-----------+-----------+-------------------+
| id | location | amount | date |
+-------+-----------+-----------+-------------------+
| 1 | 1 | 10 | 01.01.2016 |
| 2 | 2 | 5 | 01.01.2016 |
| 3 | 1 | 6 | 02.01.2016 |
| 4 | 2 | 35 | 02.01.2016 |
| 5 | 1 | 50 | 03.01.2016 |
| 6 | 2 | 20 | 03.01.2016 |
+-------+-----------+-----------+-------------------+
I want my output to look like this:
loc | dateMaxAmount| MaxAmount | MinDate | AmountMinDate | MaxDate | MaxDateAmount
1 | 03.01.2016 | 50 | 01.01.2016| 10 | 03.01.2016| 50
2 | 02.01.2016 | 35 | 01.01.2016| 5 | 03.01.2016| 20
I am using MS Access.

SQL - Select distinct on two column

I have this table 'words' with more information:
+---------+------------+-----------
| ID |ID_CATEGORY | ID_THEME |
+---------+------------+-----------
| 1 | 1 | 1
| 2 | 1 | 1
| 3 | 1 | 1
| 4 | 1 | 2
| 5 | 1 | 2
| 6 | 1 | 2
| 7 | 2 | 3
| 8 | 2 | 3
| 9 | 2 | 3
| 10 | 2 | 4
| 11 | 2 | 4
| 12 | 3 | 5
| 13 | 3 | 5
| 14 | 3 | 6
| 15 | 3 | 6
| 16 | 3 | 6
And this query that gives to me 3 random ids from different categories, but not from different themes too:
SELECT Id
FROM words
GROUP BY Id_Category, Id_Theme
ORDER BY RAND()
LIMIT 3
What I want as result is:
+---------+------------+-----------
| ID |ID_CATEGORY | ID_THEME |
+---------+------------+-----------
| 2 | 1 | 1
| 7 | 2 | 3
| 14 | 3 | 6
That is, repeat no category or theme.
When you use GROUP BY you cannot include in the select list a column which is not being ordered. So, in your query it's impossible to inlcude Id in the select list.
So you need to do something a bit more complex:
SELECT Id_Category, Id_Theme,
(SELECT Id FROM Words W
WHERE W.Id_Category = G.Id_Category AND W.Id_Theme = G.Id_Theme
ORDER BY RAND() LIMIT 1
) Id
FROM Words G
GROUP BY Id_Category, Id_Theme
ORDER BY RAND()
LIMIT 3
NOTE: the query groups by the required columns, and the subselect is used to take a random Id from all the possible Ids in the group. Then main query is filtered to take three random rows.

Quickly calculating running totals in sql server using set based operations

I have some data that looks like this:
+---+--------+-------------+---------------+--------------+
| | A | B | C | D |
+---+--------+-------------+---------------+--------------+
| 1 | row_id | disposal_id | excess_weight | total_weight |
| 2 | 1 | 1 | 0 | 30 |
| 3 | 2 | 1 | 10 | 30 |
| 4 | 3 | 1 | 0 | 30 |
| 5 | 4 | 2 | 5 | 50 |
| 6 | 5 | 2 | 0 | 50 |
| 7 | 6 | 2 | 15 | 50 |
| 8 | 7 | 2 | 5 | 50 |
| 9 | 8 | 2 | 5 | 50 |
+---+--------+-------------+---------------+--------------+
And I am transforming it to look like this:
+---+--------+-------------+---------------+--------------+
| | A | B | C | D |
+---+--------+-------------+---------------+--------------+
| 1 | row_id | disposal_id | excess_weight | total_weight |
| 2 | 1 | 1 | 0 | 30 |
| 3 | 2 | 1 | 10 | 30 |
| 4 | 3 | 1 | 0 | 20 |
| 5 | 4 | 2 | 5 | 50 |
| 6 | 5 | 2 | 0 | 45 |
| 7 | 6 | 2 | 15 | 45 |
| 8 | 7 | 2 | 5 | 30 |
| 9 | 8 | 2 | 5 | 25 |
+---+--------+-------------+---------------+--------------+
Basically, I need to update the total_weight column by subtracting the sum of the excess_weights from previous rows in the table which belong to the same disposal_id.
I'm currently using a cursor because it's faster then other solutions I've tried (cte, triangular join, cross apply). My cursor solution keeps a running total that is reset to zero for each new disposal_id, increments it by the excess weight, and performs updates when needed and runs in about 40 seconds. The other solutions I've tried took anywhere from 3-5 minutes and I'm wondering if there is a relatively performant way to do this using set based operations?
I've spent a lot of time optimizing such queries, ended up with two performant options: either store precalculated running totals, as described in Denormalizing to enforce business rules: Running Totals, or calculate them on the client, which is also fast and easy.
The other solution you probably already tried is to do something like the answers found here
Unless you are using Oracle, which has decent aggregates for cumulative sum, you're better off using a cursor. At best, you're going to have to rejoin the table to itself or use another methods for what should be a O(n) operation. In general, the set based solution for problems like these are messy or really messy.
'previous rows' implies an ordering. so no - no set based operations there.
Oracle's LEAD and LAG are built for this, but SQL Server forces you into triangular joins... which i suppose you have investigated.