How to create SQL query - sql

I have data(result / output) in a table like this:
Project code project name associates time efforts in days
1 Analytics amol,manisha,sayali,pooja (21+17+20+17)=57
I need to calculate the time efforts in days. I have done it for February and I have added each persons days he has worked in that month. I mean I have all days minus absentee of any day of all associates.
So, I need to do this by SQL queries.
I have one table which contains all the associates present with dates.
Like this:
UID username date
So can any one give me a suggestion how I could do this?

It will be a better design to have a separate table to store projectid, team member id and his/her efforts in days. so that you can write a simple join query to achieve what you want.

Here is what I would do. Change you tables so you have:
projects
project_code project_name
1 Analytics
users
UID username date
1 amol
2 manisha
3 sayali
projects_users
project_code uid effort
1 1 21
1 2 17
1 3 20
Now you can query the result you asked for like this:
SELECT
p.project_code,
p.project_name,
GROUP_CONCAT(DISTINCT u.username SEPARATOR ', ') AS associates,
SUM(pu.effort) effort
JOIN users AS u
JOIN projects_users AS pu
FROM projects p
GROUP BY project_code

Related

SQLite - Output count of all records per day including days with 0 records

I have a sqlite3 database maintained on an AWS exchange that is regularly updated by a Python script. One of the things it tracks is when any team generates a new post for a given topic. The entries look something like this:
id
client
team
date
industry
city
895
acme industries
blueteam
2022-06-30
construction
springfield
I'm trying to create a table that shows me how many entries for construction occur each day. Right now, the entries with data populate, but they exclude dates with no entries. For example, if I search for just
SELECT date, count(id) as num_records
from mytable
WHERE industry = "construction"
group by date
order by date asc
I'll get results that looks like this:
date
num_records
2022-04-01
3
2022-04-04
1
How can I make sqlite output like this:
date
num_records
2022-04-02
3
2022-04-02
0
2022-04-03
0
2022-04-04
1
I'm trying to generate some graphs from this data and need to be able to include all dates for the target timeframe.
EDIT/UPDATE:
The table does not already include every date; it only includes dates relevant to an entry. If no team posts work on a day, the date column will jump from day 1 (e.g. 2022-04-01) to day 3 (2022-04-03).
Given that your "mytable" table contains all dates you need as an assumption, you can first select all of your dates, then apply a LEFT JOIN to your own query, and map all resulting NULL values for the "num_records" field to "0" using the COALESCE function.
WITH cte AS (
SELECT date,
COUNT(id) AS num_records
FROM mytable
WHERE industry = "construction"
GROUP BY date
ORDER BY date
)
SELECT dates.date,
COALESCE(cte.num_records, 0) AS num_records
FROM (SELECT date FROM mytable) dates
LEFT JOIN cte
ON dates.date = cte.date

Count of new users by Year SQL Query

We have a subscription based business and a table with the account holders details and the signup date
I want to do a query that gets the count of new signups for each year.
I.E.
Table
USER / SIGNUPDATE
User 1 06/08/2013
User 2 06/08/2013
User 3 06/08/2014
User 4 06/08/2014
User 5 06/08/2014
User 6 06/08/2014
User 7 06/08/2014
User 8 06/08/2015
Returning record set
CountOfNewUsers2013 / CountOfNewUsers2014 / CountOfNewUsers2015
2 / 5 / 1
I can get the count for each year individually but not sure how or if I can group them together in one query.
select year(signupdate), count(*) as newusers
from tablename
group by year(signupdate)
You have to group by year of signupdate column.
I noticed you are trying to display the results in one row horizontally. This is not the way this data is typically displayed. it can be done that way, it's just a lot more work. I am assuming you are using SQL Server because you haven't mentioned which system. Here is how to do it with multiple rows (in two columns):
SELECT year(SIGNUPDATE) as [Year]
, count(USER) as CountOfNewUsers
FROM IDontKnowYourTableNameSorry
GROUP BY year(SIGNUPDATE)
Group by and count are the key features here if you want to look up their documentation.

Query to analyze log-in data and intelligently identify a shift worked

I have a large Vertica table that tracks almost any user activity within an enterprise wide program. There is a subset of users where I want to identify the hours they worked on a day to day basis. The tricky part is that some users work 12 hour shifts that span multiple days. Could anyone suggest the best way to do this? Here's what I was originally thinking:
select users.max_hour - users.min_hour as shift_length,
timestamp_trunc(activity_dt_tm ,'ddd')
(select username,
ceil(max(hour(activity_dt_tm))) as max_hour,
floor(min(hour(activity_dt_tm))) as min_hour
from user_activity
where timestamp_trunc(activity_dt_tm ,'ddd') = '2014/11/10'
group by username
) users
I would look at the results from that query and see which users shifts were under a minimum threshold of say 8 hours, indicating they probably started working in the afternoon into the following day. Once I have that list of usernames, I would pass them into a second query that would look ahead to the next day and grab the maximum hour of the activity data row and substitute it in for their 'max_time'. I'm not a sql expert, but I think this might involve some temporary tables to pass the data around. If anyone could point me in the right direction it would be much appreciated.
Edit
Here's a SQL Fiddle with some staged data for 2 users. http://sqlfiddle.com/#!2/4ce900
User2 has activity of a normal 8-5 workday. User1 starts working around 7PM and works into the next day. I'd want the output to look something like this:
UserName | Shift Start | Shift End | Hours Worked
-------------------------------------------------
User1 | 7PM | 7AM | 12
User2 | 8AM | 5PM | 9
I'd want to attribute all the hours worked to the day the user started their shift.
You can use the SQL below to find the start, end and duration of breaks that a user had. You can then filter the breaks that are longer than a threshold and use them to separate user's shifts.
select t1.username, t1.end_dt_tm beforeBreak, t2.start_dt_tm afterBreak, t2.start_dt_tm - t1.end_dt_tm as diff
from user_activity t1, user_activity t2
where t1.username = t2.username and t2.start_dt_tm =
(
select min(nxt.start_dt_tm) from user_activity nxt
where nxt.username = t1.username and nxt.start_dt_tm > t1.end_dt_tm
)
;
(note that your fiddle has the same row twice for user 1)

Join to Calendar Table - 5 Business Days

So this is somewhat of a common question on here but I haven't found an answer that really suits my specific needs. I have 2 tables. One has a list of ProjectClosedDates. The other table is a calendar table that goes through like 2025 which has columns for if the row date is a weekend day and also another column for is the date a holiday.
My end goal is to find out based on the ProjectClosedDate, what date is 5 business days post that date. My idea was that I was going to use the Calendar table and join it to itself so I could then insert a column into the calendar table that was 5 Business days away from the row-date. Then I was going to join the Project table to that table based on ProjectClosedDate = RowDate.
If I was just going to check the actual business-date table for one record, I could use this:
SELECT actual_date from
(
SELECT actual_date, ROW_NUMBER() OVER(ORDER BY actual_date) AS Row
FROM DateTable
WHERE is_holiday= 0 and actual_date > '2013-12-01'
ORDER BY actual_date
) X
WHERE row = 65
from here:
sql working days holidays
However, this is just one date and I need a column of dates based off of each row. Any thoughts of what the best way to do this would be? I'm using SQL-Server Management Studio.
Completely untested and not thought through:
If the concept of "business days" is common and important in your system, you could add a column "Business Day Sequence" to your table. The column would be a simple unique sequence, incremented by one for every business day and null for every day not counting as a business day.
The data would look something like this:
Date BDAY_SEQ
========== ========
2014-03-03 1
2014-03-04 2
2014-03-05 3
2014-03-06 4
2014-03-07 5
2014-03-08
2014-03-09
2014-03-10 6
Now it's a simple task to find the N:th business day from any date.
You simply do a self join with the calendar table, adding the offset in the join condition.
select a.actual_date
,b.actual_date as nth_bussines_day
from DateTable a
join DateTable b on(
b.bday_seq = a.bday_seq + 5
);

SQL - combine two columns into a comma separated list

The problem I'm facing is probably easy to fix, but I can't seem to find an answer online due to the specificity of the issue.
In my database, I have a 3 tables to denote how an educational course is planned. Suppose there is a course called Working with Excel. This means the table Courses has a row for this.
The second table denotes cycles of the same course. If the course is given on Jan 1 2013 and Feb 1 2013, in the underlying tables Cycles, you will find 2 rows, one for each date.
I currently already have an SQL script that gives me two columns: The course name, and a comma separated list with all the Cycle dates.
Please note I am using dd/MM/yyyy notation
This is how it's currently set up (small excerpt, this is the SELECT statement to explain the desired output):
SELECT course.name,
stuff((SELECT distinct ',' + CONVERT(varchar(10), cycleDate, 103) --code 101 = mm/dd/yyyy, code 103 = dd/mm/yyyy
FROM cycles t2
where t2.courseID= course.ID and t2.cycleDate > GETDATE()
FOR XML PATH('')),1,1,'') as 'datums'
The output it gives me:
NAME DATUMS
---------------------------------------------------
Working with Excel 01/01/2013,01/02/2013
Some other course 12/3/2013, 1/4/2013, 1/6/2013
The problem is that I need to add info from the third table I haven't mentioned yet. The table ExtraDays contains additional days for a cycle, in case this spans more than a day.
E.g., if the Working with Excel course takes 3 days, (Jan 1+2+3 and Feb 1+2+3), each of the course cycles will have 2 ExtraDays rows that contain the 'extra days'.
The tables would look like this:
Table COURSES
ID NAME
---------------------------------------------------
1 Working with Excel
Table CYCLES
ID DATE COURSEID
---------------------------------------------------
1 1/1/2013 1
2 1/2/2013 1
Table EXTRADAYS
ID EXTRADATE CYCLEID
---------------------------------------------------
1 2/1/2013 1
2 3/1/2013 1
3 2/2/2013 2
4 3/2/2013 2
I need to add these ExtraDates to the comma-separated list of dates in my output. Preferably sorted, but this is not necessary.
I've been stumped quite some time by this. I have some SQL experience, but apparently not enough for this issue :)
I'm hoping to get the following output:
NAME DATUMS
--------------------------------------------------------------------------------------
Working with Excel 01/01/2013,02/01/2013,03/01,2013,01/02/2013,02/02/2013,03/02/2013
I'm well aware that the database structure could be improved to simplify this, but unfortunately this is a legacy application, I cannot change the structure.
Can anyone point me in the right way to combining these two columns.
I hope I described my issue clear enough for you. Else, just ask :)
SELECT course.name,
stuff((SELECT distinct ',' + CONVERT(varchar(10), cycleDate, 103) --code 101 = mm/dd/yyyy, code 103 = dd/mm/yyyy
FROM (select id, date, courseid from cycles
union
select id, extradate, courseid from extradays) t2
where t2.courseID= course.ID and t2.cycleDate > GETDATE()
FOR XML PATH('')),1,1,'') as 'datums'