SQL Parent Child Relationship - sql

I would like to know how to create a parent/child relationship for a set of specific months, let's say we have an employee John and I want to know all the people working under John, so I would do a CTE like this:
WITH CTE
AS
(
SELECT #EmployeeIdTmp as EmployeeId,
0 AS [Level]
UNION ALL
SELECT em.[EmployeeId],
[Level] + 1
FROM Employee em
INNER JOIN CTE t
ON em.[ManagerId] = t.EmployeeId
WHERE (em.[ManagerId] <> em.[EmployeeId]
AND em.[ManagerId] IS NOT NULL)
)
SELECT EmployeeId, [Level]
FROM CTE
In this CTE I have a specific where condition but it doesn't matter just business rules :)
This is fine, is working perfectly on SQL Server 2008 R2, now I need to build my hierarchy relation based not only on the current month, I need to look back for example two months ago.
If I see only one month it's fine but If I implement a logic to cover more than one month, I get stuck into a circular reference which is right because John could have Maria working for him on January and also the same hierarchy happens on February, my question is how I can build a hierarchy based on what happened in a period of time, like for example between January and February.
I'm sure there is a way to do it but mine is not :)
Sorry I'll provide more data about it. Let's say I need to run a report between January and February 2015, company has an organization hierarchy on January but could be different on February because one employee change his manager or left the company. So all these changes needs to be reflected on my treeview for that period of month.
Here an example of my treeview:
For January:
John
Maria
Julia
Darin
For February:
John
Maria
Julia
Nicolas
Darin
If I pick a date from January to February I should see a combination of both including the new employee Nicolas on February. I have a a table that keeps history of each month keeping the employee/manager hierarchy so for each month I could have repeated data yes.
Table Employee:
EmployeeId int
ManagerId int
PeriodId int
The PeriodId column is a number that represents a month/year so for example my hierarchy for january will have PeriodId = 1, february = 2 and so on, the PeriodId is unique by month/year.
I have a table value function with the CTE above that receives a manager and returns all employees under him and the level.
My CTE including the PeriodId looks like this:
WITH CTE
AS
(
SELECT #EmployeeIdTmp as EmployeeId,
0 AS [Level]
UNION ALL
SELECT em.[EmployeeId],
[Level] + 1
FROM Employee em
INNER JOIN #PeriodIds p
ON em.[PeriodId] = p.[PeriodId]
INNER JOIN CTE t
ON em.[ManagerId] = t.EmployeeId
WHERE (em.[ManagerId] <> em.[EmployeeId]
AND em.[ManagerId] IS NOT NULL)
)
SELECT EmployeeId, [Level]
FROM CTE
When I'm checking for one month, all is good, but as soon as I try to get data for two months for example, is taking too much time and repeating data more than two times, even if I'm specifying just two months.

If I understand correctly, the question can be reduced to "how to prevent circular traversal in CTE queries". Look here for an answer: TSQL CTE: How to avoid circular traversal?

Related

Summary data even when department is missing for a day

I have data submitted by several departments that I need to summarise to output on a report.
Most days, every department submits data. Some days, a department might miss submitting data.
I need to reflect a zero value entry for that department for the day, rather than skipping it.
I don't know why, but this is striking me as a difficult challenge.
If my data looks like this:
Date, Department, Employee
1 May 2016, First, Fred
1 May 2016, First, Wilma
1 May 2016, Second, Betty
1 May 2016, Second, Barney
2 May 2016, Second, Betty
3 May 2016, First, Wilma
3 May 2016, Second, Betty
3 May 2016, Second, Barney
If I do a count(*) on this data, the output I am hoping for is:
1 May 2016, First, 2
1 May 2016, Second, 2
2 May 2016, First, 0
2 May 2016, Second, 1
3 May 2016, First, 1
3 May 2016, Second, 2
It's the 3rd line, "2 May 2016, First, 0", that I can't get my output to include.
My underlying data is more complex than above, but above is a reasonable simplex representation of the problem.
I'm at the point where I'm messing around with cursors trying to 'build' this recordset, so I think that's a clue that I need to ask for help.
Assuming that your main table is:
create table mydata
(ReportDate date,
department varchar2(20),
Employee varchar2(20));
We can use the below query:
with dates (reportDate) as
(select to_date('01-05-2016','dd-mm-yyyy') + rownum -1
from all_objects
where rownum <=
to_date('03-05-2016','dd-mm-yyyy')-to_date('01-05-2016','dd-mm-yyyy')+1 ),
departments( department) as
( select 'First' from dual
union all
select 'Second' from dual) ,
AllReports ( reportDate, Department) as
(select dt.reportDate,
dp.department
from dates dt
cross join
departments dp )
select ar.reportDate, ar.department, count(md.employee)
from AllReports ar
left join myData md
on ar.ReportDate = md.reportDate and
ar.department = md.department
group by ar.reportDate, ar.department
order by 1, 2
First we generate dates that we are interested in. In our sample between 01-05-2016 and 03-05-2016. It's in dates WITH.
Next we generate list of departments - Departments WITH.
We cross join them to generate all possible reports - AllReports WITH.
And we use LEFT JOIN to your main table to figure out which data exists and which are missing.

Two tables with no direct relationship

I have 2 tables with no relation between them. I want to display the data in tabular format by month. Here is a sample output:
There are 2 different tables
1 for income
1 for expense
Problem is that we have no direct relation between these. The only commonality between them is month (date). Does anyone have a suggestion on how to generate such a report?
here is my union queries:
SELECT TO_DATE(TO_CHAR(PAY_DATE,'MON-YYYY'), 'MON-YYYY') , 'FEE RECEIPT', NVL(SUM(SFP.AMOUNT_PAID),0) AMT_RECIEVED
FROM STU_FEE_PAYMENT SFP, STU_CLASS SC, CLASS C
WHERE SC.CLASS_ID = C.CLASS_ID
AND SFP.STUDENT_NO = SC.STUDENT_NO
AND PAY_DATE BETWEEN '01-JAN-2014' AND '31-DEC-2014'
AND SFP.AMOUNT_PAID >0
GROUP BY TO_CHAR(PAY_DATE,'MON-YYYY')
UNION
SELECT TO_DATE(TO_CHAR(EXP_DATE,'MON-YYYY'), 'MON-YYYY') , ET.DESCRIPTION, SUM(EXP_AMOUNT)
FROM EXP_DETAIL ED, EXP_TYPE ET, EXP_TYPE_DETAIL ETD
WHERE ET.EXP_ID = ETD.EXP_ID
AND ED.EXP_ID = ET.EXP_ID
AND ED.EXP_DETAIL_ID = ETD.EXP_DETAIL_ID
AND EXP_DATE BETWEEN '01-JAN-2014' AND '31-DEC-2014'
GROUP BY TO_CHAR(EXP_DATE,'MON-YYYY'), ET.DESCRIPTION
ORDER BY 1
Regards:
In order to do this you probably want to make the Income and Expenses into separate sub-queries.
I have taken the two parts of your union query and separated them into sub-queries, one called income and one called expense. Both sub-queries summarise the data over the month period as before, but now you can use a JOIN on the Months to allow the data from each sub-query to be connected. Note: I have used an OUTER JOIN, because this will still join month where there is no income, but there is expense and vice versa. This will require some manipulation, because you probably are better off returning a set of zeros for the month if no transaction occur.
In the top level SELECT, replace the use of *, with the correct listing of fields required. I simply used this to show that each field can be reused from the sub-query in the outer query, by referring to the alias as the table name.
SELECT Income.*, Expenses.*
FROM (SELECT TO_DATE(TO_CHAR(PAY_DATE,'MON-YYYY'), 'MON-YYYY') as Month, 'FEE RECEIPT', NVL(SUM(SFP.AMOUNT_PAID),0) AMT_RECIEVED
FROM STU_FEE_PAYMENT SFP, STU_CLASS SC, CLASS C
WHERE SC.CLASS_ID = C.CLASS_ID
AND SFP.STUDENT_NO = SC.STUDENT_NO
AND PAY_DATE BETWEEN '01-JAN-2014' AND '31-DEC-2014'
AND SFP.AMOUNT_PAID >0
GROUP BY TO_CHAR(PAY_DATE,'MON-YYYY') Income
OUTER JOIN (SELECT TO_DATE(TO_CHAR(EXP_DATE,'MON-YYYY'), 'MON-YYYY') as Month, ET.DESCRIPTION, SUM(EXP_AMOUNT)
FROM EXP_DETAIL ED, EXP_TYPE ET, EXP_TYPE_DETAIL ETD
WHERE ET.EXP_ID = ETD.EXP_ID
AND ED.EXP_ID = ET.EXP_ID
AND ED.EXP_DETAIL_ID = ETD.EXP_DETAIL_ID
AND EXP_DATE BETWEEN '01-JAN-2014' AND '31-DEC-2014'
GROUP BY TO_CHAR(EXP_DATE,'MON-YYYY'), ET.DESCRIPTION) Expenses
ON Income.Month = Expenses.Month
There are still many calculations that you will have to insert, to get your final result, which you will have to work on separately. The resulting query to perform what you expect above will likely be a lot longer than this, I am just trying to show you the structure.
However the final tricky part for you is going to be the BBF. Balance Bought Forward. SQL is great a joining tables and columns, but each row is treated and handled separately, it does not read and value from the previous row within a query and allow you to manipulate that value in the next row. To do this you need another sub-query to SUM() all the changes from a point in time up until the start of the month. Financial products normally store Balance at points in time, because it is possible that not all transaction are accurately recorded and there needs to be a mechanism to adjust the Balance. Using this theory, you you need to write your sub-query to summarise all changes since the previous Balance.
IMO Financial applications are inherently complex, so the solution is going to take some time to mould into the right one.
Final Word: I am not familiar with OracleReports, but there may be something in there which will assist with maintaining the BBF.
sqlite> create table Income(Month text, total_income real);
sqlite> create table Expense(Month text, total_expense real);
sqlite> insert into Income values('Jan 2014', 9000);
sqlite> insert into Income values('Feb 2014', 6000);
sqlite> insert into Expense values('Jan 2014', 9000);
sqlite> insert into Expense values('Feb 2014', 18000);
sqlite> select Income.Month, Income.total_income, Expense.total_expense, Income.total_income - Expense.total_expense as Balance from Income, Expense where Income.Month == Expense.Month
Jan 2014|9000.0|9000.0|0.0
Feb 2014|6000.0|18000.0|-12000.0

SQL: Can GROUP BY contain an expression as a field?

I want to group a set of dated records by year, when the date is to the day. Something like:
SELECT venue, YEAR(date) AS yr, SUM(guests) AS yr_guests
FROM Events
...
GROUP BY venue, YEAR(date);
The above is giving me results instead of an error, but the results are not grouping by year and venue; they do not appear to be grouping at all.
My brute force solution would be a nested subquery: add the YEAR() AS yr as an extra column in the subquery, then do the grouping on yr in the outer query. I'm just trying to learn to do as much as possible without nesting, because nesting usually seems horribly inefficient.
I would tell you the exact SQL implementation I'm using, but I've had trouble discovering it. (I'm working through the problems on http://www.sql-ex.ru/ and if you can tell what they're using, I'd love to know.) Edited to add: Per test in comments, it is probably not SQL Server.
Edited to add the results I am getting (note the first two should be summed):
venue | yr | yr_guests
1 2012 15
1 2012 35
2 2012 12
1 2008 15
I expect those first two lines to instead be summed as
1 2012 50
Works Fine in SQL Server 2008.
See working Example here: http://sqlfiddle.com/#!3/3b0f9/6
Code pasted Below.
Create The Events Table
CREATE TABLE [Events]
( Venue INT NOT NULL,
[Date] DATETIME NOT NULL,
Guests INT NOT NULL
)
Insert the Rows.
INSERT INTO [Events] VALUES
(1,convert(datetime,'2012'),15),
(1,convert(datetime,'2012'),35),
(2,convert(datetime,'2012'),12),
(1,convert(datetime,'2008'),15);
GO
-- Testing, select newly inserted rows.
--SELECT * FROM [Events]
--GO
Run the GROUP BY Sql.
SELECT Venue, YEAR(date) AS yr, SUM(guests) AS yr_guests
FROM Events
GROUP BY venue, YEAR(date);
See the Output Results.
VENUE YR YR_GUESTS
1 2008 15
1 2012 50
2 2012 12
it depends of your database engine (or SQL)
to be sure (over different DB Systems & Versions), make a subquery
SELECT venue, theyear, SUM(guests) from (
SELECT venue, YEAR(date) AS theyear, guest
FROM Events
)
GROUP BY theyear
you make a subtable of
venue, date as theyear, guest
aaaa, 2001, brother
aaaa, 2001, bbrother
bbbb, 2001, nobody
... and so on
and then
count them

select all records from one table and return null values where they do not have another record in second table

I have looked high and low for this particular query and have not seen it.
We have two tables; Accounts table and then Visit table. I want to return the complete list of account names and fill in the corresponding fields with either null or the correct year etc. this data is used in a matrix report in SSRS.
sample:
Acounts:
AccountName AccountGroup Location
Brown Jug Brown Group Auckland
Top Shop Top Group Wellington
Super Shop Super Group Christchurch
Visit:
AcccountName VisitDate VisitAction
Brown Jug 12/12/2012 complete
Super Shop 1/10/2012 complete
I need to select weekly visits and show those that have had a complete visit and then the accounts that did not have a visit.
e.g.
Year Week AccountName VisitStatus for week 10/12/2012 should show
2012 50 Brown Jug complete
2012 50 Top Group not complete
2012 50 Super Shop not complete
e.g.
Year Week AccountName VisitStatus for week 1/10/2012 should show
2012 2 Brown Jug not complete
2012 2 Top Group not complete
2012 2 Super Shop complete
please correct me if am worng
select to_char(v.visitdate,'YYYY') year,
to_char(v.visitdate,'WW') WEAK,a.accountname,v.visitaction
from accounts a,visit v
where a.accountname=v.ACCCOUNTNAME
and to_char(v.visitdate,'WW')=to_char(sysdate,'WW')
union all
select to_char(sysdate,'YYYY') year,
to_char(sysdate,'WW') WEAK,a.accountname,'In Complete'
from accounts a
where a.accountname not in ( select v.ACCCOUNTNAME
from visit v where to_char(v.visitdate,'WW')=to_char(sysdate,'WW'));
The following answer assumes that
A) You want to see every week within a given range, whether any accounts were visited in that week or not.
B) You want to see all accounts for each week
C) For accounts that were visited in a given week, show their actual VisitAction.
D) For accounts that were NOT visited in a given week, show "not completed" as the VisitAction.
If all those are the case then the following query may do what you need. There is a functioning sqlfiddle example that you can play with here: http://sqlfiddle.com/#!3/4aac0/7
--First, get all the dates in the current year.
--This uses a Recursive CTE to generate a date
--for each week between a start date and an end date
--In SSRS you could create report parameters to replace
--these values.
WITH WeekDates AS
(
SELECT CAST('1/1/2012' AS DateTime) AS WeekDate
UNION ALL
SELECT DATEADD(WEEK,1,WeekDate) AS WeekDate
FROM WeekDates
WHERE DATEADD(WEEK,1,WeekDate) <= CAST('12/31/2012' AS DateTime)
),
--Next, add meta data to the weeks from above.
--Get the WeekYear and WeekNumber for each week.
--Note, you could skip this as a separate query
--and just included these in the next query,
--I've included it this way for clarity
Weeks AS
(
SELECT
WeekDate,
DATEPART(Year,WeekDate) AS WeekYear,
DATEPART(WEEK,WeekDate) AS WeekNumber
FROM WeekDates
),
--Cross join the weeks data from above with the
--Accounts table. This will make sure that we
--get a row for each account for each week.
--Be aware, this will be a large result set
--if there are a lot of weeks & accounts (weeks * account)
AccountWeeks AS
(
SELECT
*
FROM Weeks AS W
CROSS JOIN Accounts AS A
)
--Finally LEFT JOIN the AccountWeek data from above
--to the Visits table. This will ensure that we
--see each account/week, and we'll get nulls for
--the visit data for any accounts that were not visited
--in a given week.
SELECT
A.WeekYear,
A.WeekNumber,
A.AccountName,
A.AccountGroup,
IsNull(V.VisitAction,'not complete') AS VisitAction
FROM AccountWeeks AS A
LEFT JOIN Visits AS V
ON A.AccountName = V.AccountName
AND A.WeekNumber = DATEPART(WEEK,V.VisitDate)
--Set the maxrecursion number to a number
--larger than the number of weeks you will return
OPTION (MAXRECURSION 200);
I hope that helps.

Advice on database design / SQL for retrieving data with chronological order

I am creating a database that will help keep track of which employees have been on a certain training course. I would like to get some guidance on the best way to design the database.
Specifically, each employee must attend the training course each year and my database needs to keep a history of all the dates on which they have attend the course in the past.
The end user will use the software as a planning tool to help them book future course dates for employees. When they select a given employee they will see:
(a) Last attendance date
(b) Projected future attendance date(i.e. last attendance date + 1 calendar year)
In terms of my database, any given employee may have multiple past course attendance dates:
EmpName AttandanceDate
Joe Bloggs 1st Jan 2007
Joe Bloggs 4th Jan 2008
Joe Bloggs 3rd Jan 2009
Joe Bloggs 8th Jan 2010
My question is what is the best way to set up the database to make it easy to retrieve the most recent course attendance date? In the example above, the most recent would be 8th Jan 2010.
Is there a good way to use SQL to sort by date and pick the MAX date?
My other idea was to add a column called ‘MostRecent’ and just set this to TRUE.
EmpName AttandanceDate MostRecent
Joe Bloggs 1st Jan 2007 False
Joe Bloggs 4th Jan 2008 False
Joe Bloggs 3rd Jan 2009 False
Joe Bloggs 8th Jan 2010 True
I wondered if this would simplify the SQL i.e.
SELECT Joe Bloggs WHERE MostRecent = ‘TRUE’
Also, when the user updates a given employee’s attendance record (i.e. with latest attendance date) I could use SQL to:
Search for the employee and set the
MostRecent value to FALSE
Add a new record with MostRecent set to TRUE?
Would anybody recommended either method over the other? Or do you have a completely different way of solving this problem?
To get the last attendance date use the group function called MAX, i.e.
SELECT MAX(AttandanceDate)
FROM course_data
WHERE employee_name = 'Joe Bloggs'
To get the max attendance date for all the employees:
SELECT employee_name, MAX(AttandanceDate)
FROM course_data
GROUP BY employee_name
ORDER BY employee_name
Query above will NOT return data for employees who haven't attended any courses. So you need to execute a different query.
SELECT A.employee_name, B.AttandanceDate
FROM employee AS A
LEFT JOIN (
SELECT employee_id, MAX(AttandanceDate) AS AttandanceDate
FROM course_data
GROUP BY employee_id
) AS B ON A.id = B.employee_id
ORDER BY A.employee_name
For employees who haven't attended any course, the query will return a NULL AttendanceDate.
The flag is redundant. The other way how to get last attend day by employee:
select top 1 AttandanceDate
from course_data
WHERE employee_name = 'Joe Bloggs'
order by AttandanceDate desc
This may already be the case, but the output from the AttandanceDate columns makes me suspicious that that column may not be a datetime column. Most RDBMS's have some sort of date, time, and/or date time data types to use for storing this information. In which KandadaBoggu's AND OMG Ponies responses are perfect. But if you are storing your dates as strings you WILL have issues trying to do any of their suggestions.
Using a date time data type usually also opens you to the possibilites of obtaining date details like:
e.g. SELECT YEAR(2008-01-01) will return 2008 as an integer.
If you are running SQL Server 2005 or 2008 or later, you can use row_number() do something like the following. This will list everyone, with their most recent attendance.
with temp1 as
(select *
, (row_number() over (partition by EmpName order by AttandanceDate descending))
as [CourseAttendanceOrder]
from AttendanceHistory)
select *
from temp
where CourseAttendanceOrder = 1
order by EmpName
This could be put into a view so you can use it as needed.
However, if you always will be focused on one individual at a time, it may be more efficient to make a stored procedure that can use statements like select max(AttandanceDate) for just the person you are working on.