I have been tasked with analyzing license utilization via data stored in a database controlled by Flexnet manager (flexlm). What I need to graph is the number of concurrent licenses in use for a specific period of time (high water mark). I am having some trouble doing this as I have very little experience of SQL or BI tools.
I have taken 2 tables, license_events and license_features (these have been filtered for specific users and features). I have then done a join to create a License_feature table. Sample data looks as follows:
CLIENT_PROJECT FEATURE_NAME LIC_COUNT START_TIME EVENT_TIME DURATION
BGTV eclipse 1 1,422,272,438 1422278666 6,228
BGTV eclipse 1 1,422,443,815 1422443845 30
BGTV eclipse 1 1,422,615,676 1422615681 5
BGTV eclipse 1 1,422,631,395 1422631399 4
BGTV eclipse 4 1,422,631,431 1422631434 3
BGTV eclipse 1 1,422,631,465 1422631474 9
BGTV eclipse 1 1,422,631,472 1422631474 2
BGTV eclipse 2 1,422,632,128 1422632147 19
BGTV eclipse 1 1,422,632,166 1422632179 13
BGTV eclipse 6 1,422,632,197 1422632211 14
What I need now is to graph something like this:
For each time (second)
sum(LIC_COUNT) where start_time <= time && end_time >= time
Ideally this should give me the number of concurrent licenses checked out at a specific second. Even better would be if I could get this information for a different time period such as hours or days.
How could I go about doing this?
Use the GROUP BY keywords to group the SUM() together on a specific column. For example, grouping the SUM() of LIC_COUNT by each START_TIME;
SELECT START_TIME, SUM(LIC_COUNT) AS TOTAL_LIC_COUNT
FROM YOUR_TABLE
GROUP BY START_TIME
Now, to SUM() all LIC_COUNT at each increment between START_TIME and END_TIME you'll need to explicitly specify those unique values somewhere else. For example, if you created a table called UniqueTimes that contained all possible values between your earliest START_DATE and last END_DATE. Then you could do something like the following;
SELECT UniqueTime, SUM(LIC_COUNT) AS TotalLicCount
FROM YOUR_TABLE
LEFT JOIN UniqueTimes ON (UniqueTime >= START_TIME AND UniqueTime <= END_TIME)
GROUP BY UniqueTime
This should group your rows as each unique time, and show the total of all summed LIC_COUNT at each specific time.
I hope this helps.
Related
The gem we have installed (Blazer) on our site limits us to one query.
We are trying to write a query to show how many hours each employee has for the past 10 days. The first column would have employee names and the rest would have hours with the column header being each date. I'm having trouble figuring out how to make the column headers dynamic based on the day. The following is an example of what we have working without dynamic column headers and only using 3 days.
SELECT
pivot_table.*
FROM
crosstab(
E'SELECT
"User",
"Date",
"Hours"
FROM
(SELECT
"q"."qdb_users"."name" AS "User",
to_char("qdb_works"."date", \'YYYY-MM-DD\') AS "Date",
sum("qdb_works"."hours") AS "Hours"
FROM
"q"."qdb_works"
LEFT OUTER JOIN
"q"."qdb_users" ON
"q"."qdb_users"."id" = "q"."qdb_works"."qdb_user_id"
WHERE
"qdb_works"."date" > current_date - 20
GROUP BY
"User",
"Date"
ORDER BY
"Date" DESC,
"User" DESC) "x"
ORDER BY 1, 2')
AS
pivot_table (
"User" VARCHAR,
"2017-10-06" FLOAT,
"2017-10-05" FLOAT,
"2017-10-04" FLOAT
);
This results in
| User | 2017-10-05 | 2017-10-04 | 2017-10-03 |
|------|------------|------------|------------|
| John | 1.5 | 3.25 | 2.25 |
| Jill | 6.25 | 6.25 | 6 |
| Bill | 2.75 | 3 | 4 |
This is correct, but tomorrow, the column headers will be off unless we update the query every day. I know we could pivot this table with date on the left and names on the top, but that will still need updating with each new employee – and we get new ones often.
We have tried using functions and queries in the "AS" section with no luck. For example:
AS
pivot_table (
"User" VARCHAR,
current_date - 0 FLOAT,
current_date - 1 FLOAT,
current_date - 2 FLOAT
);
Is there any way to pull this off with one query?
You could select a row for each user, and then per column sum the hours for one day:
with user_work as
(
select u.name as user
, to_char(w.date, 'YYYY-MM-DD') as dt_str
, w.hours
from qdb_works w
join qdb_users u
on u.id = w.qdb_user_id
where w.date >= current_date - interval '2 days'
)
select User
, sum(case when dt_str = to_char(current_date,
'YYYY-MM-DD') then hours end) as Today
, sum(case when dt_str = to_char(current_date - 'interval 1 day',
'YYYY-MM-DD') then hours end) as Yesterday
, sum(case when dt_str = to_char(current_date - 'interval 2 days',
'YYYY-MM-DD') then hours end) as DayBeforeYesterday
from user_work
group by
user
, dt_str
It's often easier to return a list and pivot it client side. That also allows you to generate column names with a date.
Is there any way to pull this off with one query?
No, because a fixed SQL query cannot have any variability in its output columns. The SQL engine determines the number, types and names of every column of a query before executing it, without reading any data except in the catalog (for the structure of tables and other objects), execution being just the last of 5 stages.
A single-query dynamic pivot, if such a thing existed, couldn't be prepared, since a prepared query always have the same results structure, whereas by definition a dynamic pivot doesn't, as the rows that pivot into columns can change between executions. That would be at odds again with the Prepare-Bind-Execute model.
You may find some limited workarounds and additional explanations in other questions, for example: Execute a dynamic crosstab query, but since you mentioned specifically:
The gem we have installed (Blazer) on our site limits us to one
query
I'm afraid you're out of luck. Whatever the workaround, it always need at best one step with a query to figure out the columns and generate a dynamic query from them, and a second step executing the query generated at the previous step.
I have a table that looks like this:
**ActivityNumber -- TimeStamp -- PreviousActivityNumber -- Team**
1234-4 -- 01/01/2017 14:12 -- 1234-3 -- Team A
There are 400,000 rows.
The ActivityNumber is a unique ticket number with the activity count attached. There are 4 teams.
Each activitynumber is in the table.
I need to calculate the average time taken between updates for each team, for each month (to see how each team is improving over time).
I produced a query which counts the number of activities per team per month - so I'm part way there.
I'm unable to find the timestamp for the previousActivityNumber so I can subtract it from the current Activity number. If I could get this, I could run an average on it.
Conceptually:
select a1.Team,
a1.ActivityNumber,
a1.TimeStamp,
a2.Timestamp as PrevTime,
datediff('n',a1.Timestamp, a2.timestamp) as WorkMinutes
from MyTable a1
left join MyTable a2
on ((a1.Team = a2.Team)
and (a1.PreviousActivityNumber = a2.ActivityNumber )
I have a database model that stores
visit time
last seen time
how many seconds online (derived value, calculated by subtracting last seen time from visit time)
I need to build a graph of online people for a time range (say 8pm to 9pm). I'm thinking of the x-axis as the time with the y-axis as the number of people. The granularity is 1 minute for the graph, but I have data granular to 5 seconds.
I can't just sum the seconds online value because people visit before or after 8pm.
I was thinking of just loading up all records found in a particular day and doing calculations in memory (which I would probably do for now, then just cache the derived values for later) but I wanted to know if there's a more efficient way?
I wonder if there's a special sql query group by thing I can do to make this work.
Edit: Here's a graphical representation I am stealing from another question (Count Grouped Gaps In Time For Time Range) :P
|.=========]
|.=============]
|=========.======]
|===.=================.====]
|.=================.==========]
T 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6
The bars represent the data I've stored (visit time, last seen time, and seconds online) and I need to know at a particular point how many are online. In this example for T=0 the number is 3 and for T=9 the number is 4)
Q: I can't understand what you mean with "but I have data granular to 5 seconds", how many records do you store per visit? Can you add some example data?
A: There's only one record per visit. Granular to 5 seconds means I'm storing up to 5 seconds worth of accurate data.
Sample data as requested:
id visit_time last_seen_time seconds_online
1 00:00:00 00:00:12 10
2 00:12:41 00:12:47 5
3 00:01:20 00:01:22 0
4 00:01:22 00:01:27 5
In this particular case, if I graph the people online at 00:00:00 there would be one person, until 00:00:15 where there would be 0 people:
4|
3|
2| *
1|* *
-*****-******************-
Very interesting and hard question, if we suppose that the interval of graph should be by hour (for example from 8.00 to 8.59), and the granularity by minute we can leverage the problem by extract this date parts (if you are using postgresql the function to use should be EXTRACT), I also suggest to use Commont Table Expressions.
We can then build a CTE to have first minute and last minute of each visit in the target hour, like:
SELECT CASE WHEN EXTRACT(hour FROM visit_time) = 8
THEN EXTRACT(minute FROM visit_time)
ELSE 0 END AS first_minute,
CASE WHEN EXTRACT(hour FROM last_seen_time) = 8
THEN EXTRACT(minute FROM last_seen_time)
ELSE 59 END AS last_minute
FROM visit_table
WHERE EXTRACT(hour FROM visit_time) <= 8 AND EXTRACT(hour FROM last_seen_time) >= 8
The number of visitor changes when a new visit begin or a visit ends, so we can build a second CTE from the first to have a list of all minutes where the visitors' number changes, lets name target the first CTE, then the latter could be defined as:
SELECT first_minute AS minute
FROM target
UNION
SELECT last_minute AS minute
FROM target
The UNION will also eliminate duplicates.
Finally we can join the two tables and count the visitors:
WITH target AS (
SELECT CASE WHEN EXTRACT(hour FROM visit_time) = 8
THEN EXTRACT(minute FROM visit_time)
ELSE 0 END AS first_minute,
CASE WHEN EXTRACT(hour FROM last_seen_time) = 8
THEN EXTRACT(minute FROM last_seen_time)
ELSE 59 END AS last_minute
FROM visit_table
WHERE EXTRACT(hour FROM visit_time) <= 8
AND EXTRACT(hour FROM last_seen_time) >= 8
), time_table AS (
SELECT first_minute AS minute
FROM target
UNION
SELECT last_minute AS minute
FROM target
)
SELECT time_table.minute, COUNT(*) AS Users
FROM target INNER JOIN
time_table ON time_table.minute BETWEEN target.first_minute
AND target.last_minute
GROUP BY time_table.minute
ORDER BY time_table.minute
You should obtain a table where the first record contains the first minute, within the target hour, when there is at least an online visitor, with the number of online people, then you have a record for each change of the number of online people, with the minute of the change and the new number of online people, you can easily make your graph from this.
Sorry if I can't test this solution, but I hope it could help you anyway.
I'm working with PostgreSql and trying to build reporting query for my logs, but unfortunately unsuccessfully...
Basically I have LOG table which logs status changes of other entity. So for the sake of simplicity lets say it has columns STATUS and STATUS_CHANGE_DATE. Now each status change updates this logging table with new status and time it was changed. What I need is the duration and number of times status in it for each status (same status can be used multiple times, e.g go from status 1 to 2 then back to 1). I would like to build a view for it and use in my java application reporting by mapping that view right to hibernate entity. Unfortunately I'm not that experienced with sql so maybe someone can give me some hints of whats best solution would be as I tried few things but basically don't know how to do it.
Lets say we have:
STATUS STATUS_CHANGE_DATE
1 2013 01 01
2 2013 01 03
1 2013 01 06
3 2013 01 07
My wanted result would be a table that contains status 1 with 2 times and 3 days duration and status 2 1 time with 3 days duration too (assuming status 3 is end(or close) and its duration is not required).
Any ideas?
if your statuses are changing in every row, you can do this
with cte as (
select
status,
lead(status_change_date) over(order by status_change_date) as next_date,
status_change_date
from Table1
)
select
status, count(*) as cnt,
sum(next_date - status_change_date) as duration
from cte
where next_date is not null
group by status
sql fiddle demo
Try this:
SELECT "STATUS", "STATUS_CHANGE_DATE" - lag("STATUS_CHANGE_DATE") OVER (ORDER BY "STATUS_CHANGE_DATE") AS "DURATION" FROM table ORDER BY "STATUS";
This works for me in a similar case, in my case i need to calculate the average time between sessions in a log table. I hope this works for you.
The root problem: I have an application which has been running for several months now. Users have been reporting that it's been slowing down over time (so in May it was quicker than it is now). I need to get some evidence to support or refute this claim. I'm not interested in precise numbers (so I don't need to know that a login took 10 seconds), I'm interested in trends - that something which used to take x seconds now takes of the order of y seconds.
The data I have is an audit table which stores a single row each time the user carries out any activity - it includes a primary key, the user id, a date time stamp and an activity code:
create table AuditData (
AuditRecordID int identity(1,1) not null,
DateTimeStamp datetime not null,
DateOnly datetime null,
UserID nvarchar(10) not null,
ActivityCode int not null)
(Notes: DateOnly (datetime) is the DateTimeStamp with the time stripped off to make group by for daily analysis easier - it's effectively duplicate data to make querying faster).
Also for the sake of ease you can assume that the ID is assigned in date time order, that is 1 will always be before 2 which will always be before 3 - if this isn't true I can make it so).
ActivityCode is an integer identifying the activity which took place, for instance 1 might be user logged in, 2 might be user data returned, 3 might be search results returned and so on.
Sample data for those who like that sort of thing...:
1, 01/01/2009 12:39, 01/01/2009, P123, 1
2, 01/01/2009 12:40, 01/01/2009, P123, 2
3, 01/01/2009 12:47, 01/01/2009, P123, 3
4, 01/01/2009 13:01, 01/01/2009, P123, 3
User data is returned (Activity Code 2) immediate after login (Activity Code 1) so this can be used as a rough benchmark of how long the login takes (as I said, I'm interested in trends so as long as I'm measuring the same thing for May as July it doesn't matter so much if this isn't the whole login process - it takes in enough of it to give a rough idea).
(Note: User data can also be returned under other circumstances so it's not a one to one mapping).
So what I'm looking to do is select the average time between login (say ActivityID 1) and the first instance after that for that user on that day of user data being returned (say ActivityID 2).
I can do this by going through the table with a cursor, getting each login instance and then for that doing a select to say get the minimum user data return following it for that user on that day but that's obviously not optimal and is slow as hell.
My question is (finally) - is there a "proper" SQL way of doing this using self joins or similar without using cursors or some similar procedural approach? I can create views and whatever to my hearts content, it doesn't have to be a single select.
I can hack something together but I'd like to make the analysis I'm doing a standard product function so would like it to be right.
SELECT TheDay, AVG(TimeTaken) AvgTimeTaken
FROM (
SELECT
CONVERT(DATE, logins.DateTimeStamp) TheDay
, DATEDIFF(SS, logins.DateTimeStamp,
(SELECT TOP 1 DateTimeStamp
FROM AuditData userinfo
WHERE UserID=logins.UserID
and userinfo.ActivityCode=2
and userinfo.DateTimeStamp > logins.DateTimeStamp )
)TimeTaken
FROM AuditData logins
WHERE
logins.ActivityCode = 1
) LogInTimes
GROUP BY TheDay
This might be dead slow in real world though.
In Oracle this would be a cinch, because of analytic functions. In this case, LAG() makes it easy to find the matching pairs of activity codes 1 and 2 and also to calculate the trend. As you can see, things got worse on 2nd JAN and improved quite a bit on the 3rd (I'm working in seconds rather than minutes).
SQL> select DateOnly
2 , elapsed_time
3 , elapsed_time - lag (elapsed_time) over (order by DateOnly) as trend
4 from
5 (
6 select DateOnly
7 , avg(databack_time - prior_login_time) as elapsed_time
8 from
9 ( select DateOnly
10 , databack_time
11 , ActivityCode
12 , lag(login_time) over (order by DateOnly,UserID, AuditRecordID, ActivityCode) as prior_login_time
13 from
14 (
15 select a1.AuditRecordID
16 , a1.DateOnly
17 , a1.UserID
18 , a1.ActivityCode
19 , to_number(to_char(a1.DateTimeStamp, 'SSSSS')) as login_time
20 , 0 as databack_time
21 from AuditData a1
22 where a1.ActivityCode = 1
23 union all
24 select a2.AuditRecordID
25 , a2.DateOnly
26 , a2.UserID
27 , a2.ActivityCode
28 , 0 as login_time
29 , to_number(to_char(a2.DateTimeStamp, 'SSSSS')) as databack_time
30 from AuditData a2
31 where a2.ActivityCode = 2
32 )
33 )
34 where ActivityCode = 2
35 group by DateOnly
36 )
37 /
DATEONLY ELAPSED_TIME TREND
--------- ------------ ----------
01-JAN-09 120
02-JAN-09 600 480
03-JAN-09 150 -450
SQL>
Like I said in my comment I guess you're working in MSSQL. I don't know whether that product has any equivalent of LAG().
If the assumptions are that:
Users will perform various tasks in no mandated order, and
That the difference between any two activities reflects the time it takes for the first of those two activities to execute,
Then why not create a table with two timestamps, the first column containing the activity start time, the second column containing the next activity start time. Thus the difference between these two will always be total time of the first activity. So for the logout activity, you would just have NULL for the second column.
So it would be kind of weird and interesting, for each activity (other than logging in and logging out), the time stamp would be recorded in two different rows--once for the last activity (as the time "completed") and again in a new row (as time started). You would end up with a jacob's ladder of sorts, but finding the data you are after would be much more simple.
In fact, to get really wacky, you could have each row have the time that the user started activity A and the activity code, and the time started activity B and the time stamp (which, as mentioned above, gets put down again for the following row). This way each row will tell you the exact difference in time for any two activities.
Otherwise, you're stuck with a query that says something like
SELECT TIME_IN_SEC(row2-timestamp) - TIME_IN_SEC(row1-timestamp)
which would be pretty slow, as you have already suggested. By swallowing the redundancy, you end up just querying the difference between the two columns. You probably would have less need of knowing the user info as well, since you'd know that any row shows both activity codes, thus you can just query the average for all users on any given day and compare it to the next day (unless you are trying to find out which users are having the problem as well).
This is the faster query to find out, in one row you will have current and row before datetime value, after that you can use DATEDIFF ( datepart , startdate , enddate ). I use #DammyVariable and DamyField as i remember the is some problem if is not first #variable=Field in update statement.
SELECT *, Cast(NULL AS DateTime) LastRowDateTime, Cast(NULL As INT) DamyField INTO #T FROM AuditData
GO
CREATE CLUSTERED INDEX IX_T ON #T (AuditRecordID)
GO
DECLARE #LastRowDateTime DateTime
DECLARE #DammyVariable INT
SET #LastRowDateTime = NULL
SET #DammyVariable = 1
UPDATE #T SET
#DammyVariable = DammyField = #DammyVariable
, LastRowDateTime = #LastRowDateTime
, #LastRowDateTime = DateTimeStamp
option (maxdop 1)