Grouping data in a sorted list - splunk

In splunk, I want to group data in order after it is sorted.
sample events
Time: 1:00 server: A .....
Time: 1:01 server: A ......
Time: 1:02 server: B ......
Time: 1:03 server: A ......
Time: 1:04 server: A ......
Time: 1:05 server: C ......
Time: 1:06 server: A ......
I want to see
Server: A Events: 2
Server: B Events: 1
Server: A Events: 2
Server: C Events: 1
Server: A Events: 1
Extra points if I can get start and end times for each set too.

Try something like this:
index=ndx sourcetype=srctp server=*
| stats min(_time) as First max(_time) as Last count as Events by server
| rename server as Server
| eval First=strftime(First,"%c"), Last=strftime(Last,"%c")
| table First Last Events Server
Should give you a table like the following:
First | Last | Events | Server
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
<time> | <time> | 7 | serverA
Etc

Related

Review scripts which are older than 3 months and in the last 30 days

I'm try to run a query that will allow me to see where we have scripts running that are older than 3 months old over the last 30 days delivery, so we know they need to be updated.
I have been able to build the query to show me all the scripts and their last regen dates (with specific dates put in) but can't work out;
How to look at only the last 30 days data.
How to see only the scripts where the date_regen column is older than 3 months from today's date - From the last 30 days data that I'm reviewing.
EXAMPLE TABLE
visit_datetime | client | script | date_regen |
2019/10/04 03:32:51 | 1 | script1 | 2019-09-17 13:12:01 |
2019/09/27 03:32:52 | 2 | script2 | 2019-07-18 09:44:02 |
2019/10/06 03:32:50 | 3 | script3 | 2019-03-18 14:08:02 |
2019/10/02 06:28:24 | 4 | script6 | 2019-09-11 10:02:01 |
2019/03/01 06:28:24 | 5 | script7 | 2019-02-11 10:02:01 |
The below examples haven't been able to get me what I need. My idea was that I would get the current date (using now()) and then knowing that, look at all data in the last 30 days.
After that I would then WHERE month,-3 (so date_regen 3 months+ old from the current date.
However I can't get it to work. I also looked at trying to do -days but that also had no success.
-- WHERE MONTH = MONTH(now()) AND YEAR = YEAR(now())
-- WHERE date_regen <= DATEADD(MONTH,-3,GETDATE())
-- WHERE DATEDIFF(MONTH, date_regen, GetDate()) >= 3
Code I am currently using to get the table
SELECT split_part(js,'#',1) AS script,
date_regen,
client
FROM table
WHERE YEAR=2019 AND MONTH=10 AND DAY = 01 (This where is irrelevant as I would need to use now() but I don't know what replaces "YEAR/MONTH/DAY ="
GROUP BY script,date_regen,client
ORDER BY client DESC;
END GOAL
I should only see client 3 as clients 1+2+4 have tags where the date_regen is in the last 3 months, and client 5 has a visit_datetime out of the 30 limit.
visit_datetime | client | script | date_regen |
2019/10/06 03:32:50 | 3 | script3 | 2019-03-18 14:08:02 |
I think you want simple filtering:
select t.*
from t
where visit_datetime >= current_timestamp - interval 30 day and
date_regen < add_months(current_timestamp, -3)

How to get subtotals with time datatype in SQL?

I get stuck generating a SQL query. I have a Table in a Firebird DB like the following one:
ID | PROCESS | STEP | TIME
654 | 1 | 1 | 09:08:40
655 | 1 | 2 | 09:09:32
656 | 1 | 3 | 09:10:04
...
670 | 2 | 15 | 09:30:05
671 | 2 | 16 | 09:31:00
and so on.
I need the subtotals for each process group (It's about 7 of these). The table has the "time"-type for the TIME column.I have been trying it with DATEDIFF, but it doesn't work.
You need to use SUM
This question has been answered here.
How to sum up time field in SQL Server
and here.
SUM total time in SQL Server
For more specific Firebird documentation. Read up on the sum function here.
Sum() - Firebird Official Documentation
I think you should use "GROUP BY" to get max time and min time, and to use them in the datediff function. Something like that:
select process, datediff(second, max(time), min(time)) as nb_seconds
from your_table
group by process;

Select timestamp difference by call status from a call attempts table

I have a table collecting verification call cdrs like this in REDSHIFT:
name | callid | timestamp | Anumber | BNumber | Duration | Status
customerA |1631c40d-e397 |2017/01/01 03:01:00| +390123456789 | +440123456700 | 0 | aborted
customerA |8a12ca23-8728 |2017/01/01 03:02:00| +390123456789 | +440123456701 | 0 | aborted
customerB |54739440-c297 |2017/01/01 03:03:00| +440123456755 | +440123456780 | 0 | aborted
customerA |87e01f74-ce9e |2017/01/01 03:03:01| +390123456789 | +440123456700 | 1 | success
customerB |54739440-c297 |2017/01/01 03:03:02| +440123456755 | +440123456123 | 0 | aborted
customerB |1d192eb7-01b0 |2017/01/01 03:03:03| +440123456755 | +440123456123 | 1 | success
Each attempt is a call with a unique callid.
Re-attempt means: I call you, call fails, I call you again until call is successful. So If anumber calls bnumber 3 times but just the third one is successful, then I have 3 attempts of which 2 re-attempts failed.
I need to get:
Number of aborted calls globally which have been re-attempted in less than 5 sec
Number of aborted calls globally which have been re-attempted in less than 20 sec
e.g.
customerA made 1000 calls globally in April 01-30, 2017. 600 successful, 300 failed, 100 aborted.
Among these (300+100) calls, how many of them got re-attempted in less than 5 sec (too quick) and 20 sec.
Number/percentage of re-attempts when a number fails
e.g.
Say (300+100) calls were made to 200 individual number.
I would like to know the percentage of those who actually retries on the same number.
If 100 users just try once & failed, the remaining 100 users (so 50% here) tried on avg 3 times when their number failed to be verified.
The big issue here for me is that customerA's call1 call2 and call3 are not linked in any way. So if customerA makes a call using numberA to numberB and it fails (aborted or failed) then he can try again and again until one call goes 'successful'.
I not it's not easy to explain. hope it is clearer a bit now.
Results should be like this
customer | aborted_calls | percentage_aborted_5s | percentage_aborted_20s
customerA | 300 | 20 | 40
I was trying to calculate summing the attempt like this but it's not counting all the attempts for the same 'call' (anumber calling many times bnumber withouth success)
select a.anumber,a.bnumber, sum(stat) from
(
select anumber,bnumber,timestamp
,case when status=3 then 0 else 1 end stat --3 is aborted or failed
from mytable
) a
inner join (select * from mytable
)v
on a.anumber=v.anumber and a.bnumber=v.bnumber
where datediff(s,v.timestamp,a.timestamp)<=5
group by a.anumber,a.bnumber
If I have a success, than the counting should start again from 0 .

Comparing multiple rows of data within a date range to create a new table in MS Access

I'm a novice to SQL & MS-Access, however I have a table of data in MS-Access that looks like:
ID | Start_Time | End_Time
1 | 1:00:00 PM | 1:00:30 PM
2 | 2:15:10 PM | 2:15:50 PM
3 | 2:15:30 PM | 2:18:40 PM
4 | 2:17:00 PM | 2:17:30 PM
5 | 2:45:10 PM | 3:03:10 PM
Each row is sequentially recorded into the database. I want to compare the start and end times of each and combine together rows that overlap. For instance, ID 1's Start_Time and End_Time do not overlap any other times in the table, therefore, it would get posted into the new table. However, ID 2 through 4 have Start_Times and End_Times that overlap with ID 2's Start_Time as Start_Time of the group and ID 3's End_Time as the End_Time of the group ID 2 through 4.
The end result would be a new table that should look like:
ID | Start_Time | End_Time | Duration_seconds
1 | 1:00:00 PM | 1:00:30 PM | 30
2 | 2:15:10 PM | 2:18:40 PM | 210
3 | 2:45:10 PM | 3:03:10 PM | 1080
How can I do this in SQL/MS-Access?
Thank you!!
This might need a few passes through the recordsets.
Define 2 new variables. NewStart and NewEnd. As you grab each ID, assign the existing start/end times.
Using a nested loop compare each record to every other record (new times). If the start time is between the time range then replace that IDs NewStart as the other IDs start time. The NewEnd will be assigned the greater of the current ID or comparitive IDs end time.
As you cycle through, items 3 and 4 will have the same "new" times as ID 2. All will have transformed "new" times.
After this, you query distinct times for each ID

Lost trying to use DISTINCT and GROUP BY

I'm having trouble with something that I thought would've been simple...
I have a simple model Statistic that stores a date (created_at), a user_fingerprint and a structure_id. From that, I'd like to create a graph to show #visitors per day.
So I did
#structure.statistics.order('DATE(created_at) ASC').group('DATE(created_at)').count
Which works and return what I expect:
=> {Sat, 18 May 2014=>50, Mon, 19 May 2014=>90}
Now I'd like the same, but I want to squeeze all rows with the same couple (created_at, user_fingerprint). For instance:
| created_at | user_fingerprint | structure_id |
|----------------------|------------------|--------------|
| Sat, 18 May 2014 2PM | '124512341' | 12 |
| Sat, 18 May 2014 4PM | '124512341' | 12 |
| Mon, 19 May 2014 6PM | '124512341' | 12 |
With this data, I would have:
=> {Sat, 18 May 2014=>1, Mon, 19 May 2014=>1}
# instead of
=> {Sat, 18 May 2014=>2, Mon, 19 May 2014=>1}
I would be able to do it in Ruby but I wondered if I could directly do it with SQL & Arel.
Solution regarding your answers
Here is what I did at the end:
#impressions = {}
# The following is to ensure I will have a key when there is no stat for a day.
(15.days.ago.to_date..Date.today).each { |date| #impressions[date] = 0 }
#structure.statistics.where( Statistic.arel_table[:created_at].gt(Date.today - 15.days) )
.order('DATE(created_at) ASC')
.group('DATE(created_at)')
.select('DATE(created_at) as created_at, COUNT(DISTINCT(user_fingerprint)) as user_count')
.each{ |stat| #impressions[stat.created_at] = stat.user_count }
I need to do a bit of Ruby though but that's good for me.
your query would look something like (Oracle dialect)
select trunc(created_at), user_fingerprint, count(distinct user_fingerprint)
from statistic
group by trunc(created_at), user_fingerprint
there is no SQL standard for getting date portion out of datetime data field.
oracle: trunc(dt_column)
sql server: cast(dt_column As Date)
mysql: DATE(dt_column)
#structure.statistics.order('DATE(created_at) ASC').group('DATE(created_at)').select('count(distinct(user_fingerprint)) as user_count').first.user_count