splunk join 2 search queries - splunk

I am writing a splunk query to find out top exceptions that are impacting client. So I have 2 queries, one is client logs and another server logs query. Joined both of them using a common field, these are production logs so I am changing names of it. I am trying to find top 5 failures that are impacting client. below is my query.
index=pirs sourcetype=client-* env=* (type=Error error_level=fatal) error_level=fatal serviceName=FailedServiceEndpoint | table _time,serviceName,xab,endpoint,statusCode | join left=L right=R where L.xab = R.xab [search index=zirs sourcetype=server-* | rex mode=sed field=span_name "s#\..*$##" | search span_success = false spanName=FailedServiceEndpoint | table _time,spanName,xab] | chart count over L.serviceName
I explicitly mentioned a service name in here, In the final query there wont be service name, because we need top 5 failures that are impacting client.
This query provides me with service name and count, I also need other columns like endpoint name, httpStatusCode I am not sure how to do that and also if there is anything refactoring required for splunk query?

That's an odd use of join. I don't see that particular syntax documented, but apparently it works for you.
To get more fields, use stats instead of chart.
| stats count, values(endpointName) as endpointName, values(httpStatusCode) as httpStatusCode by serviceName

Related

SQL different null values in different rows

I have a quick question regarding writing a SQL query to obtain a complete entry from two or more entries where the data is missing in different columns.
This is the example, suppose I have this table:
Client Id | Name | Email
1234 | John | (null)
1244 | (null) | john#example.com
Would it be possible to write a query that would return the following?
Client Id | Name | Email
1234 | John | john#example.com
I am finding this particularly hard because these are 2 entires in the same table.
I apologize if this is trivial, I am still studying SQL and learning, but I wasn't able to come up with a solution for this and I although I've tried looking online I couldn't phrase the question in the proper way, I suppose and I couldn't really find the answer I was after.
Many thanks in advance for the help!
Yes, but actually no.
It is possible to write a query that works with your example data.
But just under the assumption that the first part of the mail is always equal to the name.
SELECT clients.id,clients.name,bclients.email FROM clients
JOIN clients bclients ON upper(clients.name) = upper(substring(bclients.email from 0 for position('#' in bclients.email)));
db<>fiddle
Explanation:
We join the table onto itself, to get the information into one row.
For this we first search for the position of the '#' in the email, get the substring from the start (0) of the string for the amount of characters until we hit the # (result of positon).
To avoid case-problems the name and substring are cast to uppercase for comparsion.
(lowercase would work the same)
The design is flawed
How can a client have multiple ids and different kind of information about the same user at the same time?
I think you want to split the table between clients and users, so that a user can have multiple clients.
I recommend that you read information about database normalization as this provides you with necessary knowledge for successfull database design.

Splunk index usage search adding column titled NULL to results

I'm running a fairly simple search to identify index usage on my Splunk install by source, as we're running through the Enterprise 30-day trial with the intention of using Splunk Free after it expires:
index=_internal source=*license_usage.log | eval MB=b/1024/1024 | timechart span=1d sum(MB) by s where count in top50
The results for all of my data sources are returned as expected but there's an additional column titled "NULL" at the end of the results:
Splunk index search NULL column
All of my data has an input source and when I click on the column and choose to view the data, it brings back no results.
Can anyone help me understand what this NULL column is please? If it's correct it suggests I'm using over the 500MB/day limit for Splunk Free, which I need to address before the trial period ends.
The NULL column appears because some events do not have an 's' field. You only want to sum those events with an s field so modify your query to
index=_internal source=*license_usage.log type=Usage
| eval MB=b/1024/1024
| timechart span=1d sum(MB) by s where count in top50

Stats Count Splunk Query

I wonder whether someone can help me please.
I'd made the following post about Splunk query I'm trying to write:
https://answers.splunk.com/answers/724223/in-a-table-powered-by-a-stats-count-search-can-you.html
I received some great help, but despite working on this for a few days now concentrating on using eval if statements, I still have the same issue with the "Successful" and "Unsuccessful" columns showing blank results. So I thought I'd cast the net a little wider and ask please whether someone maybe able to look at this and offer some guidance on how I may get around the problem.
Many thanks and kind regards
Chris
I tried exploring your use-case with splunkd-access log and came up with a simple SPL to help you.
In this query I am actually joining the output of 2 searches which aggregate the required results (Not concerned about the search performance).
Give it a try. If you've access to _internal index, this will work as is. You should be able to easily modify this to suit your events (eg: replace user with ClientID).
index=_internal source="/opt/splunk/var/log/splunk/splunkd_access.log"
| stats count as All sum(eval(if(status <= 303,1,0))) as Successful sum(eval(if(status > 303,1,0))) as Unsuccessful by user
| join user type=left
[ search index=_internal source="/opt/splunk/var/log/splunk/splunkd_access.log"
| chart count BY user status ]
I updated your search from splunk community answers (should look like this):
w2_wmf(RequestCompleted)`request.detail.Context="*test"
| dedup eventId
| rename request.ClientID as ClientID detail.statusCode AS statusCode
| stats count as All sum(eval(if(statusCode <= 303,1,0))) as Successful sum(eval(if(statusCode > 303,1,0))) as Unsuccessful by ClientID
| join ClientID type=left
[ search w2_wmf(RequestCompleted)`request.detail.Context="*test"
| dedup eventId
| rename request.ClientID as ClientID detail.statusCode AS statusCode
| chart count BY ClientID statusCode ]
I answered in Splunk
https://answers.splunk.com/answers/724223/in-a-table-powered-by-a-stats-count-search-can-you.html?childToView=729492#answer-729492
but using dummy encoding, it looks like
w2_wmf(RequestCompleted)`request.detail.Context="*test"
| dedup eventId
| rename request.ClientId as ClientID, detail.statusCode as Status
| eval X_{Status}=1
| stats count as Total sum(X_*) as X_* by ClientID
| rename X_* as *
Will give you ClientID, count and then a column for each status code found, with a sum of each code in that column.
As I gather you can't get this working, this query should show dummy encoding in action
`index=_internal sourcetype=*access
| eval X_{status}=1
| stats count as Total sum(X_*) as X_* by source, user
| rename X_* as *`
This would give an output of something like

How do you 'join' multiple SQL data sets side by side (that don't link to each other)?

How would I go about joining results from multiple SQL queries so that they are side by side (but unrelated)?
The reason I am thinking of this is so that I can run 1 query in Google Big Query and it will return 1 single table which I can import into Excel and do some charts.
e.g. Query 1 looks at dataset TableA and returns:
**Metric:** Sales
**Value:** 3,402
And then Query 2 looks at dataset TableB and returns:
**Name:** John
**DOB:** 13 March
They would both use different tables and different filters, etc.
What would I do to make it look like:
---Sales----------John----
---3,402-------13 March----
Or alternatively:
-----Sales--------3,402-----
-----John-------13 March----
Or is there a totally different way to do this?
I can see the use case for the above, I've used something similar to create a single table from multiple tables with different metrics to query in Data Studio so that filters apply to all data in the dataset for example. However in that case, the data did share some dimensions that made it worthwhile doing.
If you are going to put those together with no relationship between the tables, I'd have 4 columns with TYPE describing the data in that row to make for easier filtering.
Type | Sales | Name | DOB
Use UNION ALL to put the rows together so you have something like
"Sales" | 3402 | null | null
"Customer Details" | null | John | 13 March
However, like the others said, make sure you have a good reason to do that otherwise you're just creating a bigger table to query for no reason.

Best way to use hibernate for complex queries like top N per group

I'm working now for a while on a reporting applications where I use hibernate to define my queries. However, more and more I get the feeling that for reporting use cases this is not the best approach.
The queries only result partial columns, and thus not typed objects
(unless you cast all fields in java).
It is hard to express queries without going straight into sql or
hql.
My current problem is that I want to get the top N per group, for example the last 5 days per element in a group, where on each day I display the amount of visitors.
The result should look like:
| RowName | 1-1-2009 | 2-1-2009 | 3-1-2009 | 4-1-2009 | 5-1-2009
| SomeName| 1 | 42 | 34 | 32 | 35
What is the best approach to transform the data which is stored per day per row to an output like this? Is it time to fall back on regular sql and work with untyped data?
I really want to use typed objects for my results but java makes my life pretty hard for that. Any suggestions are welcome!
Using the Criteria API, you can do this:
Session session = ...;
Criteria criteria = session.createCriteria(MyClass.class);
criteria.setFirstResult(1);
criteria.setMaxResults(5);
... any other criteria ...
List topFive = criteria.list();
To do this in vanilla SQL (and to confirm that Hibernate is doing what you expect) check out this SO post: