How to aggregate or join two JSON datasets in Splunk? - splunk

I try to build up an overview some process steps of my application.
I generate two JSON documents
{
"requestID": "abc-123",
"username": "ringo",
}
and
{
"requestID": "abc-123",
"favoriteCar": "Lada"
}
ok, now I have also other entries like these:
abc-456 / paul / Fiat
bcd-987 / george / Talbot
and so on ... linked by the requestID
Now I want to do a table that shows me:
ID | Username | Car
---------|--------------|---------------
abc-123 | ringo | Lada
abc-456 | paul | Fiat
bcd-987 | george | Talbot
So my question is: How can I do these aggregation?
Kind regards
Markus

Aggregations are done with the stats command. Once you have the fields extracted, they can be grouped using stats values(*) as * by requestID.

Related

Is it possible to UNNEST an array in BigQuery so that the nested data in split into columns by a key value?

Let's say I have some data in BigQuery which includes a nested array of objects like so:
{
"name" : "Bob",
"age": "24",
"customFields": [
{
"index": "1",
"value": "1.98"
},
{
"index": "2",
"value": "Nintendo"
},
{
"index": "3",
"value": "Yellow"
}
]
}
I've only been able to unnest this data so that the "index" and "value" fields are columns:
+------+-----+-------+----------+
| name | age | index | value |
+------+-----+-------+----------+
| Bob | 24 | 1 | 1.98 |
| Bob | 24 | 2 | Nintendo |
| Bob | 24 | 3 | Yellow |
+------+-----+-------+----------+
In most cases this would be the desired output, but as the data I'm using refers to Google Analytics custom dimensions I require something a bit more complex. I'm trying to get the index value to be used in the name of the column the data appears in, like so:
+------+-----+---------+----------+---------+
| name | age | index_1 | index_2 | index_3 |
+------+-----+---------+----------+---------+
| Bob | 24 | 1.98 | Nintendo | Yellow |
+------+-----+---------+----------+---------+
Is this possible? What would be the SQL query required to generate this output? It should use the "index" value in he column name, as the output won't be in the ordered "1,2,3,..." all the time.
What you are describing is often referred to as a pivot table - a transformation where values are used as columns. SQL doesn't generally support this as SQL is designed around the concept of having a fixed schema while pivot table requires dynamic schemas.
However if you have a fixed set of index columns you can emulate it with something like:
SELECT
name,
age,
ARRAY(SELECT value FROM UNNEST(customFields) WHERE index="1")[SAFE_OFFSET(0)] AS index_1,
ARRAY(SELECT value FROM UNNEST(customFields) WHERE index="2")[SAFE_OFFSET(0)] AS index_2,
ARRAY(SELECT value FROM UNNEST(customFields) WHERE index="3")[SAFE_OFFSET(0)] AS index_3
FROM your_table;
What this does is specifically define columns for each index that picks out the right values from the customFields array.

ID Extracted from string not useable for connecting to bound form - "expression ... too complex"

I have a linked table to a Outlook Mailitem folder in my Access Database. This is handy in that it keeps itself constantly updated, but I can't add an extra field to relate these records to a parent table.
My workaround was to put an automatically generated/added ID String into the Subject so I could work from there. In order to make my form work the way I need it to, I'm trying to create a query that takes the fields I need from the linked table and adds a calculated field with the extracted ID so it can be referenced for relating records in the form.
The query works fine (I get all the records and their IDs extracted) but when I try to filter records from this query by the calculated field I get:
This expression is typed incorrectly, or it is too complex to be evaluated. For example, a numeric expression may contain too many complicated elements. Try simplifying the expression by assigning parts of the expression to variables.
I tried separating the calculated field out into three fields so it's easier to read, hoping that would make it easier to evaluate for Access, but I still get the same error. My base query is currently:
SELECT InStr(Subject,"Support Project #CS")+19 AS StartID,
InStr(StartID,Subject," ") AS EndID,
Int(Mid(Subject,StartID,EndID-StartID)) AS ID,
ProjectEmails.Subject,
ProjectEmails.[From],
ProjectEmails.To,
ProjectEmails.Received,
ProjectEmails.Contents
FROM ProjectEmails
WHERE (((ProjectEmails.[Subject]) Like "*Support Project [#]CS*"));
I've tried to bind a subform to this query on qryProjectEmailWithID.ID = SupportProject.ID where the main form is bound to SupportProject, and I get the above error. I tried building a query that selects all records from that query where the ID = a given parameter and I still get the same error.
The working query that adds Support Project IDs would look like:
+----+--------------------------------------+----------------------+----------------------+------------+----------------------------------+
| ID | Subject | To | From | Received | Contents |
+----+--------------------------------------+----------------------+----------------------+------------+----------------------------------+
| 1 | RE: Support Project #CS1 ID Extra... | questions#so.com | Isaac.Reefman#so.com | 2019-03-11 | Trying to work out how to add... |
| 1 | RE: Support Project #CS1 ID Extra... | isaac.reefman#so.com | questions#so.com | 2019-03-11 | Thanks for your question. The... |
| 1 | RE: Support Project #CS1 ID Extra... | isaac.reefman#so.com | questions#so.com | 2019-03-11 | You should use a different me... |
| 2 | RE: Support Project #CS2 IT issue... | support#domain.com | someone#company.com | 2019-02-21 | I really need some help with ... |
| 2 | RE: Support Project #CS2 IT issue... | someone#company.com | support#domain.com | 2019-02-21 | Thanks for your question. The... |
| 2 | RE: Support Project #CS2 IT issue... | someone#company.com | support#domain.com | 2019-02-21 | Have you tried turning it off... |
| 3 | RE: Support Project #CS3 email br... | support#domain.com | someone#company.com | 2019-02-12 | my email server is malfunccti... |
| 3 | RE: Support Project #CS3 email br... | someone#company.com | support#domain.com | 2019-02-12 | Thanks for your question. The... |
| 3 | RE: Support Project #CS3 email br... | someone#company.com | support#domain.com | 2019-02-13 | I've just re-started the nece... |
+----+--------------------------------------+----------------------+----------------------+------------+----------------------------------+
The view in question would populate a datasheet that looks the same with just the items whos ID matches the ID of the current SupportProject record, updating when a new record is selected. A separate text box should show the full content of whichever record is selected in that grid, like this:
Have you tried turning it off and on again?
From: support#domain.com
On: 21/02/2019
Thanks for your question. The matter has been assigned to Support Project #CS2, and a support staff member will be in touch shortly to help you out. As it is considered of medium priority, you should expect daily updates.
Thanks,
Support
From: someone#company
On: 21/02/2019
I really need some help with my computer. It seems really slow and I can't do my work efficiently.
Neither of these things happens as when I try to use the calculated number to relate to the PK of the SupportProject table...
I don't know if this is a part of the problem, but whether I use Int(Mid(Subject... or Val(Mid(Subject... I still apparently get a Double, where the ID field (as an autoincrement ID) is a Long. I can't work out how to force it to return a Long, so I can't test whether that's the problem.
So that is output resulting from posted SQL? I really wanted raw data but close enough. If requirement is to extract number after ...CS, calculate in query and save query:
Val(Mid([Subject],InStr([Subject],"CS")+2))
Then build another query to join first query to table.
SELECT qryProjectEmailWithID.*, SupportProject.tst
FROM qryProjectEmailWithID
INNER JOIN SupportProject ON qryProjectEmailWithID.ID = SupportProject.ID;
Filter criteria can be applied to either ID field.
A subform can display the related child records synchronized with SupportProject records on main form.
I tested the ID calc with your data and then with a link to my Inbox. No issue with query join.

Is there any Eloquent way to get sum of relational table and sort by it?

Example:
table Users
ID | Username | sex
1 | Tony | m
2 | Andy | m
3 | Lucy | f
table Scores
ID | user_id | score
1 | 2 | 4
2 | 1 | 3
3 | 1 | 4
4 | 2 | 3
5 | 1 | 1
6 | 3 | 3
7 | 3 | 2
8 | 2 | 3
Expected Result:
ID | Username | sex | score_sum (sum) (desc)
2 | Andy | m | 10
1 | Tony | m | 8
3 | Lucy | f | 5
The code I use so far:
User model:
class User extends Authenticatable
{
...
public function scores()
{
return $this->hasMany('App\Score');
}
...
}
Score model
class Job extends Model
{
//i put nothing here
}
Code in controller:
$users = User::all();
foreach ($users as $user){
$user->score_sum = $user->scores()->sum('score');
}
$users = collect($users)->sortByDesc('score_sum');
return view('homepage', [
'users' => $users->values()->all()
]);
Hope my example above make sense. My code does work, but I thought there must be an Eloquent and elegant way to do this without foreach?
There are 2 options for doing this in an Eloquent way.
Option 1
The first way is to do this to add the score_sum as an attribute that is always included when querying the users model. This is only a good idea if you will be using the score_sum the majority of the time when querying the users table. If you only need the score_sum on very specific view or for specific business logic then I would use the second option below.
To do this you will add the attribute to the users model, you can look here for documentation: https://laravel.com/docs/5.6/eloquent-mutators#defining-an-accessor
Here is an example for your use case:
/app/User.php
class User extends Model
{
.
.
.
public function getScoreSumAttribute($value)
{
return $this->scores()->sum('score');
}
}
Option 2
If you just want to do this for a single use case, then the easiest solution is just to use the sum() function in the eventual foreach loop you will be using (most likely in the view).
For example in a view:
#foreach($users as $user)
<div>Username: {{$user->username}}</div>
<div>Sex: {{$user->sex}}</div>
<div>Score Sum: {{$user->scores()->sum('price')}}</div>
#endforeach
Additionally, if you do not want to do this in a foreach loop you can use a raw query in the Eloquent call in your Controller gets the `score_sum'. Here is an example of how that can be done:
$users = User::select('score_sum',DB::raw(SUM(score) FROM 'scores'))->get();
I did not have a quick environment to test this, you might need a WHERE clause in the DB::raw query
Hope this helps!
This is as nice as it gets:
User::selectRaw('*, (SELECT SUM(score) FROM scores WHERE user_id = users.id) as score_sum')
->orderBy('score_sum', 'DESC')
->get();

Lucene Query - AND operator failing in Azure Search?

I have a search index of sandwiches. The index has three fields: id, meat, and bread. Each field is an Edm.String. In this index, here is a subset of my data:
ID | Meat | Bread
-----------------------
1 | Ham | White
2 | Turkey | Hoagie
3 | Tuna | Wheat
4 | Roast Beef | White
5 | Ham | Wheat
6 | Roast Beef | Rye
7 | Turkey | Wheat
I need to write a query that returns all ham or turkey sandwiches on wheat bread. In an attempt to do this, I've created the following:
{
"search":"(meat:(Ham|Turkey) AND bread:\"Wheat\")",
"searchMode":"all",
"select":"id,meat,bread"
}
When I run this query, I'm not seeing any results. What am I missing? What am I doing wrong? I'm trying to understand full queries. Do field-level queries support the phrase operator? I'm not sure what I'm doing wrong.
You need to use "queryType": "full" to request the Lucene syntax. See an example on MSDN.
That said, what you're trying to accomplish is easier and more efficiently done using filters. Assuming you make the relevant fields in your index filterable, you can use the following filter expression for your example: $filter=(meat eq 'Ham' or meat eq 'Turkey') and bread eq 'Wheat'. For more on filters, see this article. Hope this helps!

Creating an SSIS job to split a column and insert into database

I have a column called Description:
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Description/Title |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Liszt, Hungarian Rhapsody #6 {'Pesther Carneval'}; 2 Episodes from Lenau's 'Faust'; 'Hunnenschlacht' Symphonic Poem. (NW German Phil./ Kulka) |
| Beethoven, Piano Sonatas 8, 23 & 26. (Justus Frantz) |
| Puccini, Verdi, Gounod, Bizet: Arias & Duets from Butterfly, Tosca, Boheme, Turandot, I Vespri, Faust, Carmen. (Fiamma Izzo d'Amico & Peter Dvorsky w.Berlin Radio Symph./Paternostro) |
| Puccini, Ponchielli, Bizet, Tchaikovsky, Donizetti, Verdi: Arias from Boheme, Manon Lescaut, Tosca, Gioconda, Carmen, Eugen Onegin, Favorita, Rigoletto, Luisa Miller, Ballo, Aida. (Peter Dvorsky, ten. w.Hungarian State Opera Orch./ Mihaly) |
| Thomas, Leslie: 'The Virgin Soldiers' (Hywel Bennett reads abridged version. Listening time app. 2 hrs. 45 mins. DOLBY) |
| Katalsky, A. {1856-1926}: Liturgy for A Cappella Chorus. Rachmaninov, 6 Choral Songs w.Piano. (Bolshoi Theater Children's Choir/ Zabornok. DOLBY) |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Please note that above I'm only showing 1 field.
Also, the output that I would like is:
+-------+-------+
| Word | Count |
+-------+-------+
| Arias | 3 |
| Duets | 2 |
| Liszt | 10 |
| Tosca | 1 |
+-------+-------+
I want this output to encompass EVERY record. I do not want a separate one of these for each record, just one global one.
I am choosing to use SSIS to do this job. I'd like your input on which controls to use to help with this task:
I'm not looking for a solution, but simply some direction on how to get started with this. I understand this can be done many different ways, but I cannot seem to think of a way to do this most efficiently. Thank you for any guidance.
FYI:
This script does an excellent job of concatenating everything:
select description + ', ' as 'data()'
from [BroincInventory]
for xml path('')
But I need guidance on how to work with this result to create the required output. How can this be done with c# or with one of the SSIS components?
edit: As siyual points out below I need a script task. The script above obviously will not work since there's a limit to the size of a data point.
I think term extraction might be the component you are looking for. Check this out: http://www.mssqltips.com/sqlservertip/3194/simple-text-mining-with-the-ssis-term-extraction-component/