Is it possible to add the description or other custom field to query result log? - osquery

I have the following scheduled query in combination with a TLS plugin logger.
"vssadmin.exe": {
"query": "select * from file WHERE directory = 'C:\\Windows\\Prefetch\\' and filename like '%vssadmin%';",
"interval": 600,
"description": "Vssadmin Execute, usaullay used to execute activity on Volume Shadow copy",
"platform": "windows"
},
I'd like to add the description field to the result output log of this specific query, so I can use it to map my queries to a framework. Unfortunately the provided documentation doesn't state such option. Is it possible to add the description or other custom field to the logged output?

Like this?
Tag your #osquery queries/logs with MITRE ATT&CK IDs like so:
SELECT username,shell, 'T1136' AS attckID FROM users;

Related

Left join did not working properly in Azure Stream Analytics

I'm trying to create a simple left join between two inputs (event hubs), the source of inputs is an app function that process a rabbitmq queue and send to a event hub.
In my eventhub1 I have this data:
[{
"user": "user_aa_1"
}, {
"user": "user_aa_2"
}, {
"user": "user_aa_3"
}, {
"user": "user_cc_1"
}]
In my eventhub2 I have this data:
[{
"user": "user_bb_1"
}, {
"user": "user_bb_2"
}, {
"user": "user_bb_3
}, {
"user": "user_cc_1"
}]
I use that sql to create my left join
select hub1.[user] h1,hub2.[user] h2
into thirdTestDataset
from hub1
left join hub2
on hub2.[user] = hub1.[user]
and datediff(hour,hub1,hub2) between 0 and 5
and test result looks ok...
the problem is when I try it on job running... I got this result in power bi dataset...
Any idea why my left isn't working like any sql query?
I tested your query sql and it works well for me too.So when you can't get expected output after executing ASA job,i suggest you following troubleshoot solutions in this document.
Based on your output,it seems that the HUB2 becomes the left table.You could use diagnostic log in ASA to locate the truly output of job execution.
I tested the end-to-end using blob storage for input 1 and 2 and your sample and a PowerBI dataset as output and observed the expected result.
I think there are few things that can go wrong with your query:
First, your join has a 5-hours windows: basically that means it looks at EH1 and EH2 for matches during that large window, so live results will be different from sample input for which you have only 1 row. Can you validate that you had no match during this 5-hour window?
Additionally by default PBI streaming datasets are "hybrid datasets" so it will accumulate results without a good way to know when the result was emitted since there is no timestamp in your output schema. So you can also view previous data here. I'd suggest few things here:
In Power BI, change the option of your dataset: disable "Historic data analysis" to remove caching of data
Add a timestamp column to make sure to identify when the data is generated (the first line of you query will become: select System.timestamp() as time, hub1.[user] h1,hub2.[user] h2 )
Let me know if it works for you.
Thanks,
JS (Azure Stream Analytics)

Dynamically Append datetime to filename during copy activity or when specifying name in blob dataset

I am saving a file to blob storage in Data factory V2, when I specify the location to save to I am calling the file (for example) file1 and it saves in blob as file1, no problem. But can I use the dynamic content feature to append the datetime to the filename so its something like file1_01-07-2019_14-30-00 ?(7th Jan 14:30:00 just in case its awkward to read). Alternatively, can I output the result (the filename) of the webhook activity to the next activity (the function)?
Thank you.
I couldn't get this to work without editing the copy pipeline JSON file directly (late 2018 - may not be needed anymore). You need dynamic code in the copy pipeline JSON and settings defined in the dataset for setting filename parameters.
In the dataset define 'Parameters' for folder path and/or filename (click '+ New' and give them any name you like) e.g. sourceFolderPath, sourceFileName.
Then in dataset under 'Connection' include the following in the 'File path' definition:
#dataset().sourceFolderPath and #dataset().sourceFileName either side of the '/'
(see screenshot below)
In the copy pipeline click on 'Code' in the upper right corner of pipeline window and look for the following code under the 'blob' object you want defined by a dynamic filename - it the 'parameters' code isn't included add it to the JSON and click the 'Finish' button - this code may be needed in 'inputs', 'outputs' or both depending on the dynamic files you are referencing in your flow - below is an example where the output includes the date parameter in both folder path and file name (the date is set by a Trigger parameter):
"inputs": [
{
"referenceName": "tmpDataForImportParticipants",
"type": "DatasetReference"
}
],
"outputs": [
{
"referenceName": "StgParticipants",
"type": "DatasetReference",
"parameters": {
"sourceFolderPath": {
"value": <derived value of folder path>,
"type": "Expression"
},
"sourceFileName": {
"value": <derived file name>,
"type": "Expression"
}
}
}
]
Derived value of folder path may be something like the following - this results in a folder path of yyyy/mm/dd within specified blobContainer:
"blobContainer/#{formatDateTime(pipeline().parameters.windowStart,'yyyy')}/#{formatDateTime(pipeline().parameters.windowStart,'MM')}/#{formatDateTime(pipeline().parameters.windowStart,'dd')}"
or it could be hardcoded e.g. "blobContainer/directoryPath" - don't include '/' at start or end of definition
Derived file name could be something like the following:
"#concat(string(pipeline().parameters.'_',formatDateTime(dataset().WindowStartTime, 'MM-dd-yyyy_hh-mm-ss'))>,'.txt')"
You can include any parameter set by the Trigger e.g. an ID value, account name, etc. by including pipeline().parameters.
Dynamic Dataset Parameters example
Dynamic Dataset Connection example
Once you set up the copy activity and select you blob dataset as the sink, you need to put in a value for the WindowStartTime, this can either just be a timestamp e.g. 1900-01-01T13:00:00Z or you can put in a pipeline parameter into this.
Having a parameter would maybe be more helpful if you're setting up a schedule trigger, as you will be able to input this WindowStartTime timestamp by when the trigger runs. For this you would use #trigger().scheduledTime as the value for the trigger parameter WindowStartTime.
https://learn.microsoft.com/en-us/azure/data-factory/concepts-pipeline-execution-triggers#trigger-type-comparison
You can add a dataset parameter such an WindowStartTime, which is in the format 2019-01-10T13:50:04.279Z. Then you would have something like below for the dynamic filename:
#concat('file1_', formatDateTime(dataset().WindowStartTime, 'MM-dd-yyyy_hh-mm-ss')).
To use in the copy activity you will also need to add a pipeline parameter.

How to match following queries in Azure Search

I have the default Analyzer set for my index and the fields in Azure Search.
I have following values for a field - name.
Demo 001
Demo Site 001
001 Demo Site
I am trying to get matching values for following . My sample queries are
$count=true&queryType=full&searchFields=name&searchMode=any&$select=name,id&$skip=0&$top=10&search=name:/"Demo(.*)/
I could get all the results
In order to get the query work for getting only Demo S, that is Demo Site 001. What change I should make to the Query? Or what change I should make to the analyzer?
If I want to get a query working with 001, 001 and a space how can I modify the query?
Finally is there any way I could tell the search that I need only the properties which starts with 001?
Is it possible to achieve all the above three conditions with a single setup?
There are 2 probable ways to achieve this.
A. Custom Analyzer with a CharMap filter
1. For index phase, you can use a Custom Analyzer with a character filter to map whitespaces to underscores/emptystring.
eg:If you map whitespaces to emptystring, your data will be stored as:
Demo Site 001 ---> DemoSite001
001 Demo Site ---> 001DemoSite
"charFilters":[
{
"name":"map_dash",
"#odata.type":"#Microsoft.Azure.Search.MappingCharFilter",
"mappings":[" =>"]
}
In query phase,
Step 1. Parse the query and substitute whitespace with the same identifier, as used in the index phase.
So , search query "Demo S" translates to ---> "DemoS"
Step 2. Do a wildcard search for the new query string
search = DemoS*
B. Custom Analyzer with an EdgeNGramToken Filter
Use a custom analyzer , with a EdgeNGram TokenFilter to index your documents.
eg:
"tokenFilters": [
{
"name": "edgeNGramFilter",
"#odata.type": "#Microsoft.Azure.Search.EdgeNGramTokenFilterV2",
"minGram": 2,
"maxGram": 20
}
],
"analyzers": [
{
"name": "prefixAnalyzer",
"#odata.type": "#Microsoft.Azure.Search.CustomAnalyzer",
"tokenizer": "keyword",
"tokenFilters": [ "lowercase", "edgeNGramFilter" ]
}
]
With any of these approach
"Demo S" will return only Demo Site 001
"001 " will only return 001 Demo Site
More details :
How Search works
Custom Analyzers

Add column description to BiqQuery table?

need to add descriptions to each column of a BigQuery table, seems I can do it manually, how to do it programmatically?
BigQuery now supports ALTER COLUMN SET OPTIONS statement, which can be used to update the description of a column
example:
ALTER TABLE mydataset.mytable
ALTER COLUMN price
SET OPTIONS (
description="Price per unit"
)
Documentation:
https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language#alter_column_set_options_statement
As Adam mentioned, you can use the table PATCH method on the API to update the schema columns. The other method is to use bq.
You can first get the schema by doing the following:
1: Get the JSON schema:
TABLE=publicdata:samples.shakespeare
bq show --format=prettyjson ${TABLE} > table.txt
Then copy the schema from table.txt to schema.txt ... it will look something like:
[
{
"description": "A single unique word (where whitespace is the delimiter) extracted from a corpus.",
"mode": "REQUIRED",
"name": "word",
"type": "STRING"
},
{
"description": "The number of times this word appears in this corpus.",
"mode": "REQUIRED",
"name": "word_count",
"type": "INTEGER"
},
....
]
2: Set the description field to whatever you want (if it is not there, add it).
3: Tell BigQuery to update the schema with the added columns. Note that schema.txt must contain the complete schema.
bq update --schema schema.txt -t ${TABLE}
You can use the REST API to create or update a table, and specify a field desciption (schema.fields[].description) in your schema.
https://cloud.google.com/bigquery/docs/reference/v2/tables#methods

aws data pipeline datetime variable

I am using AWS Data Pipeline to save a text file to my S3 bucket from RDS. I would like the file name to to have the date and the hour in the file name like:
myfile-YYYYMMDD-HH.txt
myfile-20140813-12.txt
I have specified my S3DataNode FilePath as:
s3://mybucketname/out/myfile-#{format(myDateTime,'YYYY-MM-dd-HH')}.txt
When I try to save my pipeline I get the following error:
ERROR: Unable to resolve myDateTime for object:DataNodeId_xOQxz
According to the AWS Data Pipeline documentation for date and time functions this is the proper syntax for using the format function.
When I save pipeline using a "hard-coded" the date and time I don't get this error and my file is in my S3 bucket and folder as expected.
My thinking is that I need to define "myDateTime" somewhere or use a NOW()
Can somebody tell me how to set "myDateTime" to the current time (e.g. NOW) or give a workaround so I can format the current time to be used in my FilePath?
I am not aware of an exact equivalent of NOW() in Data Pipeline. I tried using makeDate with no arguments (just for fun) to see if that worked.. it did not.
The closest are runtime variables scheduledStartTime, actualStartTime, reportProgressTime.
http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-object-s3datanode.html
The following for eg. should work.
s3://mybucketname/out/myfile-#{format(#scheduledStartTime,'YYYY-MM-dd-HH')}.txt
Just for fun, here is some more info on Parameters.
At the end of your Pipeline Json (click List Pipelines, select into one, click Edit Pipeline, then click Export), you need to add a Parameters and/or Values object.
I use a myStartDate for backfill processes which you can manipulate once it is passed in for ad hoc runs. You can give this a static default, but can't set it to a dynamic value so it is limited for regular schedule tasks. For realtime/scheduled dates, you need to use the #scheduledStartTime, etc, as suggested. Here is a sample of setting up some Parameters and or Values. Both show up in Parameters in the UI. These values can be used through out your pipeline activities (shell, hive, etc) with the #{myVariableToUse} notation.
"parameters": [
{
"helpText": "Put help text here",
"watermark": "This shows if no default or value set",
"description": "Label/Desc",
"id": "myVariableToUse",
"type": "string"
}
]
And for Values:
"values": {
"myS3OutLocation": "s3://some-bucket/path",
"myThreshold": "30000",
}
You cannot add these directly in the UI (yet) but once they are there you can change and save the values.