Replace text in specific position in line with system date - awk

I have a file with a single line. I want to replace the text between positions 188 (inclusive) to 197 with the system date (YYYY-MM-DD).
I tried this but it doesn't work:
sed 's/\(.\{188\}\)\([0-9-]\{10\}\)\(.*\)/\1$(date '+%Y-%m-%d')\188/g'
I want to use sed or anything else that works in a shell script.
The input file is:
{ "agent": { "run_as_user": "root" }, "logs": { "logs_collected": { "files": { "collect_list": [ { "file_path": "/home/ec2-user/logs/**", "log_group_name": "Staging", "log_stream_name": "2020-10-24", "timestamp_format": "[%Y-%m-%d %H:%M:%S]" } ] } } } }
. . . and in the output, I want to change only the the date as shown below.
{ "agent": { "run_as_user": "root" }, "logs": { "logs_collected": { "files": { "collect_list": [ { "file_path": "/home/ec2-user/logs/**", "log_group_name": "Staging", "log_stream_name": "2020-10-25", "timestamp_format": "[%Y-%m-%d %H:%M:%S]" } ] } } } }

Could you please try following, written as per shown attempts of OP in GNU awk.
awk -v date=$(date +%Y-%m-%d) '{print substr($0,1,187) date substr($0,198)}' Input_file

Related

SQL Server Replace in MongoDB

I want to do a replace in projection. Like a SQL Server REPLACE. I'm pretty sure we can handle that in code but looking for some shell commands.
Here is what I have
db.OrderHistoryHeader.aggregate([
{
$project:{
"_id":0,
"OrderNo":1 // I want to do Replace(OrderNo,'XYZ','ABC')
}
}
],
{
allowDiskUse:true
}).pretty();
There's no built-in operator for that currently but you can use $indexOfBytes combined with $substr and $concat.
db.OrderHistoryHeader.aggregate([
{
$addFields:
{
index: { $indexOfBytes: [ "$OrderNo", "XYZ" ] },
}
},
{
$project: {
OrderNo: {
$concat: [
{ $substr: [ "$OrderNo", 0, "$index" ] },
"ABC",
{ $substr: [ "$OrderNo", { $add: [3, "$index"] }, -1 ] }
]
}
}
},
{
$project: {
index: 0
}
}
])
Where 3 is the length of text being replaced.
You can use the replaceOne method
db.collection.replaceOne(filter, replacement, options)
From documentation:
Behavior
replaceOne() replaces the first matching document in the collection that matches the filter, using the replacement document.
upsert
If upsert: true and no documents match the filter, db.collection.replaceOne() creates a new document based on the replacement document.

merging s3 manifest files using jq

I have multiple s3 manifest files each corresponding to a date for a given date range. I am looking to merge all of the manifest files to generate a single manifest file, thus allowing me to perform a single Redshift copy.
manifest file 1:
{
"entries": [
{
"url": "DFA/20161001/394007-OMD-Coles/dcm_account394007_activity_20160930_20161001_050403_294198927.csv.gz"
}
]
}
manifest file 2:
{
"entries": [
{
"url": "DFA/20161002/394007-OMD-Coles/dcm_account394007_activity_20161001_20161002_054043_294865863.csv.gz"
}
]
}
I am looking for an output like:-
{
"entries": [
{
"url": "DFA/20161001/394007-OMD-Coles/dcm_account394007_activity_20160930_20161001_050403_294198927.csv.gz"
},
{
"url": "DFA/20161002/394007-OMD-Coles/dcm_account394007_activity_20161001_20161002_054043_294865863.csv.gz"
}
]
}
I did try
jq -s '.[]' "manifest_file1.json" "manifest_file2.json"
and other suggestions posted in Stackoverflow but couldn't make it work.
Or, without resorting to reduce:
$ jq -n '{entries: [inputs.entries[]]}' manifest_file_{1,2}.json
{
"entries": [
{
"url": "DFA/20161001/394007-OMD-Coles/dcm_account394007_activity_20160930_20161001_050403_294198927.csv.gz"
},
{
"url": "DFA/20161002/394007-OMD-Coles/dcm_account394007_activity_20161001_20161002_054043_294865863.csv.gz"
}
]
}
Note that inputs was introduced in jq version 1.5. If your jq does not have inputs, you can use jq -s as follows:
$ jq -s '{entries: [.[].entries[]]}' manifest_file_{1,2}.json
So if by "merge" you mean to combine the "entries" arrays into a single array by concatenating them, you could do this:
$ jq 'reduce inputs as $i (.; .entries += $i.entries)' manifest_file{1,2}.json
Which yields:
{
"entries": [
{
"url": "DFA/20161001/394007-OMD-Coles/dcm_account394007_activity_20160930_20161001_050403_294198927.csv.gz"
},
{
"url": "DFA/20161002/394007-OMD-Coles/dcm_account394007_activity_20161001_20161002_054043_294865863.csv.gz"
}
]
}

nested select query in elasticsearch

I have to convert the following query in elasticsearch :
select * from index where observable not in (select observable from index where tags = 'whitelist')
I read that I should use a Filter in a Not Filter but I don't understand how to do.
Can anyone help me?
Thanks
EDIT:
I have to get all except those that have 'whitelist' tag but I need to check also that nothing of the blacklist element is contained into the whitelist.
Your SQL query can be simplified to this:
select * from index where tags not in ('whitelist')
As a result the "corresponding" ES query would be
curl -XPOST localhost:9200/index/_search -d '{
"query": {
"filtered": {
"filter": {
"bool": {
"must_not": {
"terms": {
"tags": [
"whitelist"
]
}
}
}
}
}
}
}'
or another using the not filter instead of bool/must_not:
curl -XPOST localhost:9200/index/_search -d '{
"query": {
"filtered": {
"filter": {
"not": {
"terms": {
"tags": [
"whitelist"
]
}
}
}
}
}
}'

ElasticSearch:filtering documents based on field length

I read couple of similar problems on SO and suggest solution not work..
I want to find all fields where word is shorter than 8
my database screen:
I tried to do this using this query
{
"query": {
"match_all": {}
},
"filter": {
"script": {
"script": "doc['word'].length < 5"
}
}
}
what I doing wrong? I miss something?
Any field used in a script is loaded entirely into memory (http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-scripting.html#_document_fields), so you may want to consider an alternative approach.
You can e.g. use the regexp-filter to just find terms of a certain length, with a pattern like .{0,4}.
Here's a runnable example you can play with: https://www.found.no/play/gist/2dcac474797b0b2b952a
#!/bin/bash
export ELASTICSEARCH_ENDPOINT="http://localhost:9200"
# Index documents
curl -XPOST "$ELASTICSEARCH_ENDPOINT/_bulk?refresh=true" -d '
{"index":{"_index":"play","_type":"type"}}
{"word":"bar"}
{"index":{"_index":"play","_type":"type"}}
{"word":"barf"}
{"index":{"_index":"play","_type":"type"}}
{"word":"zip"}
'
# Do searches
# This will not match barf
curl -XPOST "$ELASTICSEARCH_ENDPOINT/_search?pretty" -d '
{
"query": {
"filtered": {
"filter": {
"regexp": {
"word": {
"value": ".{0,3}"
}
}
}
}
}
}
'

How can I query elasticsearch for only one type of record?

I am issuing a query to elasticsearch and I am getting multiple record types. How do I limit the results to one type?
The following query will limit results to records with the type "your_type":
curl - XGET 'http://localhost:9200/_all/your_type/_search?q=your_query'
See http://www.elasticsearch.org/guide/reference/api/search/indices-types.html for more details.
You can also use query dsl to filter out results for specific type like this:
$ curl -XGET 'http://localhost:9200/_search' -d '{
"query": {
"filtered" : {
"filter" : {
"type" : { "value" : "my_type" }
}
}
}
}
'
Update for version 6.1:
Type filter is now replaced by Type Query: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-type-query.html
You can use that in both Query and Filter contexts.
{
"query" : {
"filtered" : {
"filter" : {
"bool" : {
"must" :[{"term":{"_type":"UserAudit"}}, {"term" : {"eventType": "REGISTRATION"}}]
}
}
}
},
"aggs":{
"monthly":{
"date_histogram":{
"field":"timestamp",
"interval":"1y"
},
"aggs":{
"existing_visitor":{
"terms":{
"field":"existingGuest"
}
}
}
}
}
}
"_type":"UserAudit" condition will look the records only specific to type
On version 2.3 you can query _type field like:
{
"query": {
"terms": {
"_type": [ "type_1", "type_2" ]
}
}
}
Or if you want to exclude a type:
{
"query": {
"bool" : {
"must_not" : {
"term" : {
"_type" : "Hassan"
}
}
}
}
}