Filebeat - kakfka output 'and' or 'regex' condition not working - filebeat

kafka topic filter filebeat
Hi , I am trying to filter all messages containing indicator 'TEST01' from different log paths and send the messages to two different topics( topic1 and topic2) based on fields.type
If the fields.type is "first_test" then the messages should go to "topic1" else to "topic2" . Below is the code i tried - but the and: operator is not working. Appreciate any help on coding composite conditions in filebeat -dynamic kafka output . Thank you https://www.elastic.co/guide/en/beats/filebeat/master/defining-processors.html#condition-equals
topics:
- topic: "topic1"
and:
- when.contains:
message: "TEST01"
- equals:
fields.type: "first_test"
- topic: "topic2"
and:
- when.contains:
message: "TEST01"
- not:
equals:
fields.type: "first_test"

Got this work without and: operator but with below code using 'contains' and different fields.type. Still any answers on how to use and: , regex in context of my question will be helpful. thanks
topics:
- topic: "topic1"
- when.contains:
message: "TEST01"
fields.type: "first_test"
- topic: "topic2"
- when.contains:
message: "TEST01"
fields.type: "second_test

Related

using multiple conditions in iam policy

I am trying to give two conditions StringEquals and ArnEquals. The StringEquals block is working whereas ArnEquals block is failing. Even after i gave the correct federated user of mine. I am unable to start build. here is what i am trying. I am getting the error the user with xxxx arn is not allowed to stratbuild action.
- Effect: Allow
Action:
- codebuild:StartBuild
- codebuild:StopBuild
- codeBuild:RetryBuild
Resource:
- !Sub arn:aws:codebuild:${AWS::Region}:${AWS::AccountId}:project/*
Condition:
StringEquals:
aws:ResourceTag/Application:
- yyy
- xxx
aws:ResourceTag/TargetEnvironmentType:
- dev
ArnEquals:
aws:SourceArn:
- !Sub arn:aws:sts::${AWS::AccountId}:federated-user/Builder-role/xxx#gmial.com

Calling Karate feature file returns response object including multiple copies of previous response object of parent scenario

I am investigating exponential increase in JAVA heap size when executing complex scenarios especially with multiple reusable scenarios. This is my attempt to troubleshoot the issue with simple example and possible explanation to JVM heap usage.
Environment: Karate 1.1.0.RC4 | JDK 14 | Maven 3.6.3
Example: Download project, extract and execute maven command as per READEME
Observation: As per following example, if we call same scenario multiple times, response object grows exponentially since it includes response from previous called scenario along with copies of global variables.
#unexpected
Scenario: Not over-writing nested variable
* def response = call read('classpath:examples/library.feature#getLibraryData')
* string out = response
* def resp1 = response.randomTag
* karate.log('FIRST RESPONSE SIZE = ', out.length)
* def response = call read('classpath:examples/library.feature#getLibraryData')
* string out = response
* def resp2 = response.randomTag
* karate.log('SECOND RESPONSE SIZE = ', out.length)
Output:
10:26:23.863 [main] INFO c.intuit.karate.core.FeatureRuntime - scenario called at line: 9 by tag: getLibraryData
10:26:23.875 [main] INFO c.intuit.karate.core.FeatureRuntime - scenario called at line: 14 by tag: libraryData
10:26:23.885 [main] INFO com.intuit.karate - FIRST RESPONSE SIZE = 331
10:26:23.885 [main] INFO c.intuit.karate.core.FeatureRuntime - scenario called at line: 9 by tag: getLibraryData
10:26:23.894 [main] INFO c.intuit.karate.core.FeatureRuntime - scenario called at line: 14 by tag: libraryData
10:26:23.974 [main] INFO com.intuit.karate - SECOND RESPONSE SIZE = 1783
10:26:23.974 [main] INFO c.intuit.karate.core.FeatureRuntime - scenario called at line: 9 by tag: getLibraryData
10:26:23.974 [main] INFO c.intuit.karate.core.FeatureRuntime - scenario called at line: 14 by tag: libraryData
10:26:23.988 [main] INFO com.intuit.karate - THIRD RESPONSE SIZE = 8009
Do we really need to include response and global variables in the response of called feature file (non-shared scope)?
When we read large json file and call multiple reusable scenario files, each time copy of read json data gets added to response object. Is there way to avoid this behavior?
Is there a better way to script complex test using reusable scenarios without having multiple copies of same variables?
Okay, can you look at this issue:
https://github.com/intuit/karate/issues/1675
I agree we can optimize the response and global variables. Would be great if you can contribute code.

Delta Table : org.apache.spark.sql.catalyst.parser.ParseException: mismatched input 'FROM'

I am trying to run the query on EMR/EMR Notebooks (Spark with Scala) -
SELECT max(version), max(timestamp) FROM (DESCRIBE HISTORY delta.`s3://a/b/c/d`)
But I am getting the following error -
The same query works fine on Databricks.
Another doubt that I have is - why does the colour of s3 location change post //.
So I tried to break the above query and only run the Describe HISTORY query. And for some reason it says -
Error Log -
An error was encountered:
org.apache.spark.sql.AnalysisException: Table or view not found: HISTORY;
at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:47)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.org$apache$spark$sql$catalyst$analysis$Analyzer$ResolveRelations$$lookupTableFromCatalog(Analyzer.scala:835)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.resolveRelation(Analyzer.scala:787)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$8.applyOrElse(Analyzer.scala:817)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$8.applyOrElse(Analyzer.scala:810)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsUp$1$$anonfun$apply$1.apply(AnalysisHelper.scala:90)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsUp$1$$anonfun$apply$1.apply(AnalysisHelper.scala:90)
at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:71)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsUp$1.apply(AnalysisHelper.scala:89)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsUp$1.apply(AnalysisHelper.scala:86)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:194)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$class.resolveOperatorsUp(AnalysisHelper.scala:86)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUp(LogicalPlan.scala:30)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:810)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:756)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1$$anonfun$apply$1$$anonfun$2.apply(RuleExecutor.scala:92)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1$$anonfun$apply$1$$anonfun$2.apply(RuleExecutor.scala:92)
at org.apache.spark.sql.execution.QueryExecutionMetrics$.withMetrics(QueryExecutionMetrics.scala:141)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1$$anonfun$apply$1.apply(RuleExecutor.scala:91)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1$$anonfun$apply$1.apply(RuleExecutor.scala:88)
at scala.collection.LinearSeqOptimized$class.foldLeft(LinearSeqOptimized.scala:124)
at scala.collection.immutable.List.foldLeft(List.scala:84)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1.apply(RuleExecutor.scala:88)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1.apply(RuleExecutor.scala:80)
at scala.collection.immutable.List.foreach(List.scala:392)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:80)
at org.apache.spark.sql.catalyst.analysis.Analyzer.org$apache$spark$sql$catalyst$analysis$Analyzer$$executeSameContext(Analyzer.scala:164)
at org.apache.spark.sql.catalyst.analysis.Analyzer$$anonfun$execute$1.apply(Analyzer.scala:156)
at org.apache.spark.sql.catalyst.analysis.Analyzer$$anonfun$execute$1.apply(Analyzer.scala:156)
at org.apache.spark.sql.catalyst.analysis.AnalysisContext$.withLocalMetrics(Analyzer.scala:104)
at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:155)
at org.apache.spark.sql.catalyst.analysis.Analyzer$$anonfun$executeAndCheck$1.apply(Analyzer.scala:126)
at org.apache.spark.sql.catalyst.analysis.Analyzer$$anonfun$executeAndCheck$1.apply(Analyzer.scala:125)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.markInAnalyzer(AnalysisHelper.scala:201)
at org.apache.spark.sql.catalyst.analysis.Analyzer.executeAndCheck(Analyzer.scala:125)
at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:76)
at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:74)
at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:66)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:80)
at org.apache.spark.sql.SparkSession.table(SparkSession.scala:630)
at org.apache.spark.sql.execution.command.DescribeColumnCommand.run(tables.scala:714)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:196)
at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:196)
at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3391)
at org.apache.spark.sql.execution.SQLExecution$.org$apache$spark$sql$execution$SQLExecution$$executeQuery$1(SQLExecution.scala:83)
at org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1$$anonfun$apply$1.apply(SQLExecution.scala:94)
at org.apache.spark.sql.execution.QueryExecutionMetrics$.withMetrics(QueryExecutionMetrics.scala:141)
at org.apache.spark.sql.execution.SQLExecution$.org$apache$spark$sql$execution$SQLExecution$$withMetrics(SQLExecution.scala:178)
at org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:93)
at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:200)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:92)
at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$withAction(Dataset.scala:3390)
at org.apache.spark.sql.Dataset.<init>(Dataset.scala:196)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:81)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:644)
... 50 elided
Caused by: org.apache.spark.sql.catalyst.analysis.NoSuchTableException: Table or view 'history' not found in database 'default';
at org.apache.spark.sql.hive.client.HiveClient$$anonfun$getTable$1.apply(HiveClient.scala:81)
at org.apache.spark.sql.hive.client.HiveClient$$anonfun$getTable$1.apply(HiveClient.scala:81)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.hive.client.HiveClient$class.getTable(HiveClient.scala:81)
at org.apache.spark.sql.hive.client.HiveClientImpl.getTable(HiveClientImpl.scala:84)
at org.apache.spark.sql.hive.HiveExternalCatalog.getRawTable(HiveExternalCatalog.scala:141)
at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$getTable$1.apply(HiveExternalCatalog.scala:723)
at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$getTable$1.apply(HiveExternalCatalog.scala:723)
at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:98)
at org.apache.spark.sql.hive.HiveExternalCatalog.getTable(HiveExternalCatalog.scala:722)
at org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.getTable(ExternalCatalogWithListener.scala:138)
at org.apache.spark.sql.catalyst.catalog.SessionCatalog.lookupRelation(SessionCatalog.scala:706)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.org$apache$spark$sql$catalyst$analysis$Analyzer$ResolveRelations$$lookupTableFromCatalog(Analyzer.scala:832)
UPDATED (18-Feb-2021) -> What I have tried till now.
Query Using Spark Sql -
spark.sql("SELECT max(version), max(timestamp) FROM (DESCRIBE HISTORY delta.s3://a/b/c/d)")
But this Didnt work. Same Error.
Create Spark Session with -
spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension
and spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog.
But its throwing the same error.
UPDATE 2 (18-Feb-2021) :- Trying the approach as mentioned by #alex.
Using PySpark.
It was working partly and but not completely.
Thanks in Advance.
Per documentation, to get support for DESCRIBE HISTORY you need to configure Spark SQL Extensions and Catalog by passing 2 properties (see docs):
spark.sql.extensions to value io.delta.sql.DeltaSparkSessionExtension
spark.sql.catalog.spark_catalog to value org.apache.spark.sql.delta.catalog.DeltaCatalog
Update:
For Spark 2.4.x, the Delta 0.6.1 should be used, and its documentation has following code snippet to activate extensions:
spark.sparkContext._jvm.io.delta.sql.DeltaSparkSessionExtension() \
.apply(spark._jsparkSession.extensions())
spark = SparkSession(spark.sparkContext, spark._jsparkSession.cloneSession())

Cakephp 4 Linking 2 existing entities with belongstomany association

First of all, thank you for your patience. I am getting to grips with PHP coding after a long while. I have finished the famous Cake tutorial and am now continuing with what I know and creating new solutions.
I am creating a voting system. When the current user votes on a suggestion (sames as an article), a new entry needs to be saved in a related cross table between suggestions and users. I thought that the public function link() is the best way to do this. See the function in the ArticlesController below. However, I get an error message "Call to undefined method Authentication\Identity::isNew()". This suggests I have not taken care of some authorization around this call. Can anyone tell me what I missed? Or whether there is a better way to do this?
Note; I do know it looks strange that when I already have my one suggestion I am working through the suggestions table, but I could not find a way for one entity to directly link it to an associated entity (ie I found no link function for entities). It would be nice to hear if this can be done in a leaner way.
public function upvote($id = null)
{
$suggestion = $this->Suggestions->get($id);
$this->Authorization->skipAuthorization();
$user = $this->request->getAttribute('identity');
//Create link between current suggestions and current user in the suggestions_users table
$suggestionsTable = $this->getTableLocator()->get('Suggestions');
$suggestionsTable->Users->link($suggestion,[$user]);
if ($voteTable->save($vote)) {
// The foreign key value was set automatically.
$this->Flash->success(__('Your vote has been cast.'));
}
else {
$this->Flash->error(__('The vote could not be cast. Apologies for the technical issue.'));
}
}
Stack trace;
2020-11-12 13:46:52 Error: [Error] Call to undefined method Authentication\Identity::isNew() in C:\wamp64\www\cake\vendor\cakephp\authorization\src\IdentityDecorator.php on line 125
Stack Trace:
- C:\wamp64\www\cake\vendor\cakephp\cakephp\src\ORM\Association\BelongsToMany.php:1323
- C:\wamp64\www\cake\vendor\cakephp\cakephp\src\ORM\Association\BelongsToMany.php:864
- C:\wamp64\www\cake\src\Controller\SuggestionsController.php:143
- C:\wamp64\www\cake\vendor\cakephp\cakephp\src\Controller\Controller.php:529
- C:\wamp64\www\cake\vendor\cakephp\cakephp\src\Controller\ControllerFactory.php:79
- C:\wamp64\www\cake\vendor\cakephp\cakephp\src\Http\BaseApplication.php:251
- C:\wamp64\www\cake\vendor\cakephp\cakephp\src\Http\Runner.php:77
- C:\wamp64\www\cake\vendor\cakephp\authorization\src\Middleware\AuthorizationMiddleware.php:129
- C:\wamp64\www\cake\vendor\cakephp\cakephp\src\Http\Runner.php:73
- C:\wamp64\www\cake\vendor\cakephp\authentication\src\Middleware\AuthenticationMiddleware.php:122
- C:\wamp64\www\cake\vendor\cakephp\cakephp\src\Http\Runner.php:73
- C:\wamp64\www\cake\vendor\cakephp\cakephp\src\Routing\Middleware\RoutingMiddleware.php:166
- C:\wamp64\www\cake\vendor\cakephp\cakephp\src\Http\Runner.php:73
- C:\wamp64\www\cake\vendor\cakephp\cakephp\src\Http\Middleware\CsrfProtectionMiddleware.php:156
- C:\wamp64\www\cake\vendor\cakephp\cakephp\src\Http\Runner.php:73
- C:\wamp64\www\cake\vendor\cakephp\cakephp\src\Http\Middleware\BodyParserMiddleware.php:159
- C:\wamp64\www\cake\vendor\cakephp\cakephp\src\Http\Runner.php:73
- C:\wamp64\www\cake\vendor\cakephp\cakephp\src\Routing\Middleware\RoutingMiddleware.php:166
- C:\wamp64\www\cake\vendor\cakephp\cakephp\src\Http\Runner.php:73
- C:\wamp64\www\cake\vendor\cakephp\cakephp\src\Routing\Middleware\AssetMiddleware.php:68
- C:\wamp64\www\cake\vendor\cakephp\cakephp\src\Http\Runner.php:73
- C:\wamp64\www\cake\vendor\cakephp\cakephp\src\Error\Middleware\ErrorHandlerMiddleware.php:121
- C:\wamp64\www\cake\vendor\cakephp\cakephp\src\Http\Runner.php:73
- C:\wamp64\www\cake\vendor\cakephp\debug_kit\src\Middleware\DebugKitMiddleware.php:60
- C:\wamp64\www\cake\vendor\cakephp\cakephp\src\Http\Runner.php:73
- C:\wamp64\www\cake\vendor\cakephp\cakephp\src\Http\Runner.php:58
- C:\wamp64\www\cake\vendor\cakephp\cakephp\src\Http\Server.php:90
- C:\wamp64\www\cake\webroot\index.php:40
Request URL: /suggestions/upvote/1
Referer URL: http://localhost:8000/cake/suggestions

how to do count in flink sql

I'd like to do count(0)in flink SQL, but it gives exception like
org.apache.flink.client.program.ProgramInvocationException: The main method caused an error: SQL parse failed. UDT in DDL is not supported yet.
don't know is there anything wrong?
expect the output should work fine
INSERT INTO request_join
select requestId,count(0) from requests
GROUP BY TUMBLE(rowtime, INTERVAL '1' HOUR),requestId;
The schema of the table is here
name: request_join
schema:
- '`requestId` VARCHAR'
- '`count` LONG'
properties:
'connector.type': 'kafka'
'connector.version': 'universal'
'connector.topic': 'request_join_test'
'connector.startup-mode': 'latest-offset'
'connector.properties.0.key': 'zookeeper.connect'
'connector.properties.0.value': '10.XXXXXXXXX'
'connector.properties.1.key': 'bootstrap.servers'
'connector.properties.1.value': '10.XXXXXXXXX'
'connector.properties.2.key': 'group.id'
'connector.properties.2.value': 'request_join_test'
'update-mode': 'append'
'format.type': 'json'
'format.json-schema': '{type: "object", properties: {requestId: { type: "string"},count:{type:
"number"}}}'
didn't find anything wrong, but it just doesn't work, if I do not count and delete the count from the schema it will work well so I'm sure the sql itself is good.
I checked the flink sql it says some of the functions are not supported in DDL, so don't flink support count? I can see from examples that it support SUM very well.
There is sth wrong with your schema,
schema:
- 'requestId VARCHAR'
- 'count BIGINT'