No Encoder found for play.api.libs.json.JsValue - dataframe

I have case classes like these
case class PrimeraRemoteCopyConfig(
links: Option[Vector[JsObject]] = None,
status: Option[Vector[JsObject]] = None,
targets: Option[Vector[JsObject]] = None,
groups: Option[Vector[JsObject]] = None,
groupTargets: Option[Vector[JsObject]] = None,
groupVolumes: Option[Vector[JsObject]] = None)
object PrimeraRemoteCopyConfig {
implicit val _format = Json.format[PrimeraRemoteCopyConfig]
}
case class PrimeraConfig(
systemUid: String,
tenantId: String,
systemWWN: String,
remoteCopyConfig: Option[PrimeraRemoteCopyConfig] = None)
object PrimeraConfig {
implicit val _format = Json.format[PrimeraConfig]
}
And I have spark dataset that uses state management, using flatMapGroupsWithState.
However I am getting
22/05/10 17:07:48 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 192.168.0.199, 53277, None)
Exception in thread "main" java.lang.UnsupportedOperationException: No Encoder found for play.api.libs.json.JsValue
- map value class: "play.api.libs.json.JsValue"
- field (class: "scala.collection.Map", name: "underlying")
- array element class: "play.api.libs.json.JsObject"
- option value class: "scala.collection.immutable.Vector"
- field (class: "scala.Option", name: "links")
- option value class: "model.PrimeraRemoteCopyConfig"
- field (class: "scala.Option", name: "remoteCopyConfig")
- root class: "model.PrimeraConfig"
at org.apache.spark.sql.errors.QueryExecutionErrors$.cannotFindEncoderForTypeError(QueryExecutionErrors.scala:1000)
I tried defining encoders for JsObject using Encoders.product[JsObject] However it does not work. What am I doing wrongly here

Related

Make Pydantic chuck error for wrong argument names

Suppose I have the following class:
class ModelConfig(pydantic.BaseModel):
name: str = "bert"
If I were to instantiate it with model_config = ModelConfig(name2="hello"), this simply ignores that there is no name2 and just keeps name="bert". Is there a way to raise an error saying unknown argument in pydantic?
You can do this using the forbid Model Config
For example:
class ModelConfig(pydantic.BaseModel, extra=pydantic.Extra.forbid):
name: str = "bert"
Passing model_config = ModelConfig(name2="hello") will throw an error

Create CloudFormation stack from AWS Lambda function, passing API Gateway parameters

I am unable to get parameters in Lambda function. If I mention parameters value in lambda it works fine. when I remove parameters values from Lambda function and try from API gateway or test lambda it process default parameters values. please help
My Lambda function is :
import boto3
import time
import json
datetime = time.strftime("%Y%m%d%H%M%S")
stackname = 'myec2'
client = boto3.client('cloudformation')
response = client.create_stack(
StackName= (stackname+ '-' + datetime),
TemplateURL='https://testnaeem.s3.amazonaws.com/ec2tags.yaml',
Parameters=[
{
"ParameterKey": "MyInstanceName",
"ParameterValue": " "
},
{
"ParameterKey": "MyInstanceType",
"ParameterValue": " "
}
]
)
def lambda_handler(event, context):
return(response)
My CloudFormation template is:
---
Parameters:
MyInstanceType:
Description: Instance type description
Type: String
MyInstanceName:
Description: Instance type description
Type: String
Resources:
MyInstance:
Type: AWS::EC2::Instance
Properties:
AvailabilityZone: us-east-1a
ImageId: ami-047a51fa27710816e
InstanceType: !Ref MyInstanceType
KeyName: miankeyp
Tags:
- Key : Name
Value : !Ref MyInstanceName
- Key : app
Value : demo
Please help what changes required in the Lambda function.
My test Values are:
{
"MyInstanceName": "demott",
"MyInstanceType": "t2.micro"
}
I modified the code of your lambda function. Please check comments in the code for clarification:
import boto3
import time
import json
datetime = time.strftime("%Y%m%d%H%M%S")
stackname = 'myec2'
client = boto3.client('cloudformation')
def lambda_handler(event, context):
print(event) # to check what your even actually is
# it will be printed out in CloudWatch Logs for your
# function
# you have to check what the event actually looks like
# and adjust event['MyInstanceName'] and event['MyInstanceType']
# in the following code
response = client.create_stack(
StackName= (stackname+ '-' + datetime),
TemplateURL='https://testnaeem.s3.amazonaws.com/ec2tags.yaml',
Parameters=[
{
"ParameterKey": "MyInstanceName",
"ParameterValue": event['MyInstanceName']
},
{
"ParameterKey": "MyInstanceType",
"ParameterValue": event['MyInstanceType']
}
]
)
return(response)
By the way, such function and API gateway can spin up a lot of ec2 instances very quickly. So that you are aware of this.

How to define an instance depending on some variable?

I'll give code on Python, but it's doesn't matter.
I have a module argument_parser with dictionary and class:
FORMATS = {
'JSON': 'json',
'XML': 'xml',
}
class ArgumentParser:
# some methods
def parse():
"""returns 'XML' or 'JSON' string"""
return fomat
and a module with presenters -- presenter
class JSONPresenter:
# some magic
class XMLPresenter:
# some magic
The problem is in dependency injection:
argument_parser = ArgumentParser()
format = argument_parser.parse()
if format == argument_parser.FORMATS['JSON']:
presenter = JSONFilePresenter()
elif format == argument_parser.FORMATS['XML']:
presenter = XMLFilePresenter()
if-elif construction is ugly. If I want to add some other formats I'll must add more elif.
I could define the type of presenter in ArgumentParser class but I think semantically it is wrong -- it is not a filed of responsibility of this class. What should I do to do it right?
PRESENTERS = {
'json': JSONPresenter(),
'xml': XMLPresenter(),
}
argument_parser = ArgumentParser()
fmt = argument_parser.parse()
presenter = PRESENTERS[fmt]
Or, if you need a new presenter every time:
PRESENTERS = {
'json': lambda: JSONPresenter(),
'xml': lambda: XMLPresenter(),
}
argument_parser = ArgumentParser()
fmt = argument_parser.parse()
presenter = PRESENTERS[fmt]()

ScalikeJDBC: Connection pool is not yet initialized.(name:'default)

I'm playing with ScalikeJdbc library. I want to retrieve the data from PostgreSQL database. The error I get is quite strange for me. Even if I configure manually the CP:
val poolSettings = new ConnectionPoolSettings(initialSize = 100, maxSize = 100)
ConnectionPool.singleton("jdbc:postgresql://localhost:5432/test", "user", "pass", poolSettings)
I still see the error. Here is my DAO:
class CustomerDAO {
case class Customer(id: Long, firstname: String, lastname: String)
object Customer extends SQLSyntaxSupport[Customer]
val c = Customer.syntax("c")
def findById(id: Long)(implicit session: DBSession = Customer.autoSession) =
withSQL {
select.from(Customer as c)
}.map(
rs => Customer(
rs.int("id"),
rs.string("firstname"),
rs.string("lastname")
)
).single.apply()
}
The App:
object JdbcTest extends App {
val dao = new CustomerDAO
val res: Option[dao.Customer] = dao.findById(2)
}
My application.conf file
# PostgreSQL
db.default.driver = "org.postgresql.Driver"
db.default.url = "jdbc:postgresql://localhost:5432/test"
db.default.user = "user"
db.default.password = "pass"
# Connection Pool settings
db.default.poolInitialSize = 5
db.default.poolMaxSize = 7
db.default.poolConnectionTimeoutMillis = 1000
The error:
Exception in thread "main" java.lang.IllegalStateException: Connection pool is not yet initialized.(name:'default)
at scalikejdbc.ConnectionPool$$anonfun$get$1.apply(ConnectionPool.scala:57)
at scalikejdbc.ConnectionPool$$anonfun$get$1.apply(ConnectionPool.scala:55)
at scala.Option.getOrElse(Option.scala:120)
at scalikejdbc.ConnectionPool$.get(ConnectionPool.scala:55)
at scalikejdbc.ConnectionPool$.apply(ConnectionPool.scala:46)
at scalikejdbc.NamedDB.connectionPool(NamedDB.scala:20)
at scalikejdbc.NamedDB.db$lzycompute(NamedDB.scala:32)
What did I miss?
To load application.conf, scalikejdbc-config's DBs.setupAll() should be called in advance.
http://scalikejdbc.org/documentation/configuration.html#scalikejdbc-config
https://github.com/scalikejdbc/hello-scalikejdbc/blob/9d21ec7ddacc76977a7d41aa33c800d89fedc7b6/test/settings/DBSettings.scala#L3-L22
In my case I omit play.modules.enabled += "scalikejdbc.PlayModule" in conf/application.conf using ScalikeJDBC Play support...

Piglatin jodatime error with StanfordCoreNLP

I am trying to create a Pig UDF that extracts the locations mentioned in a tweet using the Stanford CoreNLP package interfaced through the sista Scala API. It works fine when run locally with 'sbt run', but throws a "java.lang.NoSuchMethodError" exception when called from Pig:
Loading default properties from tagger
edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger
Reading POS tagger model from
edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger
Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz
2013-06-14 10:47:54,952 [communication thread] INFO
org.apache.hadoop.mapred.LocalJobRunner - reduce > reduce done [7.5
sec]. Loading classifier from
edu/stanford/nlp/models/ner/english.muc.7class.distsim.crf.ser.gz ...
2013-06-14 10:48:02,108 [Low Memory Detector] INFO
org.apache.pig.impl.util.SpillableMemoryManager - first memory handler
call - Collection threshold init = 18546688(18112K) used =
358671232(350264K) committed = 366542848(357952K) max =
699072512(682688K) done [5.0 sec]. Loading classifier from
edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz
... 2013-06-14 10:48:10,522 [Low Memory Detector] INFO
org.apache.pig.impl.util.SpillableMemoryManager - first memory handler
call- Usage threshold init = 18546688(18112K) used =
590012928(576184K) committed = 597786624(583776K) max =
699072512(682688K) done [5.6 sec]. 2013-06-14 10:48:11,469 [Thread-11]
WARN org.apache.hadoop.mapred.LocalJobRunner - job_local_0001
java.lang.NoSuchMethodError:
org.joda.time.Duration.compareTo(Lorg/joda/time/ReadableDuration;)I
at edu.stanford.nlp.time.SUTime$Duration.compareTo(SUTime.java:3406)
at edu.stanford.nlp.time.SUTime$Duration.max(SUTime.java:3488) at
edu.stanford.nlp.time.SUTime$Time.difference(SUTime.java:1308) at
edu.stanford.nlp.time.SUTime$Range.(SUTime.java:3793) at
edu.stanford.nlp.time.SUTime.(SUTime.java:570)
Here is the relevant code:
object CountryTokenizer {
def tokenize(text: String): String = {
val locations = TweetEntityExtractor.NERLocationFilter(text)
println(locations)
locations.map(x => Cities.country(x)).flatten.mkString(" ")
}
}
class PigCountryTokenizer extends EvalFunc[String] {
override def exec(tuple: Tuple): java.lang.String = {
val text: java.lang.String = Util.cast[java.lang.String](tuple.get(0))
CountryTokenizer.tokenize(text)
}
}
object TweetEntityExtractor {
val processor:Processor = new CoreNLPProcessor()
def NERLocationFilter(text: String): List[String] = {
val doc = processor.mkDocument(text)
processor.tagPartsOfSpeech(doc)
processor.lemmatize(doc)
processor.recognizeNamedEntities(doc)
val locations = doc.sentences.map(sentence => {
val entities = sentence.entities.map(List.fromArray(_)) match {
case Some(l) => l
case _ => List()
}
val words = List.fromArray(sentence.words)
(words zip entities).filter(x => {
x._1 != "" && x._2 == "LOCATION"
}).map(_._1)
})
List.fromArray(locations).flatten
}
}
I am using sbt-assembly to construct a fat-jar, and so the joda-time jar file should be accessible. What is going on?
Pig ships with its own version of joda-time (1.6), which is incompatible with 2.x.