Import xx.sql file to execute by using Ebean - sql

Is there any way to execute SQLs directly read from SQL files(xx.sql) in Ebean?
For example, if I had a SQL file including several SQL statements (values already written in the file), is there any way to execute this SQL file by using Ebean?

You have at least two options out of the box:
Play evolutions are meant to updating DB schema, so you can use them also for inserting initial data (if they are flat, and will not contain relations to the objects not created yet), sample evolution for MySQL:
# --- !Ups
INSERT INTO your_table (some_field) VALUES ('New value');
# --- !Downs
DELETE FROM your_table WHERE some_field = 'New value`;
Use Global object and insert initial data using common Ebean's way:
public void onStart(Application app) {
if (YourModel.find.findRowCount() == 0) {
YourModel newItem = new YourModel();
newItem.someField = "New value";
newItem.save();
YourModel newItem2 = new YourModel();
// etc....
}
}
For the second approach you can check the way how to read YAML file holding initial data with Global object of Zentask sample (the file with sample is placed in conf directory)
Edit:
Take a closer look to the initial-data.yml, there are also relations between tasks and projects, so they have fixed id values. So you need do the same in your yaml:
projects:
- !!models.Project
id: 1
name: Play 2.0
folder: Play framework
tasks:
- !!models.Task
title: Fix the documentation
done: false
folder: Todo
project: !!models.Project
id: 1

Related

TypeORM View Entity synchronization (creation) order problems

Using TypeORM, I'm trying to create ViewEntities that depend on each other, for example "View B" select from "View A". No matter what I do I can't get the ViewEntities to get created in the order of dependency. Sometimes "View B" is created first, and the synchronization process fails, because it can't find "View A", since it's not created yet.
The error:
QueryFailedError: relation "public.course_item_view" does not exist
Solutions I have tried:
Renaming the ViewEntity files (to check if the system uses ABC ordering on file names)
Renaming the ViewEntity classes (to check if the system uses ABC ordering on class names)
Renaming the ViewEntity's "name" property (to check if the system uses ABC ordering on the final SQL view names)
Reordering the ViewEntity class references in the "entities: []" array of the connection options
Reordering the ViewEntity class imports in the file where I declare the connection options
Removing/Adding the file again (to check if the system uses Creation Date based ordering)
Modifying the files (to check if the system uses Modification Date based ordering)
All of these failed. I cannot figure out how the system determines the order in which the view's are created.
Any help would be GREATLY appreciated!!
Expected Behavior
The view's should be created in an order that is either specified by a property inside the views, or the order should be resolved automatically from the SELECT statements (dependency array), or it should be based on the order in which I reference the ViewEntities in the "entities: []" array of the connection options, or any other solution would be perfect where one could determine the order in which the ViewEntities are created.
Actual Behavior
The ViewEntites are created in an order that I honestly can't understand. Sometimes a dependent ViewEntity is created before the ViewEntitiy it depends on. This causes the synchronization to fail.
File name: "CourseItemView" which resolves to: "course_item_view"
#ViewEntity({
expression: `
SELECT
"uvcv"."userId",
"uvcv"."courseId",
"uvcv"."videoId",
CAST (null AS integer) AS "examId",
"uvcv"."isComplete" AS "isComplete"
FROM public.video_completed_view AS "uvcv"
UNION ALL
SELECT
"uecv"."userId",
"uecv"."courseId",
CAST (null AS integer) AS "videoId",
"uecv"."examId",
"uecv"."isCompleted" AS "isComplete"
FROM public.user_exam_completed_view AS "uecv"
.
.
File name: "CourseItemStateView" which resolves to: "course_item_state_view"
This DEPENDS on the "course_item_view", as you can see in the SQL
#ViewEntity({
expression: `
SELECT
"course"."id" AS "courseId",
"user"."id" AS "userId",
"civ"."videoId" AS "videoId",
"civ"."isComplete" AS "isVideoCompleted",
"civ"."examId" AS "examId",
"civ"."isComplete" AS "isExamCompleted"
FROM public."course"
LEFT JOIN public."user"
ON 1 = 1
LEFT JOIN public.course_item_view AS "civ" ------------------- HERE
ON "civ"."courseId" = "course"."id"
AND "civ"."userId" = "user"."id"
ORDER BY "civ"."videoId","civ"."examId"
`
})
.
.
My connection options:
const postgresOptions = {
// properties, passwords etc...
entities: [
// entities....
// ...
// ...
// views
VideoCompletedView,
UserExamCompletedView,
UserExamAnswerSessionView,
UserVideoMaxWatchedSecondsView,
CourseItemView, --------------------------------HERE
CourseItemStateView ---------------------------HERE
],
} as ConnectionOptions;
createConnection(postgresOptions )
Steps to Reproduce
Create ViewEntites that depend on each other
You will run into this issue, but is hard to say exactly why and when, this is the main problem.

Bigquery dbt_external_tables External Data Configuration

I need some help when using the dbt_external_tables package.
I realized that in the csv I have in GCS some lines appear to have line breaks and this is causing some issues when trying to query the table created by the macro.
Sometimes when doing this configuration of the external table manually the BigQuery UI has two options:
Allow jagged rows (CSV)
Allow quoted newlines (CSV) true
I usually put those options in true and sometimes the issues are solved.
I don't know how to do this using the dbt_external_tables.
This is important as I am receiving this errors when trying to query the table created by dbt "Error while reading table: kpi-process.file_csv.History, error message: CSV table references column position 9, but line starting at position:10956 contains only 7 columns. "
The dbt-external-tables package supports passing a dictionary of options for BigQuery external tables, which maps to the options documented here. In your case, it sounds like you want to turn on allow_jagged_rows and allow_quoted_newlines, so you can specify them like so:
version: 2
sources:
- name: my_external_source
tables:
- name: my_external_table
location: 'gs://bucket/path/*'
options:
format: csv
allow_jagged_rows: true
allow_quoted_newlines: true
And dbt will template a DDL statement accordingly:
create or replace external statement my_external_source.my_external_table
options (
format = 'csv',
allow_jagged_rows = true,
allow_quoted_newlines = true,
uris = ['gs://bucket/path/*']
)

Terraform: How Do I Setup a Resource Based on Configuration

So here is what I want as a module in Pseudo Code:
IF UseCustom, Create AWS Launch Config With One Custom EBS Device and One Generic EBS Device
ELSE Create AWS Launch Config With One Generic EBS Device
I am aware that I can use the 'count' function within a resource to decide whether it is created or not... So I currently have:
resource aws_launch_configuration "basic_launch_config" {
count = var.boolean ? 0 : 1
blah
}
resource aws_launch_configuration "custom_launch_config" {
count = var.boolean ? 1 : 0
blah
blah
}
Which is great, now it creates the right Launch configuration based on my 'boolean' variable... But in order to then create the AutoScalingGroup using that Launch Configuration, I need the Launch Configuration Name. I know what you're thinking, just output it and grab it, you moron! Well of course I'm outputting it:
output "name" {
description = "The Name of the Default Launch Configuration"
value = aws_launch_configuration.basic_launch_config.*.name
}
output "name" {
description = "The Name of the Custom Launch Configuration"
value = aws_launch_configuration.custom_launch_config.*.name
}
But how the heck do I know from the higher area that I'm calling the module that creates the Launch Configuration and Then the Auto Scaling Group which output to use for passing into the ASG???
Is there a different way to grab the value I want that I'm overlooking? I'm new to Terraform and the whole no real conditional thing is really throwing me for a loop.
Terraform: How to conditionally assign an EBS volume to an ECS Cluster
This seemed to be the cleanest way I could find, using a ternary operator:
output "name {
description = "The Name of the Launch Configuration"
value = "${(var.booleanVar) == 0 ? aws_launch_configuration.default_launch_config.*.name : aws_launch_configuration.custom_launch_config.*.name}
}
Let me know if there is a better way!
You can use the same variable you used to decide which resource to enable to select the appropriate result:
output "name" {
value = var.boolean ? aws_launch_configuration.custom_launch_config[0].name : aws_launch_configuration.basic_launch_config[0].name
}
Another option, which is a little more terse but arguably also a little less clear to a future reader, is to exploit the fact that you will always have one list of zero elements and one list with one elements, like this:
output "name" {
value = concat(
aws_launch_configuration.basic_launch_config[*].name,
aws_launch_configuration.custom_launch_config[*].name,
)[0]
}
Concatenating these two lists will always produce a single-item list due to how the count expressions are written, and so we can use [0] to take that single item and return it.

Can't figure out how to insert keys and values of nested JSON data into SQL rows with NiFi

I'm working on a personal project and very new (learning as I go) to JSON, NiFi, SQL, etc., so forgive any confusing language used here or a potentially really obvious solution. I can clarify as needed.
I need to take the JSON output from a website's API call and insert it into a table in my MariaDB local server that I've set up. The issue is that the JSON data is nested, and two of the key pieces of data that I need to insert are used as variable key objects rather than values, so I don't know how to extract it and put it in the database table. Essentially, I think I need to identify different pieces of the JSON expression and insert them as values, but I'm clueless how to do so.
I've played around with the EvaluateJSON, SplitJSON, and FlattenJSON processors in particular, but I can't make it work. All I can ever do is get the result of the whole expression, rather than each piece of it.
{"5381":{"wind_speed":4.0,"tm_st_snp":26.0,"tm_off_snp":74.0,"tm_def_snp":63.0,"temperature":58.0,"st_snp":8.0,"punts":4.0,"punt_yds":178.0,"punt_lng":55.0,"punt_in_20":1.0,"punt_avg":44.5,"humidity":47.0,"gp":1.0,"gms_active":1.0},
"1023":{"wind_speed":4.0,"tm_st_snp":26.0,"tm_off_snp":82.0,"tm_def_snp":56.0,"temperature":74.0,"off_snp":82.0,"humidity":66.0,"gs":1.0,"gp":1.0,"gms_active":1.0},
"5300":{"wind_speed":17.0,"tm_st_snp":27.0,"tm_off_snp":80.0,"tm_def_snp":64.0,"temperature":64.0,"st_snp":21.0,"pts_std":9.0,"pts_ppr":9.0,"pts_half_ppr":9.0,"idp_tkl_solo":4.0,"idp_tkl_loss":1.0,"idp_tkl":4.0,"idp_sack":1.0,"idp_qb_hit":2.0,"humidity":100.0,"gp":1.0,"gms_active":1.0,"def_snp":23.0},
"608":{"wind_speed":6.0,"tm_st_snp":20.0,"tm_off_snp":53.0,"tm_def_snp":79.0,"temperature":88.0,"st_snp":4.0,"pts_std":5.5,"pts_ppr":5.5,"pts_half_ppr":5.5,"idp_tkl_solo":4.0,"idp_tkl_loss":1.0,"idp_tkl_ast":1.0,"idp_tkl":5.0,"humidity":78.0,"gs":1.0,"gp":1.0,"gms_active":1.0,"def_snp":56.0},
"3396":{"wind_speed":6.0,"tm_st_snp":20.0,"tm_off_snp":60.0,"tm_def_snp":70.0,"temperature":63.0,"st_snp":19.0,"off_snp":13.0,"humidity":100.0,"gp":1.0,"gms_active":1.0}}
This is a snapshot of an output with a couple thousand lines. Each of the numeric keys that you see above (5381, 1023, 5300, etc) are player IDs for the following stats. I have a table set up with three columns: Player ID, Stat ID, and Stat Value. For example, I need that first snippet to be inserted into my table as such:
Player ID Stat ID Stat Value
5381 wind_speed 4.0
5381 tm_st_snp 26.0
5381 tm_off_snp 74.0
And so on, for each piece of data. But I don't know how to have NiFi select the right pieces of data to insert in the right columns.
I believe that it's possible to use jolt to transform your json into a format:
[
{"playerId":"5381", "statId":"wind_speed", "statValue": 0.123},
{"playerId":"5381", "statId":"tm_st_snp", "statValue": 0.456},
...
]
then use PutDatabaseRecord with json reader.
Another approach is to use ExecuteGroovyScript processor.
Add new parameter to it with name SQL.mydb and link it to your DBCP controller service
And use the following script as Script Body parameter:
import groovy.json.JsonSlurper
import groovy.json.JsonBuilder
def ff=session.get()
if(!ff)return
//read flow file content and parse it
def body = ff.read().withReader("UTF-8"){reader->
new JsonSlurper().parse(reader)
}
def results = []
//use defined sql connection to create a batch
SQL.mydb.withTransaction{
def cmd = 'insert into mytable(playerId, statId, statValue) values(?,?,?)'
results = SQL.mydb.withBatch(100, cmd){statement->
//run through all keys/subkeys in flow file body
body.each{pid,keys->
keys.each{k,v->
statement.addBatch(pid,k,v)
}
}
}
}
//write results as a new flow file content
ff.write("UTF-8"){writer->
new JsonBuilder(results).writeTo(writer)
}
//transfer to success
REL_SUCCESS << ff

How to get the reorder the column with csv input fixed column in pentaho

Scenario:
I have created transformation to load data into table from csv file and I have following columns in csv file:
Customer_Id
Company_Id
Employee_Name
But user may give input file with column ordering (random order) as
Employee_Name
Company_Id
Customer_Id
so, if I try to load file which has random column ordering, will kettle load correct column values as per column names ... ?
Using ETL Metadata Injection you can use a transformation like this, to either normalize the data, or to store it to your database:
Then you just need to send the correct data to that transformation. You can read the header line from the CSV, and use Row Normaliser to convert to the format used by ETL Metadata Injection.
I have included a quick example here: csv_inject on Dropbox, if you make something like this and run it from something that runs it per csv file it should work.
Ooh, thats some nasty javascript!
The way to do this is with metadata injection. Look at the samples, but basically you need a template which reads the file, and writes it back out. you then use another parent transformation to figure out the headings, configure that template and then execute it.
There are samples in the PDI samples folder, and also take a look at the "figuring out file format" example in matt casters blueprints project on github.
You could try something like this as your JavaScript:
//Script here
var seen;
trans_Status = CONTINUE_TRANSFORMATION;
var col_names = ['Customer_Id','Company_Id','Employee_Name'];
var col_pos;
if (!seen) {
// First line
trans_Status = SKIP_TRANSFORMATION;
seen = 1;
col_pos = [-1,-1,-1];
for (var i = 0; i < col_names.length; i++) {
for (var j = 0; j < row.length; j++) {
if (row[j] == col_names[i]) {
col_pos[i] = j;
break;
}
}
if (col_pos[i] === -1) {
writeToLog("e", "Cannot find " + col_names[i]);
trans_Status = ERROR_TRANSFORMATION;
break;
}
}
}
var Customer_Id = row[col_pos[0]];
var Company_Id = row[col_pos[1]];
var Employee_Name = row[col_pos[2]];
Here is the .ktr I tried: csv_reorder.ktr
(edit, here are the test csv files)
1.csv:
Customer_Id,Company_Id,Employee_Name
cust1,comp1,emp1
2.csv:
Employee_Name,Company_Id,Customer_Id
emp2,comp2,cust2
Assuming rejecting the input file is not an option you basically have 4 solutions.
reorder the fields in an external editor (don't use excel if it contains dates)
Use code within your transformation to detect the column headers and reorder the file.
Use metadata injection as proposed by bolav
Create a job. This need to:
a. load the file into a temporary database.
b. use an sql statement to retrieve the fields (use a SELECT with an ORDER By clause)
c. output the file in the correct order