How to associate Pentaho Carte Object Id of a Job with the corresponding channel_id in the log Database? - pentaho

I'm trying to set up a Carte server with logging capabilities.
I have created a logging database, so each job run by Carte or Kitchen gets logged to this Database. The problem is that each time i run a Carte job using the rest api (/kettle/executeJob/?job=/jobs/myjob.kjb&level=Basic) it gets generated a Carte Object Id, but I can't associate this object Id to any field in the log database. I see that in the database there is a field called log_channel_id, but when i call the endpoint (/kettle/jobStatus/) the field log_channel_id in the http response is empty, so I'm not being able to associate the Carte Object Id of a job with any field in the logging database.

Related

Azure Data Factory Copy Activity - Does not fail, despite being unsuccessful

Overview
I have a Data Factory copy activity in a pipeline that moves tabular data (.csv) from a file in Azure Data Lake into a SQL database-landing schema.
The table definitions will be refreshed each load, in principle, and so there should not be errors. However, I want to log errors in the case that the table exists and the file cannot be mapped to the destination.
I anticipate the activity failing when it cannot match the column to the destination table definition. Strangely my process succeeds despite not loading the file into the database.
Testing
I added a column in my source file.
I did not update the destination database table definition.
I truncated the destination table.
I unset fault tolerance.
Deleted all copy activity logs (nascent development environment, so this is OK).
Run the pipeline.
Check Data Factory pipeline Output, 100% success rating!?
Query destination table (empty).
Open each log file and look for relevant information (nothing just headers per copy activity call).
Copy activity settings

Error in SSMS when running query from SQL On-Demand endpoint

I am attempting to pull in data from a CSV file that is stored in an Azure Blob container and when I try to query the file I get an error of
File 'https://<storageaccount>.blob.core.windows.net/<container>/Sales/2020-10-01/Iris.csv' cannot be opened because it does not exist or it is used by another process.
The file does exist and as far as I know of it is not being used by anything else.
I am using SSMS and also a SQL On-Demand endpoint from Azure Synapse.
What I did in SSMS was run the following commands after connecting to the endpoint:
CREATE DATABASE [Demo2];
CREATE EXTERNAL DATA SOURCE AzureBlob WITH ( LOCATION 'wasbs://<container>#<storageaccount>.blob.core.windows.net/' )
SELECT * FROM OPENROWSET (
BULK 'Sales/2020-10-01/Iris.csv',
DATA_SOURCE = 'AzureBlob',
FORMAT = '*'
) AS tv1;
I am not sure of where my issue is at or where to go next. Did I mess up anything with creating the external data source? Do I need to use a SAS token there and if so what is the syntax for that?
#Ubiquitinoob44, you need to create a database credential:
https://learn.microsoft.com/en-us/azure/synapse-analytics/sql/develop-storage-files-storage-access-control?tabs=shared-access-signature
I figured out what the issue was. I haven't tried Armando's suggestion yet.
First I had to go to the container and edit IAM policies to give my Active Directory login a Blob Data Contributor role. The user to give access to will be your email address for logging in to your portal.
https://learn.microsoft.com/en-us/azure/storage/common/storage-auth-aad-rbac-portal?toc=/azure/synapse-analytics/toc.json&bc=/azure/synapse-analytics/breadcrumb/toc.json
After that I had to re-connect to the On-Demand endpoint in SSMS. Make sure you login through the Azure AD - MFA option. Originally I was using the On-Demand endpoint username and password which was not given access to the Blob Data Contributor role for the container.
https://learn.microsoft.com/en-us/azure/synapse-analytics/sql/resources-self-help-sql-on-demand

Cannot view or delete a table in BigQuery when it is named/saved as "test"

I have created a query destination table that resulted in this message even though I saved the results of a query, and did not import data using the automatic schema detection feature.
I named the table "test".
"The schema for this table was automatically detected. If the schema
was not correctly detected, you can re-run the load job with an
adjusted schema. Dismiss."
The table is displayed in the dataset, but when I try to access it the error message displays and there is no data present:
Unable to find table: [removed]:[removed].test Dismiss
Also, when I try to delete it via the web UI nothing happens, and the network request in the browser returns a 404.
Why can I not view or delete this table?

Tracking changes in table

I m working with pgsql .I want to save the audit record to any file system(spread sheet,word...). ie, I have a web application. Any changes(insert,delete,update) occur in the app, will recorded in the audit logg table.But no of tables are in db also each table have more than 5000 rows. so it is difficult (bulk data)to save audit logg as a table.So I want to save audit log as a file in pgSQL. How can it implement?
Thankyou..
Veena.I have worked on pgsql past 2 years back.To my knowledge,To configure a PostgreSQL database as a standalone audit log database or to save audit file, just follow this.firstly Gather database information after that create the audit store schema and configure a PostgreSQL Server data source for CA SiteMinder and Point the Policy Server to the database finally restart the policy server.
You can create the logging schema so the pgsql server database can store audit logs.
To create audit logs,Open sm_postgresql_logs.sql in a text editor and copy the contents of the entire file and start a SQL client, such as psql, and log in as the user who administers the Policy Server database.Select the database instance from the database list and paste the schema from sm_postgresql_logs.sql into the query after that execute the query.
The audit log store schema is created in the database.
Hope this will help you.

What happens when bigquery upload job fails after loaded a portion of the JSON file?

As the title mentioned, what happens when I start a bigquery upload job and, let's say, after loading 50% of the rows in the JSON file the job failed. Does bigquery rollback everything of the load job or am I left with 50% of the data loaded?
I am appending data daily into a single table and keeping duplicate-free is very important. We are using the HTTP Rest API
BigQuery appends data atomically. You will never get half of the data in the table if the load fails. If the job completes successfully, all of the data will show up at once.
There are two additional tricks you can use to prevent duplicates:
Specify a job id for the load job. Imagine you pull your network cable mid way through starting the job... how do you know whether it succeeded? Specifying a job id lets you look up the job later if the job creation request fails.
Perform your loads to a temporary table, and specify WRITE_TRUNCATE as the writeDisposition. This means that you can run import jobs idempotently to the temporary table, and if you don't know whether a job succeeded, just run another one, and it will overwrite the data. Once you have a load job that completes successfully, run a table copy job with a writeDisposition to WRITE_APPEND to append the new data to your main table.