I am unable to copy sample data to populate a table Coginity pro from Redshift - sql

I have been trying to copy data to a table in my Coginity Pro but I get the error message below .
I have copied my ARN from redshift and pasted it in the relevant path but I still could not populate the sample data to the tables already created in coginity Pro
below is the error message
Status: ERROR
copy users from 's3://awssampledbuswest2/tickit/allusers_pipe.txt'
credentials 'aws_iam_role='
delimiter '|' region 'us-west-2'
36ms 2022-11-28T02:23:51.059Z
(SQLSTATE: 08006, SQLCODE: 0): An I/O error occurred while sending to the backend.

#udemeribe . Please check STL_LOAD_ERRORS ( order by date_field(starttime)) table

Related

AsterixDB ERROR: Code: 1 "HYR0010: Node asterix_nc2 does not exist" M1 Mac

I'm trying to set up a sample cluster with asterixDB on my M1 mac. I have my environment up and running and I am able to successfully make SQL queries with the following code:
drop dataverse csv if exists;
create dataverse csv;
use csv;
create type csv_type as {
lat: int32,
long: int32
};
create dataset csv_set (csv_type)
primary key lat;
However, when I try to load the dataset with a CSV file it seems to brick my sample cluster and throws the error: Error Code: 1 "HYR0010: Node asterix_nc2 does not exist". The code which causes this is below.
use csv;
load dataset csv_set using localfs
(("path"="127.0.0.1:///Users/nicholassantini/Downloads/test.csv"),
("format"="delimited-text"));
Thus far I have tried both java's newest release of version 18 and 17.0.3 as well as a variety of ports for the queries. I'm not sure what else to try. Some logs that I think are relevant say that it is failing to connect to the node. Not sure if that's an issue with the port or the node itself. Here is a snippet of those logs.
image.png
Also in case it matters, my CSV is a simple 2 column 2 row file with all single-digit integer values.
I appreciate any and all help.
After consulting the developer help email thread, I was able to find that the issue stems from the release of asterixDB that I was using (0.9.7.1). Upgrading to the newest release(0.9.8) fixed this issue.
The link can be found here:
https://ci-builds.apache.org/job/AsterixDB/job/asterixdb-snapshot-integration/lastSuccessfulBuild/artifact/asterixdb/asterix-server/target/asterix-server-0.9.8-SNAPSHOT-binary-assembly.zip

AWS Glue - getSink() is throwing "No such file or directory" right after glue_context.purge_s3_path

I am trying to purge a partition of a glue catalog table and then recreate the partition using getSink option (similar to truncate/load partition in database)
For purging the partition , I am using glueContext.purge_s3_path option with retention period = 0 . The partition is getting purged successfully .
self._s3_path=s3://server1/main/transform/Account/local_segment/source_system=SAP/
self._glue_context.purge_s3_path(
self._s3_path,
{"retentionPeriod": 0, "excludeStorageClasses": ()}
)
Here Catalog database = Account , Table = local_segment , Partition_key = source_system
However when I am trying to recreate the partition right after the purge step , I am getting "An error occurred while calling o180.pyWriteDynamicFrame. No such file or directory" from getSink writeFrame .
If I remove the purge part then getSink is working fine and is able to create the partition and write the files .
I even tried "MSCK REPAIR TABLE" in between purge and getSink but no luck .
Shouldn't getSink create a partition if it does not exist i.e. purged from previous step ?
target = self._glue_context.getSink(
connection_type="s3",
path=self._s3_path_prefix,
enableUpdateCatalog=True,
updateBehavior="UPDATE_IN_DATABASE",
partitionKeys=["source_system"]
)
target.setFormat("glueparquet")
target.setCatalogInfo(
catalogDatabase=f"{self._target_database}",
catalogTableName=f"{self._target_table_name}"
)
target.writeFrame(self._dyn_frame)
Where -
self._s3_path_prefix = s3://server1/main/transform/Account/local_segment/
self._target_database = Account
self._target_table_name = local_segment
Error Message :
An error occurred while calling o180.pyWriteDynamicFrame. No such file or directory 's3://server1/main/transform/Account/local_segment/source_system=SAP/run-1620405230597-part-block-0-0-r-00000-snappy.parquet'
Try to check if you have permission for this object on s3. I got the same error and once I configured the object to be public (just for test), it worked. So maybe it’s a new object and your process might not have access.

Google - BigQuery Location Error (dataset was not found in location europe-west1)

I am running a query on BigQuery and everything runs properly. However, when trying to save this query as a new table (in BQ) I get the following error:
enter image description here
"Not found: Dataset mycompany-data:google_analytics_de was not found in location europe-west1 "
The aforementioned table location on BigQuery is europe-west1.
I have double-checked for spelling errors and access permissions, but this error persists whatsoever.
Can you assist me with this?

LeaseAlreadyPresent Error in Azure Data Factory V2

I am getting the following error in a pipeline that has Copy activity with Rest API as source and Azure Data Lake Storage Gen 2 as Sink.
"message": "Failure happened on 'Sink' side. ErrorCode=AdlsGen2OperationFailed,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=ADLS Gen2 operation failed for: Operation returned an invalid status code 'Conflict'. Account: '{Storage Account Name}'. FileSystem: '{Container Name}'. Path: 'foodics_v2/Burgerizzr/transactional/_567a2g7a/2018-02-09/raw/inventory-transactions.json'. ErrorCode: 'LeaseAlreadyPresent'. Message: 'There is already a lease present.'. RequestId: 'd27f1a3d-d01f-0003-28fb-400303000000'..,Source=Microsoft.DataTransfer.ClientLibrary,''Type=Microsoft.Azure.Storage.Data.Models.ErrorSchemaException,Message=Operation returned an invalid status code 'Conflict',Source=Microsoft.DataTransfer.ClientLibrary,'",
The pipeline runs in a for loop with Batch size = 5. When I make it sequential, the error goes away, but I need to run it in parallel.
This is known issue with adf limitation variable thread parallel running.
You probably trying to rename filename using variable.
Your option is to run another child looping after each variable execution.
i.e. variable -> Execute Pipeline
enter image description here
or
remove those variable, hard coded those variable expression in azure activity.
enter image description here
Hope this helps

java.lang.ClassNotFoundException: org.apache.hadoop.hive.ql.io.HivePassThroughOutputFormat

I have followed this link for installing shark on CDH5. I have installed it but as it also mentioned on the above block:-
This -skipRddReload is only needed when you have some table with hive/hbase mapping, because of some issus in PassthroughOutputFormat by hive hbase handler.
the error message is something like:
"Property value must not be null"
or
"java.lang.ClassNotFoundException: org.apache.hadoop.hive.ql.io.HivePassThroughOutputFormat"
I have created an external table in hive to access Hbase table and when i tried shark with -skipRddReload ,shark gets started but when i tred to access the same external table within shark getting error
java.lang.ClassNotFoundException: org.apache.hadoop.hive.ql.io.HivePassThroughOutputFormat
Is there any solution to get rid of this ?
EDIT
Hbase to hive with
CREATE EXTERNAL TABLE abc (key string,LPID STRING,Value int,ts1 STRING,ts2 STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES
("hbase.columns.mapping" = ":key,cf1:LPID,cf1:Value,cf1:ts1,cf1:ts2")
TBLPROPERTIES("hbase.table.name" = "abc");
This abc i wanted to access in shark,any solution ?