Redshift UDF boto sql query and S3

Redshift UDF boto sql query and S3 - sql

Is there a way to use UDF in Redshift, execute a SQL query and upload the result to AWS S3 ? Would really appreciate if someone knows how to this.
Thanks

To create a UDF in Redshift, you can use python. You can then call the function in a SQL SELECT statement. To output the results of a query to a file in S3, you can use the UNLOAD statement.

Related

Deleting a file from S3 does not delete it from Athena?

When I add a file to S3, run a query against Athena, Athena returns the expected result with the data from this file.
Now if I then delete that same file from S3 and run the same query, Athena still returns the same data even though the file is not in S3 anymore.
Is this the expected behaviour? I thought Athena calls out to S3 on every query, but I'm now starting to think there is some sort of caching going on?
Does anyone have any ideas? I can't find any information online about this.
Thanks for the help in advance!

Athena (Hive)/Glue load partitions with a frequency. If you want to load latest result you need run
MSCK REPAIR TABLE table_name;
to refresh Athena caches.

Thanks for the help guys.
I actually was looking at the wrong files in S3 and the files I thought were removed were still present. Once I deleted them from S3, the query against Athena returned the expected results immediately.
Thanks!

how create stored procedure in athena aws from s3?

I need to create procedural logic using data stored in aws s3 from athena or glue.
actually it is migrating a stored procedure from sql server to aws, but I don't know what aws service or where to do it with, it doesn't use database but s3 tables.
Thank you very much for guiding me on how to do it.

Athena doesn't support stored procedures, but however you can leverage UDFs to define the same logic as in your source stored procedure.
Below is the syntax for an UDF and refer to this for more information:
USING EXTERNAL FUNCTION UDF_name(variable1 data_type[, variable2 data_type][,...])
RETURNS data_type
LAMBDA 'lambda_function'
SELECT [...] UDF_name(expression) [...]

AWS Redshift - Frequently accessed/queried tables

Is there a way for us to check how frequently a table has been accessed/queried in AWS redshift?
Frequency can be daily/monthly/every hour or whatever.. Can some one help me?
It could be sql queries using system tables from AWS Redshift or some python script. What is the best way?

I have a lambda function in which I am fetching a csv file from s3 now I want to run SQL query on that data in node.js?

I have a lambda function in which I am fetching a csv file from s3 now I want to run SQL query on that csv or query on JSON(after converting csv into JSON) which is best and easiest approch for this in node.js. As I want to use group by query so S3 select is not possible?

I found module "querycsv" in python , so I changed environment of code to Python. https://pythonhosted.org/querycsv/

Take a look at AWS Athena which helps you to run more complex querieson the files in S3.

Pig Script Create Table

I've been trying to store csv data into a table in a database using a pig script.
But instead of inserting the data into a table in a database I created a new file in the metastore.
Can someone please let me know if it is possible to insert data into a table in a database with a pig script, and if so what that script might look like?

You can take a look at DBStorage, but be sure to include the JDBC jar in your pig script and declaring the UDF.
The documentation for the storage UDF is here:
http://pig.apache.org/docs/r0.12.0/api/org/apache/pig/piggybank/storage/DBStorage.html

you can use:
STORE into tablename USING org.apache.hcatalog.pig.HCatStorer()

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Redshift UDF boto sql query and S3 - sql

Is there a way to use UDF in Redshift, execute a SQL query and upload the result to AWS S3 ? Would really appreciate if someone knows how to this. Thanks

To create a UDF in Redshift, you can use python. You can then call the function in a SQL SELECT statement. To output the results of a query to a file in S3, you can use the UNLOAD statement.

Related

Deleting a file from S3 does not delete it from Athena?

how create stored procedure in athena aws from s3?

AWS Redshift - Frequently accessed/queried tables

I have a lambda function in which I am fetching a csv file from s3 now I want to run SQL query on that data in node.js?

Pig Script Create Table

Categories

Resources