I am working on a realtime project using Ontotext GraphDB which requires, the SPARQL queries are inserting into GRAPHDB every seconds, which is working perfectly. I am able to achieve that using RESTful API webapp with Flask. Here, my question is while doing the SELECT query in order to select the specific data, is it possible to retrieve the timestamp of the INSERT Query happened to GraphDB and if possible, how to do a select query for getting specific data between a time period. Thanks in advance.
You could use GraphDB history plugin. You may read more about it using following link: https://graphdb.ontotext.com/documentation/enterprise/data-history-and-versioning.html?highlight=history%20plugin
Related
I am new to superset and wanted to know if there's any way to perform data streaming in big query using apache superset? Currently, I have set up the database in apache superset with big query but when I update the table data using SQL commands in bigquery it doesn't reflect in superset. Is there any way to get the streaming of data to superset?
I've looked around the Apache Superset documentation and couldn't find anything related to "streaming data" from a source, what I think in this scenario is happening, is that you have a dashboard which use the data from the table you have in BigQuery, and after adding some new information to the table you expect that change to be reflected in the dashboard automatically.
Based on this my theory is that Superset saves the result of your query on memory or it could be using BigQuery cached results which may not allow the dashboard to automatically update the data and see the changes made. My suggestion is to either run again the query for your table to try to get the latest data. On the other hand, if Superset use cached results, you'll have to take a look at the configuration used for Superset for BigQuery looking for a way to disable it.
I am creating a web app in which there is an analytical dashboard that is allowed to generate a query. This query will be forward to backend server through an exposed API. The controller (written in Node.js) behind this API will execute this query on AWS Athena to fetch the required data.
Now the problem is that how should I send query to backend server. Should I use JSON format? Then at backend How should I convert JSON to SQL Query? Do I need to write custom solution or is there any supported library available?
Is there any better way of doing this?
I have tried some JavaScript libraries like JSON-SQL, JSON-SQL-Builder2 but these doesn't support the format of Query that will be executed by Athena. Athena uses Presto engine to run a query.
Good evening,
If your problem is sending a query to the database from Node, then it seems like the AWS SDK for JavaScript in Node.js.
Your workflow will likely be something like:
Start Query Execution
Get Query Execution
Get Data from S3
I'm trying to create an API with GraphQL on top of Hive, has anyone done something of the sort?
Thanks
I've been working on this for the past week. I've gotten close to it working by leveraging Graphene, Graphene-SQLAlchemy, and PyHive but the queries aren't generating in a way that Hive likes (due to aliasing of the columns by Graphene-SQLAlchemy is my suspicion).
It definitely looks possible - just need to figure out this last bit and it should work.
I want to design Web UI which fetches data from HDFS. I want to generate some reports using this data which is stored in HDFS. I have my own custom reports format. I am writing REST API's to fetch data. But running HIVE queries gives latency issues Hence I want different approach for this, I could think of two.
Using IMPALA to create tables. But I am not sure about REST support for IMPALA.
Using HIVE but instead of MR use SPARK as execution engine. .
spark-job-server provides REST support, and fetch data with SPARK-SQL.
Which of the approach will be suitable or is there any better approach for this?
Please can anyone help as I am very new in this.
I'd prefer to choose impala if latency is the main consideration. It's dedicated to SQL processing on hdfs and does it well. About REST api and the application logic you are achieving, this seems to be a good example
I am trying to identify slow queries in a large-scale Django 1.3 web application. As it is kind of difficult to match the raw sql query in the slow query log with the specific ORM statement in the code, I wondered if it is possible to add a SQL comment to the query constructed with the ORM, something like..
Object.objects.filter(Q(pub_date__lte=datetime.now)).comment('query no. 123')
Solution found by using .extra() for raw SQL commands on the django-user mailinglist:
Object.objects.filter(Q(pub_date__lte=datetime.now()).extra(where=['1=1 /* query no. 123 */'])
For those reading in 2022 onwards - there is a much better answer these days:
Google's sqlcommenter project has a Django middleware
[A] Django middleware whose purpose is to augment a SQL statement right before execution, with information about the controller and user code to help with later making database optimization decisions, after those statements are examined from the database server’s logs.