querying google fusion table - sql

I have a Google fusion table with 3 row layouts as shown below:
We can query the fusion table as,
var query = new google.visualization.Query("https://www.google.com/fusiontables/gvizdata?tq=select * from *******************");
which select the data from the first row layout ie Rows 1 by default. Is there any way that we can query the second or 3rd Row layout of a fusion table?

API queries apply to the actual table data. The row layout tabs are just different views onto that data. You can get the actual query being executed for a tab with Tools > Publish; the HTML/JavaScript contains the FusionTablesLayer request.
I would recommend using the regular Fusion Tables APi rather than the gvizdata API because it's much more flexible and not limited to 500 response rows.

The documentation for querying a Fusion Tables source has not been updated yet to account for the new structure, so this is just a guess. Try appending #rows:id=2 to the end of your table id:
select * from <table id>#rows:id=2

A couple of things:
Querying Fusion Tables with SQL is deprecated. Please see the porting guide.
Check out the Working With Rows part of the documentation. I believe this has your answers.

Related

Push data set alternate to Direct query

I have created a powerBI using Direct query to get live reports. But when changing slicer filter in visual it takes long time to load data in visual table. So I searched for alternate to direct query and found Push dataset. But on analysis, seems push dataset uses API for streaming. Is it possible to use the push dataset for the below select query(Have 15 columns and 20k rows) as simple as direct query.
EX: select * from persons p
left join students s on s.id=p.id

Finding the query that created a table in BigQuery

I am a new employee at the company. The person before me had built some tables in BigQuery. I want to investigate the create table query for that particular table.
Things I would want to check using the query is:
What joins were used?
What are the other tables used to make the table in question?
I have not worked with BigQuery before but I did my due diligence by reading tutorials and the documentation. I could not find anything related there.
Brief outline of your actions below:
Step 1 - gather all query jobs of that user using Jobs.list API - you must have Is Owner permission for respective projects to get someone else's jobs
Step 2 - extract only those jobs run by the user you mentioned and referencing your table of interest - using destination table attribute
Step 3 - for those extracted jobs - just simply check respective queries which allow you to learn how that table was populated
Hth!
I have been looking for an answer since a long time.
Finally found it :
Go to the three bars tab on the left hand side top
From there go to the Analytics tab.
Select BigQuery under which you will find Scheduled queries option,click on that.
In the filter tab you can enter the keywords and get the required query of the table.
For me, I was able to go through my query history and find the query I used.
Step 1.
Go to the Bigquery UI, on the bottom there are personal history and project history tabs. If you can use the same account used to execute the query I recommend personal history.
Step 2.
Click on the tab and there will be a list of queries ordered from most recently run. Check the time the table was created and find a query that ran before the table creation time.
Since the query will run first and create the table there will be slight differences. For me it stayed between a few seconds.
Step 3.
After you find the query used to create the table, simply copy it. And you're done.

Spotfire - Information Link - Filter Not Working

I am a beginner in Spotfire. I developed a simple information link.
Steps
I Created 2 tables by adding columns.
Then created Joins. 3 simple inner joins on the above table. The reason for 3 joins is it makes the query run faster than only 1 join.
Then created an information link by adding elements and joins.
This works perfectly well. The data is fetched properly. But as soon as I add filter, it stops working.
I tried
Creating Filter -> and then adding as element to the information link
Adding filter in the column filter itself: Column E_ID - Expression %1 = 1000
Editing the sql query in the information link. I added one more
clause in the where section: AND E1."E_ID" = 1000
None of these work. If I remove the filter, its working perfectly fine. The filter is on the same column on which on of the join is based.
Please suggest where I am making mistake.
Too long to comment...
So, I've noticed joins in the information designer can be cumbersome. It's convenient for people who don't have access to the data source, but if you do have access to the data source (as you do in this scenario), I would handle all of the logic on the DB server side. Thus, you are just supplying Spotfire with a flat file which it can easily ingest and create visualizations on. This will prevent Spotfire from bogging down with data transformations as well.
With that being said, I would also recommend using Stored procedures to serve up the data to Spotfire. Here are a couple of answers I posted on why which will make your life easier.
https://stackoverflow.com/a/38247931/6167855
https://stackoverflow.com/a/39640197/6167855
https://stackoverflow.com/a/43523380/6167855
https://stackoverflow.com/a/38247931/6167855

Bigquery and Tableau

I attached Tableau with Bigquery and was working on the Dash boards. Issue hear is Bigquery charges on the data a query picks everytime.
My table is 200GB data. When some one queries the dash board on Tableau, it runs on total query. Using any filters on the dashboard it runs again on the total table.
on 200GB data, if someone does 5 filters on different analysis, bigquery is calculating 200*5 = 1 TB (nearly). For one day on testing the analysis we were charged on a 30TB analysis. But table behind is 200GB only. Is there anyway I can restrict Tableau running on total data on Bigquery everytime there is any changes?
The extract in Tableau is indeed one valid strategy. But only when you are using a custom query. If you directly access the table it won't work as that will download 200Gb to your machine.
Other options to limit the amount of data are:
Not calling any columns that you don't need. Do this by hiding unused fields in Tableau. It will not include those fields in the query it sends to BigQuery. Otherwise it's a SELECT * and then you pay for the full 200Gb even if you don't use those fields.
Another option that we use a lot is partitioning our tables. For instance, a partition per day of data if you have a date field. Using TABLE_DATE_RANGE and TABLE_QUERY functions you can then smartly limit the amount of partitions and hence rows that Tableau will query. I usually hide the complexity of these table wildcard functions away in a view. And then I use the view in Tableau. Another option is to use a parameter in Tableau to control the TABLE_DATE_RANGE.
1) Right now I learning BQ + Tableau too. And I found that using "Extract" is must for BQ in Tableau. With this option you can also save time building dashboard. So my current pipeline is "Build query > Add it to Tableau > Make dashboard > Upload Dashboard to Tableau Online > Schedule update for Extract
2) You can send Custom Quota Request to Google and set up limits per project/per user.
3) If each of your query touching 200GB each time, consider to optimize these queries (Don't use SELECT *, use only dates you need, etc)
The best approach I found was to partition the table in BQ based on a date (day) field which has no timestamp. BQ allows you to partition a table by a day level field. The important thing here is that even though the field is day/date with no timestamp it should be a TIMESTAMP datatype in the BQ table. i.e. you will end up with a column in BQ with data looking like this:
2018-01-01 00:00:00.000 UTC
The reasons the field needs to be a TIMESTAMP datatype (even though there is no time in the data) is because when you create a viz in Tableau it will generate SQL to run against BQ and for the partitioned field to be utilised by the Tableau generated SQL it needs to be a TIMESTAMP datatype.
In Tableau, you should always filter on your partitioned field and BQ will only scan the rows within the ranges of the filter.
I tried partitioning on a DATE datatype and looked up the logs in GCP and saw that the entire table was being scanned. Changing to TIMESTAMP fixed this.
The thing about tableau and Big Query is that tableau calculates the filter values using your query ( live query ). What I have seen in my project logging is, it creates filters from your own query.
select 'Custom SQL Query'.filtered_column from ( your_actual_datasource_query ) as 'Custom SQL Query' group by 'Custom SQL Query'.filtered_column
Instead, try to create the tableau data source with incremental extracts and also try to have your query date partitioned ( Big Query only supports date partitioning) so that you can limit the data use.

How to fetch data for a news feed like system?

I have few tables as shown below
Polls
PollId Question Option
1 What 1
2 Why 4
Updates
UpdateId Text
1 Sleep
2 Play
Polls and updates are just two sample tables (In reality there are more tables like ,photos, videos,links etc). But when a user visit his home (like facebook new feed) he must be displayed with data relevant to him (no such data included in this example). ie I want to select data from all tables with less number of query executions. (ie, I want to present a mixture of datas, ie polls, photos, videos etc )
Currently, I'm fetching only ids and type (ie which table) from all of the tables and gather further data while iterating through this resultset. (ie from c# calling another SqlQuery) .
Is there a way to query the data from whole tables at once? (OUTER JOIN?, UNION?)
Or simply,
How can I select different type of entities at once in a single sql Query?
You could write your query so that you have one long select list for everything you want and it all comes back in one result set but I suspect that wouldn't work too well because you might have varying numbers of different types of items per user.
If you really must have it all in one hit then you can issue multiple queries in one go and get multiple result sets back. To handle this you can use an ADO.Net DataSet. See this SO example (but not the accepted answer - see Vikram Dibyal's answer as that gives a very basic overview of what I think you're asking for).
I won't copy and paste the stuff from the linked thread, just head over and take a look.