I recently linked my GA4 property to bigquery to better look at the analytics data. That was initially on daily, so every day the data was exported from Google Analytics to Bigquery. However, I decided that streaming is necessary so I switched from daily to streaming in the BigQuery Linking section of GA4's admin tab. However, that streaming data is not showing up after a few hours. I'm wondering if anyone has done this with similar problems. Do I need to recreate an entire bigquery project?
If you look at your configuration options for GA to BigQuery, you will see a message under the streaming option.
This option will take effect after the next date boundary (tomorrow) for this property.
This property = the "Data exported continuously" option (streaming)
You will probably see your data tomorrow.
Related
I have an app that sending me data from an API. The data is semi-structured (json data)
I would like to send this data to Google Big Query in order to stock all the information.
However, I'm not able to find how can I do it properly.
So far I have used Node JS on my own server to get the data using POST request.
Could you please help me ? Thnak.
You can use bigquery API to do streaming inserts.
You can also write the data to PubSub or Google Cloud Storage and use dataflow pipelines to load them into bigquery (you can either use streaming inserts (incur costs) or batch load jobs (free))
You can also log in stackdriver and from there you can select and send to bigquery (there already exists direct options for it in GCP, note that under the hood it performs streaming inserts)
If you feel that setting up dataflow is complicated, you can store your files and perform batch load jobs by directly calling bigquery API. Note that there are limits on number of batch loads you can make in a day over a particular table (1000 per day)
There is a page in the official documentation that lists all the possibilities of loading data to BigQuery.
For the simplicity, you can just send data from your local data soruce. You should use the Google Cloud client libraries for Big Query. Here you have a guide on how to do that as well as a relevant code example.
But my honest recommendation is to send data to Google Cloud Storage and from there, to load it to BigQuery. This way the whole process will be more stable.
You can check all the options from the first link that I've posted and choose what you think that will fit best with your workflow.
Keep in mind the limitations of this process.
To track BQ usage we created a new dataset and configured it in Billing export. But after waiting for a day also the dataset seems to be empty as no new tables is created.
Is there any other setup needs to be done for this to work.
Refer this link,
https://cloud.google.com/billing/docs/how-to/export-data-bigquery
Thanks and regards,
Gour
You just need to follow the How to enable billing export to BigQuery steps to begin using this functionality. Keep in mind that you have to wait certain time to start seeing your data, as mentioned in the Export Billing Data to BigQuery documentation.
After you enable BigQuery export, it might take a few hours to start seeing your data. Billing data automatically exports your data to BigQuery in regular intervals, but the frequency of updates in BigQuery varies depending on the services you're using.
In case you continue having this issue, I recommend you to take a look the Issue Tracker tool that you can use to raise a BigQuery ticket in order to verify this scenario with the Google Technical Support Team. Since this is an automated process, you might need some of their help to review your Project's internal configuration.
Excuse me for maybe a not very precise question, but I just need to check if I am missing something or it really is some kind of problem with Google Cloud (GC) BigQuery.
I've got this Java program that reads from a website and publish the data into a GC Pub/Sub Topic; a pipeline is conveniently up, pulling the message from Pub/Sub and sending it to BigQuery via the template job offered in GC Dataflow. In the end, a DataStudio dashboard is getting the data from the BigQuery table and building up its charts and all...
The thing is, all the process is working fine: I can see the resulting dashboard being populated correctly, BUT I cannot see the data in the table in BigQuery, even after refreshing the whole page. Sometimes the results show on the following day (!).
Is it me forgetting something, or is it GC BigQuery in a beta release being incomplete?
As #Pentium10 said, the GUI is just for quick previews. It does take some time to update itself. If you want to check if the data is in the table do a query.
I have some simple weekly aggregates from Google analytics that i'd like to store somewhere. The reason for storing is because if I run a query against too much data in google analytics, it becomes sampled and I want it to be totally accurate.
What is the best way to solve this?
My thoughts are:
1) Write a process in bigquery to append the data each week to a permanent dataset
2) Use an API that gets the data each week and stores the data in a google spreadsheet (appending a line each time)
What is the best recommendation for my problem - and how do I go about executing it?
Checking your previous questions, we see that you already use Bigquery.
When you run a query against the Google Analytics tables that is not sampled, as that has all the data in it. There is no need to store as you can query every time you need.
In case if you want to store, and pay for the addition table, you can go ahead store in a destination table.
If you want to access quickly, try creating a view.
I suggest the following:
1) make a roll-up table for your weekly data - you can do that either by writing a query for it and running manually or with a script in a Google Spreadsheet that uses the same query (using the API) and is scheduled to run every week. I tried a bunch of the tutorials out there and this one is the simplest to implement
2) depending on the data points you want, you can even use the Google Analytics API without having to go through BigQuery for this request, try pulling this report of yours from here . If it works there are a bunch of Google Sheets extensions that can make it a lot quicker to set up a weekly report. Or you can just code it yourself
Would that work for you?
thks!
Im trying to do logs analysis with BigQuery. Specifically, I have an appengine app and a javascript client that will be sending log data to BigQuery. In bigquery, I'll store the full log text in one column but also extract important fields into other columns. I then want to be able to do adhoc queries over those columns.
Two questions:
1) Is BigQuery particularly good or particularly bad at this use case?
2) How do I setup revolving logs? I.e. I want to only store the last N logs or the last X GB of log data. I see delete is not supported.
Just so you know, there is an excellent demo of moving App Engine Log data to BigQuery via App Engine MapReduce called log2bq (http://code.google.com/p/log2bq/)
Re: "use case" - Stack Overflow is not a good place for judgements about best or worst, but BigQuery is used internally at Google to analyse really really big log data.
I don't see the advantage of storing full log text in a single column. If you decide that you must set up revolving "logs," you could ingest daily log dumps by creating separate BigQuery tables, perhaps one per day, and then delete the tables when they become old. See https://developers.google.com/bigquery/docs/reference/v2/tables/delete for more information on the Table.delete method.
After implementing this - we decided to open source the framework we built for it. You can see the details of the framework here: http://blog.streak.com/2012/07/export-your-google-app-engine-logs-to.html
If you want your Google App Engine (Google Cloud) project's logs to be in BigQuery, Google has added this functionality built in to the new Cloud Logging system. It is a beta feature known as "Logs Export"
https://cloud.google.com/logging/docs/install/logs_export
They summarize it as:
Export your Google Compute Engine logs and your Google App Engine logs to a Google Cloud Storage bucket, a Google BigQuery dataset, a Google Cloud Pub/Sub topic, or any combination of the three.
We use the "Stream App Engine Logs to BigQuery" feature in our Python GAE projects. This sends our app's logs directly to BigQuery as they are occurring to provide near real-time log records in a BigQuery dataset.
There is also a page describing how to use the exported logs.
https://cloud.google.com/logging/docs/export/using_exported_logs
When we want to query logs exported to BigQuery over multiple days (e.g. the last week), you can use a SQL query with a FROM clause like this:
FROM
(TABLE_DATE_RANGE(my_bq_dataset.myapplog_,
DATE_ADD(CURRENT_TIMESTAMP(), -7, 'DAY'), CURRENT_TIMESTAMP()))