I have been looking at the new BigQuery Logging feature in the Cloud Platform Console, but it seems a bit inconsistent in what is being logging.
I can see some creates, deletes, inserts and queries. However, when I did a few queries and copy jobs through the web UI they do not show up.
Should activity in the BigQuery web UI also be logged?
Does it differ from where the request comes from, eg. console or API access?
There is no difference between console or API access. An activity in the BigQuery web UI should be logged.
Are you using Cloud Log viewer to view these logs? In some cases, there might be a few secs delay when these logs show up in the log viewer. And you might have to refresh the logs.
Logs containing information about queries are written to the Data Access log stream as opposed to the Admin Activity stream. Only users with Project Owner permissions can view the contents of the Data Access logs which might be why you aren't seeing these.
You should check with the project owner to confirm you have the right permissions to see these logs.
To view the Data Access audit logs, you must have the Cloud IAM roles Logging/Private Logs Viewer or Project/Owner. I had similar issue recently and after enabling the Logging/Private Logs Viewer I was able to see the logs
Related
I need some help to understand what happened to our cloud to have BigQuery resource running every 12h to our cloud without configuring it. Also, it seems very intense because we got charged, in average, one dollar every day for the past month.
After checking in Logs Explorer, I saw several logs regarding the BigQuery resource
I saw the email from one of our software guy. Since I removed him from our Firebase project, there is no more requests.
Though, that person did not do or configure anything regarding the BigQuery so we are a bit lost here and this is why we are asking some help to investigate and understand what is going on.
Hope you will be able to help. Let me know if you need more information.
Thanks in advance
NB: I did not try to add the software guy's email yet. I wanted to see how it will go for the rest of the month.
The most likely causes I've observed for this in the wild:
A scheduled query was setup by a user.
A data studio dashboard was setup and configured to periodically refresh data in the background.
Someone's setup a workflow the queries BigQuery, such as cloud composer or a cloud function, etc.
It's also possible its just something like a script running in a crontab on someone's machine. The audit log should have relevant details like request origin for cases where it's just something running as part of a bespoke process.
In 'Azure log Analytics' logs are saved all day. Is there any way to change the time of saving? I mean changing time so that logs are saved for example between 6:00 AM to 9:00 PM?
It would be great if you send an screenshot.
Thank you!
Unfortunately, there is no such option when log into Azure Log Analytics. You can only control the log categories when configuring logs into Azure Log Analytics via UI or rest api.
If you are only interested the data in a specified time range, you can try to write a query to fetch these logs in Azure Log Analytics.
Is there a way to see when a user's or service_account's last activity was?
This would help with user and account audits. For example, in AWS there is an IAM report that tells you
"password_last_used"
and
"access_key_1_last_used_date".
Note: this is for bigquery itself, not the data I have loaded into BQ.
One option is to use the BigQuery Audit Logs to check for the most recent log entry tagged to a specific user.
e.g.
resource.type="bigquery_resource"
protoPayload.authenticationInfo.principalEmail="some.user#mycompany.com"
You could also export the audit logs back into BigQuery to make it easier to query the information for a specific user.
I am trying to create an event tracking system for our website. I would like to insert the events into Bigquery directly from the consumer's browser. However, to do this, I believe that I need to share the API key with the browser for it to be able to insert into Bigquery. This creates a security flaw, where someone can take the API key and insert large volumes of false events into our Bigquery tables. Are there security features on the Bigquery server that can filter out such events (perhaps by detecting malicious insertion patterns)?
See the solution "How to Do Serverless Pixel Tracking":
https://cloud.google.com/solutions/serverless-pixel-tracking-tutorial
Instead of logging straight to BigQuery, you could:
Create a pixel in Google Cloud Storage.
Insert this pixel in your pages.
Configure GCS logs so they are routed to BigQuery - in realtime through StackDriver.
Even add a load balancer, for best performance around the world.
Im trying to do logs analysis with BigQuery. Specifically, I have an appengine app and a javascript client that will be sending log data to BigQuery. In bigquery, I'll store the full log text in one column but also extract important fields into other columns. I then want to be able to do adhoc queries over those columns.
Two questions:
1) Is BigQuery particularly good or particularly bad at this use case?
2) How do I setup revolving logs? I.e. I want to only store the last N logs or the last X GB of log data. I see delete is not supported.
Just so you know, there is an excellent demo of moving App Engine Log data to BigQuery via App Engine MapReduce called log2bq (http://code.google.com/p/log2bq/)
Re: "use case" - Stack Overflow is not a good place for judgements about best or worst, but BigQuery is used internally at Google to analyse really really big log data.
I don't see the advantage of storing full log text in a single column. If you decide that you must set up revolving "logs," you could ingest daily log dumps by creating separate BigQuery tables, perhaps one per day, and then delete the tables when they become old. See https://developers.google.com/bigquery/docs/reference/v2/tables/delete for more information on the Table.delete method.
After implementing this - we decided to open source the framework we built for it. You can see the details of the framework here: http://blog.streak.com/2012/07/export-your-google-app-engine-logs-to.html
If you want your Google App Engine (Google Cloud) project's logs to be in BigQuery, Google has added this functionality built in to the new Cloud Logging system. It is a beta feature known as "Logs Export"
https://cloud.google.com/logging/docs/install/logs_export
They summarize it as:
Export your Google Compute Engine logs and your Google App Engine logs to a Google Cloud Storage bucket, a Google BigQuery dataset, a Google Cloud Pub/Sub topic, or any combination of the three.
We use the "Stream App Engine Logs to BigQuery" feature in our Python GAE projects. This sends our app's logs directly to BigQuery as they are occurring to provide near real-time log records in a BigQuery dataset.
There is also a page describing how to use the exported logs.
https://cloud.google.com/logging/docs/export/using_exported_logs
When we want to query logs exported to BigQuery over multiple days (e.g. the last week), you can use a SQL query with a FROM clause like this:
FROM
(TABLE_DATE_RANGE(my_bq_dataset.myapplog_,
DATE_ADD(CURRENT_TIMESTAMP(), -7, 'DAY'), CURRENT_TIMESTAMP()))