Google's cloud SDK doesn't return `createTime` - google-cloud-sdk

I'm trying to list info about some instances using cloud sdk, but for some reason the createTime field is not returned. Any idea why?
$ gcloud compute instances list --format="table(name,createTime)" --filter="name:florin*"
NAME CREATE_TIME
florin-ubuntu-18-playground
This should work according to this https://cloud.google.com/sdk/gcloud/reference/topic/filters

The command gcloud compute instances list does not show the instances creation time by default. But the gcloud --format flag can change the default output displayed.
gcloud compute instances list --format="table(name,creationTimestamp)"
It is also possible to retrieve the instance creation time from the gcloud compute describe command:
gcloud compute instances describe yourInstance --zone=yourInstanceZone | grep creationTimestamp
Or:
gcloud compute instances describe yourInstance --zone=yourInstanceZone --flatten=creationTimestamp

Related

Fluentbit Cloudwatch templating with EKS and Fargate

I've an EKS cluster purely on Fargate and I'm trying to setup the logging to Cloudwatch.
I've a lot of [OUTPUT] sections that can be unified using some variables. I'd like to unify the logs of each deployment to a single log_stream and separate the log_stream by environment (name_space). Using a couple of variable I'd need just to write a single [OUTPUT] section.
For what I understand the new Fluentbit plugin: cloudwatch_logs doesn't support templating, but the old plugin cloudwatch does.
I've tried to setup a section like in the documentation example:
[OUTPUT]
Name cloudwatch
Match *container_name*
region us-east-1
log_group_name /eks/$(kubernetes['namespace_name'])
log_stream_name test_stream
auto_create_group on
This generates a log_group called fluentbit-default that according to the README.md is the fallback name in case the variables are not parsed.
The old plugin cloudwatch is supported (but not mentioned in AWS documentation) because if I replace the variable $(kubernetes['namespace_name']) with any string it works perfectly.
Fluentbit in Fargate manages automatically the INPUT section so I don't really know which variables are sent to the OUTPUT section, I suppose the variable kubernetes is not there or it has a different name or a different array structure.
So my questions are:
Is there a way to get the list of the variables (or input) that Fargate + Fluentbit are generating?
Get I solve that in a different way? (I don't want to write more than 30 different OUTPUT one for each service/log_stream_name. It would be also difficult to maintain it)
Thanks!
After few days of tests, I've realised that you need to enable the kubernetes filter to receive the kubernetes variables to the cloudwatch plugin.
This is the result, and now I can generate log_group depending on the environment label and log_stream depending of the container-namespace names.
filters.conf: |
[FILTER]
Name kubernetes
Match *
Merge_Log Off
Buffer_Size 0
Kube_Meta_Cache_TTL 300s
output.conf: |
[OUTPUT]
Name cloudwatch
Match *
region eu-west-2
log_group_name /aws/eks/cluster/$(kubernetes['labels']['app.environment'])
log_stream_name $(kubernetes['namespace_name'])-$(kubernetes['container_name'])
default_log_group_name /aws/eks/cluster/others
auto_create_group true
log_key log
Please note that the app.environment is not a "standard" value, I've added it to all my deployments. The default_log_group_name is necessary in case that value is not present.
Please note also that if you use log_retention_days and new_log_group_tags the system is not going to work. To be honest log_retention_days it never worked for me also using the new cloudwatch_logs plugin either.

Why I don't have the file savemodel.pbtxt or savemodel.pb in my bucket [Cloud Storage, BigQueryML, AI Plateform]

I am currently following this tutorial : https://cloud.google.com/architecture/predicting-customer-propensity-to-buy
to create a predictive model on customer behavior in BigQuery and BigQueryML. I then need to be able to extract my model to Cloud Storage and send it to AI Plateform to make predictions online.
My problem is this: the "gcloud ai-platform versions create" step (STEP 5) does not work.
While searching, I noticed that during the extract , the file that Google Shell is asking for is missing in my bucket.
I show you the error in my Shell. The file in question is savemodel.pb.
name#cloudshell:~ (name_account_analytics)$ gcloud ai-platform versions create --model=model_test V_1 --region=us-central1 --framework=tensorflow --python-version=3.7 --runtime-version=1.15 --origin=gs://bucket_model_test/V_1/ --staging-bucket=gs://bucket_model_test
Using endpoint [https://us-central1-ml.googleapis.com/]
ERROR: (gcloud.ai-platform.versions.create) FAILED_PRECONDITION: Field: version.deployment_uri Error: Deployment directory gs://bucket_model_test/V_1/ is expected to contain exactly one of: [saved_model.pb, saved_model.pbtxt].
- '#type': type.googleapis.com/google.rpc.BadRequest
fieldViolations:
- description: 'Deployment directory gs://bucket_model_test/V_1/ is expected to
contain exactly one of: [saved_model.pb, saved_model.pbtxt].'
field: version.deployment_uri
name#cloudshell:~ (name_account_analytics)$ gcloud ai-platform versions create --model=model_test V_1 --region=us-central1 --framework=tensorflow --python-version=3.7 --runtime-version=1.15 --origin=gs://bucket_model_test/V_1/model.bst --staging-bucket=gs://bucket_model_test
Using endpoint [https://us-central1-ml.googleapis.com/]
ERROR: (gcloud.ai-platform.versions.create) FAILED_PRECONDITION: Field: version.deployment_uri Error: The provided URI for model files doesn't contain any objects.
- '#type': type.googleapis.com/google.rpc.BadRequest
fieldViolations:
- description: The provided URI for model files doesn't contain any objects.
field: version.deployment_uri
How do I tell it to create it? Why doesn't it do it automatically?
Thanks for your help!
Welodya

Getting a zone's location via gcloud cli

Is there a way to get the Location value of a zone/region via the gcloud cli? I'm looking for the values listed here: https://cloud.google.com/compute/docs/regions-zones
I'm hoping to use these values to mark locations on a geochart.
Thanks!
You can get a zones location via gloud cli using the following commmand: gcloud compute zones list.
You can check in here the other parameters that you can use.
Maintaining my own list of locations by referring to: https://cloud.google.com/compute/docs/regions-zones

Is it possible to use service accounts to schedule queries in BigQuery "Schedule Query" feature ?

We are using the Beta Scheduled query feature of BigQuery.
Details: https://cloud.google.com/bigquery/docs/scheduling-queries
We have few ETL scheduled queries running overnight to optimize the aggregation and reduce query cost. It works well and there hasn't been much issues.
The problem arises when the person who scheduled the query using their own credentials leaves the organization. I know we can do "update credential" in such cases.
I read through the document and also gave it some try but couldn't really find if we can use a service account instead of individual accounts to schedule queries.
Service accounts are cleaner and ties up to the rest of the IAM framework and is not dependent on a single user.
So if you have any additional information regarding scheduled queries and service account please share.
Thanks for taking time to read the question and respond to it.
Regards
BigQuery Scheduled Query now does support creating a scheduled query with a service account and updating a scheduled query with a service account. Will these work for you?
While it's not supported in BigQuery UI, it's possible to create a transfer (including a scheduled query) using python GCP SDK for DTS, or from BQ CLI.
The following is an example using Python SDK:
r"""Example of creating TransferConfig using service account.
Usage Example:
1. Install GCP BQ python client library.
2. If it has not been done, please grant p4 service account with
iam.serviceAccout.GetAccessTokens permission on your project.
$ gcloud projects add-iam-policy-binding {user_project_id} \
--member='serviceAccount:service-{user_project_number}#'\
'gcp-sa-bigquerydatatransfer.iam.gserviceaccount.com' \
--role='roles/iam.serviceAccountTokenCreator'
where {user_project_id} and {user_project_number} are the user project's
project id and project number, respectively. E.g.,
$ gcloud projects add-iam-policy-binding my-test-proj \
--member='serviceAccount:service-123456789#'\
'gcp-sa-bigquerydatatransfer.iam.gserviceaccount.com'\
--role='roles/iam.serviceAccountTokenCreator'
3. Set environment var PROJECT to your user project, and
GOOGLE_APPLICATION_CREDENTIALS to the service account key path. E.g.,
$ export PROJECT_ID='my_project_id'
$ export GOOGLE_APPLICATION_CREDENTIALS=./serviceacct-creds.json'
4. $ python3 ./create_transfer_config.py
"""
import os
from google.cloud import bigquery_datatransfer
from google.oauth2 import service_account
from google.protobuf.struct_pb2 import Struct
PROJECT = os.environ["PROJECT_ID"]
SA_KEY_PATH = os.environ["GOOGLE_APPLICATION_CREDENTIALS"]
credentials = (
service_account.Credentials.from_service_account_file(SA_KEY_PATH))
client = bigquery_datatransfer.DataTransferServiceClient(
credentials=credentials)
# Get full path to project
parent_base = client.project_path(PROJECT)
params = Struct()
params["query"] = "SELECT CURRENT_DATE() as date, RAND() as val"
transfer_config = {
"destination_dataset_id": "my_data_set",
"display_name": "scheduled_query_test",
"data_source_id": "scheduled_query",
"params": params,
}
parent = parent_base + "/locations/us"
response = client.create_transfer_config(parent, transfer_config)
print response
As far as I know, unfortunately you can't use a service account to directly schedule queries yet. Maybe a Googler will correct me, but the BigQuery docs implicitly state this:
https://cloud.google.com/bigquery/docs/scheduling-queries#quotas
A scheduled query is executed with the creator's credentials and
project, as if you were executing the query yourself
If you need to use a service account (which is great practice BTW), then there are a few workarounds listed here. I've raised a FR here for posterity.
This question is very old and came on this thread while I was searching for same.
Yes, It is possible to use service account to schedule big query jobs.
While creating schedule query job, click on "Advance options", you will get option to select service account.
By default is uses credential of requesting user.
Image from bigquery "create schedule query"1

How to use Google DataFlow Runner and Templates in tf.Transform?

We are in the process of establishing a Machine Learning pipeline on Google Cloud, leveraging GC ML-Engine for distributed TensorFlow training and model serving, and DataFlow for distributed pre-processing jobs.
We would like to run our Apache Beam apps as DataFlow jobs on Google Cloud. looking at the ML-Engine samples
it appears possible to get tensorflow_transform.beam.impl AnalyzeAndTransformDataset to specify which PipelineRunner to use as follows:
from tensorflow_transform.beam import impl as tft
pipeline_name = "DirectRunner"
p = beam.Pipeline(pipeline_name)
p | "xxx" >> xxx | "yyy" >> yyy | tft.AnalyzeAndTransformDataset(...)
TemplatingDataflowPipelineRunner provides the ability to separate our preprocessing development from parameterized operations - see here: https://cloud.google.com/dataflow/docs/templates/overview - basically:
A) in PipelineOptions derived types, change option types to ValueProvider (python way: type inference or type hints ???)
B) change runner to TemplatingDataflowPipelineRunner
C) mvn archetype:generate to store template in GCS (python way: a yaml file like TF Hypertune ???)
D) gcloud beta dataflow jobs run --gcs-location —parameters
The question is: Can you show me how we can we use tf.Transform to leverage TemplatingDataflowPipelineRunner ?
Python templates are available as of April 2017 (see documentation). The way to operate them is the following:
Define UserOptions subclassed from PipelineOptions.
Use the add_value_provider_argument API to add specific arguments to be parameterized.
Regular non-parameterizable options will continue to be defined using argparse's add_argument.
class UserOptions(PipelineOptions):
#classmethod
def _add_argparse_args(cls, parser):
parser.add_value_provider_argument('--value_provider_arg', default='some_value')
parser.add_argument('--non_value_provider_arg', default='some_other_value')
Note that Python doesn't have a TemplatingDataflowPipelineRunner, and neither does Java 2.X (unlike what happened in Java 1.X).
Unfortunately, Python pipelines cannot be used as templates. It is only available for Java today. Since you need to use the python library, it will not be feasible to do this.
tensorflow_transform would also need to support ValueProvider so that you can pass in options as a value provider type through it.