I've followed the instructions on the Google Translate Multiple Documents guide, set up a batch of Office documents to be translated and successfully submitted a request using Powershell. I get the expected response as follows, apparently indicating that the request is successful:-
{
"name": "projects/<My-Project-Number-Here>/locations/us-central1/operations/20220525-16311653521501-<Generated-Job-ID-GUID>",
"metadata": {
"#type": "type.googleapis.com/google.cloud.translation.v3.BatchTranslateDocumentMetadata",
"state": "RUNNING"
}
}
All good so far.
However, the problem is that I dont see any translated documents appearing in the Storage Bucket that I specified in the output_config of the request .JSON file, and I can't seem to find a way to view the status of the long-running job that it has created.
If I re-submit the same job, it tells me that the output bucket is in-use by another batch process, which indicates that something is happening.
But, for the life of me, I can't see where in the Google Cloud Dashboard or via the gcloud command line, to query the status of the job it has created. It just says that it has created a batch a job and that's it. No further feedback. I am assuming that it fails somehow as there are no resulting translated files in the output storage bucket location.
Does anyone have experience with this and could they share how to query the job?
Thanks in advance.
Regards,
Glenn
EDIT: OK - the job hasn't failed - I just needed to be patient as I checked the output storage bucket and there are a bunch of files in there, so it is working (phew - I have a lot of documents to translate). But I would still like to know if there is a way to see the status of the job.
Related
In our current production system, we have several files that will be processed by Hybris hotfolder from external system on a daily / hourly basis. What is the best way to check the status of each file that is being processed by hot folder? Is there any OOTB dashboard functionality available for hotfolder? or is it a custom development?
So far, I'm following to check see backoffice cronjob logs. But it is very cumbersome process - by monitoring logs, finding out unique cron job id etc..any other best approaches?
I'm looking something similar to jenkins jobs status.
Appreciate your inputs.
There is a workaround. Please check this link :
https://help.sap.com/viewer/d0224eca81e249cb821f2cdf45a82ace/1808/en-US/b8004ccfcbc048faa9558ae40ea7b188.html?q=CronJobProgressTracker
Firstly, you need to implement the CronJobProgressTracker class to your current cronjob. And you can see the progress of cronjob in either hac or Backoffice ;
hac : execute flexible search
Backoffice : you can add a setting for the CronJobHistory menu. Then
just click the refresh button to see the last state of progress.
As I know , not possible to track file progress state in OOTB hotfolder. Also you can write custom code in your uploading process .BTW , to be honest my last sentence is not so meaningful . Because need to know your hotfolder xml context to give more hints ..
Hot-folder ingests a file in a series of steps specified by the beans in the hot-folder-spring.xml.Add loggers in each of the bean, eg- batchFilesHeader, batchExternalTaxConverterMapping
Then you can see the status in the console logs.
I am using the the google cloud logging web ui to export google compute engine logs to a big query dataset. According to the docs, you can even create the big query dataset from this web ui (It simply asks to give the dataset a name). It also automatically sets up the correct permissions on the dataset.
It seems to save the export configuration without errors but a couple of hours have passed and I don't see any tables created for the dataset. According to the docs, exporting the logs will stream the logs to big query and will create the table with the following template:
my_bq_dataset.compute_googleapis_com_activity_log_YYYYMMDD
https://cloud.google.com/logging/docs/export/using_exported_logs#log_entries_in_google_bigquery
I can't think of anything else that might be wrong. I am the owner of the project and the dataset is created in the correct project (I only have one project).
I also tried exporting the logs to a google storage bucket and still no luck there. I set the permissions correctly using gsutil according to this:
https://cloud.google.com/logging/docs/export/configure_export#setting_product_name_short_permissions_for_writing_exported_logs
And finally I made sure that the 'source' I am trying to export actually has some log entries.
Thanks for the help!
Have you ingested any log entries since configuring the export? Cloud Logging only exports entries to BigQuery or Cloud Storage that arrive after the export configuration is set up. See https://cloud.google.com/logging/docs/export/using_exported_logs#exported_logs_availability.
You might not have given edit permission for 'cloud-logs#google.com' in the Big Query console. Refer this.
Whenever I try to load a CSV file stored in CloudStorage into BigQuery, I get an InternalError (both using the web interface as well as the command line). The CSV is (an abbreviated) part of the Google Ngram dataset.
command like:
bq load 1grams.ngrams gs://otichybucket/import_test.csv word:STRING,year:INTEGER,freq:INTEGER,volume:INTEGER
gives me:
BigQuery error in load operation: Error processing job 'otichyproject1:bqjob_r28187461b449065a_000001504e747a35_1': An internal error occurred and the request could not be completed.
However, when I load this file directly using the web interface and the File upload as a source (loading from my local drive), it works.
I need to load from Cloud Storage, since I need to load much larger files (original ngrams datasets).
I tried different files, always the same.
I'm an engineer on the BigQuery team. I was able to look up your job, and it looks like there was a problem reading the Google Cloud Storage object.
Unfortunately, we didn't log much of the context, but looking at the code, the things that could cause this are:
The URI you specified for the job is somehow malformed. It doesn't look malformed, but maybe there is some odd UTF8 non-printing character that I didn't notice.
The 'region' for your bucket is somehow unexpected. Is there any chance you've set data location on your GCS bucket to something other than {US, EU, or ASIA}. See here for more info on bucket locations. If so, and you've set location to a region, rather than a continent, that could cause this error.
There could have been some internal error in GCS that caused this. However, I didn't see this in any of the logs, and it should be fairly rare.
We're putting in some more logging to detect this in the future and to fix the issue with regional buckets (however, regional buckets may fail, because bigquery doesn't support cross-region data movement, but at least they will fail with an intelligible error).
I submitted a load job to Google BigQuery which loads 12 compressed (gzip) tabular files from google cloud storage. Each file is about 2 gigs compressed. The command I ran was similar to:
bq load --nosync --skip_leading_rows=1 --source_format=CSV
--max_bad_records=14000 -F "\t" warehouse:some_dataset.2014_lines
gs://bucket/file1.gz,gs://bucket/file2.gz,gs://bucket/file12.gz
schema.txt
I'm receiving the following error from my BigQuery load job with no explanation of why:
Error Reason:internalError. Get more information about this error at
Troubleshooting Errors: internalError.
Errors: Unexpected. Please try again.
I'm certain that the schema file is correctly formatted as I've successfully loaded files using the same schema but different set of files.
I'm wondering in what kinds of situation would an internal error like this occur and what are some ways I could go about debugging this issue?
My BQ job id: bqjob_r78ca777a8ad4bdd9_0000014e2dc86e0e_1
Thank you!
There are some cases you can get into with large .gz input files that are not always reported with a clear cause. This can happen especially (but not exclusively) with highly compressible text, so that 1 GB of compressed data represents an unusually large amount of text.
The documented limit on this page for compressed CSV/JSON is 1 GB. If that is current, I would actually expect an error on your 2 GB input. Let me check that.
Are you able to split these files into smaller pieces and try again?
(Meta: Grace, you are correct that Google says that "Google engineers monitor and answer questions with the tag google-bigquery" on StackOverflow. I am a Google engineer, but there are also many knowledgeable people here who are not. Google's docs could perhaps give more explicit guidance: the questions that are most valuable to the StackOverflow community are ones that a future person can identify they're seeing this same problem, and preferably that a non-Googler can answer it from public information. It's tough in your case because the error is broad and the cause is unclear. But if you're able to reproduce the problem using an input file that you can make public, more people here will be able to take a crack at the problem. You can also file an issue for questions that really no one outside Google can do much with.)
I want to know where actually these log files got saved in test manager?
As per my understanding, it's saved under SQL server. But I dont know how to get it from it (may be using SharePoint reporting service). If you look the logs, you ll find the link like mtm://<tfsSeverName>:8080/tfs/defaultcollection/p:trunk/Testing/testrun/open?id=118.
I'm trying to find more info. If i get it, I ll post here.
In the mean time, I have created my customized logs with the test case, so that I can use it to debug somewhat.