Having trouble uploading larger files in BigQuery - google-bigquery

I started the last course of the Case Study. I started doing Case 1 from the Google Data Analytics Certificate with BigQuery SQL, but I am struggling to upload 202008 file because it is too much space "Local uploads are limited to 100 MB. Please use Google Cloud Storage for larger files."
Then I saw a video that I could reduce the size and save the excel file as Excel Binary Workbook (*.xlsb), but it still did not work. Did anyone face similar problems with Case 1 when uploading data? The error is when I changed the file from CVS to binary, which reduced the size from 93,213 to half of that.
enter image description here
Any help will be appreciated
enter image description here
enter image description here

Basically, you need to upload first the files to GCS, using preferably the gsutil, which allows you to use multipart to upload files, then you would be able to load it into BigQuery.

Related

upload a >4mb file to an Azure File Share in 2023

I need to upload files larger than 4mb to an Azure File Share.
Previously the guidance was to use the Date Movement Library The github page implies it's being abandoned / no longer worked on and the v12 libraries should be used instead, but it looks like the 4mb limit is still in place (see Azure Storage File Shares client library for .NET
What is the current way to upload files >4mb to a file share?
The maximum size of a file that can be uploaded in an Azure File Share is 4 TiB (Reference).
When you upload a file in a File Share, you have to upload them in chunks and the maximum size of each chunk can be 4MiB. I think this is where you are getting confused.
So, to upload a file larger than 4MB, what you would need to do is create an empty file using ShareFileClient.CreateAsync method and specify the size of the file there.
Once that is done, you would need to read the source file in chunks (max chunk size would be 4MB) and call ShareFileClient.UploadAsync method by passing the stream data read from the source file.

Bigquery Unloading Large Data to a Single GZIP File

I'm using the BigQuery console and was planning to extract a table and put the results into Google Cloud Storage as a GZIP file but encountered an error asking to wilcard the filename as based on Google docs, it's like a limitation for large volume of data and extract needs to be splitted.
https://cloud.google.com/bigquery/docs/exporting-data#console
By any chance is there a workaround so I could have a single compressed file loaded to Google Cloud Storage instead of multiple files? I was using Redshift previously and this wasn't an issue.

How to upload and download media files using GUNDB?

I'm trying to use GUN to create a File sharing platform. I read the tutorial and API but I couldn't find a general way to upload/download a file.
I hear that there is a limitation of 5Mb of localStorage in GUN, if I want to upload a large file, I have to slice it then storage it into GUN. But right now I can't find a way to storage file into GUN.
I read the question from Retric and I know how to store the image into GUN, but can I store the other type of Files such as .zip or .doc File? Is there a general API for file storage?
I wrote a quick little app in 35 lines of HTML to demonstrates file sharing for images, videos, sound, etc.
https://github.com/amark/gun/blob/master/examples/basic/upload.html
I've sent 20MB files thru it, tho yeah, I'm sure there is a better way of splitting it up into 2MB chunks - that is currently not automatic, you'd have to code it.
We'll have a feature in the future that will automatically split up video files. Do you want to help with this?
I think on the download side, all you have to do is make sure you have the whole file (stitch it back together if you do write a splitter upper), and add it to some <a href=" target. Actually, I'm not sure exactly how, but I know browsers support download file attributes for a few years now, where you can create a download link even of a in-memory file... but you'll have to search online for how. Then please write a tutorial and share it with the community!!
I would recommend using IPFS for file storage and GUN to store the links to those files. GUN isn't meant for file storage I believe, primarily user/graph data. Thus the 5 MB limitation.

Uploading a huge file to s3 (larger than my hard drive)

I'm trying to upload a file to an S3 bucket, but the problem is that the file I want to upload (and also create) is bigger than what my hard drive can hold (I want to store a 500TB file on the bucket)
Is there any way to do so?
The file is generated, so I thought about generating the file as I go while it uploads, but I can't quite figure out how to do it.
Any help is appreciated :)
Thanks in advace
The Multipart Upload API allows you to upload a file in chunks, including on-the-fly content generation... but the maximum size of an object in S3 is 5 TiB.
Also, it costs a minimum of $11,500 to store 500 TiB in S3 for 1 month, not to mention the amount of time it takes to upload it... but if this is a justifiable use case, you might consider using some Snowball Edge devices, each of which has its own built-in 100 TiB of storage.

Can Someone Help Me Troubleshoot Error In BQ "does not contain valid backup metadata."

I keep trying to upload a new table onto my companies BQ, but I keep getting the error you see in the title ("does not contain valid backup metadata.").
For reference, I'm uploading a .csv file that has been saved to our Google Cloud data storage. It's being uploaded as a native table.
Can anyone help me troubleshoot this?
It sounds like you are specifying the file type DATASTORE_BACKUP. When you specify that file type, BigQuery will take whatever uri you provide (even if it has a .CSV suffix) and search for the Google Cloud Data Storage Backup files relative to that url.