I'm looking for a Google Cloud based solution that can provide a fast and safe way to pop or fill Queues from multiple threads/processes.
Basically, I have a list of unique items, and I need a low-latency solution to select an item from it only once to a user (whenever the user requests one).
A pattern like Redis List/pop is perfect for that, but I'm open to any other solution that can achieve the same result.
The canonical way to achieve this is, or is about to become, Google’s own Cloud Memorystore: https://cloud.google.com/memorystore/
Cloud Memorystore for Redis provides a fully managed in-memory data store service built on scalable, more secure, and highly available infrastructure managed by Google.
Related
I've found this nice article on how to directly stream data from Google Storage to tf.data. This is super handy if your compute tier has limited storage (like on KNative in my case) and network bandwidth is sufficient (and free of charge anyway).
tfds.load(..., try_gcs=True)
Unfortunately, my data resides in a non Google bucket and it isn't documented for other Cloud Object Store systems.
Does anybody know if it also works in non GS environments?
I'm not sure how this is implemented in the library, but it should be possible to access other object store systems in a similar way.
You might need to extend the current mechanism to use a more generic API like the S3 API (most object stores have this as a compatibility layer). If you do need to do this, I'd recommend contributing it back upstream, as it seems like a generally-useful capability when either storage space is tight or when fast startup is desired.
I'm trying to build a data collection web endpoint.The Use case is similar to Google Analytics collect API. I want to add this endpoint(GET method) to all pages on the website and on-page load collect page info through this API.
Actually I'm thinking of doing this by using Google Cloud services like Endpoints, BQ(for storing the data).. I don't want to host it in any dedicated servers. Otherwise, I will be end up doing a lot for managing/monitoring the service.
Please suggest me how do I achieve this with Google Cloud Service? OR direct me to right direction if my idea is wrong
I suggest focussing on deciding where you want to code to run. There are several GCP options that don't require dedicated servers:
Google App Engine
Cloud Functions/Firebase Functions
Cloud Run (new!)
Look here to see which support Cloud Endpoints.
All of these products can support running code that takes the data from the request and sends it to the BigQuery API.
There are various ways of achieving what you want. David's answer is absolutely valid, but I would like to introduce Stackdriver Custom Metrics to the discussion.
Custom metrics are similar to regular Stackdriver Monitoring metrics, but you create your own time series (Stackdriver lingo described here) to keep track of whatever you want and clients can sent in their data through an API.
You could achieve the same thing with a compute solution (Google Cloud Functions for example) and a database (Google BigTable for example) and writing your own logic.. but Custom Metrics is an already built solution that includes dashboards and alerting policies while being a more managed solution.
I am trying to query public utility data from an API (oasis.caiso.com) with a threaded script in R. Apparently this API will refuse requests from certain IP addresses if too many are made. Therefor I need to run many different API requests in parallel across different IP addresses, and am wondering if a machine with many different CPUs on google cloud platform will allow this?
I was looking at the n1-highcpu-96 option from this page: https://cloud.google.com/compute/docs/machine-types
If this is a poor solution can anyone suggest another distributed computing solution that can scale to allow dozens or even hundreds of API queries simultaneously from different IPs?
If I needed multiple IP to perform "light" API calls I would not scale vertically (with a machine having 96 core). I would create an instance group with 50 or 100 or n Debian micro or small preentible instances with the size depending on the kind of computation you need to perform.
You can set up a startup script loaded in the metadata or in a custom image that connects to the API server do what it has to do and save the result on a bucket and if the instance get a "API refuse" I would simply kill the instance automatically having the instances group creating a new one for me with possibly a new IP.
This I think is a possible easy solution to achieve what you want, but I guess there are multiple solutions.
I am not sure what you are trying to achieve and I think you need to check first that it is legal and if the owner of the API agree.
We're working with Cumulocity and we'd like to offer services to our customers that are not currently possible to implement with Cumulocity. As an example, we'd like to be able to retrieve a list of devices located within x kilometers of a given point.
Currently there are two limitations that prevent us from doing so:
the impossibility of extending the Cumulocity API with custom route/parameters
the impossibility of implementing custom functions for specific API GET calls
I can think of a workaround to achieve this, like a POST request of an event that would be processed by an Esper rule, generating another event/measurement that could then be accessed by a GET. But I think we can agree this is not a suitable mechanism.
Please not that the use case I described above is just an example. Our needs don't limit to this and we need a standardized way to expand our services without requirering updates on Cumulocity side.
There are two topics here, I believe:
Geo-querying: Some geographical querying and aggregation use cases can be handled through CEL. A general geo-querying API is on the Cumulocity roadmap. Note: This use case is not only related to extending the API, as such queries go right down into the database.
Extending the API: That is actually possible. Cumulocity has a microservices API in which you can expose other APIs under the URL /services/.... This is, for example, how connectivity platforms are interfaced. The API is not on the web site because it's not GA yet, but you can certainly discuss it with your Cumulocity contact or open a ticket. This btw includes also adding permissions for the new microservices, so that you can do proper A&A.
I'm trying to make a google gadget that stores some data (say, statistics of users' actions) in a persistent way (i.e. statistics accumulates over time and over multiple users). Also I want these data to be placed at google free hosting, possibly together with the gadget itself.
Any ideas on how to do that?
I know, Google gadgets API has tools for working with remote data, but then the question is where to host it. Google Wave seemed to be an option, but it is no longer supported.
You should get a server and host it there.
You have then the best control over the code, the performance and the data itself.
There are several hosting providers out there who provide hosting for a reasonable price.
Naming some: Hostgator.com (US), Hetzner.de (DE), http://swedendedicated.com (SE, never used, just a quick search on the internet).