Monitoring AWS account spends - amazon-s3

I am planning to build a dashboard to monitor the AWS expenditure, after googling I realized AWS has no API so that developers can hook and build an app to get the real time data. Is there any way to achieve it. Kindly help me out

I believe, you are looking to monitor current AWS usage.
AWS provides optoins for same through "AWS programmatic billing access".
Once you enable it, AWS will upload csv file of your current usage every few hours to specified S3 bucket.
You need to write a program using your favourite programming language AWS S3 SDK to download and parse csv file and get real time data.
Newvem has a very good set of How to Guides available to work with AWS.
One of the guide
http://www.newvem.com/how-to-set-up-programmatic-billing-access-for-your-aws-account/
talks about enabling programmatic billing access.
Also refer, http://www.newvem.com/how-to-track-costs-of-amazon-s3-cloud-objects/ , this talks about how to track cost of Amazon S3.
2) As mentioned by Mike, AWS also provides a way where you can get billing alert using Cloudwatch.
I hope above helps.
I recommend to refer Newvem how to guides to get more insight into AWS and its offerings.
Thanks,
Taral Shah

If you're looking to monitor actual spending data, #Taral is correct. AWS Programmatic Billing Access is the best tool for recording the data. However you don't actually have to write a dashboard tool to view it, there are a lot of them out there already.
The company I work for, Cloudability, automatically imports all of your AWS Detailed Billing data and let's you build out all of the AWS spending and usage reports you need without ever having to write any code or mess with any spreadsheets.
If you want to learn more, there's a good blog post at http://blog.cloudability.com/introducing-cloudabilitys-aws-cost-analytics-powered-by-aws-detailed-billing-files/

For more information about Cloudwatch enabled monitroing refer
http://aws.amazon.com/about-aws/whats-new/2012/05/10/announcing-aws-billing-alerts/ for more
To learn AWS faster way, refer how to guides of Newvem at
http://www.newvem.com/amazon-cloud-knowledge-center/how-to-guides/
Regards
Taral

First thing is to enable detailed billing export to a S3 bucket (see here)
Then I wrote a simplistic server in Python (BSD licenced) that retrieves your detailed bill and breaks it down per service-type and usage type (see it on this GitHib repo).
Thus you can check anytime what your costs are and which services cost you the most etc.
If you tag your EC2 instances, S3 buckets etc, they will also show up on a dedicated line.

CloudWatch has an "estimated billing" API, which will get you most of the way there. See this ServerFault question for more detail: https://serverfault.com/questions/350971/how-can-i-monitor-daily-spending-on-aws
If you are looking for more detail you will need to download your CSV-formatted bill and parse it. But your question is too generic to provide any specifically useful answer. Even this will not be real time though.

Related

BigQuery resource used every 12h without configuring it

I need some help to understand what happened to our cloud to have BigQuery resource running every 12h to our cloud without configuring it. Also, it seems very intense because we got charged, in average, one dollar every day for the past month.
After checking in Logs Explorer, I saw several logs regarding the BigQuery resource
I saw the email from one of our software guy. Since I removed him from our Firebase project, there is no more requests.
Though, that person did not do or configure anything regarding the BigQuery so we are a bit lost here and this is why we are asking some help to investigate and understand what is going on.
Hope you will be able to help. Let me know if you need more information.
Thanks in advance
NB: I did not try to add the software guy's email yet. I wanted to see how it will go for the rest of the month.
The most likely causes I've observed for this in the wild:
A scheduled query was setup by a user.
A data studio dashboard was setup and configured to periodically refresh data in the background.
Someone's setup a workflow the queries BigQuery, such as cloud composer or a cloud function, etc.
It's also possible its just something like a script running in a crontab on someone's machine. The audit log should have relevant details like request origin for cases where it's just something running as part of a bespoke process.

Snapshot IBM Cloud Object Storage

i am trying to figure out if it is possible, also with the help of other softwares (like minio, portworx, veeam etc) to take a snapshot (and eventually restore it later) of the content stored on an IBM Cloud Object Storage instance used as persistence layer for an openshfit cluster through its S3 compatible api endpoints.
Please check out this link and see if it provides what you are looking for.
Thanks. https://cloud.ibm.com/docs/openshift?topic=openshift-object_storage#cos_backup_restore
In the end i found this on the official IBM Cloud documentation that achieves somehow what i answered here: basically it explains how to synch tow backets beetwen them, also in different regions, so to have both data and backups on an S3 sotrage
https://cloud.ibm.com/docs/cloud-object-storage?topic=cloud-object-storage-region-copy

How to work with AWS Cognito in production environment?

I am working on an application in which I am using AWS Cognito to store users data. I am working on understanding how to manage the back-up and disaster recovery scenarios for Cognito!
Following are the main queries I have:
I wanted to know what is the availability of this stored user data?
What are the possible scenarios with Cognito, which I need to take
care before we go in production?
AWS does not have any published SLA for AWS Cognito. So, there is no official guarantee for your data stored in Cognito. As to how secure your data is, AWS Cognito uses other AWS services (for example, Dynamodb, I think). Data in these services are replicated across Availability Zones.
I guess you are asking for Disaster Recovery scenarios. There is not much you can do on your end. If you use Userpools, there is no feature to export user data, as of now. Although you can do so by writing a custom script, a built-in backup feature would be much more efficient & reliable. If you use Federated Identities, there is no way to export & re-use Identities. If you use Datasets provided by Cognito Sync, you can use Cognito Streams to capture dataset changes. Not exactly a stellar way to backup your data.
In short, there is no official word on availability, no official backup or DR feature. I have heard that there are feature requests for the same but who knows when they would be released. And there is not much you can do by writing custom code or follow any best practices. The only thing I can think of is that periodically backup your Userpool's user data by writing a custom script using AdminGetUser API. But again, there are rate limits on how many times you can call this API. So, backup using this method can take a long time.
AWS now offers a SLA for Cognito. In the event they are unable to meet their availability target (99.9% at the time of writing), you will receive service credits.
Even through there are couple of third party solutions available, when restoring a user pool users will be created using admin flow (users are not restored rather they will be created from an admin) and they will end up with "Force Change Password" status. So the users will be forced to change the password using the temporary password and that has to be facilitated from the front end of the application.
More info : https://docs.amazonaws.cn/en_us/cognito/latest/developerguide/signing-up-users-in-your-app.html
Tools available.
https://www.npmjs.com/package/cognito-backup
https://github.com/mifi/cognito-backup
https://github.com/rahulpsd18/cognito-backup-restore
https://github.com/serverless-projects/cognito-tool
Pls bear in mind that some of these tools are outdated and can not be used. I have tested "cognito-backup-restore" and it is working as expected.
Also you have to think of how to secure the user information outputted by these tools. Usually they create a json file containing all the user information (except the passwords as passwords can not be backed up) and this file is not encrypted.
The best solution so far is to prevent accidental deletion of user pools with AWS SCPs.

How to track Amazon AWS S3 bucket downloads using Google Analytics Measurement Protocol?

I'm using AWS S3 as my CDN to store files. Often these are directly linked from places all over the world. I'd like to track the file downloads in the S3 bucket using Google Analytics. It appears Google Analytics Measurement Protocol may be able to do this. But since I'm new to both the AWS environment and GAMP, I was hoping I'm not the first to ever do this. Anyone know of a way this can be accomplished?
I doubt this is possible without you doing extra work on top.
You could create a proxy site that, when hit, records an event to Google Analytics and then redirects to the download page/bucket.
You could also maybe have some script/job/etc scrape events from the AWS dashboards and write them to Google Analytics, although this would probably be less than real-time.
You can turn on logging for the buckets you care about, then download the little logfile fragments that Amazon delivers and feed them into an off-the-shelf analytics package such as Webalizer. If you're willing to spend the time and effort to build a pipeline and massage the data so that it fits.
I've written about how to do that here:
https://www.expatsoftware.com/articles/2007/11/roll-your-own-web-stats-for-amazon-s3.html
If you just want the reports today, there are a handful of 3rd party services built around doing this for you, so if you have ~$10/month to spend that's probably the best solution.
S3stat (https://www.s3stat.com/) is my suggestion. But then it should be since it's also my product.

Collect and Display Hadoop MapReduce resutls in ASP.NET MVC?

Beginner questions. I read this article about Hadoop/MapReduce
http://www.amazedsaint.com/2012/06/analyzing-some-big-data-using-c-azure.html
I get the idea of hadoop and what is map and what is reduce.
The thing for me is, if my application sits on top of a hadoop cluster
1) No need for database anymore?
2) How do I get my data into hadoop in the first place from my ASP.NET MVC application? Say it's Stackoverflow (which is coded in MVC). After I post this question, how can this question along with the title, body, tags get into hadoop?
3) In the above article, it collects data about "namespaces" used on Stakoverflow and how many times they were used.
If this site stackoverflow wants to display the result data from mapreducer in real time, how do you do that?
Sorry for the rookie questions. I'm just trying to get a clear pictures here one piece at a time.
1) That would depend on the application. Most likely you still need database for user management, etc.
2) If you are using Amazon EMR, you'd place the inputs into S3 using .NET API (or some other way) and get the results out the same way. You could also monitor your EMR account via API, fairly straight-forward.
3) Hadoop is not really a real-time environment, more of a batch system. You could simulate
realtime by continuous processing of incoming data, however it's still not true real-time.
I'd recommend taking a look at Amazon EMR .NET docs and pick up a good book on Hadoop (such as Hadoop in Practice to understand the stack and concepts and Hive (such as Programming Hive)
Also, you can, of course, mix the environments for what they are best at; for example, use Azure Websites and SQLAzure for your .NET app and Amazon EMR for hadoop/hive. No need to park everything in one place, considering cost models.
Hope this helps.