How does gmail query from 900 million records? with rdms or no-sql? [closed] - sql

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
According to this Techcrunch news
Gmail has 900 million users. When I try to login with my username and password to gmail, It queries with the speed of light. Do they use rdms (relational) or no-sql? Is it possible with rdms?

I'm sure this isn't exactly how it's done, but one billion records at say 50 bytes per user name is only 50 gigabytes. They could keep it all in RAM in a sorted tree and just search the sorted tree.
A binary tree of that size is only thirty nodes deep, which would take microseconds to traverse, and I suspect they'd use something that branches more than a binary tree so it would be even flatter.
All in all, there's probably much more amazing things google does, this part is relatively trivial.

Related

Creating a Price tracker system [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I was recently asked the following question in an interview.
"How would you design a system to keep track of a million items at xyz.com ?
The xyz.com could update the prices maybe 2-3 times a day or once per month, so no guarentee on frequency.
Your system should show accurate prices for >95% of items at any given point of time and aim for 99%.
Also scale for 1billion items etc..
"
I asnwered along the lines of creating a distributed system app that would categorize items by priority (based on historical price fluctuations and 80/20 % rule etc) and do API calls more frequently for these.
But I was not allowed to use API calls.
I suggested scraping html content. (But the website can block my ip for such high load)
I basically want to know the resources that would help me anwering these type of questions. Prefer full length courses (Distributed systems ?) or books rather than quick-fix blogs.

Quality and Speed of SQL Query [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
I want to know how much first query effect to speed of query or server
Consider we have 50K request during one minute .
First query:
SELECT CONVERT(NUMERIC(13,1), ROUND(width, 1, 1)) FROM Home
.
And for second query we use round on client side, it means combination of client and server side
SELECT width FROM Home
You could use a tool such as SQL Query Stress to put a load on your server. This will allow you to simulate as many users as you want to execute the query as many times as you like;
http://www.sqlstress.com/
You could then use a tool such as Brent Ozar's fantastic sp_AskBrent to get some important metrics out of the system.
https://www.brentozar.com/askbrent/
Ultimately it's going to be down to you to see how your server performs in each instance and make a decision on which route to go down.

Issues while implementing Google Big Query [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
Our company is going to implement Big Query.
We saw many drawbacks in Big Query like
1. Only 1000 requests per day allowed.
2. No update delete allowed.
and so on...
Can u guys highlight some more drawbacks and also discuss on above two.
Please share any issues come during and after implementing Big Query.
Thanks in Advance.
"Only 1000 requests per day allowed"
Not true, fortunately! There is a limit of how many batch loads you can do to a table per day (1000, so one every 90 seconds), but this is about loading data, not querying it. And if you need to load data more frequently, you can use the streaming API for up to a 100,000 rows per second per table.
"No update delete allowed"
BigQuery is an analytical database which are not optimized for updates and deletes of individual rows. The analytical databases that support these operations usually do with caveats and performance costs. You can achieve the equivalent update and deletes with BigQuery by re-materializing your tables in just a couple minutes: https://stackoverflow.com/a/31663889/132438

Increment counter or query relations? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
Let's say I have a User model and a Favorite model. I want to know how many favorites a user has.
I see that you can accomplish this in two ways.
Atomically increment a counter attribute on the user model when a favorite is created. Access using user_instance.favorite_count
Query the favorite count for the user: user_instance.favorite_set.count()
I would imagine that as the DB grows, counting becomes more expensive.
Which implementation is more scalable?
I smell some premature optimization here. Databases are extremely good at counting things. Unless you have measured and are seeing some identifiable slowness, you should not attempt to denormalize: it is difficult to get right and always at risk of getting out of sync. Go with the query; and don't forget you can use aggregation to query the counts for a queryset of users at one time.

sql database convention [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
Apologies in advance if this is a stupid question. I've more or less just started learning how to use SQL.
I'm making a website, the website stores main accounts, each having many sub-accounts associated with them. Each sub-account has a few thousand records in various tables associated with it.
My question is to do with the conventional usage of databases. Is it better to use a database per main account with everything associated with it stored in the same place, store everything in one database, or an amalgamation of both?
Some insight would be much appreciated.
Will you need to access more than one of these databases at the same time? If so put them all in one. You will not like the amount of effort and cost 'joining' them back together to do a query. On top of that, every database you have needs to be managed, and should you need to transfer data between them that can get painful as well.
Segregating data by database is a last resort.