Why can't we just get the GPA and then directly compute the real physical address, as shown in the https://www.exploit-db.com/docs/45546 on page 8? we can save a lot of access to memory.
Why do we need the complex calculation with nested page tables, as shown on the same link on page 9?
I am not sure, but my guess is to allow more addressing space. if the virtual machine "has" 4gb virtual space and 4gb physical space then if we use the first approach, we can get only to 4GB in the real machine. But I think that we can overcome it.
I got it!
every table is located in the GPA, as such, we need to translate it using the table walk with the eptp to get to the PPA of the table!, we need to do it for each table, and that's why the long page walk.
For example, PML4 is in the GPA, so we need to translate it, and so on.
Related
I am trying to run a simple select query and it has column called instructions with varchar(8000) in the select column list. The table has
90,000 records and it took my SQL server management studio console to 10 seconds to return and display the full table data
SELECT id, name, instructions, etc.... FROM TABLE;
however when i remove the instructions from the select list it took only a 1 second to execute and display the result. Can any one please help me to understand the theory behind this
Thanks
Keth
There are some obvious things here that impact the time, and a few more subtle ones around it. The topic of the underlying storage of SQL Server and how it stores / retrieves this data is a book in itself, of which there are many. (I'd personally recommend Kalen Delaney but everyone will have their own preference and I appreciate we should keep away from subjectivity on SO).
90k rows of instructions potentially have to be marshalled across your network connection if you were connected from another machine than the server.
The SSMS console itself, has to display these, which itself takes time.
depending on the size of what you are reading vs your buffer cache and other queries being executed you could be putting pressure on your cache and generating more physical IO load for the server as a whole.
As mentioned in comments, more data is being read, but does this mean more is being read from the disk? This one is far more subtle when looked at in detail.
In terms of the disk IO issue, depending on when the instructions are placed in the row and the settings for the column around inlining of data. It might be that the instructions for the row are stored inline with the row, which means no additional disk IO is actually occurring to read them vs not read them, its more a case of whether SQL Server bothers to decode the value from the page in memory.
The varchar(8000) though might not be inline with the rest of the data, it could be on a row_overflow_data page, sometimes referred to as short large object (SLOB), in which case the instruction field itself stores a pointer where the data is stored, and when you read the instructions it causes SQL Server to have to read another entirely random page (and extent) elsewhere on the disk per row.
Depending how / when instructions are added, you could see a huge level of fragmentation / lack of contiguous extents being allocated for these instructions, although depending on the IO subsystem, this may be immaterial to the problem.
There are a lot of unknowns at this point which makes it harder to give anything definitive - you are in the 'it depends' area of the DB, which would need a lot more specifics and investigation to be able to point at a specific cause, vs the more general (and not entirely complete) list above.
As Tim Biegeleisen mentioned, do not read the instructions unless you need to.
i have recently discovered MonetDB and i am evaluating it for an internal project, so probably my questions are from a really newbie point of view. Maybe someone could point me to a site and/or document where i could find more info (i haven't found too much googling)
regarding scalability, correct me please if i am wrong, but what i understand is that if i need to scale, i would launch more server instances and discover them from the control node, is it right?
is there any limit on the number of servers?
the other point is about storage, is it possible to use amazon S3 to back MonetDB readonly instances?
update we would need to store a massive amount of Call Detail Records from different sources, on a read-only basis. We would aggregate/reduce that data for the day-to-day operation, accessing the bigger tables only when the full detail is required.
We would store the historical data as well to perform longer-term analysis. My concern is mostly about memory, disk storage wouldn't be the issue i think; if the hot dataset involved in a report/analysis eats up the whole memory space (fast response times needed, not sure about how memory swapping would impact), i would like to know if i can scale somehow instead of reingeneering the report/analysis process (maybe i am biased by the horizontal scaling thing :-) )
thanks!
You will find advantages of monetdb easily on net so let me highlight some disadvantages
1. In monetdb deleting rows does not free up the space
Solution: copy data in other table,drop existing table, and rename the other table
2. Joins are little slower
3. We can can not give table name as dynamic variable
Eg: if you have table name stored in one main table then you can't make a query like "for each (select tablename from mytable) select data from tablename)" the sql
You can't make functions with tablename as variable argument.
But it is still damn fast and can store large amount of data.
I've just made a simple RAM memory in Minecraft (with redstone), with 4bits for the adress and 4bits stored in each cell. Our next goal is to store different kinds of variables in it and to process them differently.
We are not engineers, so we don't really know, but we have made some quite complex things and we think we can do this. The problem is that we can't figure out how to store variables of more bits that can be stored in a single cell. I'll give an example.
Think of a 16bit variable. We thought that there's no sense in creating big cells so we decided to store that data storing 4bits in each cell. But that's not enough, we had to relate those 4 cells. So we thought that we had to create 8bit cells, with 4bits of content and 4bits to store the address where the next 4bits of the variable are stored. However, 4bits of address is nothing for RAM, we can't store nothing there. So we would need at least 8bits for the address. 4bits of content also seems quite low, and we also need at least other 4bits to store the type of the variable.
Well, finally we thought that technique was absurd and that it coudn't be done like that in real life. And we don't know how to do it now. I've searched on the web about how RAM works and the few that I've find was too complex for our needs.
Could someone please explain us how this is done in real life?
Heh you're playing the blame game, trying to pin all the responsibility of memory management on the physical RAM implementation.
In fact, RAM is just that, a storage device (your redstone tiles), actually storing data in it is your program's responsibility. Put in other words, there doesn't need to be a standardized memory cell "linking" strategy for RAM, because it's your program that writes to it and then reads it back, so it knows its own common practices.
With that in mind, storing values is easy. Say you want a 16bit integer stored in your 4bit/word RAM (so 4 words of data). Simply refer to addresses 0 through 4 as your variable and that's it. No "linking" necessary because you both know how to read from it and write to it, and you won't step on your own toes (in theory).
Additional thoughts for growing your construct: special locations for specialized registries (stack pointer to use a stack for recursive computing, program pointer for a turing machine etc). I had one more but I forgot it while writing that one, if I'll remember it I'll edit..
I'm setting up a virtuoso server on my local machine, the databse is not big (about 2GB)
The application I'm using the server for needs to make a very large number of queries and the results need to come fast.
The HDD I'm using is mechanical, so it's not that fast, I am now trying to find a way to allocate part of my main memory as a local storage so that I can put the database file on it.
is there's an easy way to do that ?
That's not what RAM is for.
If your server ever lost power, you would lose all of the data.
If you want a faster HDD, get one with a higher RPM, or get an SSD.
Take a look at the performance Tuning Guide...
It details, how to configure exactly what you are looking for.
Data is still held on disk - but the more data that can be loaded into memory too will see better performance.
get all your data into memory and that's probably as fast as it gets :-)
There's a software called RamDisk plus
you can see a demo here:
http://www.youtube.com/watch?v=vAdRsQJBEBE
This software allows you to create a disk partition right out of your RAM
My rails application always reaches the threshold of the disk I/O rate set by my VPS at Linode. It's set at 3000 (I up it from 2000), and every hour or so I will get a notification that it reaches 4000-5000+.
What are the methods that I can use to minimize the disk IO rate? I mostly use Sphinx (Thinking Sphinx plugin) and Latitude and Longitude distance search.
What are the methods to avoid?
I'm using Rails 2.3.11 and MySQL.
Thanks.
did you check if your server is swapping itself to death? what does "top" say?
your Linode may have limited RAM, and it could be very likely that it is swapping like crazy to keep things running..
If you see red in the IO graph, that is swapping activity! You need to upgrade your Linode to more RAM,
or limit the number / size of processes which are running. You should also add approximately 2x the RAM size as Swap space (swap partition).
http://tinypic.com/view.php?pic=2s0b8t2&s=7
Since your question is too vague to answer concisely, this is generally a sign of one of a few things:
Your data set is too large because of historical data that you could prune. Delete what is no longer relevant.
Your tables are not indexed properly and you are hitting a lot of table scans. Check with EXAMINE on each of your slow queries.
Your data structure is not optimized for the way you are using it, and you are doing too many joins. Some tactical de-normalization would help here. Make sure all your JOIN queries are strictly necessary.
You are retrieving more data than is required to service the request. It is, sadly, all too common that people load enormous TEXT or BLOB columns from a user table when displaying only a list of user names. Load only what you need.
You're being hit by some kind of automated scraper or spider robot that's systematically downloading your entire site, page by page. You may want to alter your robots.txt if this is an issue, or start blocking troublesome IPs.
Is it going high and staying high for a long time, or is it just spiking temporarily?
There aren't going to be specific methods to avoid (other than not writing to disk).
You could try using a profiler in production like NewRelic to get more insight into your performance. A profiler will highlight the actions that are taking a long time, however, and when you examine the specific algorithm you're using in that action, you might discover what's inefficient about that particular action.