terraform - azurerm. Difference between Standard_LRS and StandardSSD_LRS - azure-rm

What is the difference between terraform Standard_LRS and StandardSSD_LRS?
Terraform documentation points only, that you can chose one of those 2 but doesn't precise what is the difference.
Is Standard_LRS an HHD disk? Or can it be SSD disk also?

Yes Irina you are guessing correct.
Microsoft now offers three types of storage for your Azure Virtual Machines and they include Standard HDD Storage, Standard SSD Storage, and Premium SSD Storage. Standard HDD Storage is based on the traditional hard disk model, Standard SSD and Premium SSD Storage are both based on Solid State Storage but offer different performance characteristics.
'Standard_HDD' disks will show 'Standard_LRS' in the StorageAccountType property, 'Standard SSD' will show 'StandardSSD_LRS', and a Premium SSD disk will show 'Premium_LRS' in that property.

Related

Which hardware to choose in Neo4j

I'm beginner in neo4j and I would like to store more than 500 millions nodes and more than 20 billions relationships.
Which hardware is the best to deal with all this data ?
Thanks a lot.
Maxime
Neo4j does not restrict users to use certain hardware specifications. However it recommends minimum specifications for RAM, CPU and disk. That are as follows:
RAM:
Must have at least 2 GB
Good to have around 16 GB
CPU:
Must have an Intel Core I3 processor
Good to have an Intel Core I7 processor
Disk:
Must have SATA drives with 15k RPM
Good to have SSDs
Also have a look on these as well Neo4j : Advices for hardware sizing and config and https://neo4j.com/developer/guide-sizing-and-hardware-calculator/
Just for general recommendations, the top two things to look for are plenty of memory and fast SSDs (especially for larger graphs).
Neo4j has a pagecache for caching node and relationship graph topography, and the more of this you can fit into the pagecache the better. We typically recommend between 8 to 31 GB heap in addition to the pagecache depending on the volume and kind of queries you expect to run.
SSDs aid in Neo4j's index-free adjacency structure, as this involves pointer chasing across the disk. This is mostly for when you can't fit all of the graph in pagecache, but this also aids in lookup of node and relationship properties.

Azure VM disk attachment number is too low. Can this limit be increased?

Based on this blog post https://blogs.technet.microsoft.com/uspartner_ts2team/2015/08/26/azure-vm-drive-attachment-limits/ there is a limit on the disk attachment following the model of number of cpus x2. Is there a technical reason why this limit is in place? If you use kubernetes you may not be able to schedule a pod. The scheduler is not aware of this limit.
This was proposed as a workaround https://github.com/khenidak/dysk but I'm wondering why this very low limit exists in the first place.
The number of data disks are directly tied to the size of the VM. For example, if you go here https://learn.microsoft.com/en-us/azure/virtual-machines/windows/sizes you will see that each VM increasing in resources can handle more data disks.
This restraint is mainly built around performance. If you had a virtual machine with only 2 CPU cores and say 10 data disks you would likely run into performance issues as the CPU power and RAM needed to reach out to all those data disks at once could cause your VM to tap out.
The simple solution would be to use larger VM sizes if you need more disks. Or depending on how much space you have Azure can support up to 4TB data disks.

Aerospike performance with flash disks

I have read, that Aerospike provides better performance than other NoSQL key-value databases because it uses flash disks. But DRAMs are faster than flash disks. Then how can it have better performance than Redis or Scalaris that use only DRAMs? Is it because of the Aerospike's own system to access flash disks directly?
Thank you for answer
Aerospike allows you the flexibility to store all or part of your data (part segregated by "namespaces") in DRAM or Flash (writing in blocks on device without using a filesystem) or both DRAM and Flash simultaneously (not the best use of your spend) or both DRAM and in files in Flash or HDD (ie using a filesystem) .. DRAM with filesystem on HDD - gives performance of DRAM with inexpensive persistence of spinning disks. Finally for single value integer or float data, there is an extremely fast performance option of storing the data in the Primary Index itself (data-in-index option).
You can mix storage options in Aeropsike. ie store one namespace records purely in DRAM, store another namespace records purely in Flash -- on the same horizontally scalable cluster of nodes. You can define upto 32 namespaces in Aerospike Enterprise Edition.
Flash / HDD .. etc options allow you to persist your data.
Pure DRAM storage in Aerospike will definitely give you better latency performance, relative to storing on persistent layer.
Storing on Flash in "blocks" (ie without using the filesystem) in Aerospike is the sweet spot that gives you "RAM like" performance with persistence. Other NoSQL solutions that are pure DRAM storage don't give you persistence, or there may be others that if they give you persistence with storage in files only via the filesystem, they will quite likely give you much higher latency.

Is it possible to make CPU work on large persistent storage medium directly without using RAM?

Is it possible to make CPU work on large persistent storage medium directly without using RAM ? I am fine if performance is low here.
You will need to specify the CPU architecture you are interested, most standard architectures (x86, Power, ARM) assume the existence of RAM for their data bus, I am afraid only a custom board for this processors would allow to use something like SSD instead of RAM.
Some numbers comparing RAM vs SSD latencies: https://gist.github.com/jboner/2841832
Also, RAM is there for a reason, to "smooth" access to bigger storage from CPU,
have a look at this image (from https://www.reddit.com/r/hardware/comments/2bdnny/dram_vs_pcie_ssd_how_long_before_ram_is_obsolete/)
As side note, it is possible to access persistent storage without CPU involving (although RAM is still needed), see https://www.techopedia.com/definition/2767/direct-memory-access-dma or https://en.wikipedia.org/wiki/Direct_memory_access

Redis - Can data size be greater than memory size?

I'm rather new to Redis and before using it I'd like to learn some important (as for me) details on it. So....
Redis is using RAM and HDD for storing data. RAM is used as fast read/write storage, HDD is used to make this data persistant. When Redis is started it loads all data from HDD to RAM or it loads only often queried data to the RAM? What if I have 500Mb Redis storage on HDD, but I have only 100Mb or RAM for Redis. Where can I read about it?
Redis loads everything into RAM. All the data is written to disk, but will only be read for things like restarting the server or making a backup.
There are a couple of ways you can use it with less RAM than data though. You can set it up in combination with MySQL or another disk based store to work much like memcached - you manage cache misses and persistence manually.
Redis has a VM mode where all keys must fit in RAM but infrequently accessed data can be on disk. However, I'm not sure if this is in the stable builds yet.
Recent versions (>2.0) have improved significantly and memory management is more efficient. See this blog post that explains how to use hashes to optimize RAM memory footprint: http://antirez.com/post/redis-weekly-update-7.html
The feature called Virtual Memory and it official deprecated
Redis VM is now deprecated. Redis 2.4 will be the latest Redis version featuring Virtual Memory (but it also warns you that Virtual Memory usage is discouraged). We found that using VM has several disadvantages and problems. In the future of Redis we want to simply provide the best in-memory database (but persistent on disk as usual) ever, without considering at least for now the support for databases bigger than RAM. Our future efforts are focused into providing scripting, cluster, and better persistence.
more information about VM: https://redis.io/topics/virtual-memory