Get the name of every command issued to Redis - redis

I need to add local proxy like twemproxy or dynomite infront of a remote Redis server.
I want to compare the commands we use against the supported commands.
We use redis indirectly so I cannot scan our code to determine which commands need support with a high degree of certainty that I haven't missed something. So I would like to run the test suite
against a Redis instance and afterwards determine every command that was run.
For example, at https://github.com/twitter/twemproxy/blob/master/notes/redis.md, a list like
+-------------------+------------+---------------------------------------------------------------------------------------------------------------------+
| Command | Supported? | Format |
+-------------------+------------+---------------------------------------------------------------------------------------------------------------------+
| DEL | Yes | DEL key [key …] |
+-------------------+------------+---------------------------------------------------------------------------------------------------------------------+
| DUMP | Yes | DUMP key |
+-------------------+------------+---------------------------------------------------------------------------------------------------------------------+
| EXISTS | Yes | EXISTS key |
+-------------------+------------+---------------------------------------------------------------------------------------------------------------------+
| EXPIRE | Yes | EXPIRE key seconds
...
is provided to show which commands are supported.
How can I generate a list of Redis commands issued from a test suite?

You can just run:
redis-cli monitor
in your Terminal to see all commands run, along with their timestamps.

Related

Drill - Parquet IO performance issues with Azure Blob or Azure Files

The problem:
Parquet read performance of Drill appears to be 5x - 10x worse when reading from azure storage and it renders it unusable for bigger data workloads.
It appears to be only a problem when reading parquets. Reading CSV, at the other hand, runs normally.
Let's have:
Azure Blob storage account with ~1GB source.csv and parquets with same data.
Azure Premium File Storage with the same files
Local disk folder containing the same files
Drill running on Azure VM in single mode
Drill configuration:
Azure blob storage plugin working as namespace blob
Azure files mounted with SMB to /data/dfs used as namespace dfs
Local disk folder used as namespace local
The VM
Standard E4s v3 (4 vcpus, 32 GiB memory)
256GB SSD
NIC 2Gbps
6400 IOPS / 96MBps
Azure Premium Files Share
1000GB
1000 IOPS base / 3000 IOPS Burst
120MB/s throughput
Storage benchmarks
Measured with dd, 1GB data, various block sizes, conv=fdatasync
FS cache dropped before each read test (sudo sh -c "echo 3 > /proc/sys/vm/drop_caches")
Local disk
+-------+------------+--------+
| Mode | Block size | Speed |
+-------+------------+--------+
| Write | 1024 | 37MB/s |
| Write | 64 | 16MBs |
| Read | 1024 | 70MB/s |
| Read | 64 | 44MB/s |
+-------+------------+--------+
 Azure Premium File Storage SMB mount
+-------+------------+---------+
| Mode | Block size | Speed |
+-------+------------+---------+
| Write | 1024 | 100MB/s |
| Write | 64 | 23MBs |
| Read | 1024 | 88MB/s |
| Read | 64 | 40MB/s |
+-------+------------+---------+
Azure Blob
Max known throughput of azure blobs is 60MB/s. Upload/download speeds are clamped to target storage read/write speeds.
Drill benchmarks
The filesystem cache was purged before every read test.
IO performance observed with iotop
Queries were chosen simple only for demonstration. Execution time growth for more complex queries is linear.
 Sample queries:
-- Query A: Reading parquet
select sum(`Price`) as test from namespace.`Parquet/**/*.parquet`;
-- Query B: Reading CSV
select sum(CAST(`Price` as DOUBLE)) as test from namespace.`sales.csv`;
 Results
+-------------+--------------------+----------+-----------------+
| Query | Source (namespace) | Duration | Disk read usage |
+-------------+--------------------+----------+-----------------+
| A (Parquet) | dfs(smb) | 14.8s | 2.8 - 3.5 MB/s |
| A (Parquet) | blob | 24.5s | N/A |
| A (Parquet) | local | 1.7s | 40 - 80 MB/s |
| --- | --- | --- | --- |
| B (CSV) | dfs(smb) | 22s | 30 - 60 MB/s |
| B (CSV) | blob | 29s | N/A |
| B (CSV) | local | 18s | 68 MB/s |
+-------------+--------------------+----------+-----------------+
Observations
When reading parquet, more threads will spawn but only cisfd process takes the IO performance.
Trying to tune parquet reader performance as described here but without any significant results.
There is a big peak of egress data at the time of querying parquets from azure storage, that exceeds parquet data size several times. The parquets have ~300MB but the egress peak for one read query is about 2.5GB.
Conclusion
Reading parquets from Azure Files is for some reason slowed down to ridiculous speeds.
Reading parquets from Azure Blob is even a bit slower.
Reading parquets from local filesystem is nicely fast, but not suitable for real use.
Reading CSV from any source utilizes storage throughput normally, therefore I assume some problem / misconfiguration of parquet reader.
The questions
What are the reasons that parquet read performance from Azure Storage is so drastically reduced?
Is there way to optimize it?
I assume that you would have cross checked IO performance issue using Azure Monitor and if the issue still persist, I would like to work closely on this issue. This may require a deeper investigation, so If you have a support plan, I request you file a support ticket, else please do let us know, we will try and help you get a one-time free technical support. In this case, could you send an email to AzCommunity[at]Microsoft[dot]com referencing this thread. Please mention "ATTN subm" in the subject field. Thank you for your cooperation on this matter and look forward to your reply.

Load data from csv in google cloud storage as bigquery 'in' query

I want to compose such query using bigquery, my file stored in Google cloud platform storage:
select * from my_table where id in ('gs://bucket_name/file_name.csv')
I get no results. Is it possible? or am I missing something?
You are able using the CLI or API to do adhoc queries to GCS files without creating tables, a full example is covered here Accessing external (federated) data sources with BigQuery’s data access layer
code snippet is here:
BigQuery query --external_table_definition=healthwatch::date:DATETIME,bpm:INTEGER,sleep:STRING,type:STRING#CSV=gs://healthwatch2/healthwatchdetail*.csv 'SELECT date,bpm,type FROM healthwatch WHERE type = "elevated" and bpm > 150;'
Waiting on BigQueryjob_r5770d3fba8d81732_00000162ad25a6b8_1 ... (0s)
Current status: DONE
+---------------------+-----+----------+
| date | bpm | type |
+---------------------+-----+----------+
| 2018-02-07T11:14:44 | 186 | elevated |
| 2018-02-07T11:14:49 | 184 | elevated |
+---------------------+-----+----------+
on other hand you can create a permament EXTERNAL table with autodetect schema to facilitate WebUI and persistence read more about that here Querying Cloud Storage Data

How to set DTU for Azure Sql Database via SQL when copying?

I know that you can create a new Azure SQL DB by copying an existing one by running the following SQL command in the [master] db of the destination server:
CREATE DATABASE [New_DB_Name] AS COPY OF [Azure_Server_Name].[Existing_DB_Name]
What I want to find out is if its possible to change the number of DTU's the copy will have at the time of creating the copy?
As a real life example, if we're copying a [prod] database to create a new [qa] database, the copy might only need resources to handle a small testing team hitting the QA DB, not a full production audience. Scaling down the assigned DTU's would result in a cheaper DB. At the moment we manually scale after the copy is complete but this takes just as long as the initial copy (several hours for our larger DB's) as it copies the database yet again. In an ideal world we would like to skip that step and be able to fully automate the copy process.
According to the the docs is is:
CREATE DATABASE database_name
AS COPY OF [source_server_name.] source_database_name
[ ( SERVICE_OBJECTIVE =
{ 'basic' | 'S0' | 'S1' | 'S2' | 'S3' | 'S4'| 'S6'| 'S7'| 'S9'| 'S12' |
| 'GP_GEN4_1' | 'GP_GEN4_2' | 'GP_GEN4_4' | 'GP_GEN4_8' | 'GP_GEN4_16' | 'GP_GEN4_24' |
| 'BC_GEN4_1' | 'BC_GEN4_2' | 'BC_GEN4_4' | 'BC_GEN4_8' | 'BC_GEN4_16' | 'BC_GEN4_24' |
| 'GP_GEN5_2' | 'GP_GEN5_4' | 'GP_GEN5_8' | 'GP_GEN5_16' | 'GP_GEN5_24' | 'GP_GEN5_32' | 'GP_GEN5_48' | 'GP_GEN5_80' |
| 'BC_GEN5_2' | 'BC_GEN5_4' | 'BC_GEN5_8' | 'BC_GEN5_16' | 'BC_GEN5_24' | 'BC_GEN5_32' | 'BC_GEN5_48' | 'BC_GEN5_80' |
| { ELASTIC_POOL(name = <elastic_pool_name>) } } )
]
[;]
CREATE DATABASE (sqldbls)
You can also change the DTU level during a copy from the PowerShell API
New-AzureRmSqlDatabaseCopy
But you can only choose "a different performance level within the same service tier (edition)" Copy an Azure SQL Database.
You can, however, copy the database into an elastic pool in the same service tier, so you wouldn't be allocating new DTU resources. You might have a single pool for all your dev/test/qa datatabases and drop the copy there.
If you want to change the service tier, you could a Point-in-time Restore instead of a Database Copy. The database can be restored to any service tier or performance level, using the Portal, PowerShell or REST.
Recover an Azure SQL database using automated database backups

Return old master after redis sentinel failover

I have 3 box setup of redis sentinel:
CLIENT (connects to S1)
|
↓
+----+
| M1 | us-east-1
| S1 |
+----+
|
+----+ | +----+
| R2 |----+----| R3 |
| S2 | | S3 |
+----+ +----+
us-east-2 us-west-2
M1 - Master
S1 - Sentinel 1
S2 - Sentinel 2
S3 - Sentinel 3
R2 - First slave (R=replica)
R3 - Second slave
After my master died, sentinel made a failover to R2.
I brought back M1 online (cleared some disk space) and now M1 is alive and well but a slave of R2. Is there an automatic way (or semi-automatic) to make M1 a master again and R2 as a slave of M1 and my traffic again using M1 as a master redis instance?
Essentially I want to revert to how it was prior to failover.
What currently happens is that it elects R2 as a master and reconfigures it to be:
CLIENT (connects to S1)
|
↓
+----+
|[R2]| us-east-2
| S2 |
+----+
|
+----+ | +----+
|[M1]|----+----| R3 |
| S1 | | S3 |
+----+ +----+
us-east-1 us-west-2
When I failover manually, it promotes R3 as master. (which is kind of expected).
But then when I failover again manually, it promotes R2, but I would expect it to promote M1.
All successive failovers rotate between R2 and R2 (while always keeping M1 as a slave of either).
My M1 slave priority is unspecified, so it means it's a default value of 100.
My R2 slave priority is 200 and R2 is 300. That leads me to think that it should rotate all 3 of the boxes, but it rotates only R2 and R3 after the initial failover.
This looks like a sentinel bug to me
I think kiddorails's answer is correct, but most probably you have a similar problem as I had, where for some reason your original master is not replicating correctly.
Once I fixed my replication issue, I could cycle through my masters by issueing SENTINEL FAILOVER mymaster. Initially it would just bounce between the two original slaves, but now that my original master is correctly replicating, it is cycling through all 3.
So I would recommend checking the replication of your original master after a failover. if you are sure it is working, you could also stop the other slave and then use the SENTINEL FAILOVER mymaster command to force a failover to the original master. If that fails, you know there must be an issue with the replication.
I am not sure why you want to do that in first place. Redis failing over to R2 and using at as master now should perfectly work as normal M1 instance. If that's not the case, you are not actually using Sentinel correctly for high availability.
You can just trigger a manual failover with SENTINEL failover R2. It should switch to either M1 or R3.

How do I update a database that's in use?

I'm building a web application using ASP.NET MVC with SQL Server and my development process is going to be like
Make changes in SQL Server locally
Create LINQ-to-SQL classes as necessary
Before committing any change set that has a database, script out the database so that I can regenerate it if I ever need to
What I'm confused about is how I'm going to update the production database which will have live data in set.
For example, let's say I have a table like
People
========================================
Id | FirstName | LastName | FatherId
----------------------------------------
1 | 'Anakin' | 'Skywalker' | NULL
2 | 'Luke' | 'Skywalker' | 1
3 | 'Leah' | 'Skywalker' | 1
in production and locally and let's say I add an extra column locally
ALTER TABLE People ADD COLUMN LightsaberColor VARCHAR(16)
and update my LINQ to SQL, script it out, test it with sample data and decide that I want to add that column to production.
As part of a deployment process, how would I do that? Does there exist some tool that could read my database generation file (call it GenerateDb.sql) and figure out that it needs to update the production People table to put default values in the new column, like
People
==========================================================
Id | FirstName | LastName | FatherId | LightsaberColor
----------------------------------------------------------
1 | 'Anakin' | 'Skywalker' | NULL | NULL
2 | 'Luke' | 'Skywalker' | 1 | NULL
3 | 'Leah' | 'Skywalker' | 1 | NULL
???
You should have a staging DB that is identical to the production database.
When you add any changes to the database, you should perform these changes to the staging DB first, and you can of course compare the dev and staging DB to generate a script with the difference.
Visual Studio has a Schema Compare that generate a script with the differences between a two databases.
There are some other tools a well that does the same.
So, you can generate the script, apply it to the staging Db and if everything went fine, you can apply the script on the production DB
Actually that is right you must have a staging process whenever we commit features we use TFS from Development to Production that is called staging you can look up the history of the TFS whether the database or the solution. and if you're not using TFS in Visual Studio and MSSQL Server.
I guess that your are commiting youre features directly to your server that is your production test. or you can test that in your test server first to see the changes.
Another thing is that I guess if you use stored procedures you can use Temporary Tables if you're asking about the script.
I guess that it's your first time commiting in a live server..