Data Masking options is exist in Databases - sql

I have a Database with secure Data in some tables(5 tables) when I bring that Database into my Local Environment I need to Mask the Data with Special Characters.
For Example :
Table 1 :
Name Code
Mohan 100
Raju 200
I need to see the data like this
Name Code
M##$n 1#0
R##u 2##
for some of the tables which some sensitive Data
When I get the DB backup with all this data when I'm restoring the Data into my local I need to see the data like this .
Can you please suggest me the best ways or any features in SQL server to mask the Data.

I don't think you can achieve that in a single backup/restore step using only SQL Server 2012. You will either need to write some masking scripts and use those in an ETL workflow or you could consider some data masking products that offer on-the-fly masking (masks the data while copying from source to destination).
If you don't want to write your own masking scripts to do this then you could use the free DataVeil Platform data masking tool and use the Redact mask. Disclaimer: I work for DataVeil.

Related

ADF - How should I copy table data from source Azure SQL Database to 6 other Azure SQL Databases?

We curate data in the "Dev" Azure SQL Database and then currently use RedGate's Data Compare tool to push up to 6 higher Azure SQL Databases. I am trying to migrate that manual process to ADFv2 and would like to avoid copy/pasting the 10+ copy data actives for each database (x6) to keep it more maintainable for future changes. The static tables have some customization in the copy data activity but the basic idea follows this post to perform an upsert.
How can the implementation described above be done in Azure Data Factory?
I was imagining something like the following:
Using one parameterized link service that has the server name & database name configurable to generate a dynamic connection to Azure SQL Database.
Creating a pipeline for each table's copy data activity.
Creating a master pipeline to then nest each table's pipeline in.
Using variables loop over the different connections an passing those to the sub-pipelines parameters.
Not sure if that is the most efficient plan or even works yet. Other ideas/suggestions?
we can not tell you if that's the most efficient plan. But I think so. Just make it works.
As you said in the comment:
we can use Dynamic Pipelines - Copy multiple tables in Bulk with
'Lookup' & 'ForEach'. we can perform dynamic copies of your data
table lists in bulk within a single pipeline. Lookup returns either
the lists of data or first row of data. ForEach - #activity('Azure
SQL Table lists').output.value ;
#concat(item().TABLE_SCHEMA,'.',item().TABLE_NAME,'.csv') + This is
efficient and cost optimized since we are using less number of
activities and datasets.
In usually, we also will choose same solution with you: dynamic parameter/pipeline, lookup + foreach active to achieve the scenario. In one word, make the pipeline has a strong logic, simple and efficient.
Added the same info mentioned in the Comment as Answer.
Yup, we can use Dynamic Pipelines - Copy multiple tables in Bulk with 'Lookup' & 'ForEach'.
We can perform dynamic copies of your data table lists in bulk within a single pipeline. Lookup returns either the lists of data or first row of data.
ForEach - #activity('Azure SQL Table lists').output.value ;
#concat(item().TABLE_SCHEMA,'.',item().TABLE_NAME,'.csv')
This is efficient and cost optimized since we are using less number of activities and datasets.
Attached pic as ref-

How to sync/update a database connection from MS Access to SQL Server

Problem:
I need to get data sets from CSV files into SQL Server Express (SSMS v17.6) as efficiently as possible. The data sets update daily into the same CSV files on my local hard drive. Currently using MS Access 2010 (v14.0) as a middleman to aggregate the CSV files into linked tables.
Using the solutions below, the data transfers perfectly into SQL Server and does exactly what I want. But I cannot figure out how to refresh/update/sync the data at the end of each day with the newly added CSV data without having to re-import the entire data set each time.
Solutions:
Upsizing Wizard in MS Access - This works best in transferring all the tables perfectly to SQL Server databases. I cannot figure out how to update the tables though without deleting and repeating the same steps each day. None of the solutions or links that I have tried have panned out.
SQL Server Import/Export Wizard - This works fine also in getting the data over to SSMS one time. But I also cannot figure out how to update/sync this data with the new tables. Another issue is that choosing Microsoft Access as the data source through this method requires a .mdb file. The latest MS Access file formats are .accdb files so I have to save the database in an older .mdb version in order to export it to SQL Server.
Constraints:
I have no loyalty towards MS Access. I really am just looking for the most efficient way to get these CSV files consistently into a format where I can perform SQL queries on them. From all I have read, MS Access seems like the best way to do that.
I also have limited coding knowledge so more advanced VBA/C++ solutions will probably go over my head.
TLDR:
Trying to get several different daily updating local CSV files into a program where I can run SQL queries on them without having to do a full delete and re-import each day. Currently using MS Access 2010 to SQL Server Express (SSMS v17.6) which fulfills my needs, but does not update daily with the new data without re-importing everything.
Thank you!
You can use a staging table strategy to solve this problem.
When it's time to perform the daily update, import all of the data into one or more staging tables. Execute SQL statement to insert rows that exist in the imported data but not in the base data into the base data; similarly, delete rows from the base data that don't exist in the imported data; similarly, update base data rows that have changed values in the imported data.
Use your data dependencies to determine in which order tables should be modified.
I would run all deletes first, then inserts, and finally all updates.
This should be a fun challenge!
EDIT
You said:
I need to get data sets from CSV files into SQL Server Express (SSMS
v17.6) as efficiently as possible.
The most efficient way to put data into SQL Server tables is using SQL Bulk Copy. This can be implemented from the command line, an SSIS job, or through ADO.Net via any .Net language.
You state:
But I cannot figure out how to refresh/update/sync the data at the end
of each day with the newly added CSV data without having to re-import
the entire data set each time.
It seems you have two choices:
Toss the old data and replace it with the new data
Modify the old data so that it comes into alignment with the new data
In order to do number 1 above, you'd simply replace all the existing data with the new data, which you've already said you don't want to do, or at least you don't think you can do this efficiently. In order to do number 2 above, you have to compare the old data with the new data. In order to compare two sets of data, both sets of data have to be accessible wherever the comparison is to take place. So, you could perform the comparison in SQL Server, but the new data will need to be loaded into the database for comparison purposes. You can then purge the staging table after the process completes.
In thinking further about your issue, it seems the underlying issue is:
I really am just looking for the most efficient way to get these CSV
files consistently into a format where I can perform SQL queries on
them.
There exist applications built specifically to allow you to query this type of data.
You may want to have a look at Log Parser Lizard or Splunk. These are great tools for querying and digging into data hidden inside flat data files.
An Append Query is able to incrementally add additional new records to an existing table. However the question is whether your starting point data set (CSV) is just new records or whether that data set includes records already in the table.
This is a classic dilemma that needs to be managed in the Append Query set up.
If the CSV includes prior records - then you have to establish the 'new records' data sub set inside the CSV and append just those. For instance if you have a sequencing field then you can use a > logic from the existing table max. If that is not there then one would need to do a NOT compare of the table data with the csv data to identify which csv records are not already in the table.
You state you seek something 'more efficient' - but in truth there is nothing more efficient than a wholesale delete of all records and write of all records. Most of the time one can't do that - but if you can I would just stick with it.

Visualization Using Tableau

I am new to Tableau, and having performance issues and need some help. I have a hive query result in Azure Blob Storage named as part-00000.
The issue having this performance is I want to execute the custom query in Tableau and generates the graphical reports at Tableau.
So can I do this? How ?
I have 7.0 M Data in Hive table.
you can find custom query in data source connection check linked image
You might want to consider creating an extract instead of a live connection. Additional considerations would include hiding unused fields and using filters at the data source level to limit data as per requirement.

Bteq Scripts to copy data between two Teradata servers

How do I copy data from multiple tables within one database to another database residing on a different server?
Is this possible through a BTEQ Script in Teradata?
If so, provide a sample.
If not, are there other options to do this other than using a flat-file?
This is not possible using BTEQ since you have mentioned both the databases are residing in different servers.
There are two solutions for this.
Arcmain - You need to use Arcmain Backup first, which creates files containing data from your tables. Then you need to use Arcmain restore which restores the data from the files
TPT - Teradata Parallel Transporter. This is a very advanced tool. This does not create any files like Arcmain. It directly moves the data between two teradata servers.(Wikipedia)
If I am understanding your question, you want to move a set of tables from one DB to another.
You can use the following syntax in a BTEQ Script to copy the tables and data:
CREATE TABLE <NewDB>.<NewTable> AS <OldDB>.<OldTable> WITH DATA AND STATS;
Or just the table structures:
CREATE TABLE <NewDB>.<NewTable> AS <OldDB>.<OldTable> WITH NO DATA AND NO STATS;
If you get real savvy you can create a BTEQ script that dynamically builds the above statement in a SELECT statement, exports the results, then in turn runs the newly exported file all within a single BTEQ script.
There are a bunch of other options that you can do with CREATE TABLE <...> AS <...>;. You would be best served reviewing the Teradata Manuals for more details.
There are a few more options which will allow you to copy from one table to another.
Possibly the simplest way would be to write a smallish program which uses one of their communication layers (ODBC, .NET Data Provider, JDBC, cli, etc.) and use that to take a select statement and an insert statement. This would require some work, but it would have less overhead than trying to learn how to write TPT scripts. You would not need any 'DBA' permissions to write your own.
Teradata also sells other applications which hides the complexity of some of the tools. Teradata Data Mover handles provides an abstraction layer between tools like arcmain and tpt. Access to this tool is most likely restricted to DBA types.
If you want to move data from one server to another server then
We can do this with the flat file.
First we have fetch data from source table to flat file through any utility such as bteq or fastexport.
then we can load this data into target table with the help of mload,fastload or bteq scripts.

Copy all data from one SQLServer database to another on same machine

I want to copy all data from one database to another which has the same structure. The databases reside on the same machine & on same sql server.
I have googled a bit & have found solutions like this
INSERT states (statecode, statename)
SELECT statecode, statename
FROM server1.database1.dbo.states
But the problem is they are copying table by table & I have like more then 100 tables. I was thinking that is there a way to copy all of the data at once.
Views & stored procedures all should be copied.
Or should I be looking in some other direction to achieve this ...?
If this is a one-time need, use the (Database) > Tasks > Generate Scripts menu option in SQL Server Management Studio.
Some options:
Use the DB back up and restore tools to just move a big backup file. This is the simplest option.
Slave the 2nd instance off of the 1st. It'll keep it up to date, but can be a pain.
Use import export wizard to transfer the data from one DB to another DB and use Generate script for the Transfer the Procedure and views.
Check out tools like Red-Gate SQL Compare (for structural comparison) and SQL Data Compare (for data content compare). With Data Compare, you can also easily update one database from another (or a database backup, even).
They're not free - but if you have to do this several times over and over, just the time (not to speak of the hassle) you save yourself will easily outweigh the cost of purchasing these tools. Excellent stuff - highly recommended!