IS EBCDIC files are supported by Azure data factory? - azure-data-factory-2

Can we ingest EBCDIC files using azure data factory? if yes, which all connectors support that? Does DB2 connector support

You can copy EBCDIC files by Copy Data activity. Data Flow can't handle EBCDIC encoded files. You can vote here for this feature.
There are many connectors support that, you can refer to this doc. But DB2 doesn't support it.

Related

How can I export data from azure storage table to .csv file in .Net core C#

is there an azure API to import/export an existing collection from Azure Table Storage in .csv?
The Table Storage REST API does not provide a response as CSV directly, so it's always necessary to transform the data accordingly, as for example the Azure Storage Explorer does using an older version of the azcopy v7.3.
I've built a little C# library that basically does the same. It currently caches all rows in memory though to create the CSV headers so that's something to be aware of.

Azure function to convert csv to excel file

I have requirement to read data from azure sql server and write in excel blob using data factory. i created csv file from azure sql server using datafactory copy activity. i have no idea how to convert csv to excel or directly read excel from azure sql using data factory. I searched on internet and found azure functions as an option.
Any suggestions you all have about saving CSV to XLSX via Azure Functions?
Excel format is supported for the following connectors: Amazon S3,
Azure Blob, Azure Data Lake Storage Gen1, Azure Data Lake Storage
Gen2, Azure File Storage, File System, FTP, Google Cloud Storage,
HDFS, HTTP, and SFTP. It is supported as source but not sink.
As the MSDN says, Excel format is not supported as sink by now. So you can't directly convert csv file to excel file using Copy activity.
In Azure function, you can create a python function and use pandas to read csv file. Then convert it to excel file as #Marco Massetti comments.

SSIS sending source Oledb data to S3 Buckets in parquet File

My source is SQL Server and I am using SSIS to export data to S3 Buckets, but now my requirement is to send files as parquet File formate.
Can you guys give some clues on how to achieve this?
Thanks,
Ven
For folks stumbling on this answer, Apache Parquet is a project that specifies a columnar file format employed by Hadoop and other Apache projects.
Unless you find a custom component or write some .NET code to do it, you're not going to be able to export data from SQL Server to a Parquet file. KingswaySoft's SSIS Big Data Components might offer one such custom component, but I've got no familiarity.
If you were exporting to Azure, you'd have two options:
Use the Flexible File Destination component (part of the Azure feature pack), which exports to a Parquet file hosted in Azure Blob or Data Lake Gen2 storage.
Leverage PolyBase, a SQL Server feature. It let's you export to a Parquet file via the external table feature. However, that file has to be hosted in a location mentioned here. Unfortunately S3 isn't an option.
If it were me, I'd move the data to S3 as a CSV file then use Athena to convert the CSV file to Pqrquet. There is a nifty article here that talks through the Athena piece:
https://www.cloudforecast.io/blog/Athena-to-transform-CSV-to-Parquet/
Net-net, you'll need to spend a little money, get creative, switch to Azure, or do the conversion in AWS.

Transfer a file from a computer to an Azure VM

I have a vb.net application connected with an sql server. This applications handles files.
Recently, this application connected with an sql server, which is in a VM of Azure.
My question is, how i can hanndle the files?
I want my application to upload(over internet) the files somewhere and then server side to haddle where these files will be saved. And the opposite.
Can you tell me what options i have? I don't want OneDrive.
Depending on the kinds of files you store and the way your application handles them, you have multiple options with Azure. These are Azure Blob Storage ( with blob types: Block, Append, and Page), Azure Files, or Azure Data Lake Store.
Azure Blob Storage:
The following blob types are great of your data is unstructured.
Block Blobs: for use of binary data or text. You store in blocks that can be manged.
Page Blobs: to store random access files, good for storing VHDs that are backing up VMs.
Append Blobs: similar to block blobs but are append-only and optimized for append-only workloads. Good for storing log files storage.
If you handle files using native file systems APIs and want to "lift and shift" your application as is, Azure Files might be your best option which uses the SMB protocol.
Another option you might want to try, which is in preview (not generally available yet ) is Azure Data Lake Store Gen 2 which allows you to interact with Azure Blob storage through a file system interface.
From the way you describe your application, I doubt you want to use Azure Disks service. Here is a comparison table to help you decide: https://learn.microsoft.com/en-us/azure/storage/common/storage-decide-blobs-files-disks?toc=%2fazure%2fstorage%2fblobs%2ftoc.json

Which dashboard analytics will support Parse.com data source?

I've developed an app that uses Parse.com as the back end. I now need a dashboard analytics software package (such as iDashboards) that will enable me to pull data from my Parse.com database classes and present some of that data in a pretty dashboard fashion.
iDashboards looks to be the kind of tool i'm after, but it only supports certain data source inputs such as JDBC, ODBC, SQL, MySQL etc. Not being a database guru by any means, i'm not sure if Parse.com can be classed as any of the above, but from what i've read it doesn't come under any of these categories.
Can anybody recommend a way of either connecting Parse.com to iDashboard, or suggest another dashboard tool that will support Parse.com as a data source?
The main issue you are facing is that data coming out of Parse.com is going to be in json format. Most dashboards are going to prefer csv files.
The best dashboard I am aware of is Tableau and there is a discussion about getting json into Tableau here: http://community.tableau.com/ideas/1276
If your preference is using iDashboards then you need to convert the json coming out of Parse into a csv format that iDashboards can consume. You can do that using RJSON as mentioned in the post above but you'll probably have an easier time of it with a simple php or python script that periodically connects to Parse and pulls out data updates for you and then pushes it to your dashboard of choice.
Converting json to csv in php is addressed here: Converting JSON to CSV format using PHP
The difference is much more fundamental than "unsupported file format". In fact, JSON data coming out of Parse is stored in a so-called denormalized form, which means that a single JSON data file may contain the equivalent of arbitrarily many tables in a relational database. Stated differently, one JSON file may translated into potentially many CSV files, and there's no unique choice of how to perform that translation.
This is a so-called ETL problem, where ETL stands for Extract-Transform-Load. As such, you may be interested in open source ETL tools such as Kettle. Kettle is supported by Pentaho and includes functionality that can help you develop a workflow to turn JSON data into multiple CSV files that can then be imported into iDashboards (or similar). Aside from Kettle, Talend is also widely used for this purpose and has the same ability.
Finally, note that Parse is powered by MongoDB, and exports JSON data that is easily stored and manipulated in MongoDB. As such, a natural fit for reporting on Parse data is any reporting tool built for MongoDB.
As of the time of this writing, there are two such options:
JSON Studio, which is a commercial solution that is built explicitly for MongoDB and has your stated capability to produce dashboards.
SlamData, which is an open source solution, also built for MongoDB, which allows native SQL on the database. The current version does not have reporting capabilities (just CSV export), but the 2.09 version due out in June has reporting dashboards baked in.
An advantage of using a MongoDB reporting tool is that you will not have to wrangle your data into relational form. If it's heavily nested, using arrays, and so forth, it can be quite painful to develop an ETL workflow and keep it in sync with how the data is changing. Instead, all you have to do is built a script to pipe the raw data from Parse into a MongoDB instance (perhaps hosted by MongoLab or equivalent, if you don't want to host it yourself), and connect the MongoDB reporting tool on top.
You might also contact Parse and see if they have a recommended solution for this. It occurs to me they should probably bake some sort of analytical / reporting functionality into their APIs as this is such a common use case.
You can use Axibase Time-Series Database to ingest your data from parse.com and they have built in dashboards and widgets for visualization or you can just export data from ATSD to csv and use iDashboards.