How to refer user defined operators (UDOs) in Microsoft azure datalate analytics without using visual studio - azure-data-lake

I have tsv file in azure datalake, which has below fields.
paperId, language_code
I need to come up with a file with below fields
language_id, language_code
where language_id is a unique id generated for each language code.
To do this I wrote a UDO. I followed article https://learn.microsoft.com/en-us/azure/data-lake-analytics/data-lake-analytics-u-sql-develop-user-defined-operators.
using Microsoft.Analytics.Interfaces;
using System.Collections.Generic;
namespace USQL_UDO
{
public class LanguageCode : IProcessor
{
private static IDictionary<string, string> languageCodeID = new Dictionary<string, string>
public override IRow Process(IRow input, IUpdatableRow output)
{
string UserID = input.Get<long>("PaperId");
string LanguageCode = input.Get<string>("LanguageCode");
string Language_id = "";
if (languageCodeID.Keys.Contains(LanguageCode))
{
Language_id = languageCodeID[LanguageCode];
}else
{
Language_id = GetTimestamp(DateTime.Now);
languageCodeID[LanguageCode] = Language_id;
}
output.Set<string>(0, Language_id);
output.Set<string>(1, LanguageCode);
return output.AsReadOnly();
}
public static String GetTimestamp(this DateTime value)
{
return value.ToString("yyyyMMddHHmmssfff");
}
}
}
But I cannot figure out a way to refer this in my usql script. I cannot use visual studio as I'm working on a linux environment. Is there a way to refer the custom class in usql query.
I'm very new to usql and azure. I might be doing it in the complete non-sensible way.
My usql script is this.
#inputA =
EXTRACT
PaperId long,
LanguageCode string
FROM "/graph/2018-04-13/PaperLanguages.txt"
USING Extractors.Tsv(quoting : false);
#parsed_language =
PROCESS #inputA
PRODUCE Language_id string,
LanguageCode string
USING new USQL_UDO.LanguageCode();
OUTPUT #parsed_language
TO "/output/parsedData/mag2__language.csv"
USING Outputters.Text(outputHeader : true, quoting : false, delimiter: '~');

Could you use the VS Code ADL tooling from Linux instead?
In the worst case, you would compile your code and upload the dll into your Azure Data Lake Store or Azure Storage account and then register it with CREATE ASSEMBLY. Then in your U-SQL script, you bring in your code with a REFERENCE ASSEMBLY statement.
Some examples are here: https://blogs.msdn.microsoft.com/azuredatalake/2016/08/26/how-to-register-u-sql-assemblies-in-your-u-sql-catalog/

Related

REST API (JSON) that updates SQL Table using Windows Console Application and Scheduled Tasks

I am a newbie at JSON programming. Most of my experience is in C# and some in XML and Javascript. So I am a bit lost. I will attempt to be as specific as possible.
I have written a windows console application that runs via the task scheduler. Basically the windows application is supposed to take the API from a site that is managed by an outside company but the information is owned by my company and put the information within a SQL table. The API is pretty standard and written in JSON.
I am successful in parsing the JSON language and (for example) displaying it in a command prompt but I need to be able to parse the language and place it into an SQL table. I have read up on SQL injection attacks and I feel fairly confident that we have covered our bases here. So the problem lies in the fact that it does not update the table when the application is run via the scheduler or without the scheduler.
I have included a little bit of the JSON language below along with the language for my console application.
{"date":"2015-09-24","data":[{"cid":"17","rank":1},{"cid":"26","rank":1},{"cid":"80","rank":1},{"cid":"30","rank":1},{"cid":"90","rank":1},{"cid":"62","rank":1},{"cid":"147","rank":1},{"cid":"28","rank":1}"s":1,"e":null}
using System;
using System.Collections;
using System.IO;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Data.SqlClient;
using System.Diagnostics;
using System.Linq;
using System.Text;
using System.Net.Http;
using System.Net.Http.Headers;
using System.Net.Http.Formatting;
using System.Threading.Tasks;
using System.Net;
using Newtonsoft.Json;
using Newtonsoft.Json.Linq;
namespace JsonApiClient
{
class Program
{
static void Main(string[] args)
{
ExecuteRiskSearch();
Console.ReadLine();
}
static void ExecuteRiskSearch()
{
string url = "https://localhost/api/getWatchList/";
string json = new WebClient().DownloadString(url);
JObject results = JObject.Parse(json);
foreach (var result in results)
{
string cid = (string)results["CID"];
JToken rank = results["rank"];
string risk = "";
if (rank is JValue)
{
risk = (string)rank;
}
else if (rank is JArray)
{
risk = (string)((JArray)rank).First;
}
else
{
SqlConnection connection = null;
SqlCommand command = null;
try
{
connection = new SqlConnection("Integrated Security=SSPI;Persist Security Info=False;Initial Catalog=apiData;Data Source=serverName;");
command = new SqlCommand("UPDATE apiData.dbo.API SET [Category] WHERE CID=CID", connection);
connection.Open();
int numrows = command.ExecuteNonQuery();
}
catch (Exception ex)
{
System.Diagnostics.Debug.WriteLine(ex.Message);
}
finally
{
command.Dispose();
connection.Dispose();
}
}
}
}
}
}
What am I missing to make the JSON data update my SQL table? I have scoured Google search results and I haven't found much information. Any help would be so greatly appreciated.
For the need to foreach with the correct part of the JSON object, what I mean is very simply that your variable results includes the entire JSON object, from the "date" through the "e". You need to start with the "data" object and iterate through its array or your string cid will error out on assignment, as it will be attempting to assign an array to a single value. The same goes for your JToken rank. I believe it should be this:
foreach(datum in results["data"])
{
string cid = datum["cid"];
JToken rank = datum["rank"];
/* ... */
}
In addition, your set command isn't doing anything. You need to use SET columName = " + newValue + " WHERE CID == " + cid to actually affect a change, where columnName is the column you wish to alter and newValue is your C# variable carrying the desired replacement.
It's also a best practice to include a change to an updated date field when updating via an automated process, if there is one present. Generally the convention is to have a created date and an updated date for each row in a table.
I hope this at least points you in the right direction.
-C§
As an alternative, you can send entire text to Sql Server and load it there.
Sql Server 2016 will enable you to store JSON using single command - OPENJSON. In the older versions you can use existing CLR/JSON libraries such as Json4Sql or JsonSelect.

Basic info on how to export BLOB as files

I have researched on how to export BLOBs to image. A DB has an IMAGE column storing several thousand images. I thought of exporting the table but I get a BLOB file error in EMS SQL Manager for InterBase and Firebird.
There have been good posts, but I have still not been able to succeed.
SQL scripts to insert File to BLOB field and export BLOB to File
This example has appeared on numerous pages, including Microsoft's site. I am using INTERBASE (Firebird). I have not found anything related to enabling xp_shell for Firebird, or EMS SQL Manager for InterBase and Firebird (which I have also installed). My guess is: its not possible. I also tried Installing SQL Server Express, SQL Server 2008, and SQL Server 2012. I am at a dead end without having even connected to the server. The reason being I have not managed to start the server. Followed the guide at technet.microsoft: How to: Start SQL Server Agent but there are no services on the right pane to me.
PHP file to download entire column (may not post link due to rep limitation).
It has a MySQL connect section that daunts me. There on my computer is the DB as a GDB file, I also have XAMPP. I can figure out a way to use this as a localhost environment. I hope this is making sense.
Last solution is to use bcp, an idea posted on Stack Overflow titled: fastest way to export blobs from table into individual files. I read the documentation, installed it, but cannot connect to server. I use -S PC-PC -U xxx -P xxx (The server must be wrong) But the information I find all uses -T (Windows Authentication)
Summing up. I am using Firebird, as EMS SQL Manager. I try to extract all images from images table into individual files. These tools both have SQL script screens, but it appears to be in conjunction with xp shell. What would you suggest? Am I using the wrong SQL manager to accomplish this?
There are several ways:
Use isql command BLOBDUMP to write a blob to file,
Use a client library (eg Jaybird for Java, Firebird .net provider for C#) to retrieve the data,
With PHP you can use ibase_blob_get in a loop to get bytes from the blob, and write those to a file.
I don't use nor know EMS SQL Manager, so I don't know if (and how) you can export a blob with that.
The example you link to, and almost all tools you mention are for Microsoft SQL Server, not for Firebird; so it is no wonder those don't work.
Example in Java
A basic example to save blobs to disk using Java 8 (might also work on Java 7) would be:
/**
* Example to save images to disk from a Firebird database.
* <p>
* Code assumes a table with the following structure:
* <pre>
* CREATE TABLE imagestorage (
* filename VARCHAR(255),
* filedata BLOB SUB_TYPE BINARY
* );
* </pre>
* </p>
*/
public class StoreImages {
// Replace testdatabase with alias or path of database
private static final String URL = "jdbc:firebirdsql://localhost/testdatabase?charSet=utf-8";
private static final String USER = "sysdba";
private static final String PASSWORD = "masterkey";
private static final String DEFAULT_FOLDER = "D:\\Temp\\target";
private final Path targetFolder;
public StoreImages(String targetFolder) {
this.targetFolder = Paths.get(targetFolder);
}
public static void main(String[] args) throws IOException, SQLException {
final String targetFolder = args.length == 0 ? DEFAULT_FOLDER : args[0];
final StoreImages storeImages = new StoreImages(targetFolder);
storeImages.store();
}
private void store() throws IOException, SQLException {
if (!Files.isDirectory(targetFolder)) {
throw new FileNotFoundException(String.format("The folder %s does not exist", targetFolder));
}
try (
Connection connection = DriverManager.getConnection(URL, USER, PASSWORD);
Statement stmt = connection.createStatement();
ResultSet rs = stmt.executeQuery("SELECT filename, filedata FROM imagestorage")
) {
while (rs.next()) {
final Path targetFile = targetFolder.resolve(rs.getString("FILENAME"));
if (Files.exists(targetFile)) {
System.out.printf("File %s already exists%n", targetFile);
continue;
}
try (InputStream data = rs.getBinaryStream("FILEDATA")) {
Files.copy(data, targetFile);
}
}
}
}
}
Example in C#
Below is an example in C#, it is similar to the code above.
class StoreImages
{
private const string DEFAULT_FOLDER = #"D:\Temp\target";
private const string DATABASE = #"D:\Data\db\fb3\fb3testdatabase.fdb";
private const string USER = "sysdba";
private const string PASSWORD = "masterkey";
private readonly string targetFolder;
private readonly string connectionString;
public StoreImages(string targetFolder)
{
this.targetFolder = targetFolder;
connectionString = new FbConnectionStringBuilder
{
Database = DATABASE,
UserID = USER,
Password = PASSWORD
}.ToString();
}
static void Main(string[] args)
{
string targetFolder = args.Length == 0 ? DEFAULT_FOLDER : args[0];
var storeImages = new StoreImages(targetFolder);
storeImages.store();
}
private void store()
{
if (!Directory.Exists(targetFolder))
{
throw new FileNotFoundException(string.Format("The folder {0} does not exist", targetFolder), targetFolder);
}
using (var connection = new FbConnection(connectionString))
{
connection.Open();
using (var command = new FbCommand("SELECT filename, filedata FROM imagestorage", connection))
using (var reader = command.ExecuteReader())
{
while (reader.Read())
{
string targetFile = Path.Combine(targetFolder, reader["FILENAME"].ToString());
if (File.Exists(targetFile))
{
Console.WriteLine("File {0} already exists", targetFile);
continue;
}
using (var fs = new FileStream(targetFile, FileMode.Create))
{
byte[] filedata = (byte[]) reader["FILEDATA"];
fs.Write(filedata, 0, filedata.Length);
}
}
}
}
}
}

Only show effective SQL string P6Spy

I'm using p6spy to log the sql statements generated by my program. The format for the outputted spy.log file looks like this:
current time|execution time|category|statement SQL String|effective SQL string
I'm just wondering if anyone knows if there's a way to alter the spy.properties file and have only the last column, the effective SQL string, output to the spy.log file? I've looked through the properties file but haven't found anything that seems to support this.
Thanks!
In spy.properties there is a property called logMessageFormat that you can set to a custom implementation of MessageFormattingStrategy. This works for any type of logger (i.e. file, slf4j etc.).
E.g.
logMessageFormat=my.custom.PrettySqlFormat
An example using Hibernate's pretty-printing SQL formatter:
package my.custom;
import org.hibernate.jdbc.util.BasicFormatterImpl;
import org.hibernate.jdbc.util.Formatter;
import com.p6spy.engine.spy.appender.MessageFormattingStrategy;
public class PrettySqlFormat implements MessageFormattingStrategy {
private final Formatter formatter = new BasicFormatterImpl();
#Override
public String formatMessage(int connectionId, String now, long elapsed, String category, String prepared, String sql) {
return formatter.format(sql);
}
}
There is no such option provided to achieve it via configuration only yet. I think you have 2 options here:
fill a new bug/feature request report (which could bring benefit to others using p6spy as well) on: https://github.com/p6spy/p6spy/issues?state=open or
provide custom implementation.
For the later option, I believe you could achieve it via your own class (depending on the logger you use, let's assume you use Log4jLogger).
Well, if you check relevant part of the Log4jLogger github as well as sourceforge version, your implementation should be rather straightforward:
spy.properties:
appender=com.EffectiveSQLLog4jLogger
Implementation itself could look like this:
package com;
import com.p6spy.engine.logging.appender.Log4jLogger;
public class EffectiveSQLLog4jLogger extends Log4jLogger {
public void logText(String text) {
super.logText(getEffectiveSQL(text));
}
private String getEffectiveSQL(String text) {
if (null == text) {
return null;
}
final int idx = text.lastIndexOf("|");
// non-perfect detection of the exception logged case
if (-1 == idx) {
return text;
}
return text.substring(idx + 1); // not sure about + 1, but check and see :)
}
}
Please note the implementation should cover github (new project home, no version released yet) as well as sourceforge (original project home, released 1.3 version).
Please note: I didn't test the proposal myself, but it could be a good starting point and from the code review itself I'd say it could work.
I agree with #boberj, we are used to having logs with Hibernate formatter, but don't forget about batching, that's why I suggest to use:
import com.p6spy.engine.spy.appender.MessageFormattingStrategy;
import org.hibernate.engine.jdbc.internal.BasicFormatterImpl;
import org.hibernate.engine.jdbc.internal.Formatter;
/**
* Created by Igor Dmitriev on 1/3/16
*/
public class HibernateSqlFormatter implements MessageFormattingStrategy {
private final Formatter formatter = new BasicFormatterImpl();
#Override
public String formatMessage(int connectionId, String now, long elapsed, String category, String prepared, String sql) {
if (sql.isEmpty()) {
return "";
}
String template = "Hibernate: %s %s {elapsed: %sms}";
String batch = "batch".equals(category) ? ((elapsed == 0) ? "add batch" : "execute batch") : "";
return String.format(template, batch, formatter.format(sql), elapsed);
}
}
In p6Spy 3.9 this can be achieved quite simply. In spy.properties set
customLogMessageFormat=%(effectiveSql)
You can patch com.p6spy.engine.spy.appender.SingleLineFormat.java
removing the prepared element and any reference to P6Util like so:
package com.p6spy.engine.spy.appender;
public class SingleLineFormat implements MessageFormattingStrategy {
#Override
public String formatMessage(final int connectionId, final String now, final long elapsed, final String category, final String prepared, final String sql) {
return now + "|" + elapsed + "|" + category + "|connection " + connectionId + "|" + sql;
}
}
Then compile just the file
javac com.p6spy.engine.spy.appender.SingleLineFormat.java
And replace the existing class file in p6spy.jar with the new one.

TFS 2012 Backup and Restore BuildDefinitions only

I installed a TFS2012 as a test system and doing some tests before we go productive.
This includes to define many BuildDefinitions which was a lot of work.
After the tests are successful, an new server will be installed with TFS2012 on it.
For this new server - which operates then as the productive system - i would like to restore the BuildDefinitions from the test system. But only the BuildDefinitions, not the whole TeamCollections. Because i ran test checkins and i don`t want these on my productive server.
Now, is it possible to backup and restore BuildDefinitions only?
Maybe it is possible directly throught the Sql database?, but i`am a little affraid of references there, pointing on some other tables.
Best Regards, Peter Bucher
Build definitions are not source controlled. The only option is relying on the TFS database backup where can restore or view the tbl_BuildDefinition* tables in the Tfs_DefaultCollection database.
There is a user voice for this feature and also you can use TFS API to do it.
Add a vote on uservoice:
provide a way to version-control build definitions
Using TFS API
How can I copy a TFS 2010 Build Definition?
Finally i decided not to touch the database, because there are references to a lot of other tables.
I used the TFS API v11 (TFS2012) and a bit C# Code, which i fitted to my needs from this base: How can I copy a TFS 2010 Build Definition?
It copies all Build Definitions from one TFS2012 Server to another. For both servers there is the need to specifiy a TeamCollection and a TeamProject.
So, the copy-task has to be done per TeamProject.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using Microsoft.TeamFoundation.Build.Client;
using Microsoft.TeamFoundation.Client;
using Microsoft.TeamFoundation.VersionControl.Client;
namespace TFSBuildDefinitionCreator
{
internal class Program
{
private static void Main(string[] args)
{
// Copies build definitions from one server to another.
// Uses the TeamFoundation API V11 (TFS2012).
// Code was used to copy b uild definitions from a test server to a productive.
string sourceServer = "http://testTfs:8080/tfs/MyTeamCollection";
string sourceTeamProject = "MyTeamProject";
string targetServer = "https://productiveTfs:8080/tfs/MyTeamCollection";
string targetTeamProject = "MyTeamProject";
// DropLocation for defininitions: Share on which the build should be dropped.
string defaultDropLocation = "\\\\MyBuildserver\\Builds$";
// Change the DefaultProcessTemplate in the following method below: GetDefaultProcessTemplateByServerPathFromBuildServer.
CopyBuildDefinitions(sourceServer, sourceTeamProject, targetServer, targetTeamProject, defaultDropLocation);
Console.Read();
}
private static IBuildServer GetBuildServerFromServerUrl(string serverUrl)
{
var tfs = TeamFoundationServerFactory.GetServer(serverUrl);
return (IBuildServer)tfs.GetService(typeof(IBuildServer));
}
private static IBuildController GetDefaultBuildControllerFromBuildServer(IBuildServer buildServer)
{
return buildServer.QueryBuildControllers()[0];
}
private static IProcessTemplate GetDefaultProcessTemplateByServerPathFromBuildServer(IBuildServer buildServer, string teamProject)
{
var processTemplates = buildServer.QueryProcessTemplates(teamProject);
var result = processTemplates.First(t => t.ServerPath.Contains("/BuildProcessTemplates/MyDefaultTemplate.xaml"));
return result;
}
private static void CopyBuildDefinitions(string sourceServer, string sourceTeamProject, string targetServer,
string targetTeamProject, string defaultDropLocation)
{
var sourceBuildServer = GetBuildServerFromServerUrl(sourceServer);
var sourceBuildDetails = sourceBuildServer.QueryBuildDefinitions(sourceTeamProject);
foreach (var sourceBuildDetail in sourceBuildDetails)
{
CopyBuildDefinition(sourceBuildDetail, targetServer, targetTeamProject, defaultDropLocation);
}
}
private static void CopyBuildDefinition(IBuildDefinition buildDefinition, string targetServer, string targetTeamProject, string defaultDropLocation)
{
var targetBuildServer = GetBuildServerFromServerUrl(targetServer);
var buildDefinitionClone = targetBuildServer.CreateBuildDefinition(targetTeamProject);
buildDefinitionClone.BuildController = GetDefaultBuildControllerFromBuildServer(targetBuildServer);
buildDefinitionClone.ContinuousIntegrationType = buildDefinition.ContinuousIntegrationType;
buildDefinitionClone.ContinuousIntegrationQuietPeriod = buildDefinition.ContinuousIntegrationQuietPeriod;
// Noch ändern.
//buildDefinitionClone.DefaultDropLocation = buildDefinition.DefaultDropLocation;
buildDefinitionClone.DefaultDropLocation = defaultDropLocation;
buildDefinitionClone.Description = buildDefinition.Description;
buildDefinitionClone.Enabled = buildDefinition.Enabled;
//buildDefinitionClone.Name = String.Format("Copy of {0}", buildDefinition.Name);
buildDefinitionClone.Name = buildDefinition.Name;
//buildDefinitionClone.Process = buildDefinition.Process;
buildDefinitionClone.Process = GetDefaultProcessTemplateByServerPathFromBuildServer(targetBuildServer, targetTeamProject);
buildDefinitionClone.ProcessParameters = buildDefinition.ProcessParameters;
foreach (var schedule in buildDefinition.Schedules)
{
var newSchedule = buildDefinitionClone.AddSchedule();
newSchedule.DaysToBuild = schedule.DaysToBuild;
newSchedule.StartTime = schedule.StartTime;
newSchedule.TimeZone = schedule.TimeZone;
}
foreach (var mapping in buildDefinition.Workspace.Mappings)
{
buildDefinitionClone.Workspace.AddMapping(
mapping.ServerItem, mapping.LocalItem, mapping.MappingType, mapping.Depth);
}
buildDefinitionClone.RetentionPolicyList.Clear();
foreach (var policy in buildDefinition.RetentionPolicyList)
{
buildDefinitionClone.AddRetentionPolicy(
policy.BuildReason, policy.BuildStatus, policy.NumberToKeep, policy.DeleteOptions);
}
buildDefinitionClone.Save();
}
}
}
Hope that helps others.

Getting assemblyinfo of a feature in Sharepoint

I have a method that collects the assemblyversion of a webpart. (works fine) :
private void GetVersion(object control, out string name, out string version)
{
name= control.GetType().ToString();
version = control.GetType().Assembly.GetName().Version;
}
Now I want to achieve the same for my features:
private void GetFeatureVersion(SPFeature feature, out string name, out string version)
{
name = feature.Definition.GetTitle(new System.Globalization.CultureInfo("en-us"));
version = feature.GetType().Assembly.GetName().Version;
}
But in the Assembly of feature.GetType() isn't the information of my feature, but of sharepoint (14.0.0.0). The name var is fine but thats no surprise as it is not read out of the type.
I added the following to the template.xml - File.
ReceiverAssembly="$SharePoint.Project.AssemblyFullName$"
That did the trick
If you want to get the version of the feature receiver assembly you can do the following:
string version = Assembly.Load(feature.Definition.ReceiverAssembly).GetName().Version;