how would I implement custom functions in SQL?
Let's say I have a list of values and I want to transform those in a way where SQL doesn't supply a default function. E.g. look at every field's string, count the number of chars and add the sum of it to the end of each string.
In a programming language this is easy, e.g. save the field's string in a variable A, loop through the string char by char and increasing a counter by one each time, adding the counter to variable A.
But how would I do this in SQL? Can I implement EVERY function solely with the means of SQL, or do I need to use a common programming language like Java, or something like PL/SQL for that?
Thanks so much.
Related
Please help,
How could I extract 2019-04-02 out of the following string with Azure data flow expression?
ABC_DATASET-2019-04-02T02:10:03.5249248Z.parquet
The first part of the string received as a ChildItem from a GetMetaData activity is dynamically. So in this case it is ABC_DATASET that is dynamic.
Kind regards,
D
There are several ways to approach this problem, and they are really dependent on the format of the string value. Each of these approaches uses Derived Column to either create a new column or replace the existing column's value in the Data Flow.
Static format
If the format is always the same, meaning the length of the sections is always the same, then substring is simplest:
This will parse the string like so:
Useful reminder: substring and array indexes in Data Flow are 1-based.
Dynamic format
If the format of the base string is dynamic, things get a tad trickier. For this answer, I will assume that the basic format of {variabledata}-{timestamp}.parquet is consistent, so we can use the hyphen as a base delineator.
Derived Column has support for local variables, which is really useful when solving problems like this one. Let's start by creating a local variable to convert the string into an array based on the hyphen. This will lead to some other problems later since the string includes multiple hyphens thanks to the timestamp data, but we'll deal with that later. Inside the Derived Column Expression Builder, select "Locals":
On the right side, click "New" to create a local variable. We'll name it and define it using a split expression:
Press "OK" to save the local and go back to the Derived Column. Next, create another local variable for the yyyy portion of the date:
The cool part of this is I am now referencing the local variable array that I created in the previous step. I'll follow this pattern to create a local variable for MM too:
I'll do this one more time for the dd portion, but this time I have to do a bit more to get rid of all the extraneous data at the end of the string. Substring again turns out to be a good solution:
Now that I have the components I need isolated as variables, we just reconstruct them using string interpolation in the Derived Column:
Back in our data preview, we can see the results:
Where else to go from here
If these solutions don't address your problem, then you have to get creative. Here are some other functions that may help:
regexSplit
left
right
dropLeft
dropRight
I'm creating a function for querying a SQLite database that should be generic in the sense of reading from multiple pre-defined tables. As part of the function's paremeters, it is necessary to tel which column of the table should be read, an info that is supposed to be passed by an enumerator value. So the call would be something like this:
callMyFunction(enumTableA,enumColumnB);
Since enumColumnB is an enumerator value, the argument is an integer and I would like to identify the column by that integer value without having to use a switch-case for that. Something like this:
SELECT column(enumColumnB) from ...
So instead of passing the name of the column or reading all columns with * and then selecting the desired one when reading, it would use the number of the column as passed by the enumerator value to identify what should be read.
Is it possible to do this for SQLite? I couldn't find any tutorial mentioning such possibility, so I'm almost concluding there is no such possibility (and I'll have to pass the name of the column as argument or use a switch-case at the end), but I'ld like to be sure I don't have this option available.
SQLite has no mechanism for indirect columns.
(SQLite is designed as an embedded database, i.e., to be used together with a 'real' programming language.)
You have to replace the column name in whatever programming language you're using.
I am creating a database in which certain derived attributes are computed using the Universal Gravitational Constant (G), whose approximate value is 6.673 * 10^-11.
I understand that a normal integer constant can be defined using a scalar function as follows
CREATE FUNCTION MY_CONST()
RETURNS INT
AS
BEGIN
RETURN 123456789
END
Thing is, I'm new to SQL and not sure how to store a complex value such as G in there. In popular high-level programming languages like Java, I usually define the value as 6.673e-11 at the top of the editor and call it whenever I need it in my calculations. I would like to know how to simply do the same in SQL.
I really just don't get how to translate the value into SQL code as a constant.
Suppose you have a DB table like this:
Table t
....
column_a integer
column_b varchar(255)
....
Now, I want to store a string that is composed by a list of names on t.column_b, with the following format (separated by commas):
Word A, Word B, Word C...
The problem is, it might be the case that the string is larger than 255 characters and in my application logic I don't want to blindly trim to 255, but instead store the maximum number of words possible, eliminating the last word that exceeds the size. Also, I want to develop in such a way that if the column changes size, I don't want to change my application. Is it possible to write a SQL query that retrieves the declared size of a column? Or perhaps, I should use another column type?
If relevant, I am using Informix.
Thanks in advance.
Informix truncates blindly at the limit unless your database is MODE ANSI.
The DBI defines metadata attributes for columns and DBD::Informix implements them.
For a statement handle, $sth, you can use:
$sth->{PRECISION}->[0]
to get the precision (length) of the first column in the output.
See perldoc DBI under 'Statement Handle Attributes'.
If you need to know the type information for some column, write a SELECT statement, prepare it, then analyze the statement handle.
Because this is defined by DBI, you will get the same behaviour with any driver (DBD::YourDBMS).
In this application I need to allow users to enter a month as integer (1-12) then use integer tryparse to validate that input, that seems to be the easy part. I need two create two functions, one that returns the name of the month and the other returns the number of days in that month. The arrays are supposed to be defined and initialized within the function so that the main program can take the user input and call the two functions, then return the appropriate values as output to labels. I am not sure how to declare the arrays in their appropriate functions and then how to call those functions to retrieve the right value from the function.
Since this is homework I'm not gonna write the code for you, but this should be pretty simple. Assuming the number is in the textbox and the user presses the OK button (or whatever the functionality is), the code for that OK button should include calls to the two functions you create, let's say GetMonth and GetDays.
GetMonth would take an integer, and frankly, I don't really see the need for declaring any arrays here. If the array declaration is part of your assignment then you can do that, but it just doesn't seem necessary. A simple Select...Case statement seems more straightforward: you just set up the cases for the integer being passed into the function, 1-12, and return the month's string. Similarly for GetDays, just set up cases 1-12 and return the number of days.
If you're unfamiliar with any of this, take a look at these MSDN articles; they should point you in the right direction:
Functions
Cases
Arrays
Hope this helps!
Edit: Realized I never expanded on how you could do this with arrays anyway (meant to, sorry). You would just create two string arrays of size 12 (or one string array, one integer array), and then define each of the 12 elements in each array as whatever month or number of days you need. Then in the function, just return something like arrayDays[x], where x is the input being passed in. (If you want to be fancy, you could create a 12x2 string array and store all the information in one place.) But I'm pretty sure it'll take slightly less code to do a Select...Case statement (it just seems more direct to me).