Convert all table columns into a binary string - sql

I currently have this sql
CREATE TEMPORARY VIEW binary_input_table
AS
SELECT binary(CONCAT(column_1, column_2, column_3)) AS binary_input_str
FROM input_table;
where I need binary_input_str as an input to a custom UDF I made. However, this solution isn't scalable in case there are thousands of columns, which I would then have to CONCAT manually. I've also tried SELECT binary(*)... but fails as binary only expects one argument.
Is there an easy way to convert all the columns into binary and store it into a variable?

simple solution
SELECT binary(CONCAT(*)) AS binary_input_str

Related

Split multiple points in text format and switch coordinates in postgres column

I have a PostgreSQL column of type text that contains data like shown below
(32.85563, -117.25624)(32.855470000000004, -117.25648000000001)(32.85567, -117.25710000000001)(32.85544, -117.2556)
(37.75363, -121.44142000000001)(37.75292, -121.4414)
I want to convert this into another column of type text like shown below
(-117.25624, 32.85563)(-117.25648000000001,32.855470000000004 )(-117.25710000000001,32.85567 )(-117.2556,32.85544 )
(-121.44142000000001,37.75363 )(-121.4414,37.75292 )
As you can see, the values inside the parentheses have switched around. Also note that I have shown two records here to indicate that not all fields have same number of parenthesized figures.
What I've tried
I tried extracting the column to Java and performing my operations there. But due to sheer amount of records I have, I will run out of memory. I also cannot do this method in batched due to time constraints.
What I want
A SQL query or a sequence of SQL queries that will achieve the result that I have mentioned above.
I am using PostgreSQL9.4 with PGAdmin III as the client
this is a type of problem that should not be solved by sql, but you are lucky to use Postgres.
I suggest the following steps in defining your algorithm.
First part will be turning your strings into a structured data, second will transform structured data back to string in a format that you require.
From string to data
First, you need to turn your bracketed values into an array, which can be done with string_to_array function.
Now you can turn this array into rows with unnest function, which will return a row per bracketed value.
Finally you need to slit values in each row into two fields.
From data to string
You need to group results of the first query with results wrapped in string_agg function that will combine all numbers in rows into string.
You will need to experiment with brackets to achieve exactly what you want.
PS. I am not providing query here. Once you have some code that you tried, let me know.
Assuming you also have a PK or some unique column, and possibly other columns, you can do as follows:
SELECT id, (...), string_agg(point(pt[1], pt[0])::text, '') AS col_reversed
FROM (
SELECT id, (...), unnest(string_to_array(replace(col, ')(', ');('), ';'))::point AS pt
FROM my_table) sub
GROUP BY id; -- assuming id is PK or no other columns
PostgreSQL has the point type which you can use here. First you need to make sure you can properly divide the long string into individual points (insert ';' between the parentheses), then turn that into an array of individual points in text format, unnest the array into individual rows, and finally cast those rows to the point data type:
unnest(string_to_array(replace(col, ')(', ');('), ';'))::point AS pt
You can then create a new point from the point you just created, but with the coordinates reversed, turn that into a string and aggregate into your desired output:
string_agg(point(pt[1], pt[0])::text, '') AS col_reversed
But you might also move away from the text format and make an array of point values as that will be easier and faster to work with:
array_agg(point(pt[1], pt[0])) AS pt_reversed
As I put in the question, I tried extracting the column to Java and performing my operations there. But due to sheer amount of records I have, I will run out of memory. I also cannot do this method in batched due to time constraints.
I ran out of memory here as I was putting everything in a Hashmap of
< my_primary_key,the_newly_formatted_text >. As the text was very long sometimes and due to the sheer number of records that I had, it wasnt surprising that I got an OOM.
Solution that I used:
As suggested my many folks here, this solution was better solved with a code. I wrote a small script that formatted the text as per my liking and wrote the primary key and the newly formatted text to a file in tsv format. Then I imported the tsv in a new table and updated the original table from the new one.

Convert all selected columns to_char

I am using oracle SQL queries in an external Program (Pentaho Data Integration (PDI)).
I need to convert all columns to string values before I can proceed with using them.
What i am looking for is something that automatically applies the
select to_date(col1), to_date(col2),..., to_date(colN) from example_table;
to all columns, so that you might at best wrap this statement:
select * from example_table;
and all columns are automatically converted.
For explanation: I need this because PDI doesn't seem to work fine when getting uncasted DATE columns. Since I have dynamic queries, I do not know if a DATE column exists and simply want to convert all columns to strings.
EDIT
Since the queries vary and since I have a long list of them as an input, I am looking for a more generic method than just manually writing to_char() infront of every column.
If you are looking for a solution in PDI, you need to create a job (.kjb) where in you take 2 transformations. First .ktr will rebuild the query and the Second .ktr will execute the new query.
1. First Transformation: Rebuild the query
Read the columns in the Source Table Step (use Table Input step in your case). Write the query select * from example_table; and limit the rows to either 0 or 1. The idea here is not to fetch all the rows but to recreate the query.
Use Meta Structure Step to get the meta-structure of the table. It will fetch you the list of columns coming in from the prev. step.
In the Modified JavaScript step, use a small snip of code to check if the data type of column is Date and then concat to_Char(column) to the rows.
Finally Group and Set the variables into a variable.
This is the point where the fields are recreated for you automatically. Now the next step is to execute this field with the new query.
2. Second Transformation: Using this set variable in the next step to get the result. ${NWFIELDNAME} is the variable you have set with the modified column in the above transformation.
Hope this helps :)
I have placed the code for the first ktr in gist here.
select TO_CHAR(*) from example_table;
You should not use * in your production code, it is a bad coding practice. You should explicitly mention the column names which you want to fetch.
Also, TO_CHAR(*) makes no sense. How would you convert date to string? You must use proper format model.
In conclusion, it would take a minute or two at max to list down the column names using a good text editor.
I can so not immagine an application that does not know about the actual data types but if you really want to automa(gi)cally convert all columns to strings, I see two possibilities in Oracle:
If your application language allows you to specify the binding type, you simply bind all your output variables to a string variable. The Oracle driver than takes care to convert all types to strings and this is for example possible with jdbc (Java).
If (as it seems) your application language does not allow the first solution, the best way I could think of, is to define a view for each select you want to use with the appropriate TO_CHAR convertions already and then select from the view. Those views could eventually also be generated automatically from the table repository (user_table) with some PL/SQL.
Please also note, that TO_CHAR will convert your columns acccording to the NLS settings of your session and this might lead to unwanted results, so you might also want to always specify how to convert:
SELECT TO_CHAR(SYSDATE, 'YYYY-MM-DD') FROM DUAL;
using these 2 tables, you could write a procedure with looks at the columns on each table and then performs the appropriate TO_CHAR depending on the current datatype
select * from user_tab_columns
select * from user_tables
psuedo code
begin
loop on table -- user_tables
loop on column -- user_tab_columns
if current data_type = DATE then
lnewColumn = TO_CHAR(oldColumn...(
elsif current data_type = NUMBER then
...

Search through meta array in database field

I have a database field named meta that stored a set of datetimes in this format
date_approved"=>"2015-01-01T10:19:44+00:00", "date_realized"=>"2015-01-01T10:31:11+00:00", "date_tn_approved"=>"2015-01-01T10:09:40+00:00"
Is it possible to have a SQL Query to Select for example all the records where date_approved are in January?
I not handle the insertion of values into the database so I can't really change much in the manner the data is stored
If the data is really stored in the way you show us, then this can be achieved using a hstore because that value can directly be cast to a hstore:
select *
from the_table
where extract(month from ((meta::hstore -> 'date_approved')::timestamp)) = 1
This will fail if the format in the column isn't exactly as you have shown us or if the timestamps are formatted in a different way.
You might need to create the hstore extension to be able to use that:
create extension hstore;
This needs to be done as the superuser.
SQLFiddle: http://sqlfiddle.com/#!15/d41d8/4408

SSIS comparing Binary columns using Conditional Split

I need to compare two varbinary columns using SSIS Conditional Split component.
I cannot use the == operator with varbinary.
What is the best way to compare two varbinary columns.
Have tried converting to TEXT datatype but this gives me various problems.
Two Columns are compared in Script Transformation Component , But do not know how to columns with Binary DataTypes are compared.
If anybody answers how binary values can be compared in Scrit component..you will have ur answer.
Could you use a Script Component, and compare the varbinary columns using VB.NET or C#? You could then assign a value to a new column in the data flow that lets you identify if they were identical. The Conditional Split could then use the new column as the condition.

How to parse a string and create rows in SQL (postgres)

I have a single database field that contains a start date, end date, and exclusions in the form
available DD/MONTH/YYYY [to DD/MONTH/YYYY]?[, exclude WORD [, WORD]*]?
Meaning it always starts with "available DD/MONTH/YYYY", optionally has a single "to DD/MONTH/YYYY", and optionally has an exclude clause that is a comma separated list of strings. Think regular expression meanings for + , *, and ?
I have been tasked with extracting the data out so we will now have a "startdate" column, "enddate" column, and a new table that will contain the exclusions. It will need to fill the startdate and enddate columns with the values parsed from the availability string. It will also need to create multiple records in the new exclusion table, one for each of the comma separate values after the 'exclude' key word in the availability string.
Is this a migration I can do in SQL only (postgres 8.4)?
This is against postgres 8.4.
Update: With the help of a co-worker we now have a sql script that has as it's results sql to perform the insert statements based on the parsing of the exclusions. It uses a bunch of case statements and string manipulation within the sql to generate the results. I then send the output to a file and execute that file to perform the inserts. I am doing the same for the start and end date columns.
It's not 100% sql, but a simple .bat or .sh file that runs the first .sql file, then the generated one is all that is needed to get it to go.
Thanks for the input.
You can probably do that with a combination of the regexp functions ( and the to_date() or to_timestamp() functions.
But it may be easier to just mangle the text in a function in say pl/perl. That'll get you access to the full manipulation functions in perl, while keeping the work inside the database as seems to be your specification.
why single SQL?
Write simple script in Ruby/Python/Basic to read data from the source, parse it, and put into destination database.
Or it is so big?