PostgreSQL: manipulate binary data - change last byte with SQL command - sql

In a PostgreSQL table I have a column file_bytes which has the data type bytea.
I am looking for a simple SQL statement to manipulate only the last byte of the content of this column.

demo: db<>fiddle
UPDATE test
SET file_bytes = overlay(file_bytes placing 'X'::bytea from octet_length(file_bytes));
https://www.postgresql.org/docs/current/static/functions-binarystring.html
octet_length() gives the number of bytes of binary data. overlay() allows to rewrite data from a certain position.

Related

Commas using SAS and TD SQL

I am using SAS to pull data in a Teradata environment. I am counting the rows in the Teradata table, but want the output to be in a comma format (i.e. 1,000,000). I was able to use the code below to display the value as a comma, but when I try to add the column in SAS, I can't since the output is in a character format. Does anyone have any suggestions on how to format the number value as comma, so that it can be used for calculation purposes in SAS? Thanks.
CAST(Count(*) as (format 'Z,ZZZ,ZZ9')) as char(10)) as rowCount,
Assuming you're using pass through, pull it in as numeric and format it on the SAS side. You've now converted it to character (char10) and SAS doesn't do math on character variables which makes logical sense.
select rowCount format=comma12. from con
(select
count(*) as rowCount ....
)
If you have a select * you can always format it later in a data step or via PROC DATASETS. SAS separates the display and storage layers so the format controls the appearance but the underlying data still remains numeric.

How do I write a SQL Query to fetch data according to a specific format/ pattern of String?

I need to write a SQL query to fetch data according to a specific format of String. I need to fetch those records where the LOC column of my query looks like the following:
Cx-xxx-Lx
Or
Cxx-xxx-Lx
Or
x-xxx-Lx
Or
xx-xxx-Lx
Or
xxxxxLx_x
Or
xxxxxLx_xx
Or
BxxxLLxxxx
Where :-
x is a number (0 to 9)
L is a letter (A to Z)
I have filtered the LOC column to fetch data where the length of the record is either 9 or 10. Although this is fetching correct data from the DB, this is not a correct way of doing so.
My current SQL:
select * from table
where length(LOC) in (9,10)
Any help would be appreciated.
You can use regular expressions. Something like:
where regexp_like(LOC, '^C?[0-9]{1,2}-[0-9]{3}-[A-Z][0-9]$') or
regexp_like(LOC, '^[0-9]{5}[A-Z][0-9]_[0-9]{1,2}$') or
regexp_like(LOC, '^B[0-9]{3}[A-Z]{2}[0-9]{4}$')
You can combine these into one regular expression using |. I think it is easier to follow and debug as three separate expressions.

Get an average value for element in column of arrays of json data in postgres

I have some data in a postgres table that is a string representation of an array of json data, like this:
[
{"UsageInfo"=>"P-1008366", "Role"=>"Abstract", "RetailPrice"=>2, "EffectivePrice"=>0},
{"Role"=>"Text", "ProjectCode"=>"", "PublicationCode"=>"", "RetailPrice"=>2},
{"Role"=>"Abstract", "RetailPrice"=>2, "EffectivePrice"=>0, "ParentItemId"=>"396487"}
]
This is is data in one cell from a single column of similar data in my database.
The datatype of this stored in the db is varchar(max).
My goal is to find the average RetailPrice of EVERY json item with "Role"=>"Abstract", including all of the json elements in the array, and all of the rows in the database.
Something like:
SELECT avg(json_extract_path_text(json_item, 'RetailPrice'))
FROM (
SELECT cast(json_items to varchar[]) as json_item
FROM my_table
WHERE json_extract_path_text(json_item, 'Role') like 'Abstract'
)
Now, obviously this particular query wouldn't work for a few reasons. Postgres doesn't let you directly convert a varchar to a varchar[]. Even after I had an array, this query would do nothing to iterate through the array. There are probably other issues with it too, but I hope it helps to clarify what it is I want to get.
Any advice on how to get the average retail price from all of these arrays of json data in the database?
It does not seem like Redshift would support the json data type per se. At least, I found nothing in the online manual.
But I found a few JSON function in the manual, which should be instrumental:
JSON_ARRAY_LENGTH
JSON_EXTRACT_ARRAY_ELEMENT_TEXT
JSON_EXTRACT_PATH_TEXT
Since generate_series() is not supported, we have to substitute for that ...
SELECT tbl_id
, round(avg((json_extract_path_text(elem, 'RetailPrice'))::numeric), 2) AS avg_retail_price
FROM (
SELECT *, json_extract_array_element_text(json_items, pos) AS elem
FROM (VALUES (0),(1),(2),(3),(4),(5)) a(pos)
CROSS JOIN tbl
) sub
WHERE json_extract_path_text(elem, 'Role') = 'Abstract'
GROUP BY 1;
I substituted with a poor man's solution: A dummy table counting from 0 to n (the VALUES expression). Make sure you count up to the maximum number of possible elements in your array. If you need this on a regular basis create an actual numbers table.
Modern Postgres has much better options, like json_array_elements() to unnest a json array. Compare to your sibling question for Postgres:
Can get an average of values in a json array using postgres?
I tested in Postgres with the related operator ->>, where it works:
SQL Fiddle.

Manipulating a record data

I am looking for a way to take data from one table and manipulate it and bring it to another table using an SQL query.
I have a Column called NumberStuff that has data like this in it:
INC000000315482
I need to cut off the INC portion of the number and convert it into an integer and store it into a Column in another table so that it ends up looking like this:
315482
Any help would be much appreciated!
Another approach is to use the Replace function. Either in TSQL or as a Derived Column Expression in SSIS.
TSQL
SELECT REPLACE(T.MyColumn, 'INC', '') AS ReplacedINC
SSIS
REPLACE([MyColumn], "INC", "")
This removes the character based data. It then becomes an optional exercise in converting to a numeric type before storing it to the target table or letting the implicit conversion happen.
Simplest version of what you need.
select cast(right(column,6) as int) from table
Are you doing this in a SSIS statement, or?...is it always the last 6 or?...
This is a little less dependant on your formatting...removes 0's and can be any length (will trim the first 3 chars and the leading 0's).
select cast(SUBSTRING('INC000000315482',4,LEN('INC000000315482') - 3) as int)

String or binary data would be truncated. -- even though length shorter than field length

Trying to prefix a string field on a table by using the UPDATE command. For some reason I'm getting the
String or binary data would be truncated.
exception, even though the length of my data would easily fit in the field.
Using SQL Server 2008 R2 Standard Edition and SSMS 2008 R2.
Template: Learner is a nvarchar(60)
Template: Learner is a nvarchar(30) using 60 bytes as shown in sp_help
select LEN('Aaaaaaaa' + LEFT(learner, 52)) myLen from Template order by myLen desc
>>> max len = 31
update Template set learner = 'Aaaaaaaa' + LEFT(learner, 52)
>>> String or binary data would be truncated.
update Template set learner = 'Aaaaaaaa' + LEFT(learner, 52)
>>> String or binary data would be truncated.
update Template set learner = CAST('Aaaaaaaa' + LEFT(learner, 52) AS NVARCHAR(60))
>>> String or binary data would be truncated.
Whereas the following works:
SELECT CAST('Aaaaaaaa' + LEFT(learner, 52) AS NVARCHAR(60)) FROM Template
You can try to ignore this message by
SET ANSI_WARNINGS OFF
update ...
SET ANSI_WARNINGS ON
Thanks for your responses.
Looks like I was fooled by the length value on sp_help and sp_columns. As it's a NVARCHAR column the length means the length in bytes and not the number of characters. This is because NVARCHAR = Unicode (UTF-16) => 2 Bytes per character.
Looking at the CREATE script revealed the truth. Also the PRECISION field on sp_columns shows the number of characters too.
My problem was having another log table (for audit trail), filled by a trigger on the main table, where the column size also had to be changed.
Another possible reason could be having a nonsensical column CONSTRAINT with a Default Value that is longer than the max value.
Example:
The error would occur when inserting without specifying this column in the insert statement.