PostgreSQL and PHP. Fetching from query adds space to char string - sql

I have a table with a few fields. One field is char[128]. Now i store there a string 'hello'.
Now. In PHP i call: arr = pg_fetch_array(pg_query('select * from table')) but when I get value from this column i get 'hello '. When I execute 'select char_length(this_field) from table' using pgAdmin then I get value 5 not 6. Do you know why there is an extra space in PHP there?
Using VARCHAR instead of CHAR solves this problem.

padding to the length is documented:
https://www.postgresql.org/docs/current/static/datatype-character.html
character(n), char(n) fixed-length, blank padded
example:
t=# with c(t) as (values('abc'::char(3)),('a'::char(3)))
select t,concat(t,'.') from c;
t | concat
-----+--------
abc | abc.
a | a .
(2 rows)
regarding length:
t=# with c(t) as (values('abc'::char(3)),('a'::char(3)))
select t,concat(t,'.'),octet_length(t),char_length(t) from c;
t | concat | octet_length | char_length
-----+--------+--------------+-------------
abc | abc. | 3 | 3
a | a . | 3 | 1
(2 rows)
using character varying or text indeed changes this behaviour.

Related

PostgreSQL csv import not working for only integer

I have the following problem using PostgreSQL 14
On Windows 10 with latest updates.
I need to insert values into the following table.
CREATE TABLE StateList (
ID int GENERATED ALWAYS AS IDENTITY,
State_Number int NOT NULL,
ElectionGroup_ID INT NOT NULL,
Election_Number int NOT NULL,
UNIQUE (State_Number, ElectionGroup_ID, Election_Number),
PRIMARY KEY (ID)
);
I want to do the following command:
COPY StateList(Election_Number, State_Number, ElectionGroup_ID )
FROM '...\csvFileStateLists19.csv'
WITH (
FORMAT CSV,
DELIMITER ','
);
the "csvFileStateLists19" being
"19","9","4"
"19","5","238"
"19","5","21"
"19","15","1"
"19","5","10"
It worked fine for another table that used strings and integer.
But here I always get:
ERROR: FEHLER: ungültige Eingabesyntax für Typ integer: »19«
CONTEXT: COPY statelist, Zeile 1, Spalte election_number: »19«
SQL state: 22P02
Which is usually the sign that the number is an empty string or really not a number. but its not! It's a 19, why doesn't it work?
I generated the file in java,
its utf8 encoded,
database is "German_Germany.1252"
show client_encoding; => UNICODE
show server_encoding; => UTF8
SELECT pg_encoding_to_char(encoding) FROM pg_database WHERE datname = 'database1'; => UTF8
select pg_encoding_to_char(encoding), datcollate, datctype from pg_database where datname = 'database1';
Returns
"UTF8" "German_Germany.1252" "German_Germany.1252"
Thank you for your help!
Well, with your input, I get the same error message - just in English, not German - I did it in Vertica, Stonebraker's successor of PosgreSQL, whose CSV parser works very much the same:
COPY statelist FROM LOCAL 'st.csv' DELIMITER ',' EXCEPTIONS 'st.log';
-- error messages in "st.log"
-- COPY: Input record 1 has been rejected (Invalid integer format '"19"' for column 1 (State_Number)).
-- COPY: Input record 2 has been rejected (Invalid integer format '"19"' for column 1 (State_Number)).
-- COPY: Input record 3 has been rejected (Invalid integer format '"19"' for column 1 (State_Number)).
-- COPY: Input record 4 has been rejected (Invalid integer format '"19"' for column 1 (State_Number)).
-- COPY: Input record 5 has been rejected (Invalid integer format '"19"' for column 1 (State_Number)).
Well, that's no wonder really. "9" is a string literal, not an INTEGER literal. It's a VARCHAR(1) consisting of the numeric letter "9", not an INTEGER.
Try adding the ENCLOSED BY '"' clause. It worked for me:
COPY statelist FROM LOCAL 'st.csv' DELIMITER ',' ENCLOSED BY '"' EXCEPTIONS 'st.log';
-- out Rows Loaded
-- out -------------
-- out 5
SELECT * FROM statelist;
-- out State_Number | ElectionGroup_ID | Election_Number
-- out --------------+------------------+-----------------
-- out 19 | 5 | 10
-- out 19 | 5 | 21
-- out 19 | 5 | 238
-- out 19 | 9 | 4
-- out 19 | 15 | 1
Not an answer just proof that double quoted numeric values in a CSV are not the problem:
cat csv_test.csv
"19","9"
"19","5"
"19","5"
"19","15"
"19","5"
test(5432)=# \d csv_test
Table "public.csv_test"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
col1 | integer | | |
col2 | integer | | |
select * from csv_test;
col1 | col2
------+------
(0 rows)
\copy csv_test from 'csv_test.csv' with csv;
COPY 5
select * from csv_test;
col1 | col2
------+------
19 | 9
19 | 5
19 | 5
19 | 15
19 | 5
So now maybe we can get on with answers that solve the issue.

select a numeric column as string

Is it possible to select a numeric column as string with proc sql in sas ?
for exemple i have this table (table_1):
+------+------+------+--------------+--+
| id_1 | id_2 | id_3 | date | |
+------+------+------+--------------+--+
| 3 | 7 | 3 | 25/06/2017 | |
| 4 | 11 | 9 | 25/06/2020 | |
+------+------+------+--------------+--+
id_1, id_2 and id_3 can be numeric or string.
i want to create a table (table_2) where these three columns should have string as type
I wrote this code :
proc sql;
Create table table_2 as
select date, Convert(varchar(30),a.id_1), Convert(varchar(30),a.id_2), Convert(varchar(30),a.id_3)
from table_1 a
;quit;
But it doesn't works
The best solution is to fix your up stream processes to always generate the variables using a consistent type.
In SAS you can use the PUT() function to convert a value to a string, just use a FORMAT that is appropriate for the type of the variable. For example if your variable is number but it should have been 9 digits long with significant leading zeros then you would want to use the Z9. format to convert it and have the leading zeros represented.
select put(id_1,z9.) as id_1
If you want to convert either a number or a character variable to a string without first knowing the type of the variable you could use the CATS() function. But then you will not have any control over how the numbers are converted into strings. Use the LENGTH= attribute to force SAS to define the variable with a length of 30.
select cats(id_1) as id_1 length=30
Just Try It ...
SELECT date, CAST(a.id_1 AS varchar(30)), CAST(a.id_2 AS varchar(30)), CAST(a.id_3 AS varchar(30))
FROM table_1 a

regex to convert alphanumeric and special characters in a string to * in oracle

I have a requirement to convert all the characters in my string to *. My string can also contain special characters as well.
For Example:
abc_d$ should be converted to ******.
Can any body help me with regex like this in oracle.
Thanks
Use REGEXP_REPLACE and replace any single character (.) with *.
SELECT
REGEXP_REPLACE (col, '.', '*')
FROM yourTable
Demo
Instead of regex you could also use
select rpad('*', length('abc_d$ s'),'*') from dual
-- use '*' and pad it until length fits with other *
Doku: rpad(string,length,appendWhat)
Repeat with a string of '*' should work as well: repeat(string,count) (not tested)
regex or rpad makes no difference - they are optimized down to the same execution plan:
n-th try of rpad:
Plan Hash Value : 1388734953
-----------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost | Time |
-----------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 2 | 00:00:01 |
| 1 | FAST DUAL | | 1 | | 2 | 00:00:01 |
-----------------------------------------------------------------
n-th try of regex_replace
Plan Hash Value : 1388734953
-----------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost | Time |
-----------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 2 | 00:00:01 |
| 1 | FAST DUAL | | 1 | | 2 | 00:00:01 |
-----------------------------------------------------------------
So it does not matter wich u use.
THIS IS NOT AN ANSWER
As suggested by Tom Biegeleisen’s brother Tim, I ran a test to compare a solution based on regular expressions to one using just standard string functions. (Specifically, Tim's answer with regular expressions vs. Patrick Artner's solution using just LENGTH and RPAD.)
Details of the test are shown below.
CONCLUSION: On a table with 5 million rows, each consisting of one string of length 30 (in a single column), the regular expression query runs in 21 seconds. The query using LENGTH and RPAD runs in one second. Both solutions read all the data from the table; the only difference is the function used in the SELECT clause. As noted already, both queries have the same execution plan, AND the same estimated cost - because the cost does not take into account differences in function calculation time.
Setup:
create table tbl ( str varchar2(30) );
insert into tbl
select a.str
from ( select dbms_random.string('p', 30) as str
from dual
connect by level <= 100
) a
cross join
( select level
from dual
connect by level <= 50000
) b
;
commit;
Note that there are only 100 distinct values, and each is repeated 50,000 times for a total of 5 million values. We know the values are repeated; Oracle doesn't know that. It will really do "the same thing" 5 million times, it won't just do it 100 times and then simply copy the results; it's not that smart. This is something that would be known only by seeing the actual stored data, it's not known to Oracle beforehand, so it can't "prepare" for such shortcuts.
Queries:
The two queries - note that I didn't want to send 5 million rows to screen, nor did I want to populate another table with the "masked" values (and muddy the waters with the time it takes to INSERT the results into another table); rather, I compute all the new strings and take the MAX. Again, in this test all "new" strings are equal to each other - they are all strings of 30 asterisks - but there is no way for Oracle to know that. It really has to compute all 5 million new strings and take the max over them all.
select max(new_str)
from ( select regexp_replace(str, '.', '*' ) as new_str
from tbl
)
;
select max(new_str)
from ( select rpad('*', length(str), '*') as new_str
from tbl
)
;
Try this:
SELECT
REGEXP_REPLACE('B^%2',
'*([A-Z]|[a-z]|[0-9]|[ ]|([^A-Z]|[^a-z]|[^0-9]|[^ ]))', '*') "REGEXP_REPLACE"
FROM DUAL;
I have included for white spaces too
select name,lpad(regexp_replace(name,name,'*'),length(name),'*')
from customer;

How to use Regex in SQL for extracting values after repetitive numbers

I have the following table (table1):
+---+---------------------------------------------+
+---|--------att1 --------------------------------+
| 1 | 10.2.5.4 4.3.2.1.in-addr.arpa |
| 2 | asd 100.99.98.97 97.3.2.1.a.b.c fsdf |
| 3 | fd 95.94.93.92 92.5.7.1.a.b.c |
| 4 | a 11.4.99.75 75.77.52.41.in-addr.arpa |
+---+---------------------------------------------+
I would like to get the following values (that are located after the repetitive numbers): in-addr.arpa, a.b.c, a.b.c, in-addr.arpa.
I tried to use the following format with no success:
SELECT att1
FROM table1
WHERE REGEXP_LIKE(att1 , '^(\d+?)\1$')
I would like it to run in Impala and Oracle.
Use REGEXP_SUBSTR (assuming you are using an Oracle DB).
select regexp_substr(att1,'[0-9]\.([^0-9]+)',1,1,null,1)
from table1
[0-9]\. a numeric followed by a .
[^0-9]+ any character other than a numeric is matched until the next numeric is found. () around this indicates the group (first in this case) and we only extract that part of the string.
Sample Demo

GROUP_CONCAT automatically add double quotes only when the field contains double quotes

When I use GROUP_CONCAT in BigQuery for fields that contains double quotes,
the result values are
automatically escaped and added some double quotes.
But if the fields doesn't contain double quotes, GROUP_CONCAT behaves a little different.
Case1 (with double quotes)
Table
Row | word | num
--- | ---- | ---
1 | fo"o | 1
2 | ba"r | 1
3 | ba"z | 2
and the Query
SELECT GROUP_CONCAT(word) AS words, num
FROM Table
GROUP BY num
the results
Row | words | num
--- | ------------- | ---
1 | "fo""o,ba""r" | 1
2 | "ba""z" | 2
↑It's escaped automatically.
Case2 (without double quotes)
Table
Row | word | num
--- | ---- | ---
1 | fo'o | 1
2 | ba'r | 1
3 | ba'z | 2
and the Query
SELECT GROUP_CONCAT(word) AS words, num
FROM Table
GROUP BY num
the results
Row | words | num
--- | --------- | ---
1 | fo'o,ba'r | 1
2 | ba'z | 2
↑No double quotes added.
Case3(normal CONCAT with double quotes)
※The normal CONCAT doesn't behave like GROP_CONCAT.
Escaping double quotes are not added.
Table
Row | word
--- | ----
1 | fo"o
2 | ba'r
and the Query
SELECT CONCAT(word, '12"3') AS words
FROM Table
the results
Row | words
--- | ---------
1 | fo"o12"3
2 | ba'r12"3
Question
I wonder why the results are different between these cases.
I don't want to escape and add double quotes when Case1.
Are there any solutions?
Thanks.
The issue has been reported to Google, however they don't provide an ETA of when this will be address.
The issue has been posted int the Official Google BigQuery issue and feature request tracker, updates about this matter can be found at the link provided.
As mentioned in Marilu's answer, Google mentioned plans to support this use-case but didn't provide an ETA in the feature request at the time of posting. As of February 2015, the issue has been "Fixed".
Google has added a GROUP_CONCAT_UNQUOTED function which behaves nearly identically to GROUP_CONCAT except for it doesn't escape the double quotes.
Here is the description of the function from the Google Docs for BigQuery Aggregate Functions:
GROUP_CONCAT_UNQUOTED('str' [, separator])
Concatenates multiple strings into a single string, where each value is separated by the optional separator parameter. If separator is omitted, BigQuery returns a comma-separated string.
Unlike GROUP_CONCAT, this function will not add double quotes to returned values that include a double quote character. For example, the string a"b would return as a"b.
Example: SELECT GROUP_CONCAT_UNQUOTED(x) FROM (SELECT 'a"b' AS x), (SELECT 'cd' AS x);