I have the next query:
SELECT DISTINCT col_name, toTypeName(col_name)
FROM remote('host_name', 'db.table', 'user', 'password')
The result is a 6 records(WITHOUT NULL). Example:
some_prefix-1, Nullable(String)
...
some_prefix-6, Nullable(String)
Now I try splitByChar, but I'm getting:
Code: 43, e.displayText() = DB::Exception: Nested type Array(String)
cannot be inside Nullable type (version 20.1.2.4 (official build))
I tried to use not null condition and convert type but the problem still remains. Like that:
SELECT DISTINCT toString(col_name) AS col_name_str,
splitByChar('-', col_name_str)
FROM remote('host_name', 'db.table', 'user', 'password')
WHERE col_name IS NOT NULL
Is this expected behavior? How to fix this?
Lack of Nullable support in splitByChar (https://github.com/ClickHouse/ClickHouse/issues/6517)
You use wrong cast toString
SELECT DISTINCT
cast(col_name, 'String') AS col_name_str,
splitByChar('-', col_name_str)
FROM
(
SELECT cast('aaaaa-vvvv', 'Nullable(String)') AS col_name
)
WHERE isNotNull(col_name)
┌─col_name_str─┬─splitByChar('-', cast(col_name, 'String'))─┐
│ aaaaa-vvvv │ ['aaaaa','vvvv'] │
└──────────────┴────────────────────────────────────────────┘
or assumeNotNull
SELECT DISTINCT
assumeNotNull(col_name) AS col_name_str,
splitByChar('-', col_name_str)
FROM
(
SELECT cast('aaaaa-vvvv', 'Nullable(String)') AS col_name
)
WHERE isNotNull(col_name)
┌─col_name_str─┬─splitByChar('-', assumeNotNull(col_name))─┐
│ aaaaa-vvvv │ ['aaaaa','vvvv'] │
└──────────────┴───────────────────────────────────────────┘
Related
when i used :
SELECT * FROM hive('thrift://xxx:9083', 'ods_qn', 'ods_crm_prod_on_off_line_msg_es_df', 'bizid Nullable(String), corpid Nullable(Int32),time Nullable(Int64),reasontype Nullable(Int32),weworkid Nullable(Int64), type Nullable(Int8),pt String', 'pt');
i get:
Received exception from server (version 22.3.2):
Code: 210. DB::Exception: Received from localhost:9000. DB::Exception: Unable to connect to HDFS: InvalidParameter: Cannot create namenode proxy, does not contain host or port. (NETWORK_ERROR)
ps:
my hdfs used HA mode
this is my clickhouse config.xnl about hdfs:
<libhdfs3_conf>/etc/clickhouse-server/hdfs-client.xml</libhdfs3_conf>
how can i do? thank you
ps:
when i use :
CREATE TCREATE TABLE hdfs_engine_table (name String, value UInt32) ENGINE=HDFS('hdfs://nn1:8020/testck/other_test', 'TSV')
INSERT IINSERT INTO hdfs_engine_table VALUES ('one', 1), ('two', 2), ('three', 3)
select * from hdfs_engine_table;
SELECT *
FROM hdfs_engine_table
Query id: f736cbf4-09e5-4a0f-91b4-4d869b78e6e7
┌─name──┬─value─┐
│ one │ 1 │
│ two │ 2 │
│ three │ 3 │
└───────┴───────┘
it works ok!
but when i use hive url.
i got error
I would like to create a Temporary View from the results of a SQL Query - which sounds like a basic thing to do, but I just couldn't make it work and don't understand what is wrong.
This is my SQL query - which works fine and returns Col1.
%sql
SELECT
Col1
FROM
Table1
WHERE EXISTS (
select *
from TempView1)
I would like to write the results in another table which I can query. Therefore I do this :
df = spark.sql("""
SELECT
Col1
FROM
Table1
WHERE EXISTS (
select *
from TempView1)""")
OK
df
Out[28]: DataFrame[Col1: bigint]
df.createOrReplaceTempView("df_tmp_view")
OK
%sql
select * from df_tmp_view
Error in SQL statement: AnalysisException: Table or view not found: df_tmp_view; line 1 pos 14;
'Project [*]
+- 'UnresolvedRelation [df_tmp_view], [], false
display(affected_customers_tmp_view)
NameError: name 'df_tmp_view' is not defined
What am I doing wrong ?
I don't understand the error saying that the name is not defined although I define it just one command above. Also the SQL query is working and returning data...so what am I missing ?
Thanks !
you need to get the global context of the view, for example in your case:
global_temp_db = spark.conf.get("spark.sql.globalTempDatabase")
display(table(global_temp_db + "." + 'df_tmp_view'))
documentation
for example:
df_pd = pd.DataFrame(
{
'Name' : [231232,12312321,3213231],
}
)
df = spark.createDataFrame(df_pd)
df.createOrReplaceGlobalTempView('test_tmp_view')
global_temp_db = spark.conf.get("spark.sql.globalTempDatabase")
display(table(global_temp_db + "." + 'test_tmp_view'))
I have a table with a column called data that contains some JSON. If the data column for any given row in the table is not null, it will contain a JSON-encoded object with a key called companyDescription. The value associated with companyDescription is an arbitrary JavaScript object.
If I query my table like this
select data->>'companyDescription' from companies where data is not null;
I get rows like this
{"ops":[{"insert":"\n"}]}
I am trying to update all rows in the table so that the companyDescription values will be wrapped in another JSON-encoded JavaScript object in the following manner:
{"type":"quill","content":{"ops":[{"insert":"\n"}]}}
Here's what I have tried, but I think it won't work because the ->> operator is for selecting some JSON field as text, and indeed it fails with a syntax error.
update companies
set data->>'companyDescription' = CONCAT(
'{"type":"quill","content":',
(select data->>'companyDescription' from companies),
'}'
);
What is the correct way to do this?
You can use a function jsonb_set. Currently XML or JSON values are immutable. You cannot to update some parts of these values. You can replace these values by some new modified value.
postgres=# select * from test;
┌──────────────────────────────────────────────────────────────────────┐
│ v │
╞══════════════════════════════════════════════════════════════════════╡
│ {"companyId": 10, "companyDescription": {"ops": [{"insert": "\n"}]}} │
└──────────────────────────────────────────────────────────────────────┘
(1 row)
postgres=# select jsonb_build_object('type', 'quill', 'content', v->'companyDescription') from test;
┌───────────────────────────────────────────────────────────┐
│ jsonb_build_object │
╞═══════════════════════════════════════════════════════════╡
│ {"type": "quill", "content": {"ops": [{"insert": "\n"}]}} │
└───────────────────────────────────────────────────────────┘
(1 row)
postgres=# select jsonb_set(v, ARRAY['companyDescription'], jsonb_build_object('type', 'quill', 'content', v->'companyDescription')) from test;
┌────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ jsonb_set │
╞════════════════════════════════════════════════════════════════════════════════════════════════════╡
│ {"companyId": 10, "companyDescription": {"type": "quill", "content": {"ops": [{"insert": "\n"}]}}} │
└────────────────────────────────────────────────────────────────────────────────────────────────────┘
(1 row)
So you final statement can looks like:
update companies
set data = jsonb_set(data::jsonb,
ARRAY['companyDescription'],
jsonb_build_object('type', 'quill',
'content', data->'companyDescription'))
where data is not null;
For example my table is :
CREATE TABLE mytable (
id bigint NOT NULL,
foo jsonb
);
and it has some values :
id | foo
-----+-------
1 | "{'a':false,'b':true}"
2 | "{'a':true,'b':false}"
3 | NULL
I want to know how to check if value of a key is true , and which operator should I use?
I want something like this that can check the value :
SELECT 1
FROM mytable
WHERE
id=2
AND
foo['a'] is true
;
The syntax foo['a'] is invalid in Postgres.
If you want to access the value of a key, you need to use the ->> operator as documented in the manual
select *
from mytable
where id = 2
and foo ->> 'a' = 'true';
SELECT 1
FROM mytable
Where
id=2
AND
(foo ->> 'a')::boolean is true;
;
More correct might be
SELECT 1
FROM mytable
WHERE id=2
AND (foo -> 'a') = 'true'::JSONB;
This has the benefit of allowing postgres to make better use of any indexes you may have on your jsonB data as well as avoiding some of the ambiguity with the ->> operator that others have mentioned.
Using ->>
=> SELECT (('{"a": true}'::JSONB)->>'a') = 'true' as result;
result
--------
t
(1 row)
=> SELECT (('{"a": "true"}'::JSONB)->>'a') = 'true' as result;
result
--------
t
(1 row)
Using ->
=> SELECT (('{"a": "true"}'::JSONB)->'a') = 'true'::JSONB as result;
result
--------
f
(1 row)
=> SELECT (('{"a": true}'::JSONB)->'a') = 'true'::JSONB as result;
result
--------
t
(1 row)
Note: This is the same as Tamlyn's answer, but with an included example of how to compare against a JSONB true.
To get the text value of a key use ->> (double head) and to get the json or jsonb value use -> (single head).
Be careful because the text representations of JSON boolean value true and string value "true" are both true.
tamlyn=# select '{"a":true}'::json->>'a' bool, '{"a":"true"}'::json->>'a' str;
bool | str
------+------
true | true
(1 row)
In your case you probably want ->.
tamlyn=# select '{"a":true}'::json->'a' bool, '{"a":"true"}'::json->'a' str;
bool | str
------+--------
true | "true"
(1 row)
Get the JSON object field, cast to boolean and do a regular SQL where clause:
select *
from mytable
where (foo -> 'a')::boolean is true;
I'm writing database queries with pg-promise. My tables look like this:
Table "public.setting"
│ user_id │ integer │ not null
│ visualisation_id │ integer │ not null
│ name │ character varying │ not null
Table "public.visualisation"
│ visualisation_id │ integer │ not null
│ template_id │ integer │ not null
I want to insert some values into setting - three are hard-coded, and one I need to look up from visualisation.
The following statement does what I need, but must be vulnerable to SQL injection:
var q = "INSERT INTO setting (user_id, visualisation_id, template_id) (" +
"SELECT $1, $2, template_id, $3 FROM visualisation WHERE id = $2)";
conn.query(q, [2, 54, 'foo']).then(data => {
console.log(data);
});
I'm aware I should be using SQL names, but if I try using them as follows I get TypeError: Invalid sql name: 2:
var q = "INSERT INTO setting (user_id, visualisation_id, template_id) (" +
"SELECT $1~, $2~, template_id, $3~ FROM visualisation WHERE id = $2)";
which I guess is not surprising since it's putting the 2 in double quotes, so SQL thinks it's a column name.
If I try rewriting the query to use VALUES I also get a syntax error:
var q = "INSERT INTO setting (user_id, visualisation_id, template_id) VALUES (" +
"$1, $2, SELECT template_id FROM visualisation WHERE id = $2, $3)";
What's the best way to insert a mix of hard-coded and variable values, while avoiding SQL injection risks?
Your query is fine. I think you know value placeholders ($X parameter) and SQL Names too, but you are a bit confused.
In your query you only assign values to placeholders. The database driver will handle them for you, providing proper escaping and variable substitution.
The documentation says:
When a parameter's data type is not specified or is declared as
unknown, the type is inferred from the context in which the parameter
is used (if possible).
I can't find a source that states what is the default type, but I think the INSERT statement provides enough context to identify the real types.
On the other hand you have to use SQL Names when you build your query dinamically. For example you have variable column or table names. They must be inserted through $1~ or $1:name style parameters keeping you safe from injection attacks.