SQL merge using a CAST column - sql

I want to perform a SQL merge but on columns that have been CAST in the select statement.
I have tried this code:
CREATE TABLE test
AS
SELECT
a.ID, a.curr_bus_date, a.branch_no, a.account_no,
CAST(b.sortcodeinfo AS int) AS sortcode,
CAST(b.accountnumberinfo AS int) AS accountnumber,
b.curr_bus_date
FROM
TABLE1 AS a
LEFT JOIN
TABLE2 AS b ON a.branch_no = sortcode
AND a.account_no = accountnumber
AND a.curr_bus_date = b.curr_bus_date
WHERE
MONTH(curr_bus_date) = 11
AND YEAR(curr_bus_date) = 2022
And I get the error:
DatabaseError: An error occurred (InvalidRequestException) when calling the StartQueryExecution operation: line 2:37: mismatched input '.'. Expecting: ',', 'EXCEPT', 'FROM', 'GROUP', 'HAVING', 'INTERSECT', 'LIMIT', 'OFFSET', 'ORDER', 'UNION', 'WHERE',
I can't quite get the syntax correct for the cast variables I wish to merge on. I need to do this casting as the variables are different types and otherwise won't merge.. The volumes in the table are too big to do the casting as separate exercises.

The names created in the select clause are not yet available in the ON join conditions. You will have to repeat the cast expressions there.
Additionally, for the WHERE clause, this will be MUCH more efficient:
WHERE curr_bus_date >= '20221101' and curr_bus_date < '20221201'
Before, the database had to run the month and year functions for the curr_bus_date column on every row in the table, even rows you won't need. Using a function or otherwise altering a field like curr_bus_date also means the result of the expression no longer matches any index values. Any index on the column would be worthless for the query. Writing the conditions so the column values remain unaltered means indexes can still work, which can be HUGE for performance.

Related

impala replacing in-clause with inner join

overflow members!
I have a query like this running in impala:
SELECT
COUNT(*) AS value
FROM
myTable
WHERE
mycolumn IN ('value1', 'value2',..... 'value_n')
it seems than where n is in the thousands it makes this query to be quite slow.. I am trying
now to replace the in clause with an inner join something like this:
SELECT
COUNT(*) AS value
FROM
myTable
INNER JOIN (SELECT UNNEST(['value1', 'value2', .. 'value_n']) AS myColumnName)
ON myColumn=myColumnName
Impala complains about the use there of UNNEST;
[13:09] rdm (Guest)
Query: SELECT UNNEST([1, 2, 3]) AS myColumnName
Query submitted at: 2022-12-29 12:08:08 (Coordinator: http://rpkgh21dev147:25000)
ERROR: ParseException: Syntax error in line 1:
SELECT UNNEST([1, 2, 3]) AS myColumnName
       ^
Encountered: A reserved word cannot be used as an identifier: UNNEST
Expected: ALL, CASE, CAST, DEFAULT, DISTINCT, EXISTS, FALSE, IF, INTERVAL, LEFT, NOT, NULL, REPLACE, RIGHT, STRAIGHT_JOIN, TRUNCATE, TRUE, IDENTIFIER
I am not able to figure out how to make this.. what is the correct syntax in impala to make this? (be aware the n-values doesnt came from another table so I can not replace the literal array with another select).
Thank you very much in advance.
Roberto

REPLACE with JOIN - SQL

I need help to understand what I did wrong ... I'm a beginner so excuse me the simple question!
I have two tables in which I want to do a JOIN where, in one of the columns I had to use REPLACE to remove the text 'RIxRE' that does not interest me.
In table 1, this is the original text of the column id_notification: RIxRE-1787216-BSB and this is the text that returns when using REPLACE: 1787216-BSB
In column 2, this is the text that exists: 1787216-BSB
However, I get the following error:
# 1054 - Unknown column 'a.id_not' in 'on clause'
SELECT *, REPLACE(a.id_notificacao,'RIxRE','') AS id_not
FROM robo_qualinet_cadastro_remedy a
JOIN (SELECT * FROM painel_monitoracao) b ON a.id_not = b.id_notificacao
You cannot use a column alias again in the FROM clause or the WHERE clause after the SELECT (and possibly not other clauses as well, depending on the database).
So, repeat the expression:
SELECT *, REPLACE(a.id_notificacao, 'RIxRE', '') AS id_not
FROM robo_qualinet_cadastro_remedy rqcr JOIN
painel_monitoracao pm
ON REPLACE(rqcr.id_notificacao, 'RIxRE', '') = pm.id_notificacao;
Notes:
Use table aliases the mean something, such as abbreviations for the able names.
The subquery is not necessary in the FROM clause.
I suspect that you have a problem with your data model if you need a REPLACE() for the JOIN condition, but that is a different issue from this question.

Issue with Postgres not recognizing CAST on join

I'm trying to join two tables together based on an ID column. The join is not working successfully because I cannot join a varchar column on an integer column, despite using cast().
In the first table, the ID column is character varying, in the format of: XYZA-123456.
In the second table, the ID column is simply the number: 123456.
-- TABLE 1
create table fake_receivers(id varchar(11));
insert into fake_receivers(id) values
('VR2W-110528'),
('VR2W-113640'),
('VR4W-113640'),
('VR4W-110528'),
('VR2W-110154'),
('VMT2-127942'),
('VR2W-113640'),
('V16X-110528'),
('VR2W-110154'),
('VR2W-110528');
-- TABLE 2
create table fake_stations(receiver_sn integer, station varchar);
insert into fake_stations values
('110528', 'Baff01-01'),
('113640', 'Baff02-02'),
('110154', 'Baff03-01'),
('127942', 'Baff05-01');
My solution is to split the string at the dash, take the number after the dash, and cast it as an integer, so that I may perform the join:
select cast(split_part(id, '-', 2) as integer) from fake_receivers; -- this works fine, seemingly selects integers
However, when I actually attempt to perform the join, I'm getting the following error, despite using an explicit cast:
select cast(split_part(id, '-', 2) as integer), station
from fake_receivers
inner join fake_locations
on split_part = fake_locations.receiver_sn -- not recognizing split_part cast as integer!
>ERROR: operator does not exist: character varying = integer
>Hint: No operator matches the given name and argument type(s). You might need to add explicit type casts.
Strangely enough, I can perform this join with my full dataset (a queried result set shows up) but I then can't manipulate it at all (e.g. sorting, filtering it) - I get an error saying ERROR: invalid input syntax for integer: "UWM". The string "UWM" appears nowhere in my dataset or in my code, but I strongly suspect it has to do with the split_part cast from varchar to integer going wrong somewhere.
-- Misc. info
select version();
>PostgreSQL 10.5 on x86_64-apple-darwin16.7.0, compiled by Apple LLVM version 9.0.0 (clang-900.0.39.2), 64-bit
EDIT: dbfiddle exhibiting behavior
You need to include your current logic directly in the join condition:
select *
from fake_receivers r
inner join fake_stations s
on split_part(r.id, '-', 2)::int = s.receiver_sn;
Demo

Querying inside Postgresql hstore

I have a products table which has many variants, variants table has a price column with hstore datatype.
I have two queries
Query 1
SELECT variants.* FROM variants WHERE (CAST(variants.price -> 'sg' AS INT) > 1000)
Query 2
SELECT products.* FROM products INNER JOIN variants ON variants.checkoutable_id = products.id AND variants.checkoutable_type = 'Product' WHERE (CAST(variants.price -> 'sg' AS INT) > 1000)
While the first query fails with an error message ERROR: invalid input syntax for integer: "not a valid number" the second query works perfectly fine.
Building off of my comment, let's figure out how to find the problematic data. I'm going to assume you have an overwhelming number of rows in the variants table -- enough rows that manually looking for non-numeric values is going to be difficult.
First, let's isolate the rows which are not covered by the second query.
SELECT *
FROM variants
WHERE
checkoutable_type != 'Product' OR
checkoutable_id NOT IN (SELECT id FROM products);
That will probably take a while to run, and just be a big data dump. We're really interested in just price->'sg', and specifically the ones where price->'sg' isn't a string representation of an integer.
SELECT price->'sg'
FROM variants
WHERE
(checkoutable_type != 'Product' OR
checkoutable_id NOT IN (SELECT id FROM products)) AND
price->'sg' !~ '[0-9]';
That should list out the items not joined in, and which include non-numbers in the string. Clean those up, and your first query should work.
One or more rows of variants have improper content for integer, namely "not a valid number". Run the query to check which ones:
select *
from variants
where price->'sg' like 'not%';

PL/SQL Developer: joining VARCHAR2 to NUMBER?

Here's where I am:
TABLE1.ITM_CD is VARCHAR2 datatype
TABLE2.ITM_CD is NUMBER datatype
Executing left join TABLE2 on TABLE1.ITM_CD = TABLE2.ITM_CD yields ORA-01722: invalid number error
Executing left join TABLE2 on to_number(TABLE1.ITM_CD) = TABLE2.ITM_CD also yields ORA-01722: invalid number error.
-- I suspect this is because one of the values in TABLE1.ITM_CD is the string "MIXED"
Executing left join TABLE2 on TABLE1.ITM_CD = to_char(TABLE2.ITM_CD) successfully runs, but it returns blank values for the fields selected from TABLE2.
Here is a simplified version of my working query:
select
A.ITM_CD
,B.COST
,B.SIZE
,B.DESCRIPTION
,A.HOLD_REASON
from
TABLE1 a
left join TABLE2 b on a.ITM_CD = to_char(b.ITM_CD)
This query returns a list of item codes and hold reasons, but just blank values for the cost, size, and descriptions. And I did confirm that TABLE2 contains values for these fields for the returned codes.
UPDATE: Here are pictures with additional info.
I selected the following info from ALL_TAB_COLUMNS--I don't necessarily know what all fields mean, but thought it might be helpful
TABLE1 sample data
TABLE2 sample data
You can convert the TABLE1.ITM_CD to a number after you strip any leading zeros and filter out the "MIXED" values:
select A.ITM_CD
,B.COST
,B.SIZE
,B.DESCRIPTION
,A.HOLD_REASON
from ( SELECT * FROM TABLE1 WHERE ITM_CD <> 'MIXED' ) a
left join TABLE2 b
on TO_NUMBER( LTRIM( a.ITM_CD, '0' ) ) = b.ITM_CD
This is a SQL (and really a database) problem, not PL/SQL. You will need to fix this - fight your bosses if you have to. Item code must be a Primary Key in one of your two tables, and a Foreign Key in the other table, pointing to the PK. Or perhaps Item code is PK in another table which you didn't show us, and Item code in both the tables you showed us should be FK pointing to that PK.
You don't have that arrangement now, which is exactly why you are having all this trouble. The data type must match (it doesn't now), and you shouldn't have values like 'MIXED' - unless your business rules allow it, and then the field should be VARCHAR2 and 'MIXED' should be one of the values in the PK column (whichever table that is in).
In your case, the problem is that the codes in VARCHAR2 format start with a leading 0, so if you compare to the numbers converted to strings, you never get a match (and in the outer join, the match is always assumed to be to NULL).
Instead, when you convert your numbers to strings, add leading zero(s) like this:
...on a.ITM_CD = TO_CHAR(b.ITM_CD, '099999')
You can trim leading zeros from your string column.
select
A.ITM_CD
,B.COST
,B.SIZE
,B.DESCRIPTION
,A.HOLD_REASON
from
TABLE1 a
left join TABLE2 b on trim(LEADING '0' FROM a.ITM_CD ) = to_char(b.ITM_CD)