PostgreSQL: How do I sum after using a text function and cast? (ERROR SQL state: 22P02) - sql

I am probably missing something obvious and asking a silly thing, but I am unable to do a simple sum.
My data was imported with the character '€' character so I had to import the data as text:
original data sample:
"€31.51"
"€0.10"
"€24.23"
I tried to use a string function to remove the €. I was then hoping to convert to numeric and sum.
SELECT sum(coalesce(CAST(split_part(revenue_eur, '€', 2) as NUMERIC),'0'))
FROM revenue_test;
the only piece that runs is
SELECT coalesce(split_part(revenue_eur, '€', 2),'0')
FROM revenue_test;
What I need is just a sum. Could someone please kindly help me figure it out?
I tried doing a subquery but failed in misery..
Maybe there is a way to import the data without the € and into numeric?
Thank you!!!!!!
EDIT: I imported via CSV to pgAdmin4 and using postgres 12 (The file counts to 85k rows in sql)
To import the data I tried COPY with the pgAdmin4 query tool but I got error 'permission denied'. I checked all the permissions to my file but I clearly was missing something, the most likely solution I found was that I needed to connect to postgres via the terminal on my mac and use \COPY. But I didn't manage to do that.
So I ended up using the right click feature 'import' via pgAdmin.
EDIT 2: I found the problem, when importing a ',' was inserted to one value (mark thousands) so I am unable to cast without taking it away.
I found regular expression help on removing a character from a specific order but the , appears randomly.
Code works!
WITH test AS
(SELECT translate(revenue_eur, '€,', '')::float as eur
FROM revenue_test)
SELECT sum(coalesce(eur, '0'))
from test;

Related

ms access query expression field not accepting IIf expression

im trying to return 1 if null and 0 if not null and when i enter the expression this The error
error pops up.
heres the expression i wrote :
IIf(IsNull([_characterType]); 0; 1)
The database file (i had to empty it for legal reasons)
(the database was in another language with special characters and i changed all of it to english, that may be the problem but i still dont know how to fix it)
i tried using comma instead of semicolon and got the same result.
when i enter the same exact expression in SQL view it works but when i switch to design view it breaks again.
i tried using external software to fix the database.
i tried importing all of the data to another newly made access file.
Try this for your expression:
Abs(Not IsNull([_characterType]))

How to pass a Date Pipeline Parameter to a Data Flow use in a Dataflow Expression Builder

I am doing something that seems like it should be very easy yet I have yet to figure this out. I have read countless posts and tried everything I can think of and still no success.
Here goes:
I created a Pipeline Parameter pplLastWritten with a default value of 2022-08-20 12:19:08 (I have tried without the time for troubleshooting and still get errors)
Then I create a Data Flow Parameter ptblTableName
I have tried to convert to a Date, keeping as is and converting later...you name it still errors out.
In the expression builder I tried this and many more ways to build out to a sql statement:
"SELECT * FROM xxxxxx."+$ptblTableName+"where Lastwritten>='{$ptblLastWritten}'"
This is the post I got the idea from: ADF data flow concat expression with single quote
This is the error I got most of the time.
Operation on target df_DynamicSelect failed: {"StatusCode":"DF-Executor-StoreIsNotDefined","Message":"Job failed due to reason: at Source 'RptDBTEST'(Line 5/Col 0): The store configuration is not defined. This error is potentially caused by invalid parameter assignment in the pipeline.","Details":""}
I have tried so many things but in the end nothing has worked. I am new to Data Factory and come from the SSIS world which was so much easier. I would greatly appreciate someone helping. Please explain this like I'm a kindergartener because this tool is making me feel like it. :) Thank you in advanced.
I have tried various ways to format
Using different ideas in the expression builder
the ideas in this post: ADF data flow concat expression with single quote
You can use concat() function in the Data flow dynamic expression like below.
Here is the sample data in SQL.
I have created two dataflow parameters mytable and mydate.
Passed the values like below. Check the expression checkbox. For date you can also pass like this '2022-11-07T00:00:00.0000000'.
In the Query option use below Expression.
concat('select * from dbo.',$table_name,' where mydate >=','\'',$mydate,'\'')
Values inserted in Target table.

Partial Loading Due to " in data - Snowflake Issue

I haven't been able to find anything that describes this issue I am having, although, I am sure many have had this problem. It may be as simple as forcing pre-processing in Python before loading the data in.
I am trying to load data from S3 into Snowflake tables. I am seeing errors such as:
Numeric value '' is not recognized
Timestamp '' is not recognized
In the table definitions, these columns are set to DEFAULT NULL, so if there are NULL values here it should be able to handle them. I opened the files in Python to check on these columns and sure enough some of the rows (the exact number throwing an error in Snowflake) are NaN's.
Is there a way to correct for this in Snowflake?
Good chance you need to add something to your COPY INTO statement to get this to execute correctly. Try this parameter in your format options:
NULL_IF = ('NaN')
If you have more than just NaN values (like actual strings of 'NULL'), then you can add those to the list in the () above.
If you are having issues loading data into tables (from any source) and are experiencing a similar issue to the one described above, such that the error is telling you *datatype* '' is not recognized then you will need to follow these instructions:
Go into the FILE_FORMAT you are using through the DATABASES tab
Select the FILE_FORMAT and click EDIT in the tool bar
Click on Show SQL on the bottom left window that appears, copy the statement
Paste the statement into a worksheet and alter the NULL_IF statement as follows
NULL_IF = ('\\N','');
Snowflake doesn't seem to recognize a completely empty value by default, so you need to add it as an option!

Microsoft Query in Excel - Casting types?

I have an Excel workbook that has an external connection to a .csv file for a pivot table. One of the columns in the .csv - let's call it ID - has data such as '00000000000000101'. I have a simple SELECT ID FROM DATA.csv set up.
In Microsoft Access, while importing the table, I can classify that field to be text before running the query on it. However, in my current situation, Excel/Microsoft Query is taking it in as int. I looked up the CAST function, and proceeded to try
SELECT CAST(ID AS TEXT) and
SELECT CAST(ID AS CHAR(255))
but both yielded me the error
Incorrect column expression: 'CAST(ID AS xxx'.
Am I using the function incorrectly? How should I approach this?
If it is relevant, here is my connection string:
DBQ=C:\;Driver={Microsoft Access Text Driver (*.txt, *.csv)};DriverId=27;Extensions=txt,csv,tab,asc;FIL=text;MaxBufferSize=2048;MaxScanRows=25;PageTimeout=5;SafeTransactions=0;Threads=3;UID=admin;UserCommitSync=Yes;
I managed to fix this by using a schema.ini file. This helps to specify how Excel/Microsoft Query should read the fields and it seems to work nicely. Thanks for all the help!
Try the CSTR() function, use SELECT CSTR(ID)
http://office.microsoft.com/en-us/access-help/type-conversion-functions-HA001229018.aspx

Pentaho table output step not showing proper error in log

In Pentaho, I have a table output step where I load a huge num of records into a netezza target table.
One of the rows fails and the log shows me which values are causing the problem. But the log is probably not right, because when i create an insert statement with those values and run it separately on teh database, it works fine.
My question is:
In Pentaho, is there a way to identify that when a db insert fails, exactly which values caused the problem and why?
EDIT: The error is 'Column width exceeded' and it shows me the values that is supposedly causing the problem. But I made an insert statement with those values and it works good. So I think Pentaho is not showing me the correct error message, it is a different set of values that are causing the problem.
Another way I've used to deal with these kind of problems is to create another table in the DB with widened column types. Then in your transform, add a Table output step connected to the new table. Then connect your original Table output to the new step, but when asked, choose 'Error handling' as the hop type.
When you run your transform, the offending rows will end up in the new table. Then you can investigate exactly what the problem is with that particular row.
For example you can do something like:
insert into [original table] select * from [error table];
You'll probably get a better error message from your native DB interface than from the JDBC driver.
I don't know what is your problem exactly, but I think I had the same problem before.
Everything seems right, but the problem was that in some tranformations, when I transform a numeric value to string for example, the transformation added a whitespace at the end of the field, and the long of the field was n+1 instead of n, but that is very difficult to see.
A practical example would be if you are transforming with a calculator step, you may use YEAR() function to extract the year of a date field, and maybe to that new field with the year have been added a whitespace, so if the year had a length of 4, after that step it will has a length of 5, and when you are going to load a row (with that year field that is a string(5)) into the data warehouse and in your data warehouse is expecting a string(4), you will get the same error that are getting now.
You think is happening --> year = "2013" --> length 4
Really is happening --> year = "2013 " --> length 5
I recommend you to pay quite attention to the string fields and their lengths, because if some transformation adds a whitespace that you don't expect you can lose a lot of time to find the error (myself experience).
I hope this can be useful for you!
EDIT: I'm guessing you are working with PDI (Spoon, before Kettle) and the error is producing when you are loading a data warehouse, so correct me if I'm wrong.
Can you use the file with nzload command, with this command you can find exact error, and bad records in badFile provided by you for detailed analysis.
e.g. -
nzload -u <username> -pw <password> -host <netezzahost> -db <database> -t <tablename> -df <datafile> -lf <logfile> -bf <badrecords file name> -delim <delimiter>