How to convert this SAS code to SQL Server code? - sql

SAS CODE:
data table1;
set table2;
_sep1 = findc(policynum,'/&,');
_count1 = countc(policynum,'/&,');
_sep2 = findc(policynum,'-');
_count2 = countc(policynum,'-');
_sep3 = findc(policynum,'_*');
_count3 = countc(policynum,'_*');
How can I convert this into a select statement like below:
select
*,
/*Code converted to SQL from above*/
from table2
For example I tried the below code:
select
*,
charindex('/&,',policynum) as _sep1,
LEN(policynum) - LEN(REPLACE(policynum,'/&,','')) as _count1
from table2
But I got a ERROR 42S02: Function 'CHARINDEX(UNKNOWN, VARCHAR)' does not exist. Unable to identify a function that satisfies that given argument types. You may need to add explicit typecasts.
Please note that the variable pol_no is: 'character varying(50) not null'.
I am running this on using Aginity Workbench for Netezza. I believe this is IBM.

Assuming Oracle based on CHARINDEX() this may work:
You need to apply it twice, once for each character and take the minimum to find the first occurrence.
There may be a better suited function within Oracle, but I don't know enough to suggest one.
select
*,
min(charindex('/',policynum), charindex('&', policynum)) as _sep1
from table2
EDIT: based on OP notes.
Netezza seems like IBM which means use the INSTR function, not CHARINDEX.
select
*,
min(instr(policynum, '/'), instr(policynum, '&')) as _sep1
from table2
https://www.ibm.com/support/knowledgecenter/en/SSGU8G_12.1.0/com.ibm.sqls.doc/ids_sqs_2336.htm

FINDC & COUNTC functions are basically used for searching a character & counting them.
You can use LIKE operator from SQL to find characters with '%' and '_' wildcards
e.g. -
SELECT * FROM <table_name> WHERE <column_name> LIKE '%-%';
and
SELECT COUNT(*) FROM <table_name> WHERE <column_name> LIKE '%-%';
You can use regular expressions in the LIKE operator as well

Related

how to fetch range of values using LIKE operator in oracle?

I my DB i want to select data which having A15-A19 using LIKE operator but couldnt get required result.
the code i made as SELECT * FROM MASTER_RULE WHERE VALUE BETWEEN LIKE 'A15%' AND 'A19%' and also tried regular expression as SELECT * FROM MASTER_RULE WHERE REGEXP_LIKE(value, 'A[1-9]') . But regexp gives all records not specified range 15-19.
How to achieve the solution for this?
Your first query is not ok, it has one extra keyword that you do not need.
Here is the regexp_like solution:
SELECT * FROM MASTER_RULE WHERE REGEXP_LIKE(value, '^A[1][5-9]')
Here is a demo
UPDATE:
Here is the "BETWEEN SOLUTION":
SELECT *
FROM MASTER_RULE
WHERE substr(value, 2,length(value)-1) between 15 AND 19
You could just use regular string comparisons:
where value >= 'A15' and
value < 'A20'
Not only is this simple, but the code can also take advantage of an index on value.
As you mentioned in the comments your data is like A15, A16, A17. etc you can achive your requirement with simple in clause also.
SELECT * FROM MASTER_RULE WHERE VALUE in ('A15','A16','A17','A18,'A19');

Hive - SELECT inside WHEN clause of CASE function gives an error

I am trying to write a query in Hive with a Case statement in which the condition depends on one of the values in the current row (whether or not it is equal to its predecessor). I want to evaluate it on the fly, this way, therefore requiring a nested query, not by making it another column first and comparing 2 columns. (I was able to do the latter, but that's really second-best). Does anyone know how to make this work?
Thanks.
My query:
SELECT * ,
CASE
WHEN
(SELECT lag(field_with_duplicates,1) over (order by field_with_duplicates) FROM my_table b
WHERE b.id=a.id) = a.field_with_duplicates
THEN “Duplicate”
ELSE “”
END as Duplicate_Indicator
FROM my_table a
Error:
java.sql.SQLException: org.apache.spark.sql.AnalysisException: cannot recognize input near 'SELECT' 'lag' '(' in expression specification; line 4 pos 9
Notes:
The reason I needed the complicated 'lag' function is that the unique Id's in the table are not consecutive, but I don't think that's where it's at: I tested by substituting another simpler inner query and got the same error message.
Speaking of 'duplicates', I did search on this issue before posting, but the only SELECT's inside CASE's I found were in the THEN statement, and if that works the same, it suggests mine should work too.
You do not need the subquery inside CASE:
SELECT a.* ,
CASE
WHEN prev_field_with_duplicates = field_with_duplicates
THEN “Duplicate”
ELSE “”
END as Duplicate_Indicator
FROM (select a.*,
lag(field_with_duplicates,1) over (order by field_with_duplicates) as prev_field_with_duplicates
from my_table a
)a
or even you can use lag() inside CASE instead without subquery at all (I'm not sure if it will work in all Hive versions ):
CASE
WHEN lag(field_with_duplicates,1) over (order by field_with_duplicates) = field_with_duplicates
THEN “Duplicate”
ELSE “”
END as Duplicate_Indicator
Thanks to #MatBailie for the answer in his comment. Don't I feel silly...
Resolved

How to substring records with variable length

I have a table which has a column with doc locations, such as AA/BB/CC/EE
I am trying to get only one of these parts, lets say just the CC part (which has variable length). Until now I've tried as follows:
SELECT RIGHT(doclocation,CHARINDEX('/',REVERSE(doclocation),0)-1)
FROM Table
WHERE doclocation LIKE '%CC %'
But I'm not getting the expected result
Use PARSENAME function like this,
DECLARE #s VARCHAR(100) = 'AA/BB/CC/EE'
SELECT PARSENAME(replace(#s, '/', '.'), 2)
This is painful to do in SQL Server. One method is a series of string operations. I find this simplest using outer apply (unless I need subqueries for a different reason):
select *
from t outer apply
(select stuff(t.doclocation, 1, patindex('%/%/%', t.doclocation), '') as doclocation2) t2 outer apply
(select left(tt.doclocation2), charindex('/', tt.doclocation2) as cc
) t3;
The PARSENAME function is used to get the specified part of an object name, and should not used for this purpose, as it will only parse strings with max 4 objects (see SQL Server PARSENAME documentation at MSDN)
SQL Server 2016 has a new function STRING_SPLIT, but if you don't use SQL Server 2016 you have to fallback on the solutions described here: How do I split a string so I can access item x?
The question is not clear I guess. Can you please specify which value you need? If you need the values after CC, then you can do the CHARINDEX on "CC". Also the query does not seem correct as the string you provided is "AA/BB/CC/EE" which does not have a space between it, but in the query you are searching for space WHERE doclocation LIKE '%CC %'
SELECT SUBSTRING(doclocation,CHARINDEX('CC',doclocation)+2,LEN(doclocation))
FROM Table
WHERE doclocation LIKE '%CC %'

Regular expressions inside SQL Server

I have stored values in my database that look like 5XXXXXX, where X can be any digit. In other words, I need to match incoming SQL query strings like 5349878.
Does anyone have an idea how to do it?
I have different cases like XXXX7XX for example, so it has to be generic. I don't care about representing the pattern in a different way inside the SQL Server.
I'm working with c# in .NET.
You can write queries like this in SQL Server:
--each [0-9] matches a single digit, this would match 5xx
SELECT * FROM YourTable WHERE SomeField LIKE '5[0-9][0-9]'
stored value in DB is: 5XXXXXX [where x can be any digit]
You don't mention data types - if numeric, you'll likely have to use CAST/CONVERT to change the data type to [n]varchar.
Use:
WHERE CHARINDEX(column, '5') = 1
AND CHARINDEX(column, '.') = 0 --to stop decimals if needed
AND ISNUMERIC(column) = 1
References:
CHARINDEX
ISNUMERIC
i have also different cases like XXXX7XX for example, so it has to be generic.
Use:
WHERE PATINDEX('%7%', column) = 5
AND CHARINDEX(column, '.') = 0 --to stop decimals if needed
AND ISNUMERIC(column) = 1
References:
PATINDEX
Regex Support
SQL Server 2000+ supports regex, but the catch is you have to create the UDF function in CLR before you have the ability. There are numerous articles providing example code if you google them. Once you have that in place, you can use:
5\d{6} for your first example
\d{4}7\d{2} for your second example
For more info on regular expressions, I highly recommend this website.
Try this
select * from mytable
where p1 not like '%[^0-9]%' and substring(p1,1,1)='5'
Of course, you'll need to adjust the substring value, but the rest should work...
In order to match a digit, you can use [0-9].
So you could use 5[0-9][0-9][0-9][0-9][0-9][0-9] and [0-9][0-9][0-9][0-9]7[0-9][0-9][0-9]. I do this a lot for zip codes.
SQL Wildcards are enough for this purpose. Follow this link: http://www.w3schools.com/SQL/sql_wildcards.asp
you need to use a query like this:
select * from mytable where msisdn like '%7%'
or
select * from mytable where msisdn like '56655%'

Converting a String to HEX in SQL

I'm looking for a way to transform a genuine string into it's hexadecimal value in SQL. I'm looking something that is Informix-friendly but I would obviously prefer something database-neutral
Here is the select I am using now:
SELECT SomeStringColumn from SomeTable
Here is the select I would like to use:
SELECT hex( SomeStringColumn ) from SomeTable
Unfortunately nothing is that simple... Informix gives me that message:
Character to numeric conversion error
Any idea?
Can you use Cast and the fn_varbintohexstr?
SELECT master.dbo.fn_varbintohexstr(CAST(SomeStringColumn AS varbinary))
FROM SomeTable
I'm not sure if you have that function in your database system, it is in MS-SQL.
I just tried it in my SQL server MMC on one of my tables:
SELECT master.dbo.fn_varbintohexstr(CAST(Addr1 AS VARBINARY)) AS Expr1
FROM Customer
This worked as expected. possibly what I know as master.dbo.fn_varbintohexstr on MS-SQL, might be similar to informix hex() function, so possibly try:
SELECT hex(CAST(Addr1 AS VARBINARY)) AS Expr1
FROM Customer
The following works in Sql 2005.
select convert(varbinary, SomeStringColumn) from SomeTable
Try this:
select convert(varbinary, '0xa3c0', 1)
The hex number needs to have an even number of digits. To get around that, try:
select convert(varbinary, '0x' + RIGHT('00000000' + REPLACE('0xa3c','0x',''), 8), 1)
If it is possible for you to do this in the database client in code it might be easier.
Otherwise the error probably means that the built in hex function can't work with your values as you expect. I would double check the input value is trimmed and in the format first, it might be that simple. Then I would consult the database documentation that describes the hex function and see what its expected input would be and compare that to some of your values and find out what the difference is and how to change your values to match that of the expected input.
A simple google search for "informix hex function" brought up the first result page with the sentence: "Must be a literal integer or some other expression that returns an integer". If your data type is a string, first convert the string to an integer. It looks like at first glance you do something with the cast function (I am not sure about this).
select hex(cast SomeStringColumn as int)) from SomeTable
what about:
declare #hexstring varchar(max);
set #hexstring = 'E0F0C0';
select cast('' as xml).value('xs:hexBinary( substring(sql:variable("#hexstring"), sql:column("t.pos")) )', 'varbinary(max)')
from (select case substring(#hexstring, 1, 2) when '0x' then 3 else 0 end) as t(pos)
I saw this here:
http://blogs.msdn.com/b/sqltips/archive/2008/07/02/converting-from-hex-string-to-varbinary-and-vice-versa.aspx
Sorrry, that work only on >MS SQL 2005
OLD Post but in my case I also had to remove the 0x part of the hex so I used the below code. (I'm using MS SQL)
convert(varchar, convert(Varbinary(MAX), YOURSTRING),2)
SUBSTRING(CONVERT(varbinary,Addr1 ) ,1,1) as Expr1