How to select column name "startwith" in proc sql query in SAS - sql

I am looking a way to select all columns name that "startwith" a specific character. My data contains the same column name multiple time with a digit number at the end and I want the code to always select all the columns regardless the last digit numbers.
For example, if I have 3 kinds of apple in my column names, the dataset will contains the column: "apple_1", "apple_2" and "apple_3". Therefore, I want to select all columns that startwith "apple_" in a proc sql statement.
Thanks you

In regular SAS code you can use : as a wildcard to create a variable list. You normally cannot use variable lists in SQL code, but you can use them in dataset options.
proc sql ;
create table want as
select *
from mydata(keep= id apple_: )
;
quit;

Use like:
proc sql;
select t.*
from t
where col like 'apple%';
If you want the _ character as well, you need to use the ESCAPE clause, because _ is a wildcard character for LIKE:
proc sql;
select t.*
from t
where col like 'apple$_%' escape '$';

Related

Can an SQL query select variables into different tables witn an INTO clause?

I'm working myself through this guide to how to convert several character variables at once to numeric using SAS. The guide uses PROC SQL, the SQL interface of SAS. What got me wondering is the part following the INTO-clause below:
proc sql noprint; select
trim(left(name)), trim(left(newname)),
trim(left(newname))||'='||trim(left(name))
into :c_list separated by ' ', :n_list separated by ' ',
:renam_list separated by ' '
from vars;
Essentially, the clause seems to select each variable into a different table. Are you allowed to do this in SQL, or is this only a feature of PROC SQL in SAS?
I tried the syntax at several SQL sandboxes, but couldn't get it to work.
It is a feature of PROC SQL in SAS to read values into SAS Macro Variables.
Only the CREATE TABLE clause specifies the name of the table the query should create from the result set delivered by the SELECT clause.
The INTO clause in SAS' implementation of SQL is different than then INTO clause implemented in other flavors of SQL.
From SAS documentation
INTO Clause
Stores the value of one or more columns for use later in another PROC SQL query or SAS statement.
Restriction: An INTO clause cannot be used in a CREATE TABLE statement.
...
INTO macro-variable-specification
< , … macro-variable-specification>
Compare the above with Microsoft Transact-SQL INTO clause
[ INTO new_table ]
...
new_table
Specifies the name of a new table to be created, based on the columns in the select list and the rows chosen from the data source.
For these two the general patterns are
SAS: CREATE TABLE abc as SELECT ...
MS: SELECT ... INTO abc
The SAS/MS comparison for selecting a column into a variable is
SAS (: indicates into variable)SELECT expression-1,...,expression-N
INTO :macro-variable-1,...,:macro-variable-N
MS (# indicates into variable)SELECT expression-1,...,expression-N
INTO #:tsql-variable-1*,...,#tsql-variable-N
The variables the values are selected into have different scope contexts depending on system or implementation.
AFAIK there is no method to create multiple tables with a single query in SAS or ANSI SQL.
However, this is trivial within a SAS data step using a variety of methods.
The most basic:
data male female;
set sashelp.class;
if sex='M' then output male;
else if sex='F' then output female;
run;

sql query with multiple partial match condition

i have a table column looks like below.
what is the sql query statement i can use to have multiple partial match conditions?
search by ID or Name
if search abc then list the row A1 , row A2
if search test then list the row A1 , row A2, row 3
if search ghj then list the row A2
i was trying this but nothing return:
SELECT * FROM table where colB LIKE '"ID":"%abc%"'
updating data in text
{"ItemId":"123","IDs":[{"ID":"abc","CodingSystem":"cs1"}],"Name":"test itemgh"}
{"ItemId":"123","IDs":[{"ID":"ghj","CodingSystem":"cs1"}],"Name":"test abc"}
{"ItemId":"123","IDs":[{"ID":"defg","CodingSystem":"cs1"}],"Name":"test 111"}
JSON parsing
Oracle
Looked into the JSON parsing capabilities of Oracle and I managed to make running a query like this:
select * from table t where json_exists(t.colB, '$.IDs[?(#.ID=="abc")]') or json_exists(t.colB, '$.IDs?(#.name=="abc"')
And inside the same JSON query expression:
select * from table t where json_exists(t.colB, '$.IDs[?(#.ID=="abc" || #.name=="abc")]')
The call of function json_exists() is the key to this.
The first parameter can be a VARCHAR2, and I also tried with a BLOB containing text, and it works.
The second parameter is the path to your json object attribute that needs to be tested, with the condition.
I wrote two ORed conditions for the ID and for the Name, but maybe there is a better JSON query expression you can use to include them both.
More information about json_exists() function here.
Postgres
There is a JSON datatype in Postgres that supports parsing in queries.
So, if your colB column is declared as JSON you can do something like this:
select * from table where colB->>'Name' LIKE '%abc%';
And in order to have available the array elements of the IDs array, you should use the function json_array_elements().
select * from table, json_array_elements(colB->'IDs') e where colB->>'Name' LIKE '%abc%' or e->>'ID' = 'abc';
Check an example I created for you here.
Here is an online tool for online testing your JSON queries.
Check also this question in SO.
MSSQL Server 2017
I made a couple of tests also with MS SQL Server, and I managed to create an example searching for partial matching in the name field.
select * from table where JSON_VALUE(colB,'$.Name') LIKE '%abc%';
And finally I arrived to a working query that does partial match to the Name field and full match to the ID field like this:
select * from table t
CROSS APPLY OPENJSON(colB, '$.IDs') WITH (
ID VARCHAR(10),
CodingSystem VARCHAR(10)
) e
where JSON_VALUE(t.colB,'$.Name') LIKE '%abc%'
or e.ID = 'abc';
The problem is that we need to open the IDs array, and make something like a table from it, that can be queried also by accessing its columns.
The example I created is here.
LIKE text query
Your tries are good but you misplace the % symbols. They have to be first and last in your given string:
If you want the ID to be the given value:
SELECT * FROM table where colB LIKE '%"ID":"abc"%'
If the given value can be anywhere, then don't put the "ID" part:
SELECT * FROM table where colB LIKE '%abc%'
If the given value can be only on the ID or Name field then:
SELECT * FROM table where colB LIKE '%"ID":"abc"%' OR colB LIKE '%"Name":"abc"%'
And because you are giving hard-coded identifiers of fields (eg ID and Name) that can be in variable case:
SELECT * FROM table where lower(colB) LIKE '%"id":"abc"%' OR lower(colB) LIKE '%"name":"abc"%'
Assuming that the number of spaces do not vary between the : character and the value or the name of the properties.
For partial matching you can use more % in between like '%"name":"%abc%"%':
SELECT * FROM table where lower(colB) LIKE '%"id":"abc"%' OR lower(colB) LIKE '%"name":"%abc%"%'
Regular Expressions
A different option would be to test with regular expressions.
Consider checking this: Oracle extract json fields using regular expression with oracle regexp_substr

Verify if the second character is a letter in SQL

I want to put a condition in my query where I have a column that should contain second position as an alphabet.
How to achieve this?
I've tried with _[A-Z]% in where clause but is not working. I've also tried [A-Z]%.
Any inputs please?
I think you want mysql query. like this
SELECT * FROM table WHERE column REGEXP '^.[A-Za-z]+$'
or sql server
select * from table where column like '_[a-zA-Z]%'
You can use regular expression matching in your query. For example:
SELECT * FROM `test` WHERE `name` REGEXP '^.[a-zA-Z].*';
That would match the name column from the test table against a regex that verifies if the second character is either a lowercase or uppercase alphabet letter.
Also see this SQL Fiddle for an example of data it does and doesn't match.
agree with #Gordon Linoff, your ('_[A-Z]%') should work.
if not work, kindly add some sample data with your question.
Declare #Table Table
(
TextCol Varchar(20)
)
Insert Into #Table(TextCol) Values
('23423cvxc43f')
,('2eD97S9')
,('sAgsdsf')
,('3Ss08008')
Select *
From #Table As t
Where t.TextCol Like '_[A-Z]%'
The use of '%[A-Z]%' suggests that you are using SQL Server. If so, you can do this using LIKE:
where col like '_[A-Z]%'
For LIKE patterns, _ represents any character. If the first character needs to be a digit:
where col like '[0-9][A-Z]%'
EDIT:
The above doesn't work in DB2. Instead:
where substr(col, 2, 1) between 'A' and 'Z'

TERADATA - How to split a character column and keep the last token?

I have a table with article names and I would like to select the last word of each article of the table.
Right now I'm doing it in SAS and I my code looks like:
PROC SQL;
CREATE TABLE last_word as
SELECT scan(names,-1) as last_w
FROM articles;
QUIT;
I am aware of the STRTOK function in TERADATA but it seems that it only accepts positive values as indexes and in my case the articles names don't have a constant number of words.
You could use function REGEXP_SUBSTR to do this:
CREATE TABLE last_word as
SELECT REGEXP_SUBSTR(names, '[^,]+$') as last_w
FROM articles;
The Regex here will grab the last element of the list, where the list is comma delimited.

Sorting Table Variables by Prefix/Starting Letter

This is for a SAS table, so SQL commands would work, as well.
I have a table with 300 variables; they have 5 different prefixes, which I would like to sort them by. I want them in a particular order (mtr prefix before date prefix), but alphabetical would be acceptable.
I was thinking SQL would have something along the lines of:
Select mtr*, date* from Table
or
Select mtr%, date% from Table
As gbn says, you'll need to get the column names and dynamically build some sql (or data step code).
Here's a solution that retrieves the column names from an automatic SAS view that holds metadata about your session, ordered alphabetically, into a single macro variable which you can then use later in your code:
proc sql noprint;
select name into :orderedVarNames separated by ','
from sashelp.vcolumn
where libname='WORK' and memname='YOUR_TABLE_NAME'
order by name
;
quit;
(Obviously you'll need to replace the quoted values with the correct libname and table name for your table.) Then you can use this macro variable in another step, like this:
proc sql;
select &orderedVarNames
from YOUR_TABLE_NAME
;
quit;
Here, "&orderedVarNames" is resolved to the list of column names. You can check what is in the variable by putting it out to the log thus: %put &orderedVarNames;
There are other ways to do what you're thinking of, but this is probably the quickest and will work for any table. If you were going to use this technique for a variable list in a data step, change the separator to separated by ' '.
Once you've got the hang of this, you could then tailor the solution to get the exact order you want by generating more than one macro variable and filtering what you're retrieving from sashelp.vcolumn. Something like this:
proc sql noprint;
select name into :orderedMTRvars separated by ','
from sashelp.vcolumn
where libname='WORK' and memname='MYTABLE' and substr(name,1,3)='MTR'
order by name
;
select name into :orderedDATEvars separated by ','
from sashelp.vcolumn
where libname='WORK' and memname='MYTABLE' and substr(name,1,4)='DATE'
order by name
;
quit;
proc sql;
select &orderedMTRVars, &orderedDATEVars
from MYTABLE
;
quit;