What is the meaning of .WORK in SAS EG and can it be removed from my table names without consequences - sql

When SAS EG creates a query in the query builder it puts "work." in front of tables here is an example:
%_eg_conditional_dropds(WORK.QUERY_FOR_UNL_OBLIGATIONSBEHOLDN);
PROC SQL;
CREATE TABLE WORK.QUERY_FOR_UNL_OBLIGATIONSBEHOLDN AS
SELECT t1.CUSTOM_1,
t1.CUSTOM_2,
/* REPORTING_AMOUNT */
(SUM(t1.REPORTING_AMOUNT)) AS REPORTING_AMOUNT,
t1.LINE_ITEM,
t1.CUSTOM_5
FROM WORK.UNL_OBLIGATIONSBEHOLDNING t1
WHERE t1.CUSTOM_5 IN
(
'VLIK9035_POS_NOTE',
'VLIK9023_POS_COVERED_BOND'
) AND t1.CUSTOM_1 BETWEEN '20500000' AND '20599999' AND t1.LINE_ITEM NOT ='orphans'
GROUP BY t1.CUSTOM_1,
t1.CUSTOM_2,
t1.LINE_ITEM,
t1.CUSTOM_5;
QUIT;
If I remove "WORK." from the created table and the queried table nothing changes it works just as well as before, as far as I know.
What does it mean when a is named WORK.?

Generally, a table is identified by a library name and a table name. A library can consist of several tables. So, the normal form is [library].[table] to identify a table. If you omit the library name SAS interprets this as work.[table] therefore you can remove 'work.' and nothing will change.

Work is a temporary library. So yes, you can remove the work. part of your code.

Related

Add Column to SAS via Proc SQL Statement

I haven't been able to find this exact question - but it seems simple enough that it's likely been asked before. I apologize in advance if my search skills aren't up to par...
Anyhow, I am trying to create a 'source_flag' column, appended to several tables I'm creating. Basically, each year and payment type has it's own table. I can query and manipulate each table individually, but I'm joining them all together (full join) at the end of the process. I want to create a column with each observation equal to the table the data came from.
For example, I want to join six tables:
2019_PD
2020_PD
2019_PB
2020_PB
2019_PN
2020_PN
All I want to do, is in the query for each table, create a column assigning the table name to the entire row, so that I know where each row came from.
proc sql;
create table 2020_PD as select
...,
...,
...,
"2020_PD" as source_flg,
.
.
.
;
quit;
Right now SAS is trying to find a field called 2020_PD - which obviously doesn't exist. Is there an easy way to do this within the proc statement? I'm not trying to add additional data steps since I'm doing this with too many tables to make that viable.
Thank you!!
SQL uses single quotes to delimit strings. So use:
'2020_PD' as source_flg,
The double quotes are interpreted as escape characters for an identifier, which is why you are getting an unknown column error.

Create a Global Persistent List of Strings as Variable

I'm using SQL Server 2008 R2, in MS SQL Server Management Studio. I've only ever done Selecting and all the standard stuff but I find myself frequently using the same lists of strings for different queries and I'd like to be able to build variable that holds them. I don't have the access rights to create a new table, otherwise I would just do that. Is this even possible?
Let's say I have a bunch of client numbers that I want to use to include only their client account data with a query, example:
SELECT * FROM SALES
WHERE CLIENTNUMBER IN ('123','456','789')
Is there a way to create a variable that will hold those 3 values, so that I can instead just say
SELECT * FROM SALES
WHERE CLIENTNUMBER IN #CLOTHING_CLIENTS
The list is longer than 3 client numbers of course. And there are different categories etc. I think it would be MUCH simpler to do as a separate table but of course I don't have the ability to create new tables. I could do JOINs and the like too but that's getting even more work than just putting in the client numbers each time.
I'm trying to simplify things and make it more readable for other people, not make it more efficient for the database or more "correct".
There's a couple of ways you can do this involving temp tables or table variables. Try something like this:
declare #CLOTHING_CLIENTS table (ClientNumber varchar(20) not null);
-- Your list of values goes here
insert into #CLOTHING_CLIENTS (ClientNumber)
values ('123')
,('456')
,('789');
select * from Sales
where ClientNumber in (select ClientNumber from #CLOTHING_CLIENTS);
The #CLOTHING_CLIENTS variable can be used again anywhere in the same batch that it was created in. This post does a good job explaining the scope of table variables.
Good news there are Global things in T-SQL!!!
An extension of Jeff Lewis's answer that makes things a bit more 'Global' is to use the ## Type of table.
Assuming you can make them then a ##Table is a temporary table that can be access by other connections and even other databases. Just make sure your on the same server.
So you can do this:
CREATE TABLE ##MyValues(A INT)
INSERT INTO ##MyValues(A) VALUES (1)
Once done you can go anywhere and do
SELECT * FROM ##MyTables
Now all you need to do is update your snippets to do things like
SELECT * FROM SALES AS S
INNER JOIN ##MyClientIDTable AS MCIT OM MCIT.CLIENTNUMBER = S.CLIENTNUMBER
Just make sure your ##MyClientIDTable Has a CLIENTNUMBER column in it and the correct data.
Hope this helps a bit.

Hide Empty columns

I got a table with 75 columns,. what is the sql statement to display only the columns with values in in ?
thanks
It's true that a similar statement doesn't exist (in a SELECT you can use condition filters only for the rows, not for the columns). But you could try to write a (bit tricky) procedure. It must check which are the columns that contains at least one not NULL/empty value, using queries. When you get this list of columns just join them in a string with a comma between each one and compose a query that you can run, returning what you wanted.
EDIT: I thought about it and I think you can do it with a procedure but under one of these conditions:
find a way to retrieve column names dynamically in the procedure, that is the metadata (I never heard about it, but I'm new with procedures)
or hardcode all column names (loosing generality)
You could collect column names inside an array, if stored procedures of your DBMS support arrays (or write the procedure in a programming language like C), and loop on them, making a SELECT each time, checking if it's an empty* column or not. If it contains at least one value concatenate it in a string where column names are comma-separated. Finally you can make your query with only not-empty columns!
Alternatively to stored procedure you could write a short program (eg in Java) where you can deal with a better flexibility.
*if you check for NULL values it will be simple, but if you check for empty values you will need to manage with each column data type... another array with data types?
I would suggest that you write a SELECT statement and define which COLUMNS you wish to display and then save that QUERY as a VIEW.
This will save you the trouble of typing in the column names every time you wish to run that query.
As marc_s pointed out in the comments, there is no select statement to hide columns of data.
You could do a pre-parse and dynamically create a statement to do this, but this would be a very inefficient thing to do from a SQL performance perspective. Would strongly advice against what you are trying to do.
A simplified version of this is to just select the relevant columns, which was what I needed personally. A quick search of what we're dealing with in a table
SELECT * FROM table1 LIMIT 10;
-> shows 20 columns where im interested in 3 of them. Limit is just to not overflow the console.
SELECT column1,column3,colum19 FROM table1 WHERE column3='valueX';
It is a bit of a manual filter but it works for what I need.

SQL to search and replace in mySQL

In the process of fixing a poorly imported database with issues caused by using the wrong database encoding, or something like that.
Anyways, coming back to my question, in order to fix this issues I'm using a query of this form:
UPDATE table_name SET field_name =
replace(field_name,’search_text’,'replace_text’);
And thus, if the table I'm working on has multiple columns I have to call this query for each of the columns. And also, as there is not only one pair of things to run the find and replace on I have to call the query for each of this pairs as well.
So as you can imagine, I end up running tens of queries just to fix one table.
What I was wondering is if there is a way of either combine multiple find and replaces in one query, like, lets say, look for this set of things, and if found, replace with the corresponding pair from this other set of things.
Or if there would be a way to make a query of the form I've shown above, to run somehow recursively, for each column of a table, regardless of their name or number.
Thank you in advance for your support,
titel
Let's try and tackle each of these separately:
If the set of replacements is the same for every column in every table that you need to do this on (or there are only a couple patterns), consider creating a user-defined function that takes a varchar and returns a varchar that just calls replace(replace(#input,'search1','replace1'),'search2','replace2') nested as appropriate.
To update multiple columns at the same time you should be able to do UPDATE table_name SET field_name1 = replace(field_name1,...), field_name2 = replace(field_name2,...) or something similar.
As for running something like that for every column in every table, I'd think it would be easiest to write some code which fetches a list of columns and generates the queries to execute from that.
I don't know of a way to automatically run a search-and-replace on each column, however the problem of multiple pairs of search and replace terms in a single UPDATE query is easily solved by nesting calls to replace():
UPDATE table_name SET field_name =
replace(
replace(
replace(
field_name,
'foo',
'bar'
),
'see',
'what',
),
'I',
'mean?'
)
If you have multiple replaces of different text in the same field, I recommend that you create a table with the current values and what you want them replaced with. (Could be a temp table of some kind if this is a one-time deal; if not, make it a permanent table.) Then join to that table and do the update.
Something like:
update t1
set field1 = t2.newvalue
from table1 t1
join mycrossreferncetable t2 on t1.field1 = t2.oldvalue
Sorry didn't notice this is MySQL, the code is what I would use in SQL Server, my SQL syntax may be different but the technique would be similar.
I wrote a stored procedure that does this. I use this on a per database level, although it would be easy to abstract it to operate globally across a server.
I would just paste this inline, but it would seem that I'm too dense to figure out how to use the markdown deal, so the code is here:
http://www.anovasolutions.com/content/mysql-search-and-replace-stored-procedure

Oracle DB simple SELECT where column order matters

I am doing a simple SELECT statement in an Oracle DB and need to select the columns in a somewhat-specific order. Example:
Table A has 100 attributes, one of which is "chapter" that occurs somewhere in the order of columns in the table. I need to select the data with "chapter" first and the remaining columns after in no particular order. Essentially, my statement needs to read something like:
SELECT a.chapter, a. *the remaining columns* FROM A
Furthermore, I cannot simply type:
SELECT a.chapter, a.*
because this will select "chapter" twice.
I know the SQL statement seems simple, but if I know how to solve this problem, I can extrapolate this thought into more complicated areas. Also, let's assume that I can't just scroll over to find the "chapter" column and drag it to the beginning.
Thanks.
You should not select * in a program. As your schema evolves it will bring in things you do not know yet. Think about what happens when someone add a column with the whole book in it? The query you thought would be very cheap suddenly starts to bring in megabytes of data.
That means you have to specify every column you need.
Your best bet is just to select each column explicitly.
A quickie way to get around this would be SELECT a.chapter AS chapterCol, a.* FROM table a; This means there will be one column name chapterCol (assuming there's not a column already there named chapterCol. ;))
If your going to embed the 'SELECT *' into program code, then I would strongly recommend against doing that. As noted by the previous authors, your setting up the code to break if a column is ever added to (or removed from) the table. The simple advice is don't do it.
If your using this in development tools (viewing the data, and the like). Then, I'd recommend creating a view with the specific column order you need. Capture the output from 'SELECT COLUMN_NAME FROM ALL_TAB_COLUMNS' and create a select statement for the view with the column order you need.
This is how I would build your query without having to type all the names in, but with some manual effort.
Start with "Select a.chapter"
Now perform another select on your data base as follows :
select ','|| column_name
from user_tab_cols
where table_name = your_real_table_name
and column_name <> 'CHAPTER';
now take the output from that, in a cut-and-paste manner and append it to what you started with. Now run that query. It should be what you asked for.
Ta-da!
Unless you have a very good reason to do so, you should not use SELECT * in queries. It will break your application every time the schema changes.