How to run same syntax on multiple spss files - file-io

I have 24 spss files in .sav format in a single folder. All these files have the same structure. I want to run the same syntax on all these files. Is it possible to write a code in spss for this?

You can use the SPSSINC PROCESS FILES user submitted command to do this or write your own macro. So first lets create some very simple fake data to work with.
*FILE HANDLE save /NAME = "Your Handle Here!".
*Creating some fake data.
DATA LIST FREE / X Y.
BEGIN DATA
1 2
3 4
END DATA.
DATASET NAME Test.
SAVE OUTFILE = "save\X1.sav".
SAVE OUTFILE = "save\X2.sav".
SAVE OUTFILE = "save\X3.sav".
EXECUTE.
*Creating a syntax file to call.
DO IF $casenum = 1.
PRINT OUTFILE = "save\TestProcess_SHOWN.sps" /"FREQ X Y.".
END IF.
EXECUTE.
Now we can use the SPSSINC PROCESS FILES command to specify the sav files in the folder and apply the TestProcess_SHOWN.sps syntax to each of those files.
*Now example calling the syntax.
SPSSINC PROCESS FILES INPUTDATA="save\X*.sav"
SYNTAX="save\TestProcess_SHOWN.sps"
OUTPUTDATADIR="save" CONTINUEONERROR=YES
VIEWERFILE= "save\Results.spv" CLOSEDATA=NO
MACRONAME="!JOB"
/MACRODEFS ITEMS.

Another (less advanced) way is to use the command INSERT. To do so, repeatedly GET each sav-file, run the syntax with INSERT, and sav the file. Probably something like this:
get 'file1.sav'.
insert file='syntax.sps'.
save outf='file1_v2.sav'.
dataset close all.
get 'file2.sav'.
insert file='syntax.sps'.
save outf='file2_v2.sav'.
etc etc.
Good luck!

If the Syntax you need to run is completely independent of the files then you can either use: INSERT FILE = 'Syntax.sps' or put the code in a macro e.g.
Define !Syntax ()
* Put Syntax here
!EndDefine.
You can then run either of these 'manually';
get file = 'file1.sav'.
insert file='syntax.sps'.
save outfile ='file1_v2.sav'.
Or
get file = 'file1.sav'.
!Syntax.
save outfile ='file1_v2.sav'.
Or if the files follow a reasonably strict naming structure you can embed either of the above in a simple bit of python;
Begin Program.
imports spss
for i in range(0, 24 + 1):
syntax = "get file = 'file" + str(i) + ".sav.\n"
syntax += "insert file='syntax.sps'.\n"
syntax += "save outfile ='file1_v2.sav'.\n"
print syntax
spss.Submit(syntax)
End Program.

Related

How can I read two input files and create one output file via Python?

So I have two files and I need to create an output file for them - my prof. wants me to use a while loop to process the first file and a for loop for the second file, as well as use a try/except block to read from the input and write to the output. I think I have like, a general idea for the initial code but I'm still lost.
#reads the file
n1 = open('nameslist1.txt', 'r')
n2 = open('nameslist2.txt', 'r')
print(n1.read())
print(n2.read())
n1.close()
n2.close()
#writes the file
n1_o = open('allnames.txt', 'w')
n2_o = open('allnames.txt', 'w')
n1_o.write('nameslist1.txt')
n2_o.write('nameslist2.txt')
n1_o.close()
n2_o.close()

How to read the contents of an .sql file into an R script to run a query?

I have tried the readLines and the read.csv functions but then don't work.
Here is the contents of the my_script.sql file:
SELECT EmployeeID, FirstName, LastName, HireDate, City FROM Employees
WHERE HireDate >= '1-july-1993'
and it is saved on my Desktop.
Now I want to run this query from my R script. Here is what I have:
conn = connectDb()
fileName <- "C:\\Users\\me\\Desktop\\my_script.sql"
query <- readChar(fileName, file.info(fileName)$size)
query <- gsub("\r", " ", query)
query <- gsub("\n", " ", query)
query <- gsub("", " ", query)
recordSet <- dbSendQuery(conn, query)
rate <- fetch(recordSet, n = -1)
print(rate)
disconnectDb(conn)
And I am not getting anything back in this case. What can I try?
I've had trouble with reading sql files myself, and have found that often times the syntax gets broken if there are any single line comments in the sql. Since in R you store the sql statement as a single line string, if there are any double dashes in the sql it will essentially comment out any code after the double dash.
This is a function that I typically use whenever I am reading in a .sql file to be used in R.
getSQL <- function(filepath){
con = file(filepath, "r")
sql.string <- ""
while (TRUE){
line <- readLines(con, n = 1)
if ( length(line) == 0 ){
break
}
line <- gsub("\\t", " ", line)
if(grepl("--",line) == TRUE){
line <- paste(sub("--","/*",line),"*/")
}
sql.string <- paste(sql.string, line)
}
close(con)
return(sql.string)
}
I've found for queries with multiple lines, the read_file() function from the readr package works well. The only thing you have to be mindful of is to avoid single quotes (double quotes are fine). You can even add comments this way.
Example query, saved as query.sql
SELECT
COUNT(1) as "my_count"
-- comment goes here
FROM -- tabs work too
my_table
I can then store the results in a data frame with
df <- dbGetQuery(con, statement = read_file('query.sql'))
You can use the read_file() function from the readr package.
fileName = read_file("C:/Users/me/Desktop/my_script.sql")
You will get a string variable fileName with the desired text.
Note: Use / instead of \\\
The answer by Matt Jewett is quite useful, but I wanted to add that I sometimes encounter the following warning when trying to read .sql files generated by sql server using that answer:
Warning message: In readLines(con, n = 1) : line 1 appears to contain
an embedded nul
The first line returned by readLines is often "ÿþ" in these cases (i.e. the UTF-16 byte order mark) and subsequent lines are not read properly. I solved this by opening the sql file in Microsoft SQL Server Management Studio and selecting
File -> Save As ...
then on the small downarrow next to the save button selecting
Save with Encoding ...
and choosing
Unicode (UTF-8 without signature) - Codepage 65001
from the Encoding dropdown menu.
If you do not have Microsoft SQL Server Management Studio and are using a Windows machine, you could also try opening the file with the default text editor and then selecting
File -> Save As ...
Encoding: UTF-8
to save with a .txt file extension.
Interestingly changing the file within Microsoft SQL Server Management Studio removes the BOM (byte order mark) altogether, whereas changing the file within the text editor converts the BOM to the UTF-8 BOM but nevertheless causes the query to be properly read using the referenced answer.
The combination of readr and textclean works well without having to create any new functions. read_file() reads the file into a character vector and replace_white() ensures all escape sequence characters are removed from your .sql file. Note: Does cause problems if you have comments in your SQL string !!
library(readr)
library(textclean)
SQL <- replace_white(read_file("file_path")))

How to print the variable loaded and assigned from file

I am just learning machine learning using Octave. I want to load the data from the file and assign the data to one variable, then I just want to print the data to the console.The data in data.txt file is several-rows and two-columns matrix.
data = load('data.txt');
x = data(:, 1);
y = data(:, 2);
printf x;
printf y;
After executing the code, x and y will show up on the console, it is not what I expected, I just want to check the data loaded from the file, how to print this? Do I use wrong command?
Just type data in console and press enter . This will print the contents of the data variable with all its elements. Do NOT add a semicolon at end of line. This would suppress output.
The explicit way doing this is the command
display(data)

Why doesn't io:write() write to the output file?

I'm writing a short script in Lua to replicate Search/Replace functionality. The goal is to enter a search term and a replacement term, and it will comb through all the files of a given extension (not input-determined yet) and replace the Search term with the Replacement term.
Everything seems to do what it's supposed to, except the files are not actually written to. My Lua interpreter (compiled by myself in Pelles-C) does not throw any errors or exit abnormally; the script completes as if it worked.
At first I didn't have i:flush(), but I added it after reading that it is supposed to save any written data to the file (see LUA docs). It didn't change anything, and files are still not written to.
I think it might have something to do with how I'm opening the file to edit it, since the "w" option works (but overwrites everything in my test files).
Source:
io.write("Enter your search term:")
term = io.read()
io.write("Enter your replace term:")
replacement = io.read()
io.stdin:read()
t = {}
for z in io.popen('dir /b /a-d'):lines() do
if string.match(string.lower(z), "%.txt$") then
print(z)
table.insert(t, z)
end
end
print("Second loop")
for _, w in pairs(t) do
print(w)
i = io.open(w, "r+")
print(i)
--i:seek("set", 6)
--i:write("cheese")
--i:flush()
for y in i:lines() do
print(y)
p, count = string.gsub(y, term, replacement, 1)
print(p)
i:write(p)
i:flush()
io.stdin:read()
end
i:close()
end
This is the output I get (which is what I want to happen), but in reality isn't being written to the file:
There was one time where it wrote output to a file, but it only output to one file and after that write my script crashed with the message: No error. The line number was at the for y in i:lines() do line, but I don't know why it broke there. I've noticed file:lines() will break if the file itself has nothing in it and give an odd/gibberish error, but there are things in my text files.
Edit1
I tried do this in my for loop:
for y in i:lines() do
print(y)
p, count = string.gsub(y, term, replacement, 1)
print(p)
i:write(p)
i:seek("set", 3) --New
i:write("TESTESTTEST") --New
i:flush()
io.stdin:read()
end
in order to see if I could force it to write regular text. It does but then it crashes with No error and still doesn't write the replacement string (just TESTESTTEST). I don't know what the problem could be.
I guess, one can't write to file while traversing its lines
for y in i:lines() do
i:write(p)
i:flush()
end

SAS - Reading a File Backwards?

I need SAS to read many large log files, which are set up to have the most recent activities at the bottom. All I need is the most recent time a particular activity occurred, and I was wondering if it's possible for SAS to skip parsing the (long) beginning parts of the file.
I looked online and found how to read a dataset backwards, but that would require SAS to first parse everything in the .log file into the dataset first. Is it possible to directly read the file starting from the very end so that I can stop the data step as soon as I find the most recent activity of a particular type?
I read up on infile as well, and the firstobs option, but I have no idea how long these log files are until they are parsed, right? Sounds like a catch-22 to me. So is what I'm describing doable?
I'd probably set up a filename pipe statement to use an operating system command like tail -r or tac to present the file in reverse order to SAS. That way SAS can read the file normally and you don't have to worry about how long the file is.
If you mean parsing a sas log file, I am not sure if reading the log file backward is worth the trouble in practice. For instance, the following code executes less than a tenth of a second on my PC and it is writing and reading a 10,000 line log file. How big is your log files and how many are there? Also as shown below, you don't have to "parse" everything on every line. You can selectively read some parts of the line and if it is not what you are looking for, then you can just go to the next line.
%let pwd = %sysfunc(pathname(WORK));
%put pwd=&pwd;
x cd &pwd;
/* test file. more than 10,000 line log file */
data _null_;
file "test.log";
do i = 1 to 1e4;
r = ranuni(0);
put r binary64.;
if r < 0.001 then put "NOTE: not me!";
end;
put "NOTE: find me!";
do until (r<0.1);
r = ranuni(0);
put r binary64.;
end;
stop;
run;
/* find the last line that starts with
NOTE: and get the rest of the line. */
data _null_;
length msg $80;
retain msg;
infile "test.log" lrecl=80 eof=eof truncover;
input head $char5. #;
if head = "NOTE:" then input #6 msg $char80.;
else input;
return;
eof:
put "last note was on line: " _n_ ;
put "and msg was: " msg $80.;
run;
/* on log
last note was on line: 10013
and msg was: find me!
*/