Using a table made from input file Lua - input

I have a text file with contents like this
Jack 17
Will 16
Jordan 15
Elsie 16
You get the idea, it's a list of people's names with their ages.
I have a program that reads the file in. Like so:
file = io.open("ages.txt")
for line in file:lines()
do
local name, age = line:match("(%a+) (%d+)")
print(age) --Not exactly what I want
end
file:close()
print(age) gives me the ages of all people, without names. It runs for everyone, as expected as it's within the loop (as an aside, why does it not work outside the loop? It gives me nil there)
What I want to do is load it into a table. This way, if I want to know Jack's age, I can go print(Jack.age) and it will give me 17. How can this be program be constructed to support this functionality?

Perhaps you are looking for something like this to build a table in the loop:
file = io.open("ages.txt")
names = {}
for line in file:lines()
do
local n, a = line:match("(%a+) (%d+)")
names[n] = {age = a}
end
file:close()
Here is a sample interaction:
> print(names.Will.age)
16
> print(names.Jordan.age)
15
> print(names.Elsie.age)
16

Related

Print multiple data sets in a range

I've made a small program that is supposed to read data in a range input it into an object oriented program, and then return the full data set. the issue is that when I run the file it only return data on the third procedure
I tried printing other procedure sets but idk how to do that, i'm thinking this will only work if i replace the procedures from generic to specific. as in instead of Procedure name for all of them procedures 1, 2, and 3
for i in range (3):
procedure_name = ('Physical Exam')
date_of = ("Nov 6th 2022")
doctor = ('Dr. Irvine')
charge = ('$ 250.00')
procedure_name = ('X-ray')
date_of = ("Nov 6th 2022")
doctor = ('Dr. Jamison')
charge = ('$ 500.00')
procedure_name = ('Blood test')
date_of = ("Nov 6th 2022")
doctor = ('Dr. Smith')
charge = ('$ 200.00')
procedure = HW6_RODRIGUEZ_1.Procedure(procedure_name,date_of,doctor,charge)
print(f'Procedure {i+1}')
print(procedure)
print(i, end=" ")
if name == 'main':
main()
So, I think you may have misunderstood some things when it comes to variables, OOP and looping.
When you define a variable, that variable is set to the last value it is assigned. So if you have the following code:
a = 1
a = 2
a = 3
The final value of the variable 'a' will be 3, as that is the last value it is assigned.
As for loops, whatever you have written in a for loop will be repeated for a specified number of times. This means if you want to write a loop that prints "hello" 5 times, you'd write the following:
for i in range(5):
print("hello")
What your loop is essentially doing is overwriting the same 3 variables 3 times over, this won't be assigning new values to an object.
When it comes to creating an object that you assign variable to, you need to first write the code for your class. Your class can have attributes like the variables you've stated. It could look something like this:
class procedure:
def __init__(self, procedure_name, date_of, doctor, charge):
self.procedure_name = procedure_name
self.date_of = date_of
self.doctor = doctor
self.charge = charge
Now, to set up a procedure object, you just assign a variable to procedure with the desired variables as parameters, like so:
new = procedure('X-ray','Nov 6th 2022','Dr. Jamison','$ 500.00')
And to access a variable, you just need to write procedureName.attribute. For example, using the object I just set up:
print(new.doctor)
Would output 'Dr. Jamison'.
If you want to store a bunch of them, I would recommend storing them in a list or a dictionary, depending on how you want to look them up.
I hope this helps! If you are new to programming, I would recommend some simpler programs such as a program that prints the nursery rhyme 10 green bottles using loops, or maybe making a quiz.
Best of luck.

Sqldf in R - error with first column names

Whenever I use read.csv.sql I cannot select from the first column with and any output from the code places an unusual character (A(tilde)-..) at the begging of the first column's name.
So suppose I create a df.csv file in in Excel that looks something like this
df = data.frame(
a = 1,
b = 2,
c = 3,
d = 4)
Then if I use sqldf to query the csv which is in my working directory I get the following error:
> read.csv.sql("df.csv", sql = "select * from file where a == 1")
Error in result_create(conn#ptr, statement) : no such column: a
If I query a different column than the first, I get a result but with the output of the unusual characters as seen below
df <- read.csv.sql("df.csv", sql = "select * from file where b == 2")
View(df)
Any idea how to prevent these characters from being added to the first column name?
The problem is presumably that you have a file that is larger than R can handle and so only want to read a subset of rows into R and specifying the condition to filter it by involves referring to the first column whose name is messed up so you can't use it.
Here are two alternative approaches. The first one involves a bit more code but has the advantage that it is 100% R. The second one is only one statement and also uses R but additionally makes use an of an external utility.
1) skip header Read the file in skipping over the header. That will cause the columns to be labelled V1, V2, etc. and use V1 in the condition.
# write out a test file - BOD is a data frame that comes with R
write.csv(BOD, "BOD.csv", row.names = FALSE, quote = FALSE)
# read file skipping over header
DF <- read.csv.sql("BOD.csv", "select * from file where V1 < 3",
skip = 1, header = FALSE)
# read in header, assign it to DF and fix first column
hdr <- read.csv.sql("BOD.csv", "select * from file limit 0")
names(DF) <- names(hdr)
names(DF)[1] <- "TIME" # suppose we want TIME instead of Time
DF
## TIME demand
## 1 1 8.3
## 2 2 10.3
2) filter Another way to proceed is to use the filter= argument. Here we assume we know that the end of the column name is ime but there are other characters prior to that that we don't know. This assumes that sed is available and on your path. If you are on Windows install Rtools to get sed. The quoting might need to be changed depending on your shell.
When trying this on Windows I noticed that sed from Rtools changed the line endings so below we specified eol= to ensure correct processing. You may not need that.
DF <- read.csv.sql("BOD.csv", "select * from file where TIME < 3",
filter = 'sed -e "1s/.*ime,/TIME,/"' , eol = "\n")
DF
## TIME demand
## 1 1 8.3
## 2 2 10.3
So I figured it out by reading through the above comments.
I'm on a Windows 10 machine using Excel for Office 365. The special characters will go away by changing how I saved the file from a "CSV UTF-8 (Comma Delimited)" to just "CSV (Comma delimited)".

How to find every combination of a binary 16 digit number

I have 16 different options in my program and i have a 16 character variable which is filled with 1's or 0's depending on the options that are selected (0000000000000000 means nothing is selected, 0010101010000101 means options 3,5,7,9,14 and 16 are selected, 1111111111111111 means everything is selected.)
When i run my program, the code looks (using an if statement) for a 1 in the designated character of the 16 digit number and if there is one there then it runs the code for that option, otherwise it skips it..
e.g option 3 looks too see if the 3rd character (0010000000000000) is a 1 and if it is it runs the code.
Now what i am trying to do is generate a list of every different combination that is possible so I can create an option for it to just loop through and run every possible option:
0000000000000001
0000000000000010
0000000000000011
...
1111111111111100
1111111111111110
1111111111111111
I have tried this but i think it may take a couple of years to run jaja:
Dim binString As String
Dim binNUM As Decimal = "0.0000000000000001"
Do Until binNUM = 0.11111111111111111
binString = binNUM.ToString
If binString.Contains(1) Then
If binString.Contains(2) Or binString.Contains(3) Or binString.Contains(4) Or binString.Contains(5) Or binString.Contains(6) Or binString.Contains(7) Or binString.Contains(8) Or binString.Contains(9) Then
Else
Debug.Print(binNUM)
End If
End If
binNUM = binNUM + 0.0000000000000001
After the code above is complete i would then take the output list and remove any instances of "0." and then any lines which had fewer than 16 chararcters (because the final character would be a 0 and not show) I would add a 0 until there was 16 characters. I know this bit might be stupid but its as far a ive got
Is there a faster way I can I generate a list like this in VB.net?
You should be able to get the list by using Convert.ToString as follows:
Dim sb As New System.Text.StringBuilder
For i As Integer = 0 To 65535
sb.AppendLine(Convert.ToString(i, 2).PadLeft(16, "0"c))
Next
Debug.Print(sb.ToString())
BTW: This should finish in under one second, depending on your system ;-)
Create an enum with FlagAttributes, which allows you to do the key functions you list. Here is an example of setting it up in a small project I am working on:
<FlagsAttribute>
Public Enum MyFlags As Integer
None = 0
One = 1
Two = 2
Three = 4
Four = 8
Five = 16
Recon = 32
Saboteur = 64
Mine = 128
Headquarters = 256
End Enum
e.g.
Dim temp as MyFlags
Dim doesIt as Boolean
temp = MyFlags.One
doesIt = temp.HasFlag(MyFlags.Two)
temp = temp OR MyFlags.Three
'etc.
The real advantage is how it prints out, if you want something other than 0, 1 and is much more human friendly.

Reading, parsing and storing .txt files contents in Torch tensors efficiently

I have a huge number of .txt files (maybe around 10 millions) each having the same number of rows/colums. They actually are some single channel images and the pixel values are separated with an space. Here's the code I've written to do the work but it's very slow. I wonder if someone can suggest a more optimized/efficient way of doing this:
require 'torch'
f = assert(io.open(txtFilePath, 'r'))
local tempTensor = torch.Tensor(1, 64, 64):fill(0)
local i = 1
for line in f:lines() do
local l = line:split(' ')
for key, val in ipairs(l) do
tempTensor[{1, i, key}] = tonumber(val)
end
i = i + 1
end
f:close()
In brief, change you source files if it is possible.
The only I can suggest is to use binary data instead of txt as a source.
You have got the long-term methods: f:lines(), line:split(' ') and tonumber(val). All of them are using strings as variables.
As I understood, you have got file like this:
0 10 20
11 18 22
....
so, change your source it into binary like this:
<0><18><20><11><18><22> ...
where <18> is a byte in hex form, that is 12 , <20> is 16 , etc.
to read
fid = io.open(sup_filename, "rb")
while true do
local bytes = fid:read(1)
if bytes == nil then break end -- EOF
local st = bytes[0]
print(st)
end
fid:close()
https://www.lua.org/pil/21.2.2.html
It would be dramatically faster.
May be using regular expressions (instead of :split() and lines()) can help to you but I do not think.

Generating variable observations for one id to be observation for new variable of another id

I have a data set that allows linking friends (i.e. observing peer groups) and thereby one can observe the characteristics of an individual's friends. What I have is an 8 digit identifier, id, each id's friend id's (up to 10 friends), and then many characteristic variables.
I want to take an individual and create a variables that are the foreign born status of each friend.
I already have an indicator for each person that is 1 if foreign born. Below is a small example, for just one friend. Notice, MF1 means male friend 1 and then MF1id is the id number for male friend 1. The respondents could list up to 5 male friends and 5 female friends.
So, I need Stata to look at MF1id and then match it down the id column, then look over to f_born for that matched id, and finally input the value of f_born there back up to the original id under MF1f_born.
edit: I did a poor job of explaining the data structure. I have a cross section so 1 observation per unique id. Row 1 is the first 8 digit id number with all the variables following over the row. The repeating id numbers are between the friend id's listed for each person (mf1id for example) and the id column. I hope that is a bit more clear.
Kevin Crow wrote vlookup that makes this sort of thing pretty easy:
use http://www.ats.ucla.edu/stat/stata/faq/dyads, clear
drop team y
rename (rater ratee) (id mf1_id)
bys id: gen f_born = mod(id,2)==1
net install vlookup
vlookup mf1_id, gen(mf1f_born) key(id) value(f_born)
So, Dimitriy's suggestion of vlookup is perfect except it will not work for me. After trying vlookup with both my data set, the UCLA data that Dimitriy used for his example, and a toy data set I created vlookup always failed at the point the program attempts to save a temp file to my temp folder. Below is the program for vlookup. Notice its sets tempfile file, manipulates the data, and then saves the file.
*! version 1.0.0 KHC 16oct2003
program define vlookup, sortpreserve
version 8.0
syntax varname, Generate(name) Key(varname) Value(varname)
qui {
tempvar g k
egen `k' = group(`key')
egen `g' = group(`key' `value')
local k = `k'[_N]
local g = `g'[_N]
if `k' != `g' {
di in red "`value' is unique within `key';"
di in red /*
*/ "there are multiple observations with different `value'" /*
*/ " within `key'."
exit 9
}
preserve
tempvar g _merge
tempfile file
sort `key'
by `key' : keep if _n == 1
keep `key' `value'
sort `key'
rename `key' `varlist'
rename `value' `generate'
save `file', replace
restore
sort `varlist'
joinby `varlist' using `file', unmatched(master) _merge(`_merge')
drop `_merge'
}
end
exit
For some reason, Stata gave me an error, "invalid file," at the save `file', replace point. I have a restricted data set with requirments to point all my Stata temp files to a very specific folder that has an erasure program sweeping it every so often. I don't know why this would create a problem but maybe it is, I really don't know. Regardless, I tweaked the vlookup program and it appears to do what I need now.
clear all
set more off
capture log close
input aid mf1aid fborn
1 2 1
2 1 1
3 5 0
4 2 0
5 1 0
6 4 0
7 6 1
8 2 .
9 1 0
10 8 1
end
program define justlinkit, sortpreserve
syntax varname, Generate(name) Key(varname) Value(name)
qui {
preserve
tempvar g _merge
sort `key'
by `key' : keep if _n ==1
keep `key' `value'
sort `key'
rename `key' `varlist'
rename `value' `generate'
save "Z:\Jonathan\created data sets\justlinkit program\fchara.dta",replace
restore
sort `varlist'
joinby `varlist' using "Z:\Jonathan\created data sets\justlinkit program\fchara.dta", unmatched(master) _merge(`_merge')
drop `_merge'
}
end
// set trace on
justlinkit mf1aid, gen(mf1_fborn) key(aid) value(fborn)
sort aid
list
Well, this fixed my problem. Thanks to all who responded I would not have figured this out without you.