Clean up string data before saving in database as integer

Clean up string data before saving in database as integer - ruby-on-rails-3

I have an integer column called dollars in my database table and I am attempting to save values such as $1000 in the database. However, attempting to save the string $1000 with the dollar sign in an integer column will cause an automatic conversion to the integer 0.
To prevent this from happening, I tried to add a before_save callback to my clean_data method in my model to remove the dollar sign. But it seems that rails has already attempted to save the entire $1000 string in the database before clean_data is called.
I wonder if there is a better way to remove the dollar sign from the values before saving to the database as an integer?
Here's my code:
In the create action of the bids_controller.rb, doing this will allow the database to save the value $1000 from the form properly:
# Remove dollar sign
if params[:bid][:dollars][0] == "$"
params[:bid][:dollars] = params[:bid][:dollars].delete('$')
end
#bid = Bid.new(params[:bid])
#bid.save
However, if I were to remove the if-end fragment from the controller and clean the data in the Bid model, like this:
before_save :clean_data
def clean_data
puts self.dollars
self.dollars = self.dollars.delete('$')
end
I will get the value 0 for puts self.dollars before the clean_data method has a chance to remove the dollar sign.
My hypothesis is that rails has attempted to "hold" the data in the database before either clean_data or save is called, and this "hold" causes the data to be converted to 0 since the integer column isn't able to save the dollar sign.
Thank you very much for your help! :)

Take a look at the tr method.
"$1000".tr("$", "") # => "1000"
You should be able to do this on the string variable prior to saving it to the database.

Related

Can you print variables that are dynamically generated?

I am trying to setup a program that takes user input for database ddl generation. I have it working to the point where it can ask a user for the name of the table, the number of columns and any attributes that might be needed. The problem comes when I try to print a string that includes the variables used for the column names. Due to trying to let users have as many columns as they want I used variables similar to this newvar(number that increases every time you enter a column name). This works fine and I can get the values if i do send %newvar1% but it doesn't work to do send newvar%increasing number%. I need to know if this is possible or if I'm just missing something obvious. Also I don't have the code with me but I can post it once I get back to my main computer.
I have tried quite a few things like, send %newvar%%number%, send newvar%number%, othervar = newvar%number% send %othervar%.
I'll show some once I have access to it in about 2 hours.
I expect to be able to output names for increasing variables using an ever increasing number. Class is starting I'll clarify some things later.

You can use a lone percent % beginning the first argument for the send command to achieve what you want. This will make everything after it to be evaluated (up to the next comma). Here is an example:
f1::
newvar1 := "This " , newvar2 := "is just a " , newvar3 := "test."
Loop , 3
Send , % newvar%A_Index%
Return
See: https://www.autohotkey.com/docs/Language.htm#-expression

How to change the variable length in Progress?

I'm pretty new to progress and I want to ask a question.
How do I change variable (string) length in runtime?
ex.
define variable cname as char.
define variable clen as int.
cname= "".
DO cnts = 1 TO 5.
IF prc[cnts] <> "" THEN DO:
clen = clen + LENGTH(prc[cnts]).
cname = cname + prc[cnts].
END.
END.
Put cname format '???' at 1. /here change variable length/
Thanks for the reply

If the PUT statement is what you want to change, then
PUT UNFORMATTED cname.
will write the entire string out without having to worry about the length of the FORMAT phrase.
If you need something formatted, then
PUT UNFORMATTED STRING(cname, fill("X", clen)).
will do what you want. Look up the "STRING()" function in the ABL Ref docs.

In Progress 4GL all data is variable length.
This is one of the big differences between Progress and lots of other development environments. (And in my opinion a big advantage.)
Each datatype has a default format, which you can override, but that is only for display purposes.
Display format has no bearing on storage.
You can declare a field with a display format of 3 characters:
define variable x as character no-undo format "x(3)".
And then stuff 60 characters into the field. Progress will not complain.
x = "123456789012345678901234567890123456789012345678901234567890".
It is extremely common for 4gl application code to over-stuff variables and fields.
(If you then use SQL-92 to access the data you will hear much whining and gnashing of teeth from your SQL client. This is easily fixable with the "dbtool" utility.)
You change the display format when you define something:
define variable x as character no-undo format "x(30)".
or when you use it:
put x format "x(15)".
or
display x format "x(43)".
(And in many other ways -- these are just a couple of easy examples.)
Regardless of the display format the length() function will report the actual length of the data.

Enter date into function without quotes, return date

I'm trying to write a function of this form:
Function cont(requestdate As Date)
cont = requestdate
End Function
Unfortunately, when I enter =cont(12/12/2012) into a cell, I do not get my date back. I get a very small number, which I think equals 12 divided by 12 divided by 2012. How can I get this to give me back the date? I do not want the user to have to enter =cont("12/12/2012").
I've attempted to google for an answer, unfortunately, I have not found anything helpful. Please let me know if my vocabulary is correct.
Let's say my user pulled a report with 3 columns, a, b and c. a has beginning of quarter balances, b has end of quarter balances and c has a first and last name. I want my user to put in column d: =cont(a1,b1,c1,12/12/2012) and make it create something like:
BOQ IS 1200, EOQ IS 1300, NAME IS EDDARD STARK, DATE IS 12/12/2012
So we could load this into a database. I apologize for the lack of info the first time around. To be honest, this function wouldn't save me a ton of time. I'm just trying to learn VBA, and thought this would be a good exercise... Then I got stuck.

Hard to tell what you are really trying to accomplish.
Function cont(requestdate As String) As String
cont = Format(Replace(requestdate, ".", "/"), "'mm_dd_YYYY")
End Function
This code will take a string that Excel does not recognize as a number e.g. 12.12.12 and formats it (about the only useful thing I can think of for this UDF) and return it as a string (that is not a number or date) to a cell that is formatted as text.
You can get as fancy as you like in processing the string entered and formatting the string returned - just that BOTH can never be a number or a date (or anything else Excel recognizes.)

There is no way to do exactly what you're trying to do. I will try to explain why.
You might think that because your function requires a Date argument, that this somehow forces or should force that 12/12/2012 to be treated as a Date. And it is treated as a Date — but only after it's evaluated (only if the evaluated expression cannot be interpreted as a Date, then you will get an error).
Why does Excel evaluate this before the function receives it?
Without requiring string qualifiers, how could the application possibly know what type of data you intended, or whether you intended for that to be evaluated? It could not possibly know, so there would be chaos.
Perhaps this is best illustrated by example. Using your function:
=Cont(1/1/0000) should raise an error.
Or consider a very simple formula:
=1/2
Should this formula return .5 (double) or January 2 (date) or should it return "1/2" (string literal)? Ultimately, it has to do one of these, and do that one thing consistently, and the one thing that Excel will do in this case is to evaluate the expression.
TL;DR
Your problem is that unqualified expression will be evaluated before being passed, and this is done to avoid confusion or ambiguity (per examples).

Here is my method for allowing quick date entry into a User Defined Function without wrapping the date in quotes:
Function cont(requestdate As Double) As Date
cont = CDate((Mid(Application.Caller.Formula, 7, 10)))
End Function
The UDF call lines up with the OP's initial request:
=cont(12/12/2012)
I believe that this method would adapt just fine for the OP's more complex ask, but suggest moving the date to the beginning of the call:
=cont(12/12/2012,a1,b1,c1)
I fully expect that this method can be optimized for both speed and flexibility. Working on a project now that might require me to further dig into the speed piece, but it suits my needs in the meantime. Will update if anything useful turns up.
Brief Explanation
Application.Caller returns a Range containing the cell that called the UDF. (See Caveat #2)
Mid returns part of a string (the formula from the range that called the UDF in this case) starting at the specified character count (7) of the specified length (10).
CDate may not actually be necessary, but forces the value into date format if possible.
Caveats
This does require use of the full dd/mm/yyyy (1/1/2012 would fail) but pleasantly still works with my preferred yyyy/mm/dd format as well as covering some other delimiters. dd-mm-yyyy or dd+mm+yyyy would work, but dd.mm.yyyy will not because excel does not recognize it as a valid number.
Additional work would be necessary for this to function as part of a multi-cell array formula because Application.Caller returns a range containing all of the associated cells in that case.
There is no error handling, and =cont(123) or =cont(derp) (basically anything not dd/mm/yyy) will naturally fail.
Disclaimers
A quick note to the folks who are questioning the wisdom of a UDF here: I've got a big grid of items and their associated tasks. With no arguments, my UDF calculates due dates based on a number of item and task parameters. When the optional date is included, the UDF returns a delta between the actual date and what was calculated. I use this delta to monitor and calibrate my calculated due dates.
All of this can absolutely be performed without the UDF, but bulk entry would be considerably more challenging to say the least.
Removing the need for quotes sets my data entry up such that loading =cont( into the clipboard allows my left hand to F2/ctrl-v/tab while my right hand furiously enters dates on the numpad without need to frequently (and awkwardly) shift left-hand position for a shift+'.

Am I misunderstanding String#hash in Ruby?

I am processing a bunch of data and I haven't coded a duplicate checker into the data processor yet, so I expected duplicates to occur. I ran the following SQL query:
SELECT body, COUNT(body) AS dup_count
FROM comments
GROUP BY body
HAVING (COUNT(body) > 1)
And get back a list of duplicates. Looking into this I find that these duplicates have multiple hashes. The shortest string of a comment is "[deleted]". So let's use that as an example. In my database there are nine instances of a comment being "[deleted]" and in my database this produces a hash of both 1169143752200809218 and 1738115474508091027. The 116 is found 6 times and 173 is found 3 times. But, when I run it in IRB, I get the following:
a = '[deleted]'.hash # => 811866697208321010
Here is the code I'm using to produce the hash:
def comment_and_hash(chunk)
comment = chunk.at_xpath('*/span[#class="comment"]').text ##Get Comment##
hash = comment.hash
return comment,hash
end
I've confirmed that I don't touch comment anywhere else in my code. Here is my datamapper class.
class Comment
include DataMapper::Resource
property :uid , Serial
property :author , String
property :date , Date
property :body , Text
property :arank , Float
property :srank , Float
property :parent , Integer #Should Be UID of another comment or blank if parent
property :value , Integer #Hash to prevent duplicates from occurring
end
Am I correct in assuming that .hash on a string will return the same value each time it is called on the same string?
Which value is the correct value assuming my string consists of "[deleted]"?
Is there a way I could have different strings inside ruby, but SQL would see them as the same string? That seems to be the most plausible explanation for why this is occurring, but I'm really shooting in the dark.

If you run
ruby -e "puts '[deleted]'.hash"
several times, you will notice that the value is different. In fact, the hash value stays only constant as long as your Ruby process is alive. The reason for this is that String#hash is seeded with a random value. rb_str_hash (the C implementing function) uses rb_hash_start which uses this random seed which gets initialized every time Ruby is spawned.
You could use a CRC such as Zlib#crc32 for your purposes or you may want to use one of the message digests of OpenSSL::Digest, although the latter is overkill since for detection of duplicates you probably won't need the security properties.

I use the following to create String#hash alternatives that are consistant across time and processes
require 'zlib'
def generate_id(label)
Zlib.crc32(label.to_s) % (2 ** 30 - 1)
end

Ruby intentionally makes String.hash produce different values in different sessions: Why is Ruby String.hash inconsistent across machines?

How to comment on MATLAB variables

When I´m using MATLAB, sometimes I feel the need to make comments on some variables. I would like to save these comments inside these variables. So when I have to work with many variables in the workspace, and I forget the context of some of these variables I could read the comments I put in every one of them. So I would like to comment variables and keep the comments inside of them.

While I'm of the opinion that the best (and easiest) approach would be to make your variables self-documenting by giving them descriptive names, there is actually a way for you to do what you want using the object-oriented aspects of MATLAB. Specifically, you can create a new class which subclasses a built-in class so that it has an additional property describing the variable.
In fact, there is an example in the documentation that does exactly what you want. It creates a new class ExtendDouble that behaves just like a double except that it has a DataString property attached to it which describes the data in the variable. Using this subclass, you can do things like the following:
N = ExtendDouble(10,'The number of data points')
N =
The number of data points
10
and N could be used in expressions just as any double value would. Using this example subclass as a template, you could create "commented" versions of other built-in numeric classes, with the exception of those you are not allowed to subclass (char, cell, struct, and function_handle).
Of course, it should be noted that instead of using the ExtendDouble class like I did in the above example, I could instead define my variable like so:
nDataPoints = 10;
which makes the variable self-documenting, albeit with a little more typing needed. ;)

How about declaring another variable for your comments?
example:
\>> num = 5;
\>> numc = 'This is a number that contains 5';
\>> whos
...
This is my first post in StackOverflow. Thanks.

A convenient way to solve this is to have a function that does the storing and displaying of comments for you, i.e. something like the function below that will pop open a dialog box if you call it with comments('myVar') to allow you to enter new (or read/update previous) comments to variable (or function, or co-worker) labeled myVar.
Note that the comments will not be available in your next Matlab session. To make this happen, you have to add save/load functionality to comments (i.e. every time you change anything, you write to a file, and any time you start the function and database is empty, you load the file if possible).
function comments(name)
%COMMENTS stores comments for a matlab session
%
% comments(name) adds or updates a comment stored with the label "name"
%
% comments prints all the current comments
%# database is a n-by-2 cell array with {label, comment}
persistent database
%# check input and decide what to do
if nargin < 1 || isempty(name)
printDatabase;
else
updateDatabase;
end
function printDatabase
%# prints the database
if isempty(database)
fprintf('no comments stored yet\n')
else
for i=1:size(database,1)
fprintf('%20s : %s\n',database{i,1},database{i,2});
end
end
end
function updateDatabase
%# updates the database
%# check whether there is already a comment
if size(database,1) > 0 && any(strcmp(name,database(:,1)))
idx = strcmp(name,database(:,1));
comment = database(idx,2);
else
idx = size(database,1)+1;
comment = {''};
end
%# ask for new/updated comment
comment = inputdlg(sprintf('please enter comment for %s',name),'add comment',...
5,comment);
if ~isempty(comment)
database{idx,1} = name;
database(idx,2) = comment;
end
end
end

Always always always keep the Matlab editor open with a script documenting what you do. That is, variable assignments and calculations.
Only exceptions are very short sessions where you want to experiment. Once you have something -- add it to the file (It's also easier to cut and paste when you can see your entire history).
This way you can always start over. Just clear all and rerun the script. You never have random temporaries floating around in your workspace.
Eventually, when you are finished, you will also have something that is close to 'deliverable'.

Have you thought of using structures (or cells, although structures would require extra memory use)?
'>> dataset1.numerical=5;
'>> dataset1.comment='This is the dataset that contains 5';
dataset1 =
numerical: 5
comment: 'This is the dataset that contains 5'

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas