Parsing a large value that includes 3 smaller values of scientifc notation - vb.net

I'm using VB.Net 2013 and really could use some help. Perhaps I have been staring at it too long. I am presented with a value from a variable. The specific value is this
3.190E+01+3.366E+01+8.036E+00
The value is actually 3 smaller values in scientific notation as follows
3.190E+01
3.366E+01
8.036E+00
I need to get the individual values into individual variables. Once I have the individual values I need to calculate the notation of each value so 3.190E+01 is equivalent to 3.190*10^1 and 8.036E+00 is equivalent to 8.036*10^0. I can probably figure out the last part of this question if I can just get the individual values. The caveat is that the numbers will vary in size and the scientific notation part will not always be the same. I do believe it will always be E+XX though so possible to use some regex stuff that I don't fully understand.
Thank you, I look forward to your help and it is very much appreciated.

Related

How can I get SAS to forget decimal places i variables?

I have a problem with decimal places in my SAS-variables.
I have two numeric variables which I want to compare in SAS.
They are both formatted with numx10.1 and one of the variables is calculated like this:
avg(delay)/30.44 as delay_months format=numx10.1
I want to compare the two values in a datastep using
where var1^=delay_months ;
But even thought both variables have the same value for example. 4,1, SAS still shows this observation in the datastep with the where-statement.
I guess it is because of the decimals in the delay_months variable.
How can I get SAS to forget the decimal places and only focus on the number I see - for example 4,1?
Format does not change the value - it is only the way SAS represents it to you. One suggestion could be rounding during comparison (or when you create the variables)
where round(var1, 0.1) ^= round(delay_months, 0.1);
SAS uses floating-point representation for numeric values. Because of that sometimes what you see is not what it is.
More about that in SAS documentation

Generating Record Layouts for EBCDIC Data Files.

We are attempting to write a tool in Perl which is expected to parse a fixed length EBCDIC data file and generate the record layout by looking at the hex value of each byte in the record.
It is assumed that each data file, which is written by a Cobol program whose source code we do not have, can have multiple record layouts. The aim of this tool is to perform data migration (EBCDIC to ASCII) by generating layout which would then be fed to a converter.
The problem is that there are hundreds of permutations and combinations that may arise with each byte. I thought that comparing the hex value of the corresponding byte in the record below the current one might give us some clue as to what this might be. But even in this case there is no concrete solution that one might arrive at. Decisions need to be taken at every juncture which might affect the end result.
Could someone please let me know for any said patterns that I can look for? For example, for all COMP-3s each nibble can possibly represent a value from 0-9 and hence the hex value of the byte might be something like, [0-9][0-9]. Essentially for data migration one need not bother about COMPs and COMP-3s as their value would not be affected in the migration. But identifying what is the DISPLAY fields are is also turning out to be a huge task. Can someone throw some ideas or point me in some direction that I can further explore?
Any help would be highly appreciated. I am really stuck in a mire here.
Thanks,
Aditya.
There are many enterprise transformation tools that will do exactly what you need. Alternatively, it is easy to parse the ADATA records from the compiled copybooks to get the exact byte positions and representations of every field.
Can I hazard a guess? Do you have nobody skilled in Cobol? It isn't that hard to process Cobol copybooks, certainly not as hard as it is to use a write only language like Perl.
Do you have syncsort or DFsort available? It will do what you ask with a simple config file...
I guess you have to go with probabilities, and hope the data is varied enough to get a lot out of that.
Any field that only contains EBCDIC values of alpha-numeric plus punctuation
Numeric DISPLAY fields will be the easiest, containing just EBCDIC 0-9. Note that if signed then the first number will be changed to a letter, like A is -1 I think.
Pretty random distribution of values, leading with hex 0's, will likely be binary numeric "COMP" fields.
COMP-3 fields are one decimal digit in each hex digit of data. So if all the hex digits happen to be 0-9, that's a strong sign of a comp-3 field. Except the last hex digit of the field, which will contain a C for positive, D for negative, and F for unsigned.
Some programs use spaces on numeric fields, so if a field contains all sorts of binary, and also hex 40 (spaces), it's probably best to toss the hex 40 out of the mix. It might tell you a group of bytes is one field if they are all spaces together, or all data together.
As for multiple layouts, that's tough. A common convention for records that can have multiple layouts is to have a limited set of values for "what type of data is this" near the front of the record. Like significantID, recordType, data. So the significantID should increase steadily, while the recordType fields will vary between just a few values and re-cycle.
The FileWizard in RecordEditor / JRecord can search for Mainframe Cobol fields in a Files. The FileWizard results can be stored in a Xml file for use in other languages or you can use the copy Function to copy from Ebcdic to either Ascii fixed or CSV formats.
There is some out of date documentation on the File Wizard

converting the numerals as string

Can someone help mo on how to convert a number value to its string value
(ie.,1198.00 should be interpreted as Thousand Hundred and ninety eight)
There is nothing built in - you will need to write your own.
Here is one way of doing it, and a quick search finds several other samples to look at.
As a note - all of these approaches suffer from being tied to a single language, so non of them can be used as is for internationalization/localization efforts.

Always show decimal places in SQL?

I'm working on a query which returns numeric values (currency). Some of the values are whole numbers and are being displayed as 3, 5, 2 etc whilst other numbers are coming up like 2.52, 4.50 etc.
How can I force oracle to always show the decimal places?
Thanks
TO_CHAR(pAmount, '9,999,999.99');
http://www.techonthenet.com/oracle/functions/to_char.php
http://www.ss64.com/orasyntax/to_char.html
To enhance the answers already given, you can use:
TO_CHAR(your_value,'fm999.99') to prevent leading spaces
____3.45 becomes 3.45 (_ indicates whitespace)
TO_CHAR(your_value,'fm990.99') to force values less than 1 to show a leading zero
.52 becomes 0.52
TO_CHAR(your_value,'fm990.00') to force 2 decimal places, even if 0
6.3 becomes 6.30
(TO_CHAR(your_value,'fm990.00')||'%') to add a percentage sign
18.6 becomes 18.60%
source: https://community.oracle.com/thread/968373?start=0&tstart=0
The display and formatting of the data should be handled at the presentation layer - not the data one.
Use the facilities provided by your front end to format the values as you see fit.
The to_char fixes the decimal issue but you have to be certain about the length. If it is longer than the format provided, it will show the number as ####. If the number is shorter, then it will leave spaces before the number. e.g
to_char(123.45),'99.00') will show ####
and
to_char(123.45),'999999.00') will show ' 123.45'.
So, if you have to export the results to CSV or Excel, these numbers will be treated as string.
So, I have not found any solution to it.
In SQL*Plus you can use the COLUMN directive to specify formatting on a per-column basis, separate from the query itself. That way you keep your query "clean" for possible other uses and still get your formatting. (In SQL*Plus at least...)
e.g
COLUMN SAL FORMAT 99,990.99
Google for "SQL*Plus User's Guide and Reference" and you should get links to the Oracle location for your Oracle version. 10.1 is here if that'll do. They'll probably all be about the same, mind you: I don't think SQL*Plus has changed much since I learned it in 1988 on Oracle 5.1.17...

Is the CHAR datatype in SQL obsolete? When do you use it?

The title pretty much frames the question. I have not used CHAR in years. Right now, I am reverse-engineering a database that has CHAR all over it, for primary keys, codes, etc.
How about a CHAR(30) column?
Edit:
So the general opinion seems to be that CHAR if perfectly fine for certain things. I, however, think that you can design a database schema that does not have a need for "these certain things", thus not requiring fixed-length strings. With the bit, uniqueidentifier, varchar, and text types, it seems that in a well-normalized schema you get a certain elegance that you don't get when you use encoded string values. Thinking in fixed lenghts, no offense meant, seems to be a relic of the mainframe days (I learned RPG II once myself). I believe it is obsolete, and I did not hear a convincing argument from you claiming otherwise.
I use char(n) for codes, varchar(m) for descriptions. Char(n) seems to result in better performance because data doesn't need to move around when the size of contents change.
Where the nature of the data dictates the length of the field, I use CHAR. Otherwise VARCHAR.
CHARs are still faster for processing than VARCHARs in the DBMS I know well. Their fixed size allow for optimizations that aren't possible with VARCHARs. In addition, the storage requirements are slightly less for CHARS since no length has to be stored, assuming most of the rows need to fully, or near-fully, populate the CHAR column.
This is less of an impact (in terms of percentage) with a CHAR(30) than a CHAR(4).
As to usage, I tend to use CHARs when either:
the fields will generally always be close to or at their maximum length (stock codes, employee IDs, etc); or
the lengths are short (less than 10).
Anywhere else, I use VARCHARs.
I use CHAR when length of the value is fixed. For example we are generating a code or something based on some algorithm which returns the code with the specific fixed lenght lets say 13.
Otherwise, I found VARCHAR better. One more reason to use VARCHAR is that when you get the value back in your application you don't need to trim that value. In the case of CHAR you will get the full length of the column whether the value is filling it fully or not. It would get filled by spaces and you end up trimming every value, and forgetting that would lead to errors.
For PostgreSQL, the documentation states that char() has no advantage in storage space over varchar(); the only difference is that it's blank-padded to the specified length.
Having said that, I still use char(1) or char(3) for one-character or three-character codes. I think that the clarity due to the type specifying what the column should contain provides value, even if there are no storage or performance advantages. And yes, I typically use check constraints or foreign key constraints as well. Apart from those cases, I generally just stick with text rather than using varchar(). Again, this is informed by the database implementation, which automatically switches from inline to out-of-line storage if the value is large enough, which some other database implementations don't do.
Char isn't obsolete, it just should only be used if the length of the field should never vary. In the average database, this would be very few fields mostly some kind of code field like State Abbreviations which are a standard 2 character filed if you use the postal codes. Using Char where the filed length is varaible means that there will be a lot of trimming going on and that is extra, unnecessary work and the database should be refactored.