Extract Domino database meta-data and data types - lotus-domino

I have to investigate taking a domino database to an alternative database - probably SQL Server or Oracle. How do I investigate and output a complete easily readable report on the domino database's meta-data, including all data field types and imbedded objects, such as other files, imbedded text and images etc?
I have looked at creating the database synopsis but I need something that doesn't contain all the unnecessary information etc.

You can write your own tool using the NotesNoteCollection class and the NotesDXLExporter class, parsing out whatever parts you consider necessary and leaving out the parts you think are unnecessary.

Related

One time migration of VSAM files from Mainframe to Cloud Azure

Want to migrate bulk files (e.g VSAM) from Mainframe to Azure in the beginning of the Project, how that can be achieved ?
Any utility or do we need to write own scripts?
I suspect there are some utilities out there but I suspect they are most / all priced products. Since VSAM datasets are not defined using a language construct like DDL you will likely have to do most of the heavy lifting. Either writing your own programs or custom scripts. You didn’t mention operating system but I assume you’re working on z/OS.
Here are some things to consider:
The structure of the VSAM dataset is basically record oriented. There are three basic types you’ll run into that host application data:
Key Sequenced Datasets (KSDS)
Entry Sequenced Datasets (ESDS)
Relative Record datasets (RRDS)
Familiarize yourself with the means of defining the datasets as it will give you some insight into the dataset specifics. DFSMS Access Method Services Commands will show the utilities used to create them and get information like Keylength and offest of the key. DEFINE CLUSTER is the command to create the dataset. You mentioned you are moving the data toi Azure but this will help you understand the characteristics of the data you are moving.
Since there is no DDL for VSAM datasets you will generally find the structure in the programs that manipulate them like COBOL Copybooks, HLASM DSECTs and similar constructs. This is the long pole in the tent for you.
Consider what are the semantics of accessing the data. VSAM as an access method does have some ability to control read/write access on a macro level using a DEFINE CLUSTER option called SHAREOPTIONS. The SHAREOPTIONS instruct the operating system how to handle the VSAM buffers in terms of reading and writing so that multiple processes can access the same data. Its primitive if compared to sahred files systems like NFS. VSAM allows the application to control access (or serialization) using ENQ / DEQ functions. These enable applications to express intent in the cluster about a VSAM file and coordinate their own activities.
You might find that converting a VSAM file to a relational form like Db2 is better for you. Again, you’ll have to create the DDL to describe the tables, data formats and the like.
Another consideration is data conversion. You’ll find there is character data that is most likely in EBCDIC and needs to be converted to a new code page. Numeric data can be in Packed Decimal, Binary, or even text and will need to be converted.
The short answer is there isn’t an “Easy Button” to do what you want. Consider the data is only one of the questions that needs to be answered. Serialization and access to the data, codepage conversion, if you are moving some data but not others will you need to be able to map some of the converted data back to data on the mainframe.
Consider exploring IBM CDC classic replication. You can achieve it with click of buttons.
I have not done for Azure. So not sure about support.

Storing spreadsheets in a database

I am attempting to create a relational database to hold data from experiments which return csv files filled with data. This would allow me to search up an experiment I want based on date, author, experimental values etc.
However, I am not sure how to implement the relational database with the experiments which each generate seperate csv files.
Would it be possible to have a csv file as a column of the database or would it just be better to hold the name of the file?
This is a bit long for a comment.
In general, databases have the ability to store large objects (usually, "BLOB"s -- binary large objects).
Whether this meets your needs depends on several factors. I would say that the first is accessibility to the data. Storing the strings in the database has some advantages:
Anyone with access to the database has access to the data.
To repeat: Users do not need separate user access to a file system.
The same API can be used for the metadata and for the underlying data.
You have more controls over the contents -- the underlying file cannot be deleted without deleting the row in the database, for instance.
The data is automatically included in backups and restores.
Of course, there are downsides as well, some of which are related to the above:
With a separate file, it is simpler to update the file, if that is necessary.
Storing the data in a database imposes overheads (although you might be able to get around this by compressing the data).
If the application is already file-based and you are added a database component, then changing the application to support the database could be cumbersome.
I'm sure these lists are not complete. The point is that there is no "right" answer. It depends on your needs.

Using a SQL database file as project files

I am wondering if it makes sense to use multiple SQL database files like sqllite (which I believe is single file based?) as project files in my software. The project files contain basic information as well as multiple records (spectra) with lists of parameters (floating point values) and lists of measurement data (also floating point).
I currently use my own binary format, which is a pain to maintain. I tried to use XML which works very well, but the file sizes explode (500 kB before, 7.5 MB as XML).
Now I wonder if I can structure SQL databases to contain this kind of information and effectively load and save this data in my .NET software.
(I am not very experienced in SQL) so:
Can SQL tables contain sub-tables (like subnodes in XML) or be linked to other tables?
E.g. Can I make a table for the record, and this table has subtables for the lists of measurement data and parameters?
Will this be more efficient than XML in terms of storage space?
I went with a SQLite database. It can be easily implemented into .NET using the System.Data.SQLite Project, that can even be used with AnyCPU Builds.
It is working very nicely, both performance and storage space wise.
You still need to take a lot of care with different versions of your databases. If you try and save a new scheme into a database using an older scheme, some columns or tables might not exist. You need to implement a migration method to a new database file for this.
The real advantage is, that it is an open format, and I stand behind the premise, that the stuff a user saves is his, and does not need to be hidden in an obscure, file structure, if the latter does not bring any significant advantages to the table.
If the user can no longer use your software, he or she can still access all data, using other tools like the Database Browser for SQLite if need be.

Migrating RMS to RDB

We're approaching the migration of legacy OpenVMS RMS files into relational database (both MS SQL 2012 and Oracle 10g are available).
I wonder if there are:
Tools to retrieve schema of indexed files
Tools to parse indexed files
Tools to deal with custom RMS data formats (zoned decimals etc)
as a bundle/API/Library
Perhaps I should change the approach?
There are several tools available, notably through ODBC vendors (I work for one: Attunity).
1 >> Tools to retrieve schema of indexed files
Please clarify. Looking for just record/column layout and indexes within the files or also relationships between files.
1a) How are the files currently being used? Cobol, Basic, Fortran programs? Datatrieve?
They will be using some data definition method, so you want a tool which can exploit that.
Connx, and Attunity Connect can 'import' CDD definitions, BASIC - MAP files, Cobol Copybooks. Variants are typically covered as well. I have written many a (perl/awk) script to convert special definition to XML.
1b ) Analyze/RMS, or a program with calling RMS XAB's can get available index information. Atunity connect will know how to map those onto the fields from 1a)
1c ) There is no formal, stored, relationship between (indexed) files on OpenVMS. That's all in the program logic. However, some modestly smart Perl/Awk/DCL script can often generate a tablem of likely foreign/primary keys by looking at filed names and datatypes matches.
How many files / layouts / gigabytes are we talking about?
2 >> Tools to parse indexed files
Please clarify? Once the structure is known (question 1), the parsing is done by reading using that structure right? You never ever want to understand the indexed file internals. Just tell RMS to fetch records.
3 >> Tools to deal with custom RMS data formats (zoned decimals etc) as a bundle/API/Library
Again, please clarify. Once the structure is known just use the 'right' tool to read using that structure and surely it will honor the detailed data definitions.
(I know it is quite simple to write one yourself, just thought there would be something in the industry)
Famous last words... 'quite simple'. Entire companies have been build and thrive doing just that for general cases. I admit that for specific cases it can be relatively straightforward, but 'the devil is in the details'.
In the Attunity Connect case we have a UDT (User Defined data Type) to handle the 'odd' cases, often involving DATES. Dates in integers, in strings, as units since xxx are all available out of the box, but for example some have -1 meaning 'some high date' which needs some help to be stored in a DB.
All the databases have some bulk load tool (BCP, SQL$LOADER).
As long as you can deliver data conforming to what those expect (tabular, comma-seperated, quoted-or-not, escapes-or-not) you should be in good shape.
The EGH tool Vselect may be a handy, and high performance, way to bulk read indexed files, filter and format some and spit out sequential files for the DB loaders. It can read RMS indexed file faster than RMS can! (It has its own metadata language though!)
Attunity offers full access and replication services.
They include a CDC (change data capture) to not a only load the data, but to also keep it up to date in near-real-time. That's useful for 'evolution' versus 'revolution'.
Check out Attunity 'Replicate'. Once you have a data dictionary, just point to the tables desired (include, exlude filters), point to a target DB and click to replicate. Of course there are options for (global or per-table) transformations (like an AREA-CODE+EXHANGE+NUMBER to single phone number, or adding a modified date columns ).
Will this be a single big switch conversion, or is there desire to migrate the data and keep the old systems alive for days, months, years perhaps, all along keeping the data in close sync?
Hope this helps some,
Hein van den Heuvel.
OP: Perhaps I should change the approach? Probably.
You might consider finding data migration vendors, some which likely have off-the-shelf solutions, if not as a COTS tool, more likely packaged as a service (I don't think this is a big market).
What this won't help you with is what I think of as much bigger problem with the application code: who is going to change all the code that is making RMS calls, in the corresponding code that makes relational DB calls? How will the entity ("Joe Programmer", or some tool), know where the data migrated to, so that he can write the correct call? What are you doing to do about the fact that the data representation is like to change?
Ideally you'd like an automated migration tool, that will move the data itself (therefore knows that datalayouts and representation changes), and will make the code changes that correspond. You can look for these kind of vendors, too.

should xml or sqlite3 be used?

I just started iOS development am currently developing an application that just reads data from a server and displays it onto the screen. What I am not sure of is whether to use XML or sqlite3 to store the data. Which method should be more preferred and why? thanks in advance.
It is important to remember they are two different things, suited to different tasks. Choose the one that fits the problem. (In this case I would likely use XML or "just plain text" because it sounds like just a simple download-cache. Either the raw response could be kept or, perhaps the data already transformed into objects and then automatically serialized into XML or whatnot. In any case, keep it simple.)
XML is (at the very core) a markup format. XML documents are a (hopefully well-defined) structure. There is a large set of tooling that supports manipulation and querying within a hierarchical "document" model. I use XML a good bit for a serialization format and also use it for local caching if appropriate (e.g. there are no non-hierarchical relationships). XML is often loaded entirely into memory (e.g. a DOM) for manipulation.
SQLite is a relational database that is designed around tables and relationships between sets of tables. Being able to run (complex) queries is where a relational database really shines. SQLite is also very fast and can process large data-sets which can't all fit in memory. Columns in SQLite can also contain text (read: XML) so the approaches are not orthogonal.
Happy coding.
Probably all depends on how data is processed after it was stored. If data must be sorted, uses specific selection etc. then, sqlite is better solution.
Second, not so important, concern is how much data will be stored, if it's just one "table" with 10 rows then sqlite is probably too much for it.
If you want to read data from server and want to display on screen and don't need to save it locally then use XML.
If you want to store it locally and don't want to fetch from server then use XML files or sqlite database in your project.
If you want to fetch from server and also to store it locally then first use XML to fetch data and then use sqlite to store it locally.
and look at #pst answer for what is the difference between them.