VB.NET: Properly manage data from XML - vb.net

Good morning all.
I'm relatively new to the Visual Basic realm (although a traditional web based script developer), i've come to ask you a question. I am reading data from an XML file. This local XML file will be updated by another application, and I will need to periodically re-evaluate the XML file, and only import new data into a list box. Furthermore, I want to be able to click on a particular item in the listbox, and display the other values about that particular XML entry.
So, I suppose this is a multi part question. What is the proper way to import only NEW data into the program, what is the proper way to store the data, and how do I associate a value in a listbox with the data stored elsewhere?
I've considered multidimensional arrays, but have been told that strings to char arrays and then back to strings is a terrible way to manage the data, but was never offered an alternative.
I will be satisfied with a list of topics to study up on and/or an example for an answer to this question.

I would probably use classes that implement INotifyPropertyChanged and a BindingList. Then you just need to listen to ListChanged events off of the list and update the list box then.
I have a blog post that discusses binding classes and interfaces if you want to learn more about them: Data Binding Classes, Interfaces, and Attributes in Windows Forms 2.0. It might be a little dated by now, I haven't reviewed it since I wrote it in March, 2007.

As a start look at the XmlDocument and XmlReader classes.
http://msdn.microsoft.com/en-us/library/system.xml.xmldocument.aspx
http://msdn.microsoft.com/en-us/library/system.xml.xmlreader.aspx
XmlDocument helps load a document into memory and allows you to look at the document in any way you desire, depending on the size of the file there may be implications as to how long pulling in the file takes
XmlReader allows access on the fly, and gives you access very much like a DataReader. I.e. keeping track of your position in the dataset and not retaining any data once you have inspected it.
For keeping a track of updates, it depends where the XML is stored.
If it is in a file a FileSystemWatcher may help in determining when you need to update....
http://msdn.microsoft.com/en-us/library/system.io.filesystemwatcher.aspx

Related

How to store a List or Collection in a dataset table/column? (VB.NET)

I have a dataset table with various columns that are created during form load.
These columns are currently either system.double or system.string types.
And it is displayed in a datagridview.
This works fine.
But I need another column that can store a "list" or some collection in the data table.
A list of strings would do but a custom class would be better.
How is this usually done?
I have spent literally weeks googling this and I dont know where to start. The more I have looked the more confused I have ended up. I end up with more questions than answers, like how is it displayed in the datagridview? I read about a combo box?
I hope someone can give me some pointers in how to get this achieved. I've not posted any code as I think its more the theory of this I need help with.
What you are asking for has does have multiple concerns for most programmers. The storage of data (#1) and the displaying of said data to the user (#2)
For #1 I recommend the .net entity framework. It gives support for storing, querying and updating classes for use in the database. Through most tutorials that I have found it is possible to model the structure of the database tables and their relations and then build a database around that model OR to use an existing database and create entities (entity framework's class objects) around the existing structures and relationships.
Here is a link to a very good beginner tutorial that I have used before: CodeProject Entity Framework Tutorial for Absolute Beginners
For #2 I can recommend the Windows Presentation Foundation. It has lots of bells and whistles to make using a data source and displaying the relevant dependent data very easily through its unique method of data binding. From the tutorials I have used on PluralSight it can be as easy as dragging and dropping from an imported data source like the entity framework database. Alternatively, one can just handle selected row changes for one data grid and then show the dependent data in another data grid.

Programming Theory: Saving VB.NET

What is the best way to save the state of a program. Maybe that is not the right way to describe it but, what I mean is almost any application you can input a whole bunch of data make selections and choices and then save these in files unique to the application your working with.
For the time being I will ask my question in term of VB.NET since it is what I am currently working with. I understand the use of the stream writer to write data to a file (any file extension can be used even your own made up one) and then you can later open the file with the stream reader and load the saved application state. At least that is what I know how to do.
Are there other ways to approach saving the state? In my case I have a dictionary that is defined through user input to store a lot of data and I am trying to find the best way to save the dictionary so I can load it again.
I would suggest the best way to save state is the way that makes sense to you (and presumably fits in with your architectural style).
There are various locations and methods of encoding state into a file but, with the exception of a few extreme cases, there's unlikely to be any (user) perceivable performance differences between the techniques.
If one was feeling especially concerned about such things, it might be worth hiving off the reading or writing of state onto a background worker thread but I'd probably hold off on that if / until you actually start running into any disk bound perf issues.
You can actually do this using vs.net IDE and click your way to happiness. Click a control where you want to save the state, then expand the ApplicationSettings in properties. Then click the ... (box) by PropertyBinding. Now choose the property you want to store the setting for and click New. Now name your setting and select whether it is a per user setting or a application setting in the scope.
OMG, THAT IS IT, AMAZING!
Now, when you want to save the state of a windows form, just put in your code:
My.Settings.Save()

Populating PDF fields from a database

I have a PDF file (not created by me - I have no control over the design etc.) which allows users to fill in some form fields in Adobe Reader and save the result. I want to automate the process of populating the fields, using the following steps:
Fetch data from database.
Open PDF template.
Populate form fields with data.
Save modified file to a separate location on disk.
Lock modified file so that the form fields can no longer be edited.
Send file to user.
I'm happy to use PHP, Perl, Python or Java to do steps 2-5 (in descending order of preference), but whatever I use has to work under Linux (i.e. it mustn't rely on libraries which are only available on Windows for example).
The end result should be a PDF which the average user can open and print, but not modify (I'm sure advanced users could find a way to do so, but I accept that I can't guarantee complete security against modification). I don't want to change the structure of the PDF, merely populate the form fields.
Is there a standard piece of software for doing this? I've seen mentions of FDF Toolkit, but I'm not entirely sure if that's what I want and whether it will allow me to lock the file afterwards, and whether what I want to do fits in with the EULA.
Edit: Final answer is to use iText (as suggested by Mark Storer) but to implement it as a web service which allows you to pass in an array of form field names and values and the PDF file 'template'. The web service will be open source (and available on GitHub once I've written it), as per the AGPL, but anything connecting to it won't have to be.
Filling
Any number of different libraries can fill in field values. I'm partial to iText (java) or iTextSharp (c#). I wrote one in Java a number of years ago. It's not that hard). There are lots. Search SO, you'll find 'em.
Locking
There are a couple different levels of "lock the fields".
Each field has a "read only" flag. This is pretty much a courtesy as far as other libraries capable of setting field values are concerned. In fact, it's generally considered to mean "the ui cannot make changes". Form script can, regardless.
Form flattening: Draw the fields directly into the page and removing all the interactivity.
Each one has pros and cons.
Flag: None too secure. Form data still easily accessible. Scrolling fields still scroll.
Flattening: Pretty much the exact opposite. It's harder to modify (though far from impossile). The form data can only be extracted via text extraction (which is hard, but becoming increasingly common). List & text fields that contain more stuff than is visible will no longer scroll.
The ability to flatten forms is relatively rare. Again, iText can do it (as can iTextSharp), but I'm not aware of any other third party libraries that can... I'm sure they exist, I just can't name them off the top of my head.

Loading XML Files into VB.Net Structures

I have many, (15-20) different XML files that I need to load to VB.Net. They're designed as they would be in a database; they're designed in Access and bulk exported into XML files. Each file represents a different table in the database.
Now, I need to load this information into VB.Net. Initially, I'd love to use DAO and access the MDB directly via queries, but this won't be possible as I'm making sure the project will be easily ported to XNA/C# down the road. (Xbox 360 cannot use MDBs, so I'd rather deal with this problem now than down the road).
So, I'm stuck now trying to figure out how to wrangle together all of these XML files together. I've tried using Factories to parse each one individually. E.g., if three XML files contain data for a 'character' class, i'd pass in an instance of Character to each XML factory and the classes would apply the necessary data.
I'm trying to get past this though, as maintaining many different classes with redundant code is a pain. plus it is hard to debug as well. So I'm trying to figure out a new solution.
The only thing I can think of right now is using System.Reflection, where I parse through each member of the class/structure I'm instantiating, and then using the names of those members to read in the data from that element of the XML file.
However, this makes the assumption that each member of the structure/class has a matching element in the XML file, and vice-versa.
If you know the schema of the XML files - you could create .NET classes that can deserialize one of those XML files into an instance of a .NET object.
You can also you use xsd.exe (comes with Windows SDK download) to generate the .NET class definition for you if you have an XSD file (or can write an XSD easier than you can write a serializable .NET class).
Linq-to-XML is a good solution (and even better in VB.NET with things like XML Literals and Global Namespaces). Treating multiple XML files as DB tables can be a rough road some times, but certainly not impossible. I guess I'd start with JOIN (even though it has "C#" in the title, the samples are also in VB)..

Object serialization practical uses?

How many software projects have you worked on used object serialization? I personally never came across a scenario where object serialization was used. One use case i can think of is, a server software storing objects to disk to save memory. Are there other types of software where object serialization is essential or preferred over a database?
I've used object serialization in a lot of my projects. Sometimes we use it to store computer-specific settings locally. I have also used XML serialization to simplify interaction and generation of XML documents. It is also very beneficial in communication protocols. Serialize on one end and re-inflate on the other end.
Well, converting objects to XML or JSON is a form of serialization that is quite common on the web. I've also worked on a project where objects were created and serialized to a binary file in one application and then imported into another custom application (though that's fragile since it uses C# and serialization has broken in the past between versions of the .NET framework). Also, application settings that have a complex structure may be useful to serialize. I also think remoting APIs use serialization to communicate. Basically, serialization in general is simply a way to store the states of your objects, and this has many different uses.
Here are few uses I can think of :
Send an object across network, the most common example is serializing objects across a cluster
Serialize object for (sort of) caching, ie save the state in a file and read it back later
Serialize passive/huge data to a file to minimize the memory consumption and read it back whenever required.
I'm using serialization to pass objects across a TCP socket. You put XmlSerializers on either side, and it parses your data into readily available objects. If you do a little ground work, you can get it so that you're basically passing objects back and forth, and it makes socket communication extremely easy, reducing it to nothing more than socket.Send(myObject);.
Interprocess communication is a biggie.
you can combine db & serialization. f.ex. when you have to store an object with a lot of attributes (often dynamic, i.e. one object attribute set will be different from another one) to the relational DB, and you don't want to create a new column per each attribute
We started out with a system that serialized all of the thousands of in-memory objects to disk every 15 minutes or so. When that started taking too long we switched over to a mixed mode of saving the objects into a relational db and pickle file (this was a python system btw). Eventually the majority of the data was stored in a relational database. Interestingly, the system was written in such a way that all of the application code couldn't care less what was going on down there. It was all done using XP and thousands of automated tests.
Document based applications such as word processors and vector graphics editors will often serialize the document model to disk when the user invokes the Save command. Serialization is often preferred over complex databases in these apps.
Using serialization saves you time each time you want to implement an import/export functionality.
Every time you need to export your system's data, create backups or store some kind of settings, you could use serialization instead and just save the state of the objects that represent the actual config, data or whatever else.
Only when you need a specific format of the exported/imported data, there is a sense in building a custom parser and exporter/importer.
Serialization is also change-proof. Whenever you change the format of the object that is involved in the exchange functionality, it is automatically exportable and you don't have to change the logic behind your export/import parts.
We used it for a backup & update functionality. It was basically serialized hibernate objects being backed up, then the DB schema is altered through the update and we delivered a helper class that "coverted" the old objects to the new DB schema. This way we had a pretty solid update mechanism that wouldnt break easily and does an automatic backup at the same time.
I've used XML serialization heavily on one project. The technique was used to persist to database data structures that had no common structure, so the data couldn't be stored directly. I also used serialization to separate application settings that could be changed at runtime.