SQL Server big tables or store data in a xml field - sql

I have a .net solution with a big form with many data that the customer need to fill, like a form with many steps to fill all data we need to get.
So i was wondering if it's better (from a performance and design approach) a traditional big table with many fields, o store the data only on one field of XML type.
Example of one "TraditionalTable":
RecordId
CustomerId
Data 1
Data 2....
to Data N
1
120
01/01/1980
abcd ....
123
2
20
04/02/2004
fgh ....
230
3
10
05/01/1995
xyz ....
135
Example of one "DataWithXMLField":
RecordId
CustomerId
FormData
1
120
< data>< customerdetails>< borndate>01/01/1980< /borndate>< /customerdetails >< financialinfo >...."

I've done many systems like this and prefer to keep the data as XML (often it's a serialized object). I find this to be efficient at runtime and at design time. (See item below about binary attachments).
The following are some suggestions based on what I've done in the past. Obviously it's not a one-sized hammer...
Often data is "collected" by a user and "approved" by an administrator. While collecting the data, it's stored as XML. When approved, the XML is shred and placed into "normal" relational tables/fields.
Often this data has been collected through multiple pages. Storing as XML allows collecting data in a way that is logical to the user but doesn't fit the final data structure very well.
If a form is abandoned (not completed or canceled) it's easy to delete a single row.
Things to keep in mind:
Some data is related to workflow and is separate from the data being collected. For example, and field for "Form Status" may go from "In Progress", to "Submitted" to "Approved". This type of data should be kept as regular columns.
Store Binary Data separately. If your form includes submitting binary data (like uploading a PDF) I like to generate a GUID on the front end. Store that GUID in the XML and then save the binary data separately using the GUID. Possibly on disk or in a separate "attachments" table.
Define a column for a "version number" of the XML. This way you can programmatically identify what is in the XML. This will help in the future when you need to make changes to the XML.
Define a column for a "Summary" that is short human-friendly version of the XML. For example, if your XML contains information for registering for summer camps, your "XML Summary" might contain the text: "SMITH,JOHN, Camp White Pine 2021". This text us calculated on the front end. It can then be used for displaying rows of data without having to poke into the XML. For example, an administrative page may exist that lists applications that require approval.
Define a column to indicate if the XML meets all your requirements. You don't want to validate XML in the database (it's often hard, and likely repetitive of the UI). Your business layer can apply business rules (Validation) to the XML (or classes) and store in the database an indicator that all business rules are met.

Related

What kind of dynamic content is available in Eloqua?

In Eloqua, can you send out an email to a contact list but version the "hero" image headline for each segment using dynamic content blocks?
And then can you do the reverse, have the main image remain the same, and dynamically populate products below that they've purchased in the past?
For scenario 1, yes that is possible out of the box.
Scenario 2 however is a bit more complicated and would generally require a 3rd party tool to provide this type of dynamic code generation based upon a lookup table (in this case a line item inventory or purchases). Because a contact could have zero or more products (commonly as individual records in a CDO), you would generally need to aggregate or count the number of related records, and then generate your HTML table and formatting around those record values, and be contextually aware if it is the first or last record (to begin and close the table). Dynamic content does not have mathematical functions and would not be able to count those related records - this is something usually provided by a B2C system like SFMC using ampscript or dynamically generated through custom code and sent through a transactional SMTP service. You could have multiple dynamic content on top of each other, but your biggest limitation becomes the field merge, with only lets you select a record based upon earliest/last creation date, or last modified. This is not suitable if you have more than 2 records. A third party service that provides a cloud content module for your email is your best bet.

How to manage additional processed data in MarkLogic

MarkLogic 9.0.8.2
We have around 20M records in MarkLogic.
For one of the business requirement, we need to generate additional data for each xml and then need user will search this data.
As we can't change original document, so need input on what is best way to manage additional data. Following are the few which we have thought of
Create separate collection and store additional data in separate xml with same unique number i.e. same as original xml. So when user search for it, search in this collection and then retrieved original documents and send response back.
Store additional data in original document properties
We also need to create element range index to make sure it works when end user provide data in range operators.
<abc>
<xyz>
<quan>qty1</quan>
<value1>1.01325E+05</value1>
<unit>Pa</unit>
</xyz>
<xyz>
<quan>qty2</quan>
<value1>9.73E+02</value1>
<value2>1.373E+03</value2>
<unit>K</unit>
</xyz>
<xyz>
<quan>qty3</quan>
<value1>1.8E+03</value1>
<unit>s</unit>
</xyz>
<xyz>
<quan>qty4</quan>
<value1>3.6E+03</value1>
<unit>s</unit>
</xyz>
</abc>
We need to process data from value1 element. User will then search for something like
qty1 >= minvalue AND qty1<=maxvalue
qty2 >= minvalue AND qty2<=maxvalue
qty3 >= minvalue AND qty3<=maxvalue
So when user will search for qty1 then it should only get data from element where value is qty1 and so on.
So would like to know
What is best approach to store data like this
What kind of index i should create to implement this
I would recommend wrapping the original data in an envelope, which allows adding extra data in the header. It could also allow creating a canonical view on the relevant pieces of the data, and either store that as instance, and original as 'attachment' (sub-property, not an attached binary), or keep the instance as-is, and put canonical values for indexing in the header.
There is a lengthy blog article about the topic, that discusses pros and cons in high detail: https://www.marklogic.com/blog/envelope-design-pattern/
HTH!
Grtjn's answer would be the recommended solution, as it is more performant to keep all the information inside the document itself, versus having to query across both the document with the properties, but it would require changes to the document.
Option 1 & 2 could both work.
Properties documents already exist, so it doesn't add fragments, but the properties must conform to the schema.
Creating a sidecar document provides more flexibility, because you are creating new documents, it will increase number of fragments.

What are the security risks if I disclose database field name to web user interface?

I want make the program more simple, so I use table's field name as name in input html,
And then I can save some time for mapping input name to database field name
But, are there security risks if user know my field name?
(Suppose SQL injection have handled in the server program)
Update 1:
I am not going to around the field name validation
I just don't want to do something like this
$uid=$_POST['user_id'];
$ufname=$_POST['user_first_name'];
$ulname=$_POST['user_last_name'];
If I do this
$user_id=$_POST['user_id'];
$user_first_name=$_POST['user_first_name'];
$user_first_name=$_POST['user_last_name'];
I can save coding time, and don't need to think two names for one data, and reduce bug.
and I can also do something like this to save more time as I just type the name once.
$validField=array("user_id","user_first_name","user_last_name");
foreach ($validField as $field) {
$orm[$field]=$field;
}
This can also valid the field name
so I think that hacks are no way to get my unpublished fields
I can save some time for mapping input name to database field name.
If you save time mapping input names to database field names, you would need to spend a roughly equivalent time validating that the field names are, in fact, among the fields that the users can access in your database. There is no way around this validation, because otherwise your DB is exposed to hacks that try and get your unpublished fields, such as IDs and hashes. This is pretty bad, so you would need to build that validation layer.
On the other hand, if you do a mapping from meaningless IDs to meaningful, then you do not need validation, because it is your program that produced the meaningful IDs. Essentially, the validation step is built into the process.

How to fetch data for a news feed like system?

I have few tables as shown below
Polls
PollId Question Option
1 What 1
2 Why 4
Updates
UpdateId Text
1 Sleep
2 Play
Polls and updates are just two sample tables (In reality there are more tables like ,photos, videos,links etc). But when a user visit his home (like facebook new feed) he must be displayed with data relevant to him (no such data included in this example). ie I want to select data from all tables with less number of query executions. (ie, I want to present a mixture of datas, ie polls, photos, videos etc )
Currently, I'm fetching only ids and type (ie which table) from all of the tables and gather further data while iterating through this resultset. (ie from c# calling another SqlQuery) .
Is there a way to query the data from whole tables at once? (OUTER JOIN?, UNION?)
Or simply,
How can I select different type of entities at once in a single sql Query?
You could write your query so that you have one long select list for everything you want and it all comes back in one result set but I suspect that wouldn't work too well because you might have varying numbers of different types of items per user.
If you really must have it all in one hit then you can issue multiple queries in one go and get multiple result sets back. To handle this you can use an ADO.Net DataSet. See this SO example (but not the accepted answer - see Vikram Dibyal's answer as that gives a very basic overview of what I think you're asking for).
I won't copy and paste the stuff from the linked thread, just head over and take a look.

Storing Data as XML BLOB

At the moment the team i am working with is looking into the possibility of storing data which is entered by users from a series of input wizard screens as an XML blob in the database. the main reason for this being that i would like to write the input wizard as a component which can be brought into a number of systems without having to bring with it a large table structure.
To try to clarify if the wizard has 100 input fields (for example) then if i go with the normal relational db structure then their will be a 1 to 1 relationship so will have 100 columns in database. So to get this working in another system will have to bring the tables,strore procedures etc into the new system.
I have a number of reservations about this but i would like peoples opinions??
thanks
If those inputted fields don't need to be updated or to be used for later calculation or computation some values using xml or JSON is a smart choice.
so for your scenario seems like its a perfect solution