What is a managable way to store e-mails for extended periods of time? - backup

If you have a site which sends out emails to the customer, and you want to save a copy of the mail, what is an effective strategy?
If you save it to a table in your database (e.g. create a table called Mail), it gets very large very quickly.
Some strategies I've seen are:
Save it to the file system
Run a scheduled task to clear old entries from the database - but then you wind up not having a copy;
Create a separate table for each time frame (one each year, or one each month)
What strategies have you used?

I don't agree that gmail is an effective backup for business data.
Why trust your business information to a provider who makes no guarantees of service, or over who you have no control whatsoever?
Makes no sense to me.
Depending on how frequently you need to access this information, I'd say go with the filesystem or database archive. At least that way, you have control over your own data.

Data you want to save is saved in a database. The only exception that is justified is large binary data (images, videos). Who cares how large the table gets? If the mails are automated and template-based, you just have to save the variable parts anyway. The size will be about the same wherever you save it, but you probably already have a mechanism to backup your database, so you won't have to invent one to handle millions of files.

Lots of assumptions:
1. You're running windows / would like an archive in windows
2. The ability to search in the mails is important.
Since you are sending mails to your customers there isn't any reason you can't bcc a mail account of your own. Assuming you have a suitable account on your own server then I'd look at using MailStore (home) to pull the mails out from your account and put them into it's own compressed database.

Another option (depending on the email content) is to not save the email, but make sure you can recreate the email by archiving the original content that went into generating the email.

It depends on the content of your email. If it contains large images. I would plump for the file system. Otherwise if your Mail table table is getting very large very quickly I would go for the separate table, archiving off dead customers.

We save the email to a database table. It really doesn't get that big that quickly. We've a table with 32,000 emails in it (they're biggish emails too # 50kb per email) and with compression, the file only uses 16MB.
If you're sending a shed load of email, then know that GMail(free) currently only allows 7GB of data. I'd be happy holding that on a disk.

I'd think about putting in place some sort of general archiving functionality. How you implement that depends on your specific retrieval needs.
For example if you wish just to retrieve emails sent to a particular customer for a certain month then stocking them in an appropriate heirachy on the File System (zip them up if necessary) should be simple to do. You might want to record a list of sent emails in a database table with a pointer to the appropriate directory but a naming convention for your directories and files might be sufficient
You might not need to access very old emails very infrequently so you might archive these to DVD for example if online storage is a problem
If you're wanting to often search the actual content of emails then your going to have to put the content in a DB table or use an indexer like Lucerne to examine the files stocked on disk

Related

Best way to export data from other company's SAP

I need to extract some data from my client's SAP ECC (the SUIM -> Users by Complex Selection Criteria -program RSUSR002)
Normally I give them a table of values that I they have to fill some field to extract what I need.
They have to make 63 different extractions (with different values of objects, for example - but inside the same transaction - you can see in the print) from their SAP, to later send to me all extracted files.
Do you know if there is an automated way to extract that, so they don't have to make 63 extractions?
My biggest problem is that every time they make mistakes. It's a lot of things to fill..
Can I create a variant and send it to them? Is it possible to export my variant so they can import it without the need to fill 63x different data?
Thank you.
When this is a task which takes considerable effort by multiple people each year, then it is something which might be worth automatizing.
First you need to find out where that transaction gets its data from. If you spend some time analyzing and debugging the program behind the transaction, you will surely find which SELECT's on which database table(s) provide that data. If you are lucky, there might even be a function module for it.
Then you just need to write an own ABAP program which performs the same selections.
Now about the interesting part: How to get that data to you. There are several approaches here. The best one depends on your requirements and your technical infrastructure. Some possibilities are:
Let users run the program in foreground, use the method cl_gui_frontend_services=>gui_download to save the data to a file on the user's PC and ask them to send it to you via email
Run the program in background and save the file on the application server. Then ask your sysadmins how to get that file from their application server to you. The simplest way would be to just map a network fileserver so they all write to the same place, but there might be some organizational hurdles in the way which prevent that. (Our security people would call me crazy if I proposed to allow access to SMB shares from outside of our network, but your mileage may vary)
Have the program send the data to you directly via email. You can send emails from an SAP system using the function module SO_NEW_DOCUMENT_ATT_SEND_API1. This of course requires that the system was configured to be able to send emails (which you can do with transaction code SCOT). Again, security considerations apply. When it's PII or other confidential data, then you should not send it in an unencrypted email.
Use an RFC call to send the data to your own SAP system which aggregates the data
Use a webservice call to send the data to your own non-SAP system which aggregates the data
You can create a recording in transaction SM35.
There you fill a tcode (SUIM), start recording, make some input in transaction SUIM and then press 'Execute'. Then you can go back to recording (F3 multiple times) and the system will generate some table with commands (structure is BDCDATA). You can delete unnecessary part (i.e. BACK button click) and save it to use as a 'macro'. Then you can replay this recording and it will do exactly what you did.
Also it's possible to export/import the recording to text file, so you can explore it's structure, write some VBA script to create such recording from your parameters and sent it to users. But keep in mind that blanks are meaningful.
It's a standard tools so there's no any coding in the system.
You can save the selection as a variant.
Fill in the selection criteria and press Save.
It can be reused.
You can also transport Variants if the they have a special name

Database model to manage documents

I need to build a tables related to manage documents such as jpg,doc,msg,pdf using a sql server 2008 .
As i know sql server support .jpg images, so my question is if it's possible to upload other kind of files into a db.
This is an example of the table (could be redefined if it's needed).
Document : document_id int(10)
name varchar(10)
type image (doesnt know how it might works)
Those are the initial values for a table, but i dont know how to make it useful for any type.
pd: do i need to assign a directory to save this documents into the server?
You can store almost any file type in an sql server table...if you do, you will almost certainly regret it.
Store a meta-data / a pointer to the file in your database instead, and store the files themselves on a disk directly where they belong.
Your database size - and thus hardware required to run it - will grow very rapidly, so you will be incurring large costs that you do not need to incur.
Use Filestream
https://learn.microsoft.com/en-us/sql/relational-databases/blob/filestream-sql-server
I know that a link-only answer is not an answer but I can't believe no one has mentioned it yet
The proper database design pattern is not to save Files into DBMS. You should develop a kind of File Manager Subsystem to manage your files for all of your projects.
File Manager Subsystem
This subsystem should be Reusable, Extendable, Secure and etc. All your projects that want to save Files, can use this subsystem.
Files can be saved in every where such as Local Hard, Network Drive, External Drives, Clouds and etc. So this subsystem should be design to support all kind of requests.
(you can improve the mentioned subsystem by adding a lot of features to it. for example checking duplicate files,...)
This subsystem, should generate a Unique Key for each file. After uploading and saving the files, the subsystem should generate that key.
Now, you can use this Unique Key to save in database (instead of file). Every time if you want to get the file, you can get the Unique Key from database and request to get file from the subsystem by unique key.

What is the best solution to save big amount files receiving everyday?

in my application, i receive at least 1000 files per days from different email.
Pentaho i store them in a folder, but what is the best solution to save these files:
storing in folder in my hard disk or saving in a table (sql)?
Thank you
Unless you plan to take advantage of the additional features offered by a sql database, I would advise you continue to store them to the hard drive, and DO NOT forget the importance of backing up your files.

How to find the document visitior's count?

Actually I am in need of counting the visitors count for a particular document.
I can do it by adding a field, and increasing its value.
But the problem is following.,
I have 10 replication copies in different location. It is being replicated by scheduled manner. So replication conflict is happening because of document count is editing the same document in different location.
I would use an external solution for this. Just search for "visitor count" in your favorite search engine and choose a third party tool. You can then display the count on the page if that is important.
If you need to store the value in the database for some reason, perhaps you could store it as a new doc type that gets added each time (and cleaned up later) to avoid the replication issues.
Otherwise if storing it isn't required consider Google Analytics too.
Also I faced this problem. I can not say that it has a easy solution. Document locking is the only solution that i had found. But the visitor's count is not possible.
It is possible, but not by updating the document. Instead have an AJAX call to an agent or form with parameters on the URL identifying the document being read. This call writes a document into a tracking DB with one or two views and then determines from those views how many reads you have had. The number of reads is the return value of the AJAX form.
This can be written in LS, Java or #Formulas. I would try to do it 100% in #Formulas to make it as efficient as possible.
You can also add logic to exclude reads from the same user or same source IP address.
The tracking database then replicates using the same schedule as the other database.
Daily or Hourly agents can run to create summary documents and delete the detail documents so that you do not exceed the limits for #DBLookup.
If you do not need very nearly real time counts (and that is the best you can get with replicated system like this) you could use the web logs that domino generates by finding the reads in the logs and building the counts in a document per server.
/Newbs
Back in the 90s, we had a client that needed to know that each person had read a document without them clicking to sign or anything.
The initial solution was to add each name to a text field on a separate tracking document. This ran into problems when it got over 32k real fast. Then, one of my colleagues realized you could just have it create a document for each user to record that they'd read it.
Heck, you could have one database used to track all reads for all users of all documents, since one user can only open one document at a time -- each time they open a new document, either add that value to a field or create a field named after the document they've read on their own "reader tracker" document.
Or you could make that a mail-in database, so no worries about replication. Each time they open a document for which you want to track reads, it create a tiny document that has only their name and what document they read which gets mailed into the "read counter database". If you don't care who read it, you have an agent that runs on a schedule that updates the count and deletes the mailed-in documents.
There really are a lot of ways to skin this cat.

I need to store HTML emails in a database. Is that a bad idea?

The templates for these HTML emails are all the same, but there are just different variables for say, first name, last name and such.
Would it just make sense to store the most minimal of data that I need, and load the template in and replace the variables everytime?
Another option would be to actually create the HTML file and store a reference to it, which probably would be the easiest to do except it might be a pain managing the files, and it adds complexity in regards to migrating, file permissions, et cetera.
Looking for opinions from people who've done this before...
GOAL/PURPOSE/USE:
I have a booking engine. When users make a booking, they are sent a confirmation email, generated from the sessionized booking data.
This email provides a "Cannot view this email? See it here" link which provides a web view of the email, in addition to a plaintext view.
I need to display the same email that was sent out, in addition to the plaintext view.
The template is subject to change, but I think because of that very fact I should have a table of templates and map the data to a template.
That's what I would do, because the template layout may change over the time, but the person information should remain the same. So, it makes sense to just store the person information in the database and leave the template out from the database.
In fact, it would be even better if you use template engine such as Velocity (in Java) to construct your HTML emails... very easy, by the way.
On the one hand cpu is more expensive then memory, so mostly it is better to save more data to reduce cpu power used by computation.
But in your case, I would save the minimal data, the emails or what you are tying to save, because it allows you to easily remodel your templates, and to reuse the data at multiple places of your application.
You persist redundant data (especially because of the template) which is in no way normalized. I would not suggest to do that. But mentioned in the comment it is important what you want to do with that data.
If you only save the data you need you could for example exchange that template easy and use another one.
Yea, your right on track. I did a similar thing. All dynamic/runtime variables were starting from ##symbol.
So in database you would have one Template table. One table would be for dynamic/runtime variables. One table for Mapping between Template and dynamic/runtime variables.
tblTemplate - TemplateID, TemplateValue
tblRuntimeVariables - RuntimeVariableID, VariableString, VariableSQL
tblMapping - TemplateID, RuntimeVariableID, RuntimeVariableValue
Advantage of using an extra mapping table is that on adding new dynamic variables to existing change would mean making no change to existing database. Only more rows would be added to tblMapping.
In my case I was also having one extra column for storing SQL Statements in tblRuntimeVariables in case the value for a runtime variable is fetched from database.