SocialEngine db structure/explanation - socialengine

I'm new to socialEngine and a client of mine asked me to modify his website which is using socialEngine. He is using this CMS for over 6 years now and since then, it has never been modified.
I'm trying to reduce the size of database tables which are really huge.
For example,
the whole db is around 2G,
engine4_user_logins is 435MiB,
engine4_user_fields_search is 615MiB and
engine4_authorization_allow is 234MiB.
I searched for socialEngine db structure but I couldn't find it. I'm not asking you to explain every single table in database. My questions are:
Is it safe to empty these tables? And why are these tables so full?! Is it because of the long time that no modification applied to them?!

It's safe to clear engine4_user_logins however that data in that table is used in statistics section in SocialEngine's admin panel. You can clear it but you'll loose stats data. Don't clear the other two tables. SE uses
engine4_authorization_allow for item permissions and engine4_user_fields_search for searching.
engine4_authorization_allow and engine4_user_fields_search get populated when items are created in SocialEngine. blogs, users, groups, photos, etc. That's why they're huge.
If client is having performance issues, I suggest tweaking server configurations or upgrading the server.

Related

Database schema sample for storing social media post

I am building a small social networking website, I have a doubt regarding database schema:
How should I store the posts(text) by a user?
I'll have a separate POST table and will link USERS table with it, through USERS_POST table.
But every time to display all the posts on user's profile, system will have to search the entire USERS_POST table for USER id and then display?
What else should I do?
Similarly how should I store the multiple places the user has worked or studied?
I understand it's broad but I am new to Database. :)
First don't worry too much, start by making it work and see where you get performance problems. The database might be a lot quicker then you expect. Also it is often much easier to see what the best solution is when you have an actual query that is too slow.
Regarding your design, if a post is never linked to more then one user then forget the USERS_POST table and put the user id in the POST table. In either case an index on the user id would help (as in not having to read the whole table) when the database grows large.
Multiple places for a single user you would store in an additional table. For instance called USERS_PLACES, give it a column user_id to link it to USERS plus other columns for the data you wish to store per place.
BTW In postgresql you might want to keep all object (tables, columns, ...) names lowercase because unless you take care to always quote them like "USERS" postgresql will make them lowercase which can be confusing.

SharePoint groups and shared libraries/lists

This is going to be vague, hopefully not annoyingly so. I know very little about SharePoint, but I'm asking for someone who's more knowledgable but is under lots of crippling pressure. Unfortunately I'm going to be held responsible for the project (it's due before Christmas!!), so I need to see what I can figure out on my own to help out. Please allow my desperation and helplessness to excuse any problems with this question.
We've created an InfoPath form that generates xml files that will be uploaded to SharePoint. The data from these files will be aggregated and used to generate reports. The biggest issue is that the users will be spread out over three locations, and the info generated from each location needs to be firewalled from the others. But we need the xml files from all three locations to go to the same place in order to make the aggregation feasible with minimal manual work.
I've read something about SharePoint groups (http://technet.microsoft.com/en-us/library/cc262778%28v=office.14%29.aspx) and figured that might be the way of doing it, so long as 1) the xml documents could all go to the same library/repository and 2) that shared repository would only show each group their own documents. For at least two users we also need a master view that shows all of the documents regardless of the group that created them.
That's the main question. Ultimately we'll also need a similar way of storing the generated reports (tables and charts) to the creators of the xml files AND a set of users at each location who won't be able to view or create those xml files. But first things first, I guess.
Is this possible and feasible? Any hints/links that could get us started down this path?
I think in your case the best option is to create a folder for each group, and set permissions on them to allow just the specific group of users to access that folder. The same with a separate library for reports. Then, you'd just setup a list view that flattens the folder hierarchy to view all items at once.
You could also set per-document permission programmatically in an event receiver, however, there's a pretty low limit (search for ACL) on the number of unique access control lists per library (it's 50.000 actually). So depending on the number of XMLs you are going to manage you may reach this limit.

How should I deal with copies of data in a database?

What should I do if a user has a few hundred records in the database, and would like to make a draft where they can take all the current data and make some changes and save this as a draft potentially for good, keeping the two copies?
Should I duplicate all the data in the same table and mark it as a draft?
or only duplicate the changes? and then use the "non-draft" data if no changes exist?
The user should be able to make their changes and then still go back to the live and make changes there, not affecting the draft?
Just simply introduce a version field in the tables that would be affected.
Content management systems (CMS) do this already. You can create a blog post for example, and it has version 1. Then a change is made and that gets version 2 and on and on.
You will obviously end up storing quite a bit more data. A nice benefit though is that you can easily write queries to load a version (or a snapshot) of data.
As a convention you could always make the highest version number the "active" version.
You can either use BEGIN TRANS, COMMIT and ROLLBACK statements or you can create a stored procedure / piece of code that means that any amendments the user makes are put into temporary tables until they are ready to be put into production.
If you are making a raft of changes it is best to use temporary tables as using COMMIT etc can result in locks on the live data for other uses.
This article might help if the above means nothing to you: http://www.sqlteam.com/article/temporary-tables
EDIT - You could create new tables (ie NOT temporary, but full fledged sql tables) "on the fly" and name them something meaningful. For instance, the users intials, followed by original table name, followed by a timestamp.
You can then programtically create, amend and delete these tables over long periods of time as well as compare against Live tables. You would need to keep track of how many tables are being created in case your database grows to vast sizes.
The only major headache then is putting the changes back into the live data. For instance, if someone takes a cut of data into a new table and then 3 weeks later decides to send it into live after making changes. In this instance there is a likelihood of the live data having changed anyway and possibly superseding the changes the user will submit.
You can get around this with some creative coding though. There are many ways to tackle this, so if you get stuck at the next step you might want to start a new question. Hopefully this at least gives you some inspiration though.

Limit the field that are syncronized

I'm building an application that runs on a Windows Mobile device. I'm using Microsoft's Sync Framework to sync the Sql CE database with the main corporate db.
The question is how can I limit the fields that are syncronized? The table in question has stacks of fields but I only need to display a few of them on the mobile device and replication is only one way (from the server to the mobile) so that shouldn't be an issue. I've seen this similar question but there's not much info there. Can anyone give me more advice on how to achieve this? I imagine that it's a very common requirement.
Also, does anyone know if I can use the Sync Framework Version 2.0 or do I have to stick to 1.0. I had a feeling that 2.0 doesn't support Windows Mobile but I'm not sure.
Cheers
Mark
You can change the T-SQL that's generated behind the scenes to not include all the columns of the table, but there are a couple of gotchas here. Firstly, it means that you can't use a wizard to modify the sync selection later - not a big deal, and creating your own partial class to override just the specific method with the T-SQL for your table mitigates that a bit.
Second, changes to the unincluded (not sure if that's a word?) columns can also trigger a download of that row as by default the change tracking is by row. You can change this by setting the Track_Columns_Updated flag
ALTER TABLE Employee
ENABLE CHANGE_TRACKING
WITH (TRACK_COLUMNS_UPDATED = ON)
Depending on the number of rows and size of the data and frequency updated, I have often found an easier solution is to provide a trigger on the main table of the server to update records in a separate table containing just the data you need, then sync that. It makes it much easier to change what's downloaded later. This is obviously not a solution if you are downloading the entire works of Shakespeare, but for a few 1000 records of a product catalogue, I think it's perfectly feasible.

CMS Settings, .INI file or MySQL?

I'm looking for the best solution to store the ettings for a website, like the limit of posts for users, limit of users online, ranks, min. number of posts to be able to do something.
Like here, if you're new you can't thumbs up/down a post, or whatever, so how would you store all of these?
I thought of creating a table with constants in mysql but i think it's not the best solution to add a new mysql query on every page refresh.
Why not? After all, MySQL can handle a large number of requests and I've seen much more complex queries than checking access rights. A MySQL query is a query to a file just like checking an INI file, but optimized. I'm guessing that if you don't expect a huge amount of traffic, you'll be fine with a database.
Here it's a matter of preference. I prefer to do this in MySQL because I don't like to parse files and find querying a database easier. Also, editing rows is easier than changing values in a text file.
I'd say your first thought was spot-on. Put constants into a database.