Composite C1 - develop locally, sync to live site - webdeploy

I have a couple of Composite C1 CMS websites.
To edit them currently I use the web based CMS on the live site.
However - I would like to update the (code & content) in Visual Studio locally - then sync to the web. However, if my local copy is older than that online (e.g. a non techy client has edited something on the live site) and I Web Deploy - it will go over the top of the new file on the server.
I need a solution that works out the newest change? I can't find anything in Google or the C1 docs.
How can I sync - preferably using Web Deploy. Do I need some kind of version control?
Is there a best practice for this - editing the live site through the web interface seems a bit dicey & is slow.

The general answer to this type of scenario seems to be to use the Package Creator. With that you can develop locally, add the files you've changed to a package, and install that package on a live site. This solution does not at all cover all the parts of you question though, and has certain limitations:
You cannot selectively add content to a package. It's all pages or no pages.
Adding datatypes is easy, but updating them later requires you to delete the datatype (and data), and recreate the datatype.
In my experience packages works well for incremental site updates, if you limit the packages content to be front end stuff, like css, images and such.
You say you need a solution that works out the newest changes - I believe the only solution to this is yourself, with the aid of some tooling. I don't think there's a silver bullet solution here.
Should you use a version control system? Yes! By all means. Even if you are not sharing your code with anyone, a VCS is a great way to get to know Composite C1 from a file system perspective, as you can carefully track what files are changed on disk, as you develop. This knowledge is crucial when you want to continuously add features the a website that is already alive and kicking - you need to know what to deploy, and what not to touch.
Make sure you read the docs on how Composite fits in VCS: http://docs.composite.net/Configuration/C1-and-Version-Control
I assume that your sites are using the XML data storage (if you where using SQL Data Store, your content would not be overridden upon sync).
This means that your entire web application lives in one folder on disk on the web server, which can be an advantage here.
I'll try to outline a solution that could work for you, although I must stress that I've never tried this - I'm making it up as I type.
Let's say you're using git, download the site in it's entirety from the production web server, and commit the whole damned thing* to your master branch.
Then you create a new feature branch from that commit, and start making the changes you want to deploy later, and carefully commit your work as you go along, making sure you only commit the changes that are needed for your feature to work, to the feature branch.
Now, you are ready to deploy, and you switch back the master branch, and again download the entire site and commit it to master.
You then merge your feature branch into the master branch, and have git do all the hard work of stitching you changes in with the changes from the live site. There are bound to be merge conflicts, and that is where you will have to jump in, and decide for yourself what content needs to go live.
After this is done and tested, you can web deploy the site up to the production environment.
Changes to the live site might have occurred while you where merging, so consider closing the site, or parts of it, during this process.
If you are using SQL Data Store i suggest paying for a tool like Red Gate's SQL Compare and SQL Data Compare or SQL Delta, to compare your dev database to the production database, and hand pick SQL scripts that can be applied to the production database along with your feature deployment.
'* Do consider using a .gitignore file to avoid committing certain files - refer to the docs for mere info.

I suppose you should use the Package Creator
Also have a look here: http://docs.composite.net/Configuration/C1-and-Version-Control

Related

Why is it better to work on a local website copy than live on the website?

I used to work on the live website when I'm editing a website (I'm working alone), but some people told me "it's the old way". I'm inclined to evolve and I like to work, but how can't I lose time doing this?
First, that means that I need to get a copy of the website on my computer. I need to copy the files, dump and restore the database, first waste of time. If my customer adds extension on the website in the meantime(for example, Wordpress) my modification should be impacted then I need to add it on my local copy to. If I need to modify the DB I will need to do it on the local copy too.
Secondly if I want to show a work in progress to my customer I need to apply all modifications to the live website and check than everything works, still a waste of time.
And finally when everything is ok, I need to update again the live website, files and DB.
So, there's two options:
this is not the correct way to do and there's tools to do all that transparently (I hope so)
this is not a waste of time but a needed time to work properly (I understand why agency have big prices and I'll keep my method)
It depends on the complexity of the project and the size of your team.
One of the major risk of working on a live site is the introduction bugs in production. You also want to have some confirmation of functionalities developed from QA or your customer before having your users access them.
Basically, you want to make sure your new code does not break the live site, so working on the local instance could help you in this way, and you could also deploy on the live test site you changes for approval and QA.
Also if you working with a larger team working on the live site just won't scale up and the risk of introducing bug is even higher.
You could consider using docker, to simplify development on your local machine.

SQL (or any relational db) engine with SCM-friendly backing store [duplicate]

I'm doing a web app, and I need to make a branch for some major changes, the thing is, these changes require changes to the database schema, so I'd like to put the entire database under git as well.
How do I do that? is there a specific folder that I can keep under a git repository? How do I know which one? How can I be sure that I'm putting the right folder?
I need to be sure, because these changes are not backward compatible; I can't afford to screw up.
The database in my case is PostgreSQL
Edit:
Someone suggested taking backups and putting the backup file under version control instead of the database. To be honest, I find that really hard to swallow.
There has to be a better way.
Update:
OK, so there' no better way, but I'm still not quite convinced, so I will change the question a bit:
I'd like to put the entire database under version control, what database engine can I use so that I can put the actual database under version control instead of its dump?
Would sqlite be git-friendly?
Since this is only the development environment, I can choose whatever database I want.
Edit2:
What I really want is not to track my development history, but to be able to switch from my "new radical changes" branch to the "current stable branch" and be able for instance to fix some bugs/issues, etc, with the current stable branch. Such that when I switch branches, the database auto-magically becomes compatible with the branch I'm currently on.
I don't really care much about the actual data.
Take a database dump, and version control that instead. This way it is a flat text file.
Personally I suggest that you keep both a data dump, and a schema dump. This way using diff it becomes fairly easy to see what changed in the schema from revision to revision.
If you are making big changes, you should have a secondary database that you make the new schema changes to and not touch the old one since as you said you are making a branch.
I'm starting to think of a really simple solution, don't know why I didn't think of it before!!
Duplicate the database, (both the schema and the data).
In the branch for the new-major-changes, simply change the project configuration to use the new duplicate database.
This way I can switch branches without worrying about database schema changes.
EDIT:
By duplicate, I mean create another database with a different name (like my_db_2); not doing a dump or anything like that.
Use something like LiquiBase this lets you keep revision control of your Liquibase files. you can tag changes for production only, and have lb keep your DB up to date for either production or development, (or whatever scheme you want).
Irmin (branching + time travel)
Flur.ee (immutable + time travel + graph query)
XTDB (formerly called 'CruxDB') (time travel + query)
TerminusDB (immutable + branching + time travel + Graph Query!)
DoltDB (branching + time-travel + SQL query)
Quadrable (branching + remote state verification)
EdgeDB (no real time travel, but migrations derived by the compiler after schema changes)
Migra (diffing for Postgres schemas/data. Auto-generate migration scripts, auto-sync db state)
ImmuDB (immutable + time-travel)
I've come across this question, as I've got a similar problem, where something approximating a DB based Directory structure, stores 'files', and I need git to manage it. It's distributed, across a cloud, using replication, hence it's access point will be via MySQL.
The gist of the above answers, seem to similarly suggest an alternative solution to the problem asked, which kind of misses the point, of using Git to manage something in a Database, so I'll attempt to answer that question.
Git is a system, which in essence stores a database of deltas (differences), which can be reassembled, in order, to reproduce a context. The normal usage of git assumes that context is a filesystem, and those deltas are diff's in that file system, but really all git is, is a hierarchical database of deltas (hierarchical, because in most cases each delta is a commit with at least 1 parents, arranged in a tree).
As long as you can generate a delta, in theory, git can store it. The problem is normally git expects the context, on which it's generating delta's to be a file system, and similarly, when you checkout a point in the git hierarchy, it expects to generate a filesystem.
If you want to manage change, in a database, you have 2 discrete problems, and I would address them separately (if I were you). The first is schema, the second is data (although in your question, you state data isn't something you're concerned about). A problem I had in the past, was a Dev and Prod database, where Dev could take incremental changes to the schema, and those changes had to be documented in CVS, and propogated to live, along with additions to one of several 'static' tables. We did that by having a 3rd database, called Cruise, which contained only the static data. At any point the schema from Dev and Cruise could be compared, and we had a script to take the diff of those 2 files and produce an SQL file containing ALTER statements, to apply it. Similarly any new data, could be distilled to an SQL file containing INSERT commands. As long as fields and tables are only added, and never deleted, the process could automate generating the SQL statements to apply the delta.
The mechanism by which git generates deltas is diff and the mechanism by which it combines 1 or more deltas with a file, is called merge. If you can come up with a method for diffing and merging from a different context, git should work, but as has been discussed you may prefer a tool that does that for you. My first thought towards solving that is this https://git-scm.com/book/en/v2/Customizing-Git-Git-Configuration#External-Merge-and-Diff-Tools which details how to replace git's internal diff and merge tool. I'll update this answer, as I come up with a better solution to the problem, but in my case I expect to only have to manage data changes, in-so-far-as a DB based filestore may change, so my solution may not be exactly what you need.
There is a great project called Migrations under Doctrine that built just for this purpose.
Its still in alpha state and built for php.
http://docs.doctrine-project.org/projects/doctrine-migrations/en/latest/index.html
Take a look at RedGate SQL Source Control.
http://www.red-gate.com/products/sql-development/sql-source-control/
This tool is a SQL Server Management Studio snap-in which will allow you to place your database under Source Control with Git.
It's a bit pricey at $495 per user, but there is a 28 day free trial available.
NOTE
I am not affiliated with RedGate in any way whatsoever.
I've released a tool for sqlite that does what you're asking for. It uses a custom diff driver leveraging the sqlite projects tool 'sqldiff', UUIDs as primary keys, and leaves off the sqlite rowid. It is still in alpha so feedback is appreciated.
Postgres and mysql are trickier, as the binary data is kept in multiple files and may not even be valid if you were able to snapshot it.
https://github.com/cannadayr/git-sqlite
I want to make something similar, add my database changes to my version control system.
I am going to follow the ideas in this post from Vladimir Khorikov "Database versioning best practices". In summary i will
store both its schema and the reference data in a source control system.
for every modification we will create a separate SQL script with the changes
In case it helps!
You can't do it without atomicity, and you can't get atomicity without either using pg_dump or a snapshotting filesystem.
My postgres instance is on zfs, which I snapshot occasionally. It's approximately instant and consistent.
I think X-Istence is on the right track, but there are a few more improvements you can make to this strategy. First, use:
$pg_dump --schema ...
to dump the tables, sequences, etc and place this file under version control. You'll use this to separate the compatibility changes between your branches.
Next, perform a data dump for the set of tables that contain configuration required for your application to operate (should probably skip user data, etc), like form defaults and other data non-user modifiable data. You can do this selectively by using:
$pg_dump --table=.. <or> --exclude-table=..
This is a good idea because the repo can get really clunky when your database gets to 100Mb+ when doing a full data dump. A better idea is to back up a more minimal set of data that you require to test your app. If your default data is very large though, this may still cause problems though.
If you absolutely need to place full backups in the repo, consider doing it in a branch outside of your source tree. An external backup system with some reference to the matching svn rev is likely best for this though.
Also, I suggest using text format dumps over binary for revision purposes (for the schema at least) since these are easier to diff. You can always compress these to save space prior to checking in.
Finally, have a look at the postgres backup documentation if you haven't already. The way you're commenting on backing up 'the database' rather than a dump makes me wonder if you're thinking of file system based backups (see section 23.2 for caveats).
What you want, in spirit, is perhaps something like Post Facto, which stores versions of a database in a database. Check this presentation.
The project apparently never really went anywhere, so it probably won't help you immediately, but it's an interesting concept. I fear that doing this properly would be very difficult, because even version 1 would have to get all the details right in order to have people trust their work to it.
This question is pretty much answered but I would like to complement X-Istence's and Dana the Sane's answer with a small suggestion.
If you need revision control with some degree of granularity, say daily, you could couple the text dump of both the tables and the schema with a tool like rdiff-backup which does incremental backups. The advantage is that instead of storing snapshots of daily backups, you simply store the differences from the previous day.
With this you have both the advantage of revision control and you don't waste too much space.
In any case, using git directly on big flat files which change very frequently is not a good solution. If your database becomes too big, git will start to have some problems managing the files.
Here is what i am trying to do in my projects:
separate data and schema and default data.
The database configuration is stored in configuration file that is not under version control (.gitignore)
The database defaults (for setting up new Projects) is a simple SQL file under version control.
For the database schema create a database schema dump under the version control.
The most common way is to have update scripts that contains SQL Statements, (ALTER Table.. or UPDATE). You also need to have a place in your database where you save the current version of you schema)
Take a look at other big open source database projects (piwik,or your favorite cms system), they all use updatescripts (1.sql,2.sql,3.sh,4.php.5.sql)
But this a very time intensive job, you have to create, and test the updatescripts and you need to run a common updatescript that compares the version and run all necessary update scripts.
So theoretically (and thats what i am looking for) you could
dumped the the database schema after each change (manually, conjob, git hooks (maybe before commit))
(and only in some very special cases create updatescripts)
After that in your common updatescript (run the normal updatescripts, for the special cases) and then compare the schemas (the dump and current database) and then automatically generate the nessesary ALTER Statements. There some tools that can do this already, but haven't found yet a good one.
What I do in my personal projects is, I store my whole database to dropbox and then point MAMP, WAMP workflow to use it right from there.. That way database is always up-to-date where ever I need to do some developing. But that's just for dev! Live sites is using own server for that off course! :)
Storing each level of database changes under git versioning control is like pushing your entire database with each commit and restoring your entire database with each pull.
If your database is so prone to crucial changes and you cannot afford to loose them, you can just update your pre_commit and post_merge hooks.
I did the same with one of my projects and you can find the directions here.
That's how I do it:
Since your have free choise about DB type use a filebased DB like e.g. firebird.
Create a template DB which has the schema that fits your actual branch and store it in your repository.
When executing your application programmatically create a copy of your template DB, store it somewhere else and just work with that copy.
This way you can put your DB schema under version control without the data. And if you change your schema you just have to change the template DB
We used to run a social website, on a standard LAMP configuration. We had a Live server, Test server, and Development server, as well as the local developers machines. All were managed using GIT.
On each machine, we had the PHP files, but also the MySQL service, and a folder with Images that users would upload. The Live server grew to have some 100K (!) recurrent users, the dump was about 2GB (!), the Image folder was some 50GB (!). By the time that I left, our server was reaching the limit of its CPU, Ram, and most of all, the concurrent net connection limits (We even compiled our own version of network card driver to max out the server 'lol'). We could not (nor should you assume with your website) put 2GB of data and 50GB of images in GIT.
To manage all this under GIT easily, we would ignore the binary folders (the folders containing the Images) by inserting these folder paths into .gitignore. We also had a folder called SQL outside the Apache documentroot path. In that SQL folder, we would put our SQL files from the developers in incremental numberings (001.florianm.sql, 001.johns.sql, 002.florianm.sql, etc). These SQL files were managed by GIT as well. The first sql file would indeed contain a large set of DB schema. We don't add user-data in GIT (eg the records of the users table, or the comments table), but data like configs or topology or other site specific data, was maintained in the sql files (and hence by GIT). Mostly its the developers (who know the code best) that determine what and what is not maintained by GIT with regards to SQL schema and data.
When it got to a release, the administrator logs in onto the dev server, merges the live branch with all developers and needed branches on the dev machine to an update branch, and pushed it to the test server. On the test server, he checks if the updating process for the Live server is still valid, and in quick succession, points all traffic in Apache to a placeholder site, creates a DB dump, points the working directory from 'live' to 'update', executes all new sql files into mysql, and repoints the traffic back to the correct site. When all stakeholders agreed after reviewing the test server, the Administrator did the same thing from Test server to Live server. Afterwards, he merges the live branch on the production server, to the master branch accross all servers, and rebased all live branches. The developers were responsible themselves to rebase their branches, but they generally know what they are doing.
If there were problems on the test server, eg. the merges had too many conflicts, then the code was reverted (pointing the working branch back to 'live') and the sql files were never executed. The moment that the sql files were executed, this was considered as a non-reversible action at the time. If the SQL files were not working properly, then the DB was restored using the Dump (and the developers told off, for providing ill-tested SQL files).
Today, we maintain both a sql-up and sql-down folder, with equivalent filenames, where the developers have to test that both the upgrading sql files, can be equally downgraded. This could ultimately be executed with a bash script, but its a good idea if human eyes kept monitoring the upgrade process.
It's not great, but its manageable. Hope this gives an insight into a real-life, practical, relatively high-availability site. Be it a bit outdated, but still followed.
Update Aug 26, 2019:
Netlify CMS is doing it with GitHub, an example implementation can be found here with all information on how they implemented it netlify-cms-backend-github
I say don't. Data can change at any given time. Instead you should only commit data models in your code, schema and table definitions (create database and create table statements) and sample data for unit tests. This is kinda the way that Laravel does it, committing database migrations and seeds.
I would recommend neXtep (Link removed - Domain was taken over by a NSFW-Website) for version controlling the database it has got a good set of documentation and forums that explains how to install and the errors encountered. I have tested it for postgreSQL 9.1 and 9.3, i was able to get it working for 9.1 but for 9.3 it doesn't seems to work.
Use a tool like iBatis Migrations (manual, short tutorial video) which allows you to version control the changes you make to a database throughout the lifecycle of a project, rather than the database itself.
This allows you to selectively apply individual changes to different environments, keep a changelog of which changes are in which environments, create scripts to apply changes A through N, rollback changes, etc.
I'd like to put the entire database under version control, what
database engine can I use so that I can put the actual database under
version control instead of its dump?
This is not database engine dependent. By Microsoft SQL Server there are lots of version controlling programs. I don't think that problem can be solved with git, you have to use a pgsql specific schema version control system. I don't know whether such a thing exists or not...
Use a version-controlled database, of which there are now several.
https://www.dolthub.com/blog/2021-09-17-database-version-control/
These products don't apply version control on top of another type of database -- they are their own database engines that support version control operations. So you need to migrate to them or start building on them in the first place.
I write one of them, DoltDB, which combines the interfaces of MySQL and Git. Check it out here:
https://github.com/dolthub/dolt
I wish it were simpler. Checking in the schema as a text file is a good start to capture the structure of the DB. For the content, however, I have not found a cleaner, better method for git than CSV files. One per table. The DB can then be edited on multiple branches and merges extremely well.

Merging 2 WordPress databases with both corresponding and unique content + minor tweaks

A little intro: I've been developing a WordPress website at my company, for a big, international client. Since it's launch some time ago, we've obviously had some improvements, tweaks, bugs, etc. We have a couple of developers here, working with a clean local - staging - production workflow, syncing our local environments with git. When changes have to be made, we pull a version from the live site, make changes locally, upload it on a staging site, and after approval and final testing, deploy to the live website.
Our last round of changes was a big one, recoding a lot of our theme files as well as tweaking the database a bit. After uploading it on the staging site for approval however, things went awry. Our contact person got confused with the versions, and went on to publish new posts and edit pages on the live site, as well as create new content on the staging site. I don't mean 'do the same thing on both'... they did some work on each.
Now I somehow have to sync the two, so I can deploy the site and move on to further development (we also need the site live because we wrote an API that communicates between the client's large corporate site, and it's smaller subsidiaries' sites)
I checked out these posts already:
How can I merge two MySQL tables?
Merge 2 SQL Server databases
Merging wordpress Databases
And I tried multiple built-in and plugin solutions.
Having a very large website a lot of custom fields (ACF) and having them translated in 4 languages, makes this even harder, I think. Exporting and importing has broken something on each try (often with the translation), and plugins like Database Sync only offer a complete replacement of the db's, and thus losing the unique content each site has. I have some knowledge of SQL queries, and could freshen these up a bit more, but I don't really see how I could manually merge the two sql-files.
In short: I need to sync 2 databases from 2 slightly different versions of mostly the same website. There is unique and duplicate content, as well as minor tweaks in WP.

How to set up development environment for yql and open table development? How to test it locally? (best practice)

I develop -- from time to time -- yahoo open tables to access different resources on the web. Currently I am using a JavaScript editor and -- when I want to test if my open table works -- I upload the xml table description to a server to test it with a yql client application. However this approach is quite slow and -- sometimes -- I get blocked by yahoo, because of a mistake in my open table description. Therefore I would like to learn about best practices on how to test and develop yahoo open table locally. How does look your set up for the open table development?
To clarify my question, I am looking for any convenient way (best practise) to develop and test yql tables, e.g., running a part of a Java Script inside Rhino.
First of all: I agree that I don't see a really convenient way either to test YQL datatable definitions locally. Nevertheless, here is how I approach this issue.
Hosting on github
YQL datatable definitions are often used in very open scenarios e.g. when there is an existing API that you want to wrap via YQL. Therefore I am normally working on a fork of the YQL community tables and I just add my own definitions there. The hosting of the .xml files takes place on github in this case: https://github.com/yql/yql-tables
The other advantage of this approach is as well that it is easy for me to share my datatables with the community if I feel that they might be valuable for others as well.
Hosting privately
The free github account only comes with free repositories though, so everybody would be able to see and use your datatables. If that is not good for you then you could either buy a github pro account to get private repositories, or host the datatable definitions yourself.
To do that you can upload them to your own server - as you are already doing - or you should also be able to set up a web server like Apache locally on your machine and then get a dynamic hostname from dyndns.com or similar, so that you can point to this definitions from YQL. I have not tried this because github was working sufficiently well for me but I am sure that it is possible.
Why don't you just put the file you are editing in a public dropbox folder? This is what I do and it works pretty well.

Best IT/back-office system hacks? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Lots of people have things that their systems do for them or for their teams. Source control post-commit hooks are a standard example: have an automated build system that checks out the latest source, compiles, tests, and packages it is a back-office hack that most of us probably use.
What other cool things have you done?
We had one developer in our team who wasn't familiar with the concept of a subversion conflict. He deduced that if he simply deleted all that weird stuff in his code and clicked resolve that everything was ok (i.e. knocking out all the other changes in the file....)
Regardless to say, after the 5th time this occurred, and the 5th time that I had to explain why that defect that I just closed was reoccuring, I wrote a script.
It would diff for the changes to a file to see whether the consecutive checkin deleted all the previous changes and that they were done by the nameless developer.
It would then send an email to the boss with a description of what happened, and how much work was lost during the checkin.
There was no 7th occurrence.
We have a traffic-light that shows whether our daily build succeeds, has failed tests or simply doesn't build.
Also, we have a light bar that lights up for a few seconds whenever we receive an upload from a customer.
We aren't staffed 24x7 but we have critical processes that run throughout the night. We created an in-house alerts system to notify us of serious system issues, failed mission-critical processes, etc. It uses text-to-speech to create a descriptive message and then connects to our automated dialer to call the appropriate people with the message.
Working at a web design company I configured our dev server so we could see a working copy of a project in real time by a sub domain name. So if your name was joe and you were working on project jetfuel you would go to joe.jetfuel.test-example.com and you could see your changes instantly without committing.
This was a simple hack that used sub domain names as a partial directory structure. Our htdocs path looked like this htdocs/tag/project. We had a script (a php app that you would access by setup.test-example.com) that would create a new tag name for you and checkout whatever version you wanted and call the deploy script for that project. If it succeeded it would forward you to the new sub domain. You could then work on this new copy by a samba share.
This worked really well for us since we always deployed to the same linux build and our projects had simple database requirements.
Our original reason for doing this was because our developers worked on all kinds of different platforms. Besides fixing this platform problem this was awesome for viewing changes and testing. We had all kinds of tags ranging from peoples names, trunk versions, test tags, all the way to prototypes like jquery-menu-hack.jetfuel.test-example.com
Now that I look back I wonder how much easier it would have been to run virtual machines.
We had a dev working on a classic ASP site that didn't believe in source control. The code went from his machine straight to the production box. This lead to issues with lost changes or the inability to revert back to a stable version. Since CruiseControl.Net has the ability to monitor a directory, I added a project that actually checked in files whenever they were copied to production. Completely backward from CC.Net's original intent, but we didn't lose any more code.
Put in a pre-commit hook that checks the bug comment refers to an open bug, assigned to the user doing the checkin. (SCMBug can do this).
Then to make life REALLY interesting, spell check the comments!!
The commit comment, and the one in the code. (spell is my buddy)
Run the code through a code formatter set to compayn standard; and diff it to the original: if it's not in company offical format: reject the commit.
Do a coverage test with the unit test build.
Email all mistakes/errors caused to the development team.
I left OUT the name of the developer. They know they did it.
Not exactly hacks, but a couple of must-haves for IT dev work:
If you're using subversion, you've got to use CommitMonitor. (http://tools.tortoisesvn.net/CommitMonitor) It lets you monitor svn repositories for new commits & then review the new commits. Great if you're wanting to stay on top of what your team is doing. Particularly if you have a couple of juniors that need to be watched. ;)
Rsnapshot (http://www.rsnapshot.org/) is also invaluable - we have complete backup snapshots of our entire filesystem every four hours going back 2 years, and every day beyond that. It's like a data cube for your filesystem! The peace of mind this gives is pure bliss. :)
Hardly a hack, but back in the day, on our speedy VAX 11/730, our overnight process would print the file "BLAMMO.TXT" on the printer if something went amiss. Every morning, the first stop was the printer when coming in.
Back in the dotCom days about 9 years ago, I had to hack a failover system between two different locations. We had a funky setup with a powerbuilder front end website, and powerbuilder managment tool. Data was stored in MSSQL 7.0. The webservers used IPX to communicate to the SQL Servers (don't ask). Anyway, I was responsbile for coming up with a failover plan.
I ended up hacking together some linux boxes, and had them run our external DNS. One at each location. We had a remote site w/ webserver, and sql server I got SQL transaction replication working over a 128k ISDN IPX connection (of all things). Then built a monitoring tool at our production site to send packets out to various upstream network handoffs. If we experienced more than 20% outage the primary site, the monitoring tool ran a perl script on the Debian box to change DNS and point to our 2ndary. Our secondary had a heartbeat w/ our primary DNS, and monitoring station. It would duplicate records unless it lost both connections then it would roll over to pointing DNS to backup location.
The primary site would shut down the SQL server at the primary location to break replication. Automated site to site failover using 128k ISDN IPX connection :)
Back at my previous job, we had to audit many tables for data changes (inserts, updates and deletes). Our support crew had to be able to search through this data to find changes that users made.
The temporary solution that had become semi-permanent was to store each non-select query. However this was a large system, that the table would grow by about 1.5GB a day.
The solution I came up with was to create a script that for all tables in an external list, created the appropriate triggers that audit each table, row, column, before and after, when and by whom and store it in our new audit table. This table grew by about 10% the size of the older version and stored much more usable data. It enabled us to create a UI to search and view every change made to our data, without requiring any knowledge of SQL for our support team or business users.
This is at a lesser level, but I am fairly proud of a make file I wrote for compiling code for my research. It only needs to be given your source and header file names that can take care of the rest all by itself (though it does make the one assumption that you will not be compiling any header files into objects, only source files get compiled). The other downsides are the fact that it relies on the GNU make program's second expansion feature, so I don't know if it works on other make programs. Additionally the compiler used needs to support something similar to gcc's -MM feature. Here is hoping that no one laughs at it.
-include prereqs.mk
HEADERS=$(SRC_DIR)/gs_lib.h $(SRC_DIR)/gs_structs.h
SOURCES=$(SRC_DIR)/main.cpp $(SRC_DIR)/gs_lib.cpp
OBJECTS=$(patsubst $(SRC_DIR)/%.cpp,$(OBJ_DIR)/%.o,$(SOURCES))
release: FLAGS=$(GEN_FLAGS)$(OPT_FLAGS)
release: $(OBJECTS) prereqs.mk
$(CXX) $(FLAGS) $(LINKER_FLAGS) $(OUTPUT_FLAG) $(EXECUTABLE) $(OBJECTS)
prereqs.mk: $(SOURCES) $(HEADERS)
$(CXX) $(DIR_FLAGS) $(MAKE_FLAG) $(SOURCES) | sed 's,\([abcdefghijklmnopqrstuvwxyz_]*\).o:,\1= \\\n,' > $#
.SECONDEXPANSION:
$(OBJECTS): $$($$(patsubst $(OBJ_DIR)/%.o,%,$$#))
$(CXX) $(FLAGS) $(NO_LINK_FLAG) $(OUTPUT_FLAG) $# $(patsubst $(OBJ_DIR)/%.o,$(SRC_DIR)/%.cpp,$#)
Obviously I dropped the definition of a number of variables, but I think it gets the idea across.
Since my coding tools and style are compatible with the requirements of this script I like to use it. All I need to do to add (a) new piece(s) of source code is add its name(s) to the appropriate variable and the rest is taken care of.
We have Twitter accounts for many projects which tweet things like commit messages, notices from builds, failed unit tests, deployments, bug tracking activity - any kind of event associated with the project. Running a client like Twitter Gwibber (which displays a pop-up for each new status) is a great way to stay in touch with the activity on the projects you are interested. Using Twitter is good as you can take advantage of all the 3rd party apps - such as the iPhone clients.
Add commit-hook check for VRML/3d-model files with absolute path to textures/images. f:/maya/my-textures/newproject/xxxx.png just doesn't belong on the server.
Back in the 1993, when source control systems were really expensive and unwieldy, the company I worked about had an in-house source control built as 4DOS scripts. It wasn't as sofisticated as most current source control systems, for example it didn't have branching or integrates, but it did the basic job of supporting revisions history, checkout/checkin and rudimentary conflict resolution.