Importing Product in Adobe CQ5 - jcr

I have a question on how we can import/synchronise products from our back-office to CQ5 front end.
The architecture to be is pretty simple - custom back-office managing all the products( basically it will be the source of truth). CQ5 driven web-site to show search results(driven by Adobe SearchAndPromote) and product details. Purchase transactions will be handled outside of CQ5.
I went through http://dev.day.com/docs/en/cq/current/ecommerce/eCommerce-framework.html and I think have some idea in which direction we should move, but I would like someone to confirm that my understanding is correct.
1) I need to create scheduled job running on Author node that would call back-office and import products as json feed. I use annotation based #Service(Runnable.class) - Is there a way to set it so it rund on Author node only?
2) Create custom service(called my service above) that will actually create all the nodes in crx. If I have desktop and mobile versions of the site do I need to create all those dones twice? Are there any tips on easier way to create those?
3) Let CQ5 replicate those products to publish nodes.
Is there a easier way? I mean if I was using more standard web-app I would have one controller to show product details, two templates(one for mobile, one for desktop) and a service that would call back-office and return details for requested product. But Sling world is very different, and I want to check if I understand it correctly.
Cheers.

Here are some answers:
1) Here is a good article about different configs for different runmodes: http://helpx.adobe.com/cq/kb/RunModeSetUp.html you can create configs for pub and auth runmodes with certain flag your code will look for which will tell whether to execute import or not.
2) It depends. CQ tends to have copies of content for mobile site so it may make sense to do copies of nodes for mobile site but only in case you those nodes are pages (cq:Page and cq:PageContent) you create based on imported data. Otherwise you just need to save imported data somewhere and obtain it at some moment (via JCR queries or methods like .getNode()). In this case of course it makes sense not to copy your data.
3) It depends here as well. I would consider following forces you may have: should imported data be editable? how frequent are updates? how massive are updates? how critical is consistency across pubs? In case updates are not massive, not frequent and consistency matters import to auth followed by replication can work. Also it may be the case if you need to be able to edit imported data. In case updates are massive and/or frequent and consistency across pubs do not matter much (you can afford that some people may see different results from different pubs during import) I'd suggest run import on all pubs at the same time since massive replication of imported data may affect regular page/images replications.
Thanks,
Max.

Related

Is it possible to add a new wiki entry in a GitLab project using a standard merge request?

Using free, self-managed GitLab
Due to group-level custom project templates available only to paid tiers (even there there are restrictions) I am looking into an alternative solution using a simple project, where
in the code repository each branch provides a template
in the wiki each code repository branch has an entry documenting what the template does
I know that the wiki and the code are actually two separate repositories.
In its nature a template is a construct that offers a pre-made setup for working on a reoccurring task. A group template adds the additional restriction that the reoccurring task applies to more then one individual.
In order to limit tomfoolery and people pushing whatever they want thinking it's worth becoming a group-level template (even though they made something real quick to tackle a problem that has been long forgotten and even they themselves will not work on it ever again) I would like to impose access restrictions to all members. Beside the maintainer/owner all other members are assigned a developer role. All branches are protected so a change of an existing branch or the creation of a new one can only be done through a merge request leading to an assessment whether the committed changes to the repository are actually worthy of becoming a template for the whole group.
Many members of my group have the bad habit of choosing poor names for functionality they have developed (e.g. a script called jennifers_help_script_23.py) and not documenting what was actually implemented. And yes, we are not a software development company but a research institute. :D So in order to improve the documentation and the ability to actually reuse some of the quite useful things that people have developed I would like to make it mandatory for people to provide documentation if they want their stuff to be added to the project.
So the question here is can a user submit a code merge request that also acts as a merge request for a change in the wiki (e.g. user has created a new template, which requires also a new wiki page documenting that template) or the two have to be handled separately given the nature of a GitLab project (wiki separate from code)?
I was thinking maybe each branch (representing a template) will contain a markdown file that will be inserted as a wiki automatically after the merge request has been approved. However I don't know how to automate this. I am currently looking into uploading a file to the wiki using the GitLab API, hoping a can somehow add a trigger in GitLab to execute the "command" upon a successful merge. Needless to say I am quite new to all of this.

Composite C1 - develop locally, sync to live site

I have a couple of Composite C1 CMS websites.
To edit them currently I use the web based CMS on the live site.
However - I would like to update the (code & content) in Visual Studio locally - then sync to the web. However, if my local copy is older than that online (e.g. a non techy client has edited something on the live site) and I Web Deploy - it will go over the top of the new file on the server.
I need a solution that works out the newest change? I can't find anything in Google or the C1 docs.
How can I sync - preferably using Web Deploy. Do I need some kind of version control?
Is there a best practice for this - editing the live site through the web interface seems a bit dicey & is slow.
The general answer to this type of scenario seems to be to use the Package Creator. With that you can develop locally, add the files you've changed to a package, and install that package on a live site. This solution does not at all cover all the parts of you question though, and has certain limitations:
You cannot selectively add content to a package. It's all pages or no pages.
Adding datatypes is easy, but updating them later requires you to delete the datatype (and data), and recreate the datatype.
In my experience packages works well for incremental site updates, if you limit the packages content to be front end stuff, like css, images and such.
You say you need a solution that works out the newest changes - I believe the only solution to this is yourself, with the aid of some tooling. I don't think there's a silver bullet solution here.
Should you use a version control system? Yes! By all means. Even if you are not sharing your code with anyone, a VCS is a great way to get to know Composite C1 from a file system perspective, as you can carefully track what files are changed on disk, as you develop. This knowledge is crucial when you want to continuously add features the a website that is already alive and kicking - you need to know what to deploy, and what not to touch.
Make sure you read the docs on how Composite fits in VCS: http://docs.composite.net/Configuration/C1-and-Version-Control
I assume that your sites are using the XML data storage (if you where using SQL Data Store, your content would not be overridden upon sync).
This means that your entire web application lives in one folder on disk on the web server, which can be an advantage here.
I'll try to outline a solution that could work for you, although I must stress that I've never tried this - I'm making it up as I type.
Let's say you're using git, download the site in it's entirety from the production web server, and commit the whole damned thing* to your master branch.
Then you create a new feature branch from that commit, and start making the changes you want to deploy later, and carefully commit your work as you go along, making sure you only commit the changes that are needed for your feature to work, to the feature branch.
Now, you are ready to deploy, and you switch back the master branch, and again download the entire site and commit it to master.
You then merge your feature branch into the master branch, and have git do all the hard work of stitching you changes in with the changes from the live site. There are bound to be merge conflicts, and that is where you will have to jump in, and decide for yourself what content needs to go live.
After this is done and tested, you can web deploy the site up to the production environment.
Changes to the live site might have occurred while you where merging, so consider closing the site, or parts of it, during this process.
If you are using SQL Data Store i suggest paying for a tool like Red Gate's SQL Compare and SQL Data Compare or SQL Delta, to compare your dev database to the production database, and hand pick SQL scripts that can be applied to the production database along with your feature deployment.
'* Do consider using a .gitignore file to avoid committing certain files - refer to the docs for mere info.
I suppose you should use the Package Creator
Also have a look here: http://docs.composite.net/Configuration/C1-and-Version-Control

How to access results of Sonar metrics for use with applications like PowerPivot

I'm trying to run a number of applications with known failure rates through Sonar, with hopes of deciding which metrics are most valuable in determining whether a particular application will fail. Ultimately I'll be making some sort of algorithm that will look at the outputs of whatever metrics I'm using and generate a score from 1 - 100. I've got about 21 applications put through Sonar, and the results have been stored in a MySQL database. I originally planned to use PowerPivot to find relationships in the data, but it seems like the formatting of the tables doesn't lend itself well to that. Other questions on stackoverflow have told me that Sonar's tables are unformatted, and I should instead use the Web Service API to get the information. I'm unfamiliar with API and was unsuccessful in trying to do what I wanted by looking at Sonar's documentation for API.
From an answer to another question:
http://nemo.sonarsource.org/api/timemachine?resource=org.apache.cxf:cxf&format=csv&metrics=ncloc,violations_density,comment_lines_density,public_documented_api_density,duplicated_lines_density,blocker_violations,critical_violations,major_violations,minor_violations
This looks very similar to what I'd like to have, except I'm only looking at each application once (I'm analyzing a sample of all the live applications on a grid), which means Timemachine isn't really what I'm looking for. Would it be possible to generate a similar table, except instead of the stats for a particular application per date, it showed the statistics for an application and all of its classes, etc?
If you're not familiar with the WS API, you can also create your own Sonar plugin to achieve whatever you want: it is written in Java and it will execute on every analysis you run. This way, in the code ot this custom plugin, you can do whatever you want: flush the metrics you need in an output file, push them into a third party system, ... etc.
Just take a look on how to write a plugin (most probably you will create a Decorator). You have concrete examples also to get started faster.

Creating a test site for updating a CMS

I have been asked by a client to make amends to their site using the custom CS system that was built for them (by somebody else). Making the changes is not a problem but they want the changed to be viewed on a test server before going live and the only way I can think of doing that is by pulling the entire site down, duplicating and reconnecting databases and uploading it to a test server. Then I would have to make all the changes twice which isn't really ideal.
Does anyone know of a way to do this that isn't such a ball ache? There's hundreds of files and data tables as you would expect with a custom CMS and for changes that would only take a few hours to do duplicating the entire site seems a tad unnecessary.
Cheers,
Sam
Does the CMS have "preview mode"?
Typically, in a CMS you make your changes using the content editing interface, save the changes, allow authorized users to view the changes in preview mode, and then change the status to "approved"; this then sends the changes live.
Different products call this by a different name, and have different ways of doing it - but it's worth rooting around in the custom CMS to see if there's something similar.

Do you put your database static data into source-control ? How?

I'm using SQL-Server 2008 with Visual Studio Database Edition.
With this setup, keeping your schema in sync is very easy. Basically, there's a 'compare schema' tool that allow me to sync the schema of two databases and/or a database schema with a source-controlled creation script folder.
However, the situation is less clear when it comes to data, which can be of three different kind :
static data referenced in the code. typical example : my users can change their setting, and their configuration is stored on the server. However, there's a system-wide default value for each setting that is used in case the user didn't override it. The table containing those default settings grows as more options are added to the program. This means that when a new feature/option is checked in, the system-wide default setting is usually created in the database as well.
static data. eg. a product list populating a dropdown list. The program doesn't rely on the existence of a specific product in the list to work. This can be for example a list of unicode-encoded products that should be deployed in production when the new "unicode version" of the program is deployed.
other data, ie everything else (logs, user accounts, user data, etc.)
It seems obvious to me that my third item shouldn't be source-controlled (of course, it should be backuped on a regular basis)
But regarding the static data, I'm wondering what to do.
Should I append the insert scripts to the creation scripts? or maybe use separate scripts?
How do I (as a developer) warn the people doing the deployment that they should execute an insert statement ?
Should I differentiate my two kind of data? (the first one being usually created by a dev, while the second one is usually created by a non-dev)
How do you manage your DB static data ?
I have explained the technique I used in my blog Version Control and Your Database. I use database metadata (in this case SQL Server extended properties) to store the deployed application version. I only have scripts that upgrade from version to version. At startup the application reads the deployed version from the database metadata (lack of metadata is interpreted as version 0, ie. nothing is yet deployed). For each version there is an application function that upgrades to the next version. Usually this function runs an internal resource T-SQL script that does the upgrade, but it can be something else, like deploying a CLR assembly in the database.
There is no script to deploy the 'current' database schema. New installments iterate trough all intermediate versions, from version 1 to current version.
There are several advantages I enjoy by this technique:
Is easy for me to test a new version. I have a backup of the previous version, I apply the upgrade script, then I can revert to the previous version, change the script, try again, until I'm happy with the result.
My application can be deployed on top of any previous version. Various clients have various deployed version. When they upgrade, my application supports upgrade from any previous version.
There is no difference between a fresh install and an upgrade, it runs the same code, so I have fewer code paths to maintain and test.
There is no difference between DML and DDL changes (your original question). they all treated the same way, as script run to change from one version to next. When I need to make a change like you describe (change a default), I actually increase the schema version even if no other DDL change occurs. So at version 5.1 the default was 'foo', in 5.2 the default is 'bar' and that is the only difference between the two versions, and the 'upgrade' step is simply an UPDATE statement (followed of course by the version metadata change, ie. sp_updateextendedproperty).
All changes are in source control, part of the application sources (T-SQL scripts mostly).
I can easily get to any previous schema version, eg. to repro a customer complaint, simply by running the upgrade sequence and stopping at the version I'm interested in.
This approach saved my skin a number of times and I'm a true believer now. There is only one disadvantage: there is no obvious place to look in source to find 'what is the current form of procedure foo?'. Because the latest version of foo might have been upgraded 2 or 3 versions ago and it wasn't changed since, I need to look at the upgrade script for that version. I usually resort to just looking into the database and see what's in there, rather than searching through the upgrade scripts.
One final note: this is actually not my invention. This is modeled exactly after how SQL Server itself upgrades the database metadata (mssqlsystemresource).
If you are changing the static data (adding a new item to the table that is used to generate a drop-down list) then the insert should be in source control and deployed with the rest of the code. This is especially true if the insert is needed for the rest of the code to work. Otherwise, this step may be forgotten when the code is deployed and not so nice things happen.
If static data comes from another source (such as an import of the current airport codes in the US), then you may simply need to run an already documented import process. The import process itself should be in source control (we do this with all our SSIS packages), but the data need not be.
Here at Red Gate we recently added a feature to SQL Data Compare allowing static data to be stored as DML (one .sql file for each table) alongside the schema DDL that is currently supported by SQL Compare.
To understand how this works, here is a diagram that explains how it works.
The idea is that when you want to push changes to your target server, you do a comparison using the scripts as the source data source, which generates the necessary DML synchronization script to update the target. This means you don't have to assume that the target is being recreated from scratch each time. In time we hope to support static data in our upcoming SQL Source Control tool.
David Atkinson, Product Manager, Red Gate Software
I have come across this when developing CMS systems.
I went with appending the static data (the stuff referenced in the code) to the database creation scripts, then a separate script to add in any 'initialisation data' (like countries, initial product population etc).
For the first two steps, you could consider using an intermediate format (ie XML) for the data, then using a home grown tool, or something like CodeSmith to generate the SQL, and possible source files as well, if (for example) you have lookup tables which relate to enumerations used in the code - this helps enforce consistency.
This has another benefit that if the schema changes, in many cases you don't have to regenerate all your INSERT statements - you just change the tool.
I really like your distinction of the three types of data.
I agree for the third.
In our application, we try to avoid putting in the database the first, because it is duplicated (as it has to be in the code, the database is a duplicate). A secondary benefice is that we need no join or query to get access to that value from the code, so this speed things up.
If there is additional information that we would like to have in the database, for example if it can be changed per customer site, we separate the two. Other tables can still reference that data (either by index ex: 0, 1, 2, 3 or by code ex: EMPTY, SIMPLE, DOUBLE, ALL).
For the second, the scripts should be in source-control. We separate them from the structure (I think they typically are replaced as time goes, while the structures keeps adding deltas).
How do I (as a developer) warn the people doing the deployment that they should execute an insert statement ?
We have a complete procedure for that, and a readme coming with each release, with scripts and so on...
First off, I have never used Visual Studio Database Edition. You are blessed (or cursed) with whatever tools this utility gives you. Hopefully that includes a lot of flexibility.
I don't know that I'd make that big a difference between your type 1 and type 2 static data. Both are sets of data that are defined once and then never updated, barring subsequent releases and updates, right? In which case the main difference is in how or why the data is as it is, and not so much in how it is stored or initialized. (Unless the data is environment-specific, as in "A" for development, "B" for Production. This would be "type 4" data, and I shall cheerfully ignore it in this post, because I've solved it useing SQLCMD variables and they give me a headache.)
First, I would make a script to create all the tables in the database--preferably only one script, otherwise you can have a LOT of scripts lying about (and find-and-replace when renaming columns becomes very awkward). Then, I would make a script to populate the static data in these tables. This script could be appended to the end of the table script, or made it's own script, or even made one script per table, a good idea if you have hundreds or thousands of rows to load. (Some folks make a csv file and then issue a BULK INSERT on it, but I'd avoid that is it just gives you two files and a complex process [configuring drive mappings on deployment] to manage.)
The key thing to remember is that data (as stored in databases) can and will change over time. Rarely (if ever!) will you have the luxury of deleting your Production database and replacing it with a fresh, shiny, new one devoid of all that crufty data from the past umpteen years. Databases are all about changes over time, and that's where scripts come into their own. You start with the scripts to create the database, and then over time you add scripts that modify the database as changes come along -- and this applies to your static data (of any type) as well.
(Ultimately, my methodology is analogous to accounting: you have accounts, and as changes come in you adjust the accounts with journal entries. If you find you made a mistake, you never go back and modify your entries, you just make a subsequent entries to reverse and fix them. It's only an analogy, but the logic is sound.)
The solution I use is to have create and change scripts in source control, coupled with version information stored in the database.
Then, I have an install wizard that can detect whether it needs to create or update the db - the update process is managed by picking appropriate scripts based on the stored version information in the database.
See this thread's answer. Static data from your first two points should be in source control, IMHO.
Edit: *new
all-in-one or a separate script? it does not really matter as long as you (dev team) agree with your deployment team. I prefer to separate files, but I still can always create all-in-one.sql from those in the proper order [Logins, Roles, Users; Tables; Views; Stored Procedures; UDFs; Static Data; (Audit Tables, Audit Triggers)]
how do you make sure they execute it: well, make it another step in your application/database deployment documentation. If you roll out application which really needs specific (new) static data in the database, then you might want to perform a DB version check in your application. and you update the DB_VERSION to your new release number as part of that script. Then your application on a start-up should check it and report an error if the new DB version is required.
dev and non-dev static data: I have never seen this case actually. More often there is real static data, which you might call "dev", which is major configuration, ISO static data etc. The other type is default lookup data, which is there for users to start with, but they might add more. The mechanism to INSERT these data might be different, because you need to ensure you do not destoy (power-)user-created data.