Automated content creation for the web? - automation

I see a lot of new websites lately which create automated content, most notable SiteGuruji and 7zoom:
http://www.siteguruji.com/site/youtube.com
Is there an application framework or text analysis framework available to create such sites? SiteGuruji is doing full SEO analysis of the sites as well. Is there an SEO analysis library available? How do I do such an analysis?
Sorry for the noobish question, but i am new to programming and thus I am not sure which direction to start in.

By SEO did you mean this section of the page?
http://www.siteguruji.com/site/youtube.com#seo_status ?
I don't think any frameworks are available for SEO... however you can check out NLTK for text analysis and natural language processing:
http://www.nltk.org/book

You basically need to write your own classes to scrape content from the site and third party sites and analyse it. I have not found something ready for this. Bits fom here and there, you can use.
Personally, I have created everything from scratch using zend framework as basis.

Related

Polarion testing reports. What is the best way to build dashboards and live reports?

Where do you find knowledge about Polarion testing reports?
Are you using external adviser? Documentation? Videos (find only few)
Or have you been just doing "play and learn"?
That is a very broad question.
One of the best ways is to study some extensions from the extension portal and to study the extensions from the sdk. There is a custom widget example available in the SDK.
External consultants are, depending where you are, also a good, but more pricier alternative.
The SDK Documentation is quite good, at least ok.
A good entry is ITrackerService and IWorkItem objects. For testing you will need additional ITestManagementService and ITestRun
If you want to setup and configure the system by yourself you need to "play and learn", otherwise it is clever(but costly) to let a external consultant do the job.

Link to change language of site

Does anyone know or recommend a method for a simple way to convert a site into a different language. I just need the site to change from spanish to english and vice versa, but the site will load in spanish first. Perhaps a plugin is available? Most of the content is dynamic and the site is being developed with Concrete5 CMS. Any ideas would be appreciated. Thank you.
I think you need to install an add-on for this. Check out Internationalization, it's free: http://www.concrete5.org/marketplace/addons/internationalization/
Here is a YouTube video showing it in action, so you can quickly see if it's what you had in mind: https://www.youtube.com/watch?v=Hd936iaDLqw&feature=player_embedded
You need some automatic translation tool (since you said your content is dynamic).There are many in internet, you just have to search for "Automatic translation API".
I recommend you using the Google Translate one.

What are the main differences between: Seaside vs Aida vs Iliad

What are the differences between the three Smalltalk web application frameworks?
Some starting points:
What is the sweet spot for each framework? in Which case would you use one or the other?
What are their weaknesses?
Which one has the cleanest URLs?
How do they handle Ajax?
Do they have some preference in their use of persistence?
I'm just trying to decide which framework is appropriate for each kind of application.
I can only answer for Seaside:
Target: Seaside targets complex web applications with focus on reusability and development productivity. There is automatic session state management and back-button support. The two free online books Dynamic Web Development with Seaside and Seaside Tutorial provide documentation.
Weakness: For RESTful URLs you have to do some extra work.
Clean URLs: For RESTful URLs you have to do some extra work, but it can be worth it (e.g. Pier).
AJAX: There are plenty of AJAX libraries integrated in Seaside (jQuery, jQueryUI, Prototype, script.aculo.us, ...). The integrations give you full access to these libraries from within Smalltalk. New libraries can be easily integrated, e.g. JQueryWidgetBox.
Persistency: Seaside is a web application framework, not a persistency framework. You can use whatever persistency solution fits you the best, e.g. GemStone, GOODS, GLORP, ...
Also see these other questions/discussions on StackOverflow:
What is the difference between Seaside programmming and other web programming
Is Seaside still a valid option?
I can say something on the Iliad side:
Sweet spot(s): It handles AJAX painlessly. For me, that was the turning point that made me switch to Iliad. Also, it's so small and non-bloated that you can read the whole code in a day and have a grasp on how it works.
Weaknesses: The community is also very small. This results in a lack of documentation, additional modules or pre-made widgets. OTOH, small communities tend to be willing to help each other more eagerly, so pretty much all your doubts can be solved by asking at the mailing list.
URLs: Well, since all calls in Iliad are AJAX by default, the URL stays clean the whole time.
Ajax: Yep. For free and by default. You just #markDirty a widget and it'll update automatically. Dependencies are as easy to define as sending #addDependantWidget: to a widget, so that when the first is marked dirty, both will be updated. Also, if the client doesn't have a javascript capable browser, all calls will fall back to regular HTTP requests automatically.
Persistence: No preference. Since the model is separated from the framework (I think this applies for the three frameworks) you can still follow the same guidelines you would for Aida or Seaside.
And for Aida/Web:
Sweet spots: Realtime web support out of the box, for both content websites and complex web apps, HTML5 and mobile support, web server included so it works immediately after installation, you can serve many virtual websites from the same image.
Weaknesses: lack of documentation, small community
URLs: clean REST-like URLs all the time, because Aida follows from the start the moto: every domain object can have its URL (also by Alan Kay) and domain object can even choose its URL by itself.
Ajax: Seamlessly integrated, you don't see it anymore, all is just there. To refresh some element on webpage you simply call e update. No need to know any jQuery or some other JavaScript. Same goes for realtime web apps as well. WebSocket protocol is default communication channel on supported browsers to exchange JSON messages between browser and Aida based server.
Persistence: Image based persistence with automatic snapshot every hour is turned on by default. Gemstone/GLASS support provided for the next step. Relational/other DB is a duty of domain level, if needed.
For more:
Comparison of Smalltalk web frameworks from Aida centric
perspective
ToDo example in Aida/Web shows the newest realtime web/HTML5
features, as part of Comparison by example initiative
For some persistency solutions for Seaside, there is a page. Most of the solutions there are independent of Seaside.

Anyone know the Click Framework?

I've been recommended the Click framework from Apache. But I can't find any forums talking about benchmark, reviews, advantages, disavantages, usefulness, ease of implementation, etc.
I've been asked to use it to develop a web site, but I'm completly in the dark about its strengths and weaknesses.
And its damn name isn't helping !! Click ? Hey Apache ! Call your next framework "the" just for fun. I dare you.
So can anyone comment on his experience with Click ?
What I personally like about the Click framework is that it is fairly close to HTML/HTTP and the Servlet API. There is no huge abstraction to get familiar with. You have a Page class, a Form class, ... If you need to preserve state across invocations you put it in the session or you pass it through the URL... This makes it easy to start using it. It is also straightforward to control the HTML pages being generated. It may sound like it is a very basic framework but the simplicity is actually one of it greatest strengths.
Other frameworks (e.g. Seam) are more suitedr to create a very large web application with lots of reusable components and complicated pageflows but the learning curve is much steeper. So for me Click works well for small to medium sized websites.
It's an apache incubator project but that does not mean the project is not stable, rather it reflects that it is in transition to the Apache project model.
Click is Apache's version of a component based web framework equivalent to JSF (other component base Java Frameworks are Tapestry and Wicket)
Click is rated at Ohloh
There is an official blog and some Wikipedia references: Framework Comparision and info page

Software/Platform to Share Specs

What are the software/ Wiki you use to write and share your specs about the developers, testers and management?
Do you use Wiki system, and if so, what Wiki software you use?
Or do you use Sharepoint to manage and version the specs? One problem with SharePoint 2003 as specs platform is that it's very hard to collaborate among different people.
For backward compatibility sake, I would also like to have the platform able to import Microsoft Word seamlessly. And it would certainly help if the interface is similar to Microsoft Word.
Any idea?
I've used Confluence at a number of places, it's a pretty powerful wiki and very good for creating specifications that can be shared amongst various parties. See:
http://www.atlassian.com/software/confluence/
There's some more information here on the advantages of using Confluence:
https://stackoverflow.com/questions/170352/confluence-experiences
EDIT: I've updated this to deal with the Microsoft Word import feature you mentioned. Confluence supports this through the Office Connector here:
http://www.atlassian.com/software/confluence/plugins/office-connector.jsp
There's also a Sharepoint connector:
http://www.atlassian.com/software/confluence/plugins/sharepoint-connector.jsp
plus a whole bunch of plugins:
http://www.atlassian.com/software/confluence/plugins/sharepoint-connector.jsp
Some of these are user contributed also. I can't recommend Confluence enough as a commercial wiki.
I've also used JSPWiki, which is open source. it's ok but not as good as confluence, see:
http://www.jspwiki.org/
You could try Google docs - I have successfully used this in the past. It supports import / export to MS Word, and it has great support for multiple user - see http://www.brighthub.com/internet/google/articles/8236.aspx.
It supports versioning, allows you to chat with other people who are currently working on the document, and shows you a list of all the changes others have made to the document (without needing to close / reopen the document).
If you want corporate support, Google also provides that - see Google Apps for business.
We use SharePoint -- it's not ideal, but it does a decent job. If I were you, I would seriously look at getting off SharePoint 2003 and on to MOSS (SharePoint 2007). It's not perfect, but it's substantially better. Here's a little bit on using MOSS as a wiki. I think in general wiki's are a good tool for getting people up to speed on your system. We used to pass around "getting started documents" and now we have all that type of stuff in our developer portal.
Per John's comment, I looked up this feature comparison. I have to go back and look at what features I'm using that are not in WSS -- I might be paying for licenses I don't need! :)
We use email. I know it isn't elaborate, but it is easy to use. Everyone has it installed and there are no licensing issues. All spec changes are sent to an super set email distro indicating the updates and the location on the network share where the spec can be found.
We use Alfresco, in its Community version, from both its Share and Explorer web interfaces.
Quite useful, with a document library, wiki, forum and calendar.
We curently host about 1.8 Go consisting mainly in docs, versionned and sometimes automatically converted to PDF (by creating an automatic content rule).
FTP, WebDav and network share are also used to access to the same repository.
You could take a look at Microsoft Groove - the collaboration software that Microsoft bought a few years back.
It's bundled free with premium versions of Microsoft Office.
You can customize the workspace with discussion boards and can fairly seamlessly store collaboratively-edited Office documents.
We use MediaWiki for dos & specs. Wiki definitely wins anything like Microsoft Word or SharePoint - it allows you to develop a documentation in "first refer, then describe" = "divide and rule" way. Perfect for developers - they used to think the same way. The process of developing a documentation is almost ideal: you start from TOC and drill down until you write the document for every link you put earlier.
MediaWiki is quite customizable - there are lots of extensions there. The most necessary ones are:
Source code highlighter - CSO_Source
Our own templates integrating wiki with class reference.
Others are InterWiki, FileProtocolLinks, YouTube (we use customized version of it to display HD video), ReCaptcha, SpecialDeleteOldRevisions, Maintenance.
Some integration examples are here.
And we use Google issue tracker to track the issues. Its main advantages:
Imput usability: the process of adding\changing the issue is really convenient there. Earlier we tried Track Studio - the same actions require 2-3 times more time there, so it died fast simply because most of us hated to use it.
Customizable grids. See the examples. Really helpful.
Atom\RSS support. So everyone knows what's going on.
There is a Gurtle tool integrating it with TortoiseSVN. Really helpful.
Its main disadvantage is that it can't be closed from the public access. This makes it simply unusable in many cases.
If you want a UI similar to Word, why not use Word with SharePoint 2007? You're on 2003 so the experience is there. Upgrade to SharePoint 2007 and you can have the collaboration, Word features, document sharing, and so on.
This is the kind of thing Microsoft wants people to use Office for, so there's a ton of doco out there about how to configure your SharePoint and Office environment to support collaboration.
There is something that Google do in this direction and it looks really cool: wave.google.com. It would be a great step in collaboration and worth to wait it.
Here we use Google Docs it makes the documents available to everyone write or read only, public or private among people that have or not Google accounts, it also can import Word docs, not to mention that it runs directly into the browser so it has high availability with zero cost and zero setup, also its computer/OS agnostic, we have a nice experience with it.
Also perhaps you should take a look at Basecamp or Backpack at 37Signals, any of then might also fit your bill.
We use DocBook for all of our specifications (and other customer-facing documentation). DocBook is an XML format that lets you easily generate documents in just about any format, including PDF, which is how we distribute things to clients to get them signed off. We can divide a document into files (by section) and commit everything to our source control system (Subversion). Because it is all XML (i.e. text-based), Subversion's automatic merging and conflict resolution works great if two people work on the same file. We have a set of stylesheets that all of our documents use, so all documents share the exact same style/format, with no extra work on our part.
And if you don't like editing XML files directly, there are GUI front-ends that provide a reasonably WYSIWYG-like experience. I believe that most people in my office use XMLMind. Still, we happen to all be technical people so if we had to write XML directly it wouldn't be an issue.
As a sidenote, we also put out release notes. We have some XSLT that lets us write documents like this:
<bugs>
<bug id="1234" component="web">JavaScript error when clicking the Kick Me button</bug>
</bugs>
We then have a script that runs through our Subversion repository doing an svn log from the previous release tag to the current release tag, and some Bugzilla integration to automatically generate release notes on-the-fly.
(also, for most internal-only documentation, we use MediaWiki, which is also a great way to collaborate.)
We use OnTime. It was originally only used for defect tracking, but we've started using it to track features as well. These can be used to document the feature as it evolves during development. Features can be grouped together into sprints or releases, and time can be tracked against each feature. If you are using SCRUM, you can also plot burn-down charts for each sprint. It also has wiki functionality.