Partial document update in solr - apache

I m using apache solr , I want to do partial update of document in solr
I am new to Apachesolr and i m using windows operating system please help me and give suitable example for this
we need to update a single field in the index and we don’t want to send the whole document. Lets say, that we need to update product price, which is updated a few times a day. We don’t want to index the whole document again and again,

Also:
Look at http://solr.pl/en/2012/07/09/solr-4-0-partial-documents-update/
Interestingly the site above uses the same product price example that you are trying to solve.
Note that your entire document(all fields) needs to be stored for this to happen.

Related

TYPO3 v7.6.x migration to Drupal 8

I have to migrate a complex TYPO3 v7.6.30 website to Drupal 8.
So far I have investigated how TYPO3's administration part works.
I've also been digging into the TYPO3 database to find the correct mapping pattern, but I just don't seem to be getting anywhere.
My question is if there is a nice way to map/join all of the content with it's images/files/categories, so I can get row by row all page content like:
title
description
text fields
images
documents
tables
...
So in the end I will end up with a joined table with all of the data for each page on a single row, which then I can map in the migration.
I need a smooth way to map the pages with their fields.
I need the same for users (haven't researched this one yet).
The same is for the nesting of the pages in order to recreate the menus in the new CMS.
Any help on this will be highly appreciated.
You need a detailed plan of the configuration and then much understanding how TYPO3 works.
Here a basic introduction:
All content is organized in records and the main table is pages, the pagetree.
For nearly all records you have some common fields:
uid unique identifier
pid page ID (in which 'page' is the record 'stored', important for editing) (even pages are stored in pages to build a page tree)
title name of record
hidden, deleted,starttime,endtime, fe_group for visibility
there are fields for
versioning and workspaces
language support
sorting
some records (especially tt_content) have type fields, which decide how the record and which fields of it are used
there are relations to files (which are represented by sys_file records, and other records like file metadata or categories).
Aside from the default content elments where the data is stored in the tt_content record itself you can have plugins which display other records, (e.g. news, addresses, events, ...) or which get their data from another application or server.
You need to understand the complete configuration to save all.
What you might need is a special rendering of the pages.
That is doable with TYPO3: aside from the default HTML-rendering you can define other page types where you can get the content in any kind you define. e.g. xml, json, CSV, ...
This needs detailed knowledge of the individual TYPO3 configuration. So nobody can give you a full detailed picture of your installation.
And of course you need a good knowledge of your drupal target installation to answer the question 'what information should be stored where?'

Creating dynamic facets using apache solr

I'm new to apache solr.
I have uploaded a few log files using solr-cell and I want to create facets based on the content which is there in the log file.
For example: inside my log file I have a record for transaction, I would like to create transactionid as my facet and clicking it should result in a search in the uploaded log files and give me results according to that particular id.
Note: I need to facet field according to the content which is in the log.
As long as the field is indexed, you can facet on it. So, you can use either schemaless configuration or use dynamicField definitions to match and automatically create fields for your log records.
Go through Solr examples first, there should be enough information there.
(updated based on the comments)
If the text needs to be pre-processed and split, there are two basic avenues:
Using DataImportHandler (DIH), probably with LineEntityProcessor and RegexTransformer to split the field into multiple fields
Using UpdateRequestProcessor chains (in solrconfig.xml) and probably clone the field multiple times and then use RegexReplaceProcessorFactory to extract relevant parts. That's even uglier than DIH though as there is no easy way to split one field into many.
Still, specifically for logs, it is better to use something like Logstash with Solr output plugin.
+1 to Alex's answer.
Another alternative is to write a custom update processor where you figure out what field you want to facet on and explicitly add that field to your document.
This makes sense only if you know what kind of fields to expect, based on some pattern. If that is not the case, then using dynamic fields or a schemaless config is your best bet.

Store All Input Data Acrobat Form

I have an Acrobat form for work that our salesmen use to create proposals for jobs and their corresponding estimates.
The problem I am facing is that the form only stores data for one customer at a time. I am trying to get it to where they can type in a customers name (or job number, etc.) and it pull up all the form data used for that customer when that exact estimate was done (no matter how long ago it was).
How can I get my PDF form to do this (save current and all previous inputs) and not just save the current data in each editable field at a time?
I currently use Omniform and it does all of this; however, we are trying to switch over to Adobe and I am not too familiar with the software and how I can accomplish this!
Thank you in advance!
If you want to do all the processing local (without server roundtrip) you would have to embed all data in the PDF itself. There are several ways to do this but I would recommend using JavaScript. You would declare this at the document level. You would handle the blur event of the customer name (or other key field), find a match among the multiple customers and populate the secondary fields.
Assuming the data sits somewhere in a database, you would have to generate such a PDF or manipulate an existing template programmatically using a library. Not sure if you are looking to a programming solution or a tool.
Here is more info on JavaScript for Acrobat:
http://www.adobe.com/devnet/acrobat/javascript.html

Wordpress manually Attach Images to Post using SQL

I have created a custom post type called products and need to associate multiple images with one product. The actual data already exists in a different non-wordpress database in a table called productimages that simply has a productid, imageurl, and image title. I need to convert this data into wordpress format so wordpress can display everything I need. This is something that will be done regularly using sql automatically deleting products and meta data related and re-adding from our main database which changes regularly.
From what I can tell, the best way to make this work is to manually use sql to insert the productimages as attachment posts just like wordpress does when you upload media to wordpress. Then to associate the image with a product I have to manually insert records into postmeta but the data in postmeta is serialized or something and I am unsure how I would insert the data in the correct serialize format using mysql. Is it even possible to do this with mysql?
Am I going about this wrong? Should I be doing something different? I was originally going to use custom fields till I realized a custom field can only have one value and I needed two values: imageurl and imagetitle for each image. So it seems programatically creating a post type of attachment for each image is the best way to go. yes?
I look forward to anyone's response to help in this matter.
Not an expert on serialization for wordpress but have you checked the wordpress php function maybe_serialize

Building a ColdFusion Application with Version Control

We have a CMS built entirely in house. I'm the new web developer guy with literally 4 weeks of ColdFusion Experience. What I want to do is add version control to our dynamic pages. Something like what Wordpress does. When you modify a page in Wordpress it makes some database entires and keeps a copy of each page when you save it. So if you create a page and modifiy it 6 times, all in one day you have 7 different versions to roll back if necessary. Is there a easy way to do something similar in Coldfusion?
Please note I'm not talking about source control or version control of actual CFM files, all pages are done on the backend dynamically using SQL.
sure you can. just stash the page content in another database table. you can do that with ColdFusion or via a trigger in the database.
One way (there are many) to do this is to add a column called "version" and a column called "live" in the table where you're storing all of your cms pages.
The column called live is option but might make it easier for your in some ways when starting out.
The column "version" will tell you what revision number of a document in the CMS you have. By a process of elimination you could say the newest one (highest version #) would be the latest and live one. However, you may need to override this some time and turn an old page live, which is what the "live" setting can be set to.
So when you click "edit" on a page, you would take that version that was clicked, and copy it into a new higher version number. It stays as a draft until you click publish (at which time it's written as 'live')..
I hope that helps. This kind of an approach should work okay with most schema designs but I can't say for sure either without seeing it.
Jas' solution works well if most of the changes are to one field, for example the full text of a page of content.
However, if you have many fields, and people only tend to change one or two at a time, a new entry in to the table for each version can quickly get out of hand, with many almost identical versions in the history.
In this case what i like to do is store the changes on a per field basis in a table ChangeHistory. I include the table name, row ID, field name, previous value, new value, and who made the change and when.
This acts as a complete change history for any field in any table. I'm also able to view changes by record, by user, or by field.
For realtime page generation from the database, your best bet are "live" and "versioned" tables. Reason being keeping all data, live and versioned, in one table will negatively impact performance. So if page generation relies on a single SELECT query from the live table you can easily version the result set using ColdFusion's Web Distributed Data eXchange format (wddx) via the tag <cfwddx>. WDDX is a serialized data format that works particularly well with ColdFusion data (sorta like Python's pickle, albeit without the ability to deal with objects).
The versioned table could be as such:
PageID
Created
Data
Where data is the column storing the WDDX.
Note, you could also use built-in JSON support as well for version serialization (serializeJSON & deserializeJSON), but cfwddx tends to be more stable.