Lucene query results not correct for long and double values - lucene

I use Lucene 6.1.0 to index elements with a name and a value.
E.g.
<documents>
<Document>
<field name="NAME" value="Long_-1"/>
<field name="VALUE" value="-1"/>
</Document>
<Document>
<field name="NAME" value="Double_-1.0"/>
<field name="VALUE" value="-1.0"/>
</Document>
<Document>
<field name="NAME" value="Double_-0.5"/>
<field name="VALUE" value="-0.5"/>
</Document>
<Document>
<field name="NAME" value="Long_0"/>
<field name="VALUE" value="0"/>
</Document>
<Document>
<field name="NAME" value="Double_0.0"/>
<field name="VALUE" value="0.0"/>
</Document>
<Document>
<field name="NAME" value="Double_0.5"/>
<field name="VALUE" value="0.5"/>
</Document>
<Document>
<field name="NAME" value="Long_1"/>
<field name="VALUE" value="1"/>
</Document>
<Document>
<field name="NAME" value="Double_1.0"/>
<field name="VALUE" value="1.0"/>
</Document>
<Document>
<field name="NAME" value="Double_1.5"/>
<field name="VALUE" value="1.5"/>
</Document>
<Document>
<field name="NAME" value="Long_2"/>
<field name="VALUE" value="2"/>
</Document>
</documents>
According to the documentation I use the LongPoint and DoublePoint to build the index.
public static void addLongField(String name, long value, Document doc) {
doc.add(new LongPoint(name, value));
// since Lucene6.x a second field is required to store the values.
doc.add(new StoredField(name, value));
}
public static void addDoubleField(String name, double value, Document doc) {
doc.add(new DoublePoint(name, value));
// since Lucene6.x a second field is required to store the values.
doc.add(new StoredField(name, value));
}
Since I use the same field for long and double values, I get strange results for my RangeQuery if the min and max value have different signs.
LongPoint.newRangeQuery(field, minValue, maxValue);
DoublePoint.newRangeQuery(field, minValue, maxValue);
This example is correct:
VALUE:[1 TO 1] VALUE:[0.5 TO 1.0]
Results in:
0.5 Double_0.5
1 Long_1
1.0 Double_1.0
This example is erroneous
VALUE:[0 TO 1] VALUE:[-0.5 TO 1.0]
Results in:
0 Long_0
0.0 Double_0.0
1 Long_1
-1 Long_-1
-0.5 Double_-0.5
0.5 Double_0.5
1.0 Double_1.0
2 Long_2
Additionally to the correct results, all long values are returned.
Does anybody know why?
Is it not possible to store long and double values in the same field?
Thank you very much.
BR Tobias

No, you should not be keeping different data types in the same field. You should either put them in separate fields, or convert your longs into doubles (or vice versa), so that they are all indexed in the same format.
To understand what is going on, it helps to understand what the numeric fields are really doing. Numeric fields are encoded in a binary representation that facilitates range searching for that type. The encoding for integral types and that for floating point types is not comparable. For an example, for the number 1:
long 1 = lucene BytesRef: [80 0 0 0 0 0 0 1]
double 1.0 = lucene BytesRef: [bf f0 0 0 0 0 0 0]
These BytesRef binary representations are what is actually being searched. Since one part of your query is from double -0.5 to 1.0, you are effectively running a query:
encodedvalue: [40 1f ff ff ff ff ff ff] - [bf f0 0 0 0 0 0 0]
Which doesn't include just a few extra hits out of the range of long values, but most of the long values outside of the really high and low reaches (you'd need to be getting into the neighborhood of Long.MAX_VALUE/2).

Related

Update 3rd level deep field value in an XML using xQuery

I have the following sample XML and I am looking to Update:
AttendeeID to 7878 (will be a sql variable) where sequence = 1 (will be a sql variable)
Expected output is to see 7878 in the AttendeeID field's value. When I run any of the 2 options I tried, it does not yield the correct result. For Eg. Delete works but the element is not added. Replace value does not update the value.
Any inputs are highly appreciated. Thank you.
--------XML ---------------------------
DECLARE #cartXML XML =
'<OBJECT CLASS="Test1" ID="-1" FULL="FULL" VERSION="1">
<FIELD NAME="OrderDate">20220619</FIELD>
<FIELD NAME="OrderParty">Individual</FIELD>
<FIELD NAME="ShipToID">34567</FIELD>
<FIELD NAME="ShipToAddress1">123 Test Street</FIELD>
<FIELD NAME="ShipToCity">TestCity</FIELD>
<FIELD NAME="ShipToState">IL</FIELD>
<FIELD NAME="ShipTocountry">USA</FIELD>
<FIELD NAME="TaxNumber">444</FIELD>
<FIELD NAME="DiscountCode">Summer22</FIELD>
<SUBTYPE NAME="SubType1">
<OBJECT NAME="SubType111" ID="-1">
<FIELD NAME="TestID">-1</FIELD>
<FIELD NAME="Sequence">1</FIELD>
<FIELD NAME="ParentSequence">-1</FIELD>
<FIELD NAME="ExtID">-1</FIELD>
<FIELD NAME="ExtName">ABC</FIELD>
</OBJECT>
<OBJECT NAME="SubType111" ID="-1">
<FIELD NAME="TestID">-1</FIELD>
<FIELD NAME="Sequence">2</FIELD>
<FIELD NAME="ParentSequence">1</FIELD>
<FIELD NAME="ExtID">-1</FIELD>
<FIELD NAME="ExtName">DEF</FIELD>
<FIELD NAME="__ExtendedData"><OBJECT
CLASS="Meet123" ID="-1" FULL="FULL"
VERSION="1"><FIELD
NAME="OrderDetailID">-1</FIELD><FIELD
NAME="OrderID">-1</FIELD><FIELD
NAME="Sequence">0</FIELD><FIELD
NAME="AttendeeID">123</FIELD><FIELD NAME="AttendeeID_Name">Test, Mark/I
H 6</FIELD><FIELD
NAME="ShowList">1</FIELD><FIELD
NAME="BdgeName">Mark</FIELD><FIELD
NAME="BadgeCompanyName">I H 6</FIELD>
</OBJECT></FIELD>
</OBJECT>
<OBJECT NAME="SubType111" ID="-1">
<FIELD NAME="TestID">-1</FIELD>
<FIELD NAME="Sequence">3</FIELD>
<FIELD NAME="ParentSequence">1</FIELD>
<FIELD NAME="ExtID">-1</FIELD>
<FIELD NAME="ExtName">GHI</FIELD>
</OBJECT>
</SUBTYPE>
<SUBTYPE NAME="SubType2"/>
<SUBTYPE NAME="SubType3"/>
</OBJECT>';
-----------------------SQL -----------------------
select #cartXML as originalXML
DECLARE #ID as int ,#productID as int, #attendeeId as int = 7878,
#sequenceId as int, #orderLineXML as XML , #ExtendedAttrDetail as XML
SET #sequenceId = 2
select #orderlineXML = c.query('.'), #ExtendedAttrDetail = w.query('.') from
#cartXML.nodes('/OBJECT/SUBTYPE/OBJECT[FIELD[#NAME="Sequence"]/text()=sql:variable("#sequenceId")]') t1(c)
Cross APPLY (VALUES(TRY_CAST(c.query('FIELD[#NAME="__ExtendedData"]').value('.','NVARCHAR(MAX)') AS XML)))AS t2(w)
-----This works..But I am looking to alter #cartXML as it contains the entire XML
SET #ExtendedAttrDetail.modify('replace value of
(/OBJECT/FIELD[#NAME="AttendeeID"]/text())[1]
with sql:variable("#attendeeId")')
--select #ExtendedAttrDetail
------- Option 1( Preferred)---does not work--
SET #cartXML.modify ('replace value of
(/OBJECT/SUBTYPE/OBJECT[FIELD[#NAME="Sequence"]/text()=sql:variable("#sequenceId")]/FIELD[#NAME="__ExtendedData"]/OBJECT/FIELD[#NAME="AttendeeID"]/text())[1] with sql:variable("#attendeeId")')
select #cartXML as ModifiedDirectly
---Option 2 (Insert does not add correctly )
--SET #cartXML.modify('delete
--/OBJECT/SUBTYPE/OBJECT[FIELD[#NAME="Sequence"]/text()=sql:variable("#sequenceId")]
--/FIELD[#NAME="__ExtendedData"]');
--SET #cartXML.modify('insert sql:variable("#ExtendedAttrDetail") into
--(/OBJECT/SUBTYPE/OBJECT[FIELD[#NAME="Sequence"]/text()=sql:variable("#sequenceId")])
--[1]');
--SELECT #cartXML as UpdatedXL;
Please try the following solution.
The issue is that the XML fragment in question is encoded.
SQL
DECLARE #cartXML XML =
N'<OBJECT CLASS="Test1" ID="-1" FULL="FULL" VERSION="1">
<FIELD NAME="OrderDate">20220619</FIELD>
<FIELD NAME="OrderParty">Individual</FIELD>
<FIELD NAME="ShipToID">34567</FIELD>
<FIELD NAME="ShipToAddress1">123 Test Street</FIELD>
<FIELD NAME="ShipToCity">TestCity</FIELD>
<FIELD NAME="ShipToState">IL</FIELD>
<FIELD NAME="ShipTocountry">USA</FIELD>
<FIELD NAME="TaxNumber">444</FIELD>
<FIELD NAME="DiscountCode">Summer22</FIELD>
<SUBTYPE NAME="SubType1">
<OBJECT NAME="SubType111" ID="-1">
<FIELD NAME="TestID">-1</FIELD>
<FIELD NAME="Sequence">1</FIELD>
<FIELD NAME="ParentSequence">-1</FIELD>
<FIELD NAME="ExtID">-1</FIELD>
<FIELD NAME="ExtName">ABC</FIELD>
</OBJECT>
<OBJECT NAME="SubType111" ID="-1">
<FIELD NAME="TestID">-1</FIELD>
<FIELD NAME="Sequence">2</FIELD>
<FIELD NAME="ParentSequence">1</FIELD>
<FIELD NAME="ExtID">-1</FIELD>
<FIELD NAME="ExtName">DEF</FIELD>
<FIELD NAME="__ExtendedData"><OBJECT
CLASS="Meet123" ID="-1" FULL="FULL"
VERSION="1"><FIELD
NAME="OrderDetailID">-1</FIELD><FIELD
NAME="OrderID">-1</FIELD><FIELD
NAME="Sequence">0</FIELD><FIELD
NAME="AttendeeID">123</FIELD><FIELD NAME="AttendeeID_Name">Test, Mark/I
H 6</FIELD><FIELD
NAME="ShowList">1</FIELD><FIELD
NAME="BdgeName">Mark</FIELD><FIELD
NAME="BadgeCompanyName">I H 6</FIELD>
</OBJECT></FIELD>
</OBJECT>
<OBJECT NAME="SubType111" ID="-1">
<FIELD NAME="TestID">-1</FIELD>
<FIELD NAME="Sequence">3</FIELD>
<FIELD NAME="ParentSequence">1</FIELD>
<FIELD NAME="ExtID">-1</FIELD>
<FIELD NAME="ExtName">GHI</FIELD>
</OBJECT>
</SUBTYPE>
<SUBTYPE NAME="SubType2"/>
<SUBTYPE NAME="SubType3"/>
</OBJECT>';
DECLARE #ExtendedData XML
, #attendeeId INT = 770;;
-- Step #1: select XML fragment in question as real XML data type
SELECT #ExtendedData = w
FROM #cartxml.nodes('/OBJECT/SUBTYPE/OBJECT[#ID="-1"]') as t1(c)
CROSS APPLY (VALUES(TRY_CAST(c.query('FIELD[#NAME="__ExtendedData"]').value('.','NVARCHAR(MAX)') AS XML))) AS t2(w)
WHERE w.exist('/OBJECT[#CLASS="Meet123"]') = 1;
-- Step #2: remove encoded XML fragment
SET #cartXML.modify('replace value of (/OBJECT/SUBTYPE[#NAME="SubType1"]/OBJECT/FIELD[#NAME="__ExtendedData"]/text())[1]
with ""');
-- Step #3: modify AttendeeID
SET #ExtendedData.modify('replace value of
(/OBJECT/FIELD[#NAME="AttendeeID"]/text())[1]
with sql:variable("#attendeeId")');
-- Step #4: insert real XML fragment
SET #cartXML.modify('insert sql:variable("#ExtendedData") into
(/OBJECT/SUBTYPE[#NAME="SubType1"]/OBJECT/FIELD[#NAME="__ExtendedData"])[1]');
-- test
SELECT #cartXML;

SOLR - split string field into list

In Solr 8.9, I want to index string field as list by splitting it.
It works partially.
If my source string is A|B|C.
After indexing, the Solr output is :
"field": ["A|B|C", "A", B", "C"]
I would like it to be :
"field": ["A", B", "C"]
Could someone explain me, why I have in my multivalued field, the source string and the splitted string ?
My data_config.xml
<document>
<entity name="items"
query="SELECT Id, Structures FROM Items"
transformer="RegexTransformer"
>
<field column="structures" splitBy="\|" sourceColName="Structures" />
</entity>
</document>
Below is the field definition in schema.xml file
<field name="structures" type="text_general" indexed="true" stored="true" multiValued="true" />
It's resolved after changing my column name.
My query in data_config.xml :
query="SELECT Id, Structures as structures FROM Items"
The field definition in schema.xml file
<field column="structures" splitBy="\|" sourceColName="structures" />

XQuery SQL Extract text value from a child element of a specific node

I'm struggling with the following situation: I need to extract that '10000' after
...name="stateid"> <from>10000</from>
I've managed to extract some that I need from the first node audittrail
SELECT ref.value ('#datetime', 'nvarchar(364)') as [LastTimeActioned],
ref.value ('#changedby', 'nvarchar(364)') as [Actioned_By]
FROM Audit
CROSS APPLY audit.Xml.nodes ('/audittrail') R(ref)
<audittrail id="137943" datetime="29-Feb-2016 15:42:06" changedby="quality" type="update">
<fields>
<field id="10022" name="Content Control">
<mergedValue>dsad</mergedValue>
<from />
<to>dsad</to>
</field>
<field id="10027" name="Document Controller">
<mergedValue>quality</mergedValue>
<from />
<to>quality</to>
</field>
<field id="10028" name="Document Owner">
<mergedValue>quality</mergedValue>
<from />
<to>quality</to>
</field>
<field id="10029" name="Document Type">
<mergedValue>Contract/Agreement</mergedValue>
<from />
<to>Contract/Agreement</to>
</field>
<field id="10067" name="StateId">
<from>10000</from>
<to>10000</to>
</field>
....
</fields>
</audittrail>
You're looking for the Xpath:
(fields/field[#name="StateId"]/from)[1]
i.e. Find me the "fields/field" element with the attribute name of value StateId, and select the contents of the immediate child from element:
SELECT
ref.value ('#datetime', 'nvarchar(364)') as [LastTimeActioned],
ref.value ('#changedby', 'nvarchar(364)') as [Actioned_By],
ref.value ('(fields/field[#name="StateId"]/from)[1]', 'integer') as StateIdFrom
FROM Audit
CROSS APPLY audit.Xml.nodes ('/audittrail') R(ref)
SqlFiddle Here
Note that you need to explicitly select the first text element result in XQuery (hence the extra parenthesis() and [1])
More technically correctly, you could also restrict the selection to the text() of the child from, i.e.:
(fields/field[#name="StateId"]/from/text())[1]

What determines which More button an option shows in?

In OpenERP there is the More button in the tree view, and the More button in the form view. Sometimes options show in one, or the other, or both -- what determines which one an option shows in?
Which <More> button an option is available in is determined by the table:field ir.actions.act_window:multi and ir.values:key2 pair.
The <act_window> shortcut (which makes the option available only in the form view More) looks like this:
<act_window
name="My Custom Name Here"
id="model_table_whatever_name"
res_model="model.table"
src_model="another_model.table"
/>
which, by default, also sets:
multi = False
key2 = 'client_action_relate'
view_type = 'form'
view_mode = 'tree,form'
target = 'current'
The value combinations/results of multi/key2are:
multi / key2 --> tree More / form More
0 / client_action_relate --> No / Yes
1 / client_action_relate --> Yes / No
1 / client_action_multi --> Yes / No
0 / client_action_multi --> Yes / Yes
If you need/want more control over all the fields created in the two tables that the shortcut effects:
<record id="action_model_table_whatever_name" model="ir.actions.act_window">
<field name="name">My Custom Name Here</field>
<field name="type">ir.actions.act_window</field>
<field name="res_model">model.table</field>
<field name="src_model">another_model.table</field>
<field name="multi" eval="0"/>
... more fields here ...
</record>
<record id="model_table_whatever_name" model="ir.values">
<field name="name">My Custom Name Here</field>
<field name="model">sample.request</field>
<field name="value" eval="'ir.actions.act_window,' + str(ref('action_model_table_whatever_name'))"/>
<field name="key2">client_action_relate</field>
... more fields here ...
</record>
Note: While the default key2 value in the act_window shortcut is 'client_action_relate', the default value for key2 when using the record format is 'tree_but_open' -- so you can omit it when using the shortcut, but you must include it when using the record style.

How to get BCP to generate a Format File for importing fixed-width data into a SQL Server table?

The bcp command that I am using:
bcp TableName format nul -c -f c:\folder\TargetFile.xml -x -S ServerName -T -q
I think I just need the fields to have a type of xsi:type="CharFixed" rather then xsi:type="CharTerm".
The xml that it creates which doesn't work for me:
<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<RECORD>
<FIELD ID="1" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="24" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="2" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="150" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="3" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="150" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="4" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="20" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="5" xsi:type="CharTerm" TERMINATOR="\r\n" MAX_LENGTH="12" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
</RECORD>
<ROW>
<COLUMN SOURCE="1" NAME="UID" xsi:type="SQLNCHAR"/>
<COLUMN SOURCE="2" NAME="FNAME" xsi:type="SQLNCHAR"/>
<COLUMN SOURCE="3" NAME="LNAME" xsi:type="SQLNCHAR"/>
<COLUMN SOURCE="4" NAME="PHONE" xsi:type="SQLNCHAR"/>
<COLUMN SOURCE="5" NAME="Target" xsi:type="SQLNCHAR"/>
</ROW>
</BCPFORMAT>
What I actually need: (xsi:type="CharFixed")
<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<RECORD>
<FIELD ID="1" xsi:type="CharFixed" LENGTH="3"/>
<FIELD ID="2" xsi:type="CharFixed" LENGTH="3"/>
</RECORD>
<ROW>
<COLUMN SOURCE="1" NAME="Field1" xsi:type="SQLCHAR" LENGTH="3"/>
<COLUMN SOURCE="2" NAME="Field2" xsi:type="SQLCHAR" LENGTH="3"/>
</ROW>
</BCPFORMAT>
Here is the method I created to help me resolve my issue...
private XmlDocument CreateFormatFile()
{
const string xsiURI = "http://www.w3.org/2001/XMLSchema-instance";
var ff = new XmlDocument();
var dec = ff.CreateXmlDeclaration("1.0", null, null);
ff.AppendChild(dec);
var bcpFormat = ff.CreateElement("BCPFORMAT");
bcpFormat.SetAttribute("xmlns", "http://schemas.microsoft.com/sqlserver/2004/bulkload/format");
bcpFormat.SetAttribute("xmlns:xsi", xsiURI);
var record = ff.CreateElement("RECORD");
var row = ff.CreateElement("ROW");
for (var x = 0; x < Columns.Count; x++)
{
var col = Columns[x];
var id = (col.Index + 1).ToString();
var length = col.Length.ToString();
var column = ff.CreateElement("COLUMN");
column.SetAttribute("SOURCE", id);
column.SetAttribute("NAME", col.Name);
column.SetAttribute("type", xsiURI, "SQLCHAR");
column.SetAttribute("LENGTH", length);
var field = ff.CreateElement("FIELD");
field.SetAttribute("ID", id);
if (x != Columns.Count - 1)
{
field.SetAttribute("type", xsiURI, "CharFixed");
field.SetAttribute("LENGTH", length);
}
else
{
field.SetAttribute("type", xsiURI, "CharTerm");
field.SetAttribute("TERMINATOR", #"\r\n");
}
record.AppendChild(field);
row.AppendChild(column);
}
bcpFormat.AppendChild(record);
bcpFormat.AppendChild(row);
ff.AppendChild(bcpFormat);
return ff;
}
Try using the bcp Native Format Option:
How bcp Handles Data in Native Format
...
char or varchar data
At the beginning of each char or varchar field, bcp adds the prefix length.
You'd use the "-n" option instead of "-c":
bcp TableName format nul -n -f c:\folder\TargetFile.xml -x -S ServerName -T -q