Sphinx Multiple Sources - One Index

Sphinx Multiple Sources - One Index - indexing

Assuming that I have a database with 20 some tables, all with the same schema, how do I create one index for all tables?
If I have ONE index per table, the search works just fine.
I successfully created ONE index for the 20 tables, but every search returns the first record of the first table.
Index conf:
index all_table_index
{
type = plain
source = TABLE1
source = TABLE2
source = TABLE3
source = TABLE4
source = TABLE5
...
path = /data/sphinx/all_table_index
#docinfo = extern
charset_type = utf-8
}
Additionally: The unique integer field has duplicates (Primary ID Auto increment - same for each table!). Does that affect any search for any other fields?
Thank you for your help.

> The unique integer field has duplicates (Primary ID Auto increment - same for each table!).
That is your problem.
The document-id MUST be unique. Its how sphinx tracks documents, so if you have multiple documents with the same id, they will override each other, and you then have no way to distingish the seperate underlying documents.
... So you need to arrange for the IDS to be unique.
There are many ways to do it, eg
sourse TABLE1 {
sql_query = SELECT id*20 as id, ... from table1
sourse TABLE2 {
sql_query = SELECT (id*20)+1 as id, ... from table2
sourse TABLE3 {
sql_query = SELECT (id*20)+2 as id, ... from table3
etc...

Usually use this
source TABLE1 {
sql_query = SELECT CONCAT(id,01) as id, 'table1' AS source, '01' AS source_code, ... from table1
sql_attr_string = source
sql_attr_str2ordinal = source_code
01 and not 1, 001 if more than 100 sources (but never happen for me)
'table1' AS source just to easy know the source"
'01' AS source_code just to easy filter, or groupby

Related

Update all rows for one column in a table with data in another table

I appreciate any advice on this..
I have two tables where I have to update a column in my primary table with data that resides in another secondary table. I cannot rely on views, etc as this data has to be able to be edited by the user in APEX in the future. I am basically pre-populating the data for the users to reduce their manual entry.
Primary Table = Table 1
Secondary Table = Table 2
Columns to be updated in Table 1 = FTE_ID, ACCOUNT_TYPE
Columns where the data will come from Table 2 = R_ID, ACCOUNT_TYPE
Common column in both tables = TABLE1.FID AND TABLE2.FID
Here is what I have tried, but I get "single-row subquery returns more than one row" because there are multiple table1.fid rows in table1. I basically want to perform this update for ALL rows where TABLE1.FID = TABLE2.FID.
Here is my attempt:
UPDATE TABLE1
SET TABLE1.FTE_ID =
(SELECT TABLE2.R_ID FROM TABLE2 WHERE TABLE1.FID = TABLE2.FID);
Error:
single-row subquery returns more than one row
Thanks for your help,

You can fix the proximate problem by using aggregation or row number:
UPDATE TABLE1
SET TABLE1.FTE_ID = (SELECT MAX(TABLE2.R_ID)
FROM TABLE2
WHERE TABLE1.FID = TABLE2.FID
);
The subquery can only return one row; it is an "arbitrary" value from the possible matching values.
If the field is a character field and you want all matching values, then perhaps listagg is more appropriate:
UPDATE TABLE1
SET TABLE1.FTE_ID = (SELECT LISTAGG(t2.R_ID, ',') WITHIN GROUP (ORDER BY t2.R_ID)
FROM TABLE2 t2
WHERE TABLE1.FID = t2.FID
);

Merging two tables into one with the same column names

I use this command to merge 2 tables into one:
CREATE TABLE table1 AS
SELECT name, sum(cnt)
FROM (SELECT * FROM table2 UNION ALL SELECT * FROM table3) X
GROUP BY name
ORDER BY 1;
table2 and table3 are tables with columns named name and cnt, but the result table (table1) has the columns name and sum.
The question is how to change the command so that the result table will have the columns name and cnt?

Have you tried this (note the AS cnt)?
CREATE TABLE table1 AS SELECT name,sum(cnt) AS cnt
FROM ...

In the absence of an explicit name, the output of a function inherits the basic function name in Postgres. You can use a column alias in the SELECT list to fix this - like #hennes already supplied.
If you need to inherit all original columns with name and type (and possibly more) you can also create the table with a separate command:
To copy columns with names and data types only, still use CREATE TABLE AS, but add LIMIT 0:
CREATE TABLE table1 AS
TABLE table2 LIMIT 0; -- "TABLE" is just shorthand for "SELECT * FROM"
To copy (per documentation):
all column names, their data types, and their not-null constraints:
CREATE TABLE table1 (LIKE table2);
... and optionally also defaults, constraints, indexes, comments and storage settings:
CREATE TABLE table1 (LIKE table2 INCLUDING ALL);
... or, for instance, just defaults and constraints:
CREATE TABLE table1 (LIKE table2 INCLUDING DEFAULTS INCLUDING CONSTRAINTS);
Then INSERT:
INSERT INTO table1 (name, cnt)
SELECT ... -- column names are ignored

compare data between 2 table

Hey i have a requirement to compare two tables of same structure.
Table1
EmpNO - Pkey
EmpName
DeptName
FatherName
IssueDate
ValidDate
I need to pass the EMPNO as parameter and I need to compare whether any of the column get changes? and return YES OR NO value.
can I able to do that using a PL/SQL Funcation? I was thinking of using the CONCAT in-build function to do that.
I'm trying the below one
Table1Concat = Select CONCAT(Column1.....6) from tbale1 where emp_no= in_empno;
Table2Concat = Select CONCAT(Column1.....6) from tbale2 where emp_no= in_empno;
IF(Table1Concat<>Table2Concat ) THEN return data_changed :='YES';
else data_changed :='NO';
END;

If you only want to detect whether any value is different then ...
select count(*)
from (select * from table1 where emp_no = my_emp_no
union
select * from table2 where emp_no = my_emp_no
)
If it returns 1 then the rows are the same, if it returns 2 then there is a difference.
The columns must be in the same order for this to work, or you'll have to list out all the column names in the order in which they match.
If you wanted to do this in bulk for a great many rows then you'd most likely use a different solution, s do not loop through every emp_no running this code for each one.
For bulk data where all emp_id's are present in both tables, use a query of the form:
select table1.emp_no,
case when table1.column1 = table2.column1 and
table2.column2 = table2.column2 and
table2.column3 = table2.column3 and
...
then 'Yes'
else 'No
end columns_match
from table1
join table2 on table1.emp_no = table2.emp_no
You can insert this result directly into a logging table.
Take care of null values though. "any_value = null" is never true, and "any_value != Null" is also never true, so you might need to add logic to take care of cases where one or both values are null.

Prevent the insertion of duplicate rows using SQL Server 2008

I am trying to insert some data from one table into another but I would like to prevent the insertion of duplicate rows. I have currently the following query:
INSERT INTO Table1
(
Table1Col1,
Table1Col2,
Table1Col3,
Table1Col4,
Table1Col5
)
SELECT
Table2Col1,
Table2Col2 = constant1,
Table2Col3 = constant2,
Table2Col4 = constant3,
Table2Col5 = constant4
FROM Table2
WHERE
Condition1 = constant5
AND
Condition2 = constant6
AND
Condition3 = constant7
AND
Condition4 LIKE '%constant8%'
What I do not know is that the row I am trying to insert from Table2 into Table1 might already exist and I would like to prevent this possible duplication from happening and skip the insertion and just move onto inserting the next unique row.
I have seen that I can use a WHERE NOT EXISTS clause and use of the INTERSECT keyword but I did not fully understand how to apply it to my particular query as I only want to use some of the selected data from Table2 and then some constant values to insert into Table1.
EDIT:
I should add that the columns TableCol2 through to TableCol5 don't actually exist in the result set and I am just populating these columns alongside Table2Col1 that is returned.

Since you are on SQL Server 2008, you can use a merge statement.
You can easily check if a row exists base on a key
something like this:
merge TableMain AS target
using TableA as source
ON <join tables here>
WHEN MATCHED THEN <update>
WHEN NOT MATCHED BY TARGET <Insert>
WHEN NOT MATCHED BY SOURCE <delete>

Intersect (minus in Sql Server's terms) is out of question because it compares whole row. Other two options are not in/not exists/left join and merge. Not In is for single-column prinary key only, so it is out of question in this instance. In/Exists/Left join should have the same performance in Sql Server, so I'll just use exists:
INSERT INTO Table1
(
Table1Col1,
Table1Col2,
Table1Col3,
Table1Col4,
Table1Col5
)
SELECT
Table2Col1,
Table2Col2 = constant1,
Table2Col3 = constant2,
Table2Col4 = constant3,
Table2Col5 = constant4
FROM Table2
WHERE
Condition1 = constant5
AND
Condition2 = constant6
AND
Condition3 = constant7
AND
Condition4 LIKE '%constant8%'
AND NOT EXISTS
(
SELECT *
FROM Table1 target
WHERE target.Table1Col1 = Table2.Table2Col1
AND target.Table1Col2 = Table2.Table2Col2
AND target.Table1Col3 = Table2.Table2Col3
)
Merge is used to sync two tables; it has ability to insert, update and delete records from target table.
merge into table1 as target
using table2 as source
on target.Table1Col1 = source.Table2Col1
AND target.Table1Col2 = source.Table2Col2
AND target.Table1Col3 = source.Table2Col3
when not matched by target then
insert (Table1Col1,
Table1Col2,
Table1Col3,
Table1Col4,
Table1Col5)
values (Table2Col1,
Table2Col2,
Table2Col3,
Table2Col4,
Table2Col5);
If columns from table2 are computed during transfer, in not exists() case you might use derived table in place of table2, and the same applies to merge example - just place your query in place of reference to table2.

we have check the whether the data is already exist or not in table. For this we have to use If condition to avoid the duplicate insertion

how to compare two rows in one mdb table?

I have one mdb table with the following structure:
Field1 Field2 Field3 Field4
A ...
B ...
I try to use a query to list all the different fields of row A and B in a result-set:
SELECT * From Table1
WHERE Field1 = 'A'
UNION
SELECT * From Table1
WHERE Field1 = 'B';
However this query has two problems:
it list all the fields including the
identical cells, with a large table
it gives out an error message: too
many fields defined.
How could i get around these issues?

Is it not easiest to just select all fields needed from the table, based on the Field1 value and group on the values needed?
So something like this:
SELECT field1, field2,...field195
FROM Table1
WHERE field1 = 'A' or field1 = 'B'
GROUP BY field1, field2, ....field195
This will give you all rows where field1 is A or B and there is a difference in one of the selected fields.
Oh and for the group by statement as well as the SELECT part, indeed use the previously mentioned edit mode for the query. There you can add all fields (by selecting them in the table and dragging them down) that are needed in the result, then click the 'totals' button in the ribbon to add the group by- statements for all. Then you only have to add the Where-clause and you are done.
Now that the question is more clear (you want the query to select fields instead of records based on the particular requirements), I'll have to change my answer to:
This is not possible.
(untill proven otherwise) ;)
As far as I know, a query is used to select records using for example the where clause, never used to determine which fields should be shown depending on a certain criterium.
One thing that MIGHT help in this case is to look at the database design. Are those tables correctly made?
Suppose you have 190 of those fields that are merely details of the main data. You could separate this in another table, so you have a main table and details table.
The details table could look something like:
ID ID_Main Det_desc Det_value
This way you can filter all Detail values that are equal between the two main values A and B using something like:
Select a.det_desc, a.det_value, b.det_value
(Select Det_desc, det_value
from tblDetails
where id_main = a) as A inner join
(Select Det_desc, det_value
from tblDetails
where id_main = a) as B
on A.det_desc = B.det_desc and A.det_value <> B.det_value
This you can join with your main table again if needed.

You can full join the table on itself, matching identical rows. Then you can filter on mismatches if one of the two join parts is null. For example:
select *
from (
select *
from Table1
where Field1 = 'A'
) A
full join
(
select *
from Table1
where Field1 = 'B'
) B
on A.Field2 = B.Field2
and A.Field3 = B.Field3
where A.Field1 is null
or B.Field1 is null
If you have 200 fields, ask Access to generate the column list by creating a query in design view. Switch to SQL view and copy/paste. An editor with column mode (like UltraEdit) will help create the query.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Sphinx Multiple Sources - One Index - indexing

Related

Update all rows for one column in a table with data in another table

Merging two tables into one with the same column names

compare data between 2 table

Prevent the insertion of duplicate rows using SQL Server 2008

how to compare two rows in one mdb table?

Categories

Resources