Indexing one to many relational structure in solr

Indexing one to many relational structure in solr - indexing

I have a schema like the following that I want to index in SOLR. But I am not sure how to manage the one to many relationship between the first table: users and second table: address
ID NAME DESCRIPTION
-------------------------------------------------
1 NAME1 DEMO DESCRIPTION ONE
2 NAME2 DEMO DESCRIPTION TWO
3 NAME3 DEMO DESCRIPTION THREE
-------------------------------------------------
ADDR_ID USER_ID CITY STATE COUNTRY
-------------------------------------------------
1 1 cityv statev countryv
2 1 cityw statew countryw
3 2 cityx statex countryx
4 2 cityy statey countryy
5 3 cityz statez countryz
How to index users with multiple addresses?
Also how to search them by either users.name, address.city / address.state name?

Try Block Join with parent/child structure

Related

relational link within one table

I have Table User, which consists of two columns - id_user and name_user?
id_user
name_user
1
Vova
2
John
3
Ivan
4
Kate
I need to make relation between two users. If I understood correctly I should make Table relation_ids:
id_user_1
id_user_2
1
2
1
3
1
4
2
3
Tell me please do I need to double relationships between two users (for example 1-2 and 2-1)? Is there another ways to make relation within one table?

Knex.js Getting values from comma-separated

I have two SQlite3 tables task and tags
task is my master table and tags is storing tag names
I store comma-separated values in task
Now I want to get Tag names with use of a knex.js
table task
id task tags
---------------------
1 abc 1,2,3
2 xyz 3,1
3 apple 2
table tags
id tag
------------
1 cold
2 hot
3 normal
Now i want output as below
OUTPUT:
id task tags
---------------------
1 abc cold,hot,normal
2 xyz normal,cold
3 apple hot
I know i will have to use joins but not sure how to actually use it in knex.js. Please do help me.

Part of the problem is that your database is not properly normalised. Instead of having two tables task and tabs, with table tasks containing multiple tag IDs in the column 'tags' you should have three tables; 'tasks', 'tags' and the 'joining' table 'task_tags'. They would store the following data...
Tasks
id task
----------
1 abc
2 xyz
3 apple
Tags
id tag
------------
1 cold
2 hot
3 normal
task_tags
task_id tag_id
1 1
1 2
1 3
2 1
2 3
3 2
Now you can have as many tags as you like (whether or not any tasks use them) and as many tasks as you like (whether or not they use any tags) and you associate a task with it's tags via the task_tags table.
Then to get the result you want you would use the select
SELECT
tasks.id,
tasks.task,
GROUP_CONCAT(tags.tag) -- this gives you the csv line eg cold,hot,normal
from tasks
left join task_tags
ON tasks.id = task_tags.task_id
left join tags
on tags.id = task_tags.tag_id
GROUP BY task.id, tags.id
see https://www.sqlite.org/lang_aggfunc.html for explanation of GROUP_CONCAT

Your task table should be redesigned to hold one tag per row, not multiple tags in a single row:
id task tag
---------- ---------- ----------
1 abc 1
1 abc 2
1 abc 3
2 xyz 3
2 xyz 1
3 apple 2
Then it's easy:
SELECT task.id, task.task, group_concat(tags.tag, ',') AS tags
FROM task
JOIN tags ON task.tag = tags.id
GROUP BY task.id, task.task
ORDER BY task.id;
which gives
id task tags
---------- ---------- ---------------
1 abc cold,hot,normal
2 xyz normal,cold
3 apple hot
A design that follows the rules of relational databases makes life much easier (And the above can be normalized further; see the other answer); while some databases do support array types, sqlite is not one of them. If you insist on keeping your current design, though, there's an ugly hack involving the JSON1 extension and turning your CSV list of numbers into a JSON array:
SELECT task.id, task.task, group_concat(tags.tag, ',') AS tags
FROM task
JOIN json_each('[' || task.tags || ']') AS j
JOIN tags ON tags.id = j.value
GROUP BY task.id, task.task
ORDER BY task.id;

Table Join issue

Right now I've got a Main table in which I am uploading data. Because the Main table has many different duplicates, I Append various data out of the Main table into other tables such as, username, phone number, and locations in order to keep things optimized. Once I have everything stripped down from the Main table, I then append what's left into a final optimized Main table. Before this happens though, I run a select query joining all the stripped tables with the original Main table in order to connect the IDs from each table, with the correct data. For example:
Original Main Table
--Name---------Number------Due Date-------Location-------Charges Monthly-----Charges Total--
John Smith 111-1111 4/3 Chicago 234.56 500.23
Todd Jones 222-2222 4/3 New York 174.34 323.56
John Smith 111-1111 4/3 Chicago 274.56 670.23
Bill James 333-3333 4/3 Orlando 100.00 100.00
This gets split into 3 tables (name, number, location) and then there is a date table with all the dates for the year:
Name Table Number Table Location Table Due Date Table
--ID---Name------ -ID--Number--------- ---ID---Location---- --Date---
1 John Smith 1 111-1111 1 Chicago 4/1
2 Todd Jones 2 222-2222 2 New York 4/2
3 Bill James 3 333-3333 3 Orlando 4/3
Before The Original table gets stripped, I run a select query that grabs the ID from the 3 new tables, and joins them based on the connection they have with the original Main table.
Select Output
--Name ID----Number ID---Location ID---Due Date--
1 1 1 4/3
2 2 2 4/3
1 1 1 4/3
3 3 3 4/3
My issue comes when I need to introduce a new table that isn't able to be tied into the Original Main Table. I have an inventory table that, much like the original Main table, has duplicates and needs to be optimized. I do this by creating a secondary table that takes all the duplicated devices out and put them in their own table, and then strips the username and number out and puts them into their tables. I would like to add the IDs from this new device table into the select output that I have above. Resulting in:
Select Output
--Name ID----Number ID---Location ID---Due Date--Device ID---
1 1 1 4/3 1
2 2 2 4/3 1
1 1 1 4/3 2
3 3 3 4/3 1
Unlike the previous tables, the device table has no relationship to the originalMain Table, which is what is causing me so much headache. I can't seem to find a way to make this happen...is there anyway to accomplish this?

Any two tables can be joined. A table represents an application relationship. In some versions (not the original) of Entity-Relationship Modelling (notice that the "R" in E-R stands for "(application) relationship"!) a foreign key is sometimes called a "relationship". You do not need other tables or FKs to join any two tables.
Explain, in terms of its column names and the values for those names, exactly when a row should turn up in the result. Maybe you want:
SELECT *
FROM the stripped-and-ID'd version of the Original AS o
JOIN the stripped-and-ID'd version of the Device AS d
USING NameID, NumberID, LocationID and DueDate
Ie
SELECT *
FROM the stripped-and-ID'd version of the Original AS o
JOIN the stripped-and-ID'd version of the Device AS d
ON o.NameID=d.NameId AND o.NumberID=d.NumberID
AND o.LocationID=d.LocationID AND o.DueDateID=d.DueDate.
Suppose p(a,...) is some statement parameterized by a,... .
If o holds the rows where o(NameID,NumberID,LocationID,DueDate) and d holds the rows where d(NameID,NumberID,LocationID,DueDate,DeviceID) then the above holds the rows where o(NameID, NumberID, LocationID, DueDate) AND d(NameID,NumberID,LocationID,DueDate,DeviceID). But you really have not explained what rows you want.

The only way to "join" tables that have no relation is by unioning them together:
select attribute1, attribute2, ... , attributeN
from table1
where <predicate>
union // or union all
select attribute1, attribute2, ... , attributeN
from table2
where <predicate>
the where clauses are obviously optional
EDIT
optionally you could join the tables together by stating ON true which will act like a cross product

SQL Two SELECT vs. JOIN best performance?

I wonder which has better performance in this case. First of all, I want to show to the user his medical information. I have two tables
user
-----
id_user | type_blood | number | ...
1 O 123
2 A+ 442
user_allergies
-----------
id_user | name
1 name1
1 name2
I want to return:
JSON {id_user=1, type_blood=0, allergies=(name1,name2)}
So, Its better do a JOIN for user and user_allergies and iterate, or maybe two SELECT?
But if then I have another table like user_allergies, that the result can be:
user_another_table
-----------
id_user | name
1 namet1
1 namet2
1 namet3
JSON {id_user=1, type_blood=0, allergies=(name1,name2), table=(namet1,namet2,namet3)}
It's better three SELECT or a JOIN, but then I have to iterate on the results and I can't imagine a esay way. A JOIN can give me a result like:
id_user | type_blood | allergy_name | another_table_name
1 O name1 namet1
1 O name1 namet2
1 O name1 namet3
1 O name2 namet1
1 O name2 namet2
1 O name2 namet3
Is there any way to extract:
id_user | type_blood | allergy_name | another_table_name
1 O name1 namet1
1 O name2 namet2
1 O namet3
Thanks community, I'm newbie in SQL

Depending on the data - there is no way to get the 2nd set of results you've shown, if the 1st set of results shows the values. The 2nd one is throwing data away - in this case allergy 'name2' for another_table_name 'namet3'. This is why you get many rows back with repeated data.
You can use the group by clause to restrict this in some cases, but again - it won't let you throw away data like that.
You could try using the COALESCE clause, if your DB supports it.
If not, I think you're going to have to construct your JSON in some business logic, in which case its fine to read the data in a 3-way join. You order by the user id and either create or append the row data to the JSON document depending if a user record is present or not (if you order by user id, you only need to keep track of when the user id value changes).
Alternatively, you can read a list of users and single-item data in one query, and then ht the DB again for the repeating data.

Using different columns values twice in a single SQL query?

I have a mySQL table called "User" containing multiple mixed values as this:
[user_id] [user_email] [birthday]
---------------------------------
1 x#xxx.com 01/01/1981
2 y#yyy.com 02/02/1982
3 z#zzz.com 03/03/1983
I have another table called "Name" which contains name of the user, but also of some movies like this:
[node_id] [name] [user_id]
----------------------------------
9 John Doe 1
10 Star Wars 90
11 Mike Smith 2
12 Mary Lord 3
13 Rocky III 91
Finally, I have a third table named "Vote" with which is a relationship between a user and some movies he likes.
[vote_id] [node_id] [user_id]
------------------------------
1 10 1
2 10 2
3 13 1
12 10 3
13 13 2
What I'm struggling to do is pull a query with twice the "name" value for two separate things: the name of the user, and the name of the movie he likes. Like this:
[user_id] [user_name] [Birthday] [movie_name]
-------------------------------------------------
1 John Doe 01/01/1981 Star Wars
2 Mike Smith 02/02/1982 Star Wars
1 John Doe 01/01/1981 Rocky III
3 Mary Lord 03/03/1983 Rocky III
2 Mike Smith 02/02/1982 Rocky III
SELECT user.id,
node.name,
user.birthday,
IF(node.type = "movie", node.name, "")
FROM user,
node
JOIN vote ON vote.user_id = user.user_id
WHERE user.id = node.id
I think I'm all mixed up... anyone can help please?

Assuming your schema is exactly what you posted above this should work verbatim.
Query
SELECT user.user_id,
node.name user_name,
user.birthday,
(select node.name from node where node_id = vote.node_id) as movie_name
FROM user
JOIN node ON user.user_id = node.user_id
JOIN vote ON vote.user_id = user.user_id
Result

You have got the database structure wrong. Store the user name in your first table "User"

I would strongly suggest that you store the user_name in the users table. With that change you can then have a much more simple query and a properly normalized schema.
New proposed schema.
users table
(Added user_name column)
[user_id][name][user_email][birthday]
1 name1 x#xxx.com 01/01/1981
2 name2 y#yyy.com 02/02/1982
3 name3 z#zzz.com 03/03/1983
nodes table (call this movies)
(removed user entries and the user_id column as you'll be using votes to link these to users)
[node_id] [name]
10 Star Wars
11 Mike Smith
12 Mary Lord
13 Rocky III
votes table (call this something like movies_users)
(removed the vote_id column as it's just a join table)
[node_id] [user_id]
10 1
10 2
13 1
10 3
13 2
Then your query should look something like this:
select users.user_id, users.name, users.birthday, nodes.name as movie_name
from users
join votes on users.id = votes.user_id
join nodes on votes.node_id = nodes.node_id

select user_id,user_name,birthday,name
from user,name,vote
where (and here you do all the joins like user_id from one table equals user_id from another table)
But here we have a problem which makes me impossible to understand how to write the correct code you have 2 fields in two different tables, user_name and name, you want to join the tables by this name? I don't understand.) I think you are mixing the movie names with the user names, reformulate the question please

I agree with the other answers that you would be better off if you moved the user name into the user table. However, if you are stuck with your current table structure, try this:
SELECT user.id,
uname.name user_name,
user.birthday,
movie.name movie_name
FROM user
JOIN node uname ON uname.user_id = user.user_id
JOIN vote ON vote.user_id = user.user_id
JOIN node movie ON vote.node_id = movie.id
(Assuming votes can only be cast for Movies, it should be unnecessary to blank out non-movies as these should never exist.)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Indexing one to many relational structure in solr - indexing

Try Block Join with parent/child structure

Related

relational link within one table

Knex.js Getting values from comma-separated

Table Join issue

SQL Two SELECT vs. JOIN best performance?

Using different columns values twice in a single SQL query?

Categories

Resources