Selecting rows using multiple LIKE conditions from a table field - sql

I created a table out of a CSV file which is produced by an external software.
Amongst the other fields, this table contains one field called "CustomID".
Each row on this table must be linked to a customer using the content of that field.
Every customer may have one or more set of customIDs at their own discretion, as long as each sequence starts with the same prefix.
So for example:
Customer 1 may use "cust1_n" and "cstm01_n" (where n is a number)
Customer 2 may use "customer2_n"
ImportedRows
PKID CustomID Description
---- --------------- --------------------------
1 cust1_001 Something
2 cust1_002 ...
3 cstm01_000001 ...
4 customer2_00001 ...
5 cstm01_000232 ...
..
Now I have created 2 support tables as follows:
Customers
PKID Name
---- --------------------
1 Customer 1
2 Customer 2
and
CustomIDs
PKID FKCustomerID SearchPattern
---- ------------ -------------
1 1 cust1_*
2 1 cstm01_*
3 2 customer2_*
What I need to achieve is the retrieval of all rows for a given customer using all the LIKE conditions found on the CustomIDs tables for that customer.
I have failed miserably so far.
Any clues, please?
Thanks in advance.
Silver.

To use LIKE you must replace the * with % in the pattern. Different dbms use different functions for string manipulation. Let's assume there is a REPLACE function available:
SELECT ir.*
FROM ImportedRows ir
JOIN CustomIDs c ON ir.CustomID LIKE REPLACE(c.SearchPattern, '*', '%')
WHERE c.FKCustomerID = 1;

Related

Hibernate criteria left join with query

I have two classes Apartment and AdditionalSpace representing tables as below.
Apartment table
ID AREA SOLD
---- ------ ----
1 100 1
2 200 0
AdditionalSpace table
ID AREA APARTMENTID
---- ------ -----------
10 10 1
11 10 1
12 10 1
20 20 2
21 20 2
As you can see Apartment's table has a one-to-many relation with AdditionalSpace table, i.e. Apartment.ID=AdditionalSpace.APARTMENTID.
Question:- How to retrieve total area of a sold apartment including its additional space area.
The SQL which I have used so far to retrieve similar result is :-
select sum(apt.area + ads.adsarea) from apartment apt left outer join (select sum(area) as adsarea, apartmentid from additionalspace group by apartmentid) ads on ads.apartmentid=apt.id where apt.sold=1
I am struggling to find a way in order to implement the above scenario via criteria instead of SQL/HQL. Please suggest. Thanks.
I don't think this is possible in criteria. The closest I can see is to simply get the size of the apartment and the sum of the additional areas as two columns in your result, like this:
Criteria criteria = session.createCriteria(Apartment.class,"a");
criteria.createAlias("additionalSpaces", "ads");
criteria.setProjection(Projections.projectionList()
.add(Projections.property("area"))
.add(Projections.groupProperty("a.id"))
.add(Projections.sum("ads.area")));
Alternatively, if you still want to use Hibernate but are happy to write it in HQL, you can do the following:
select ads.apartment.id,max(a.area)+sum(ads.area)
from Apartment a
join a.additionalSpaces ads
group by ads.apartment.id
This works because HQL allows you to write the + to add together the two projections, but I don't know that an analogous method exists on the projections api.

SQL Two SELECT vs. JOIN best performance?

I wonder which has better performance in this case. First of all, I want to show to the user his medical information. I have two tables
user
-----
id_user | type_blood | number | ...
1 O 123
2 A+ 442
user_allergies
-----------
id_user | name
1 name1
1 name2
I want to return:
JSON {id_user=1, type_blood=0, allergies=(name1,name2)}
So, Its better do a JOIN for user and user_allergies and iterate, or maybe two SELECT?
But if then I have another table like user_allergies, that the result can be:
user_another_table
-----------
id_user | name
1 namet1
1 namet2
1 namet3
JSON {id_user=1, type_blood=0, allergies=(name1,name2), table=(namet1,namet2,namet3)}
It's better three SELECT or a JOIN, but then I have to iterate on the results and I can't imagine a esay way. A JOIN can give me a result like:
id_user | type_blood | allergy_name | another_table_name
1 O name1 namet1
1 O name1 namet2
1 O name1 namet3
1 O name2 namet1
1 O name2 namet2
1 O name2 namet3
Is there any way to extract:
id_user | type_blood | allergy_name | another_table_name
1 O name1 namet1
1 O name2 namet2
1 O namet3
Thanks community, I'm newbie in SQL
Depending on the data - there is no way to get the 2nd set of results you've shown, if the 1st set of results shows the values. The 2nd one is throwing data away - in this case allergy 'name2' for another_table_name 'namet3'. This is why you get many rows back with repeated data.
You can use the group by clause to restrict this in some cases, but again - it won't let you throw away data like that.
You could try using the COALESCE clause, if your DB supports it.
If not, I think you're going to have to construct your JSON in some business logic, in which case its fine to read the data in a 3-way join. You order by the user id and either create or append the row data to the JSON document depending if a user record is present or not (if you order by user id, you only need to keep track of when the user id value changes).
Alternatively, you can read a list of users and single-item data in one query, and then ht the DB again for the repeating data.

How to make SQL query that will combine rows of result from one table with rows of another table in specific conditions in SQLite

I have aSQLite3 database with three tables. Sample data looks like this:
Original
id aName code
------------------
1 dog DG
2 cat CT
3 bat BT
4 badger BDGR
... ... ...
Translated
id orgID isTranslated langID aName
----------------------------------------------
1 2 1 3 katze
2 1 1 3 hund
3 3 0 3 (NULL)
4 4 1 3 dachs
... ... ... ... ...
Lang
id Langcode
-----------
1 FR
2 CZ
3 DE
4 RU
... ...
I want to select all data from Original and Translated in way that result would consist of all data in Original table, but aName of rows that got translation would be replaced with aName from Translated table, so then I could apply an ORDER BY clause and sort data in the desired way.
All data and table designs are examples just to show the problem. The schema does contain some elements like an isTranslated column or translation and original names in separate tables. These elements are required by application destination/design.
To be more specific this is an example rowset I would like to produce. It's all the data from table Original modified by data from Translated if translation is available for that certain id from Original.
Desired Result
id aName code isTranslated
---------------------------------
1 hund DG 1
2 katze CT 1
3 bat BT 0
4 dachs BDGR 1
... ... ... ...
This is a typcial application for the CASE expression:
SELECT Original.id,
CASE isTranslated
WHEN 1 THEN Translated.aName
ELSE Original.aName
END AS aName,
code,
isTranslated
FROM Original
JOIN Translated ON Original.id = Translated.orgID
WHERE Translated.langID = (SELECT id FROM Lang WHERE Langcode = 'DE')
If not all records in Original have a corresponding record in Translated, use LEFT JOIN instead.
If untranslated names are guaranteed to be NULL, you can just use IFNULL(Translated.aName, Original.aName) instead.
You should probably list the actual results you want, which would help people help you in the future.
In the current case, I'm guessing you want something along these lines:
SELECT Original.id, Original.code, Translated.aName
FROM Original
JOIN Lang
ON Lang.langCode = 'DE'
JOIN Translated
ON Translated.orgId = Original.id
AND Translated.langId = Lang.id
AND Translated.aName IS NOT NULL;
(Check out my example to see if these are the results you want).
In any case, the table set you've got is heading towards a fairly standard 'translation table' setup. However, there are some basic changes I'd make.
Original
Name the table to something specific, like Animal
Don't include a 'default' translation in the table (you can use a view, if necessary).
'code' is fine, although in the case of animals, genus/species probably ought to be used
Lang
'Lanugage' is often a reserved word in RDBMSs, so the name is fine.
Specifically name which 'language code' you're using (and don't abbreviate column names). There's actually (up to) three different ISO codes possible - just grab them all.
(Also, remember that languages have language-specific names, so language also needs it's own 'translation' table)
Translated
Name the table entity-specific, like AnimalNameTranslated, or somesuch.
isTranslated is unnecessary - you can derive it from the existence of the row - don't add a row if the term isn't translated yet.
Put all 'translations' into the table, including the 'default' one. This means all your terms are in one place, so you don't have to go looking elsewhere.

SQL - ordering results by parent child

i have entries in my table of products and categories with columns id and parent.
lets say i have the following
0 ----- 0 ------ home
1 ----- 4 ------ PD1
2 ----- 0 ------ CAT1
3 ----- 2 ------ PD2
4 ----- 2 ------ CAT2
the fist col being the id, second being parent and a title at the end.
is there a way (using ORDER or some other method) of returning the results in the following order?
0 ----- 0 ------ home
2 ----- 0 ------ CAT1
3 ----- 2 ------ PD2
4 ----- 2 ------ CAT2
1 ----- 4 ------ PD1
Try this:
SELECT id, parent, title
FROM yourtable
ORDER BY parent, id
try this
SELECT * FROM yourtablename ORDER BY parentfieldname
could it be as simple as
ORDER BY ParentID, ID
Firstly, if you want to order in a custom way (not using PKs or alphabetic on a name field), you need to add a field to define the ordering weight of the various objects. I would add a field to the table called something like ordering_weight - you do not want to use the field name order or sequence b/c they are reserved SQL words.
Secondly, you need an order by clause: ORDER BY top_level.ordering_weight, next_level.ordering_weight, ..., deepest_level.ordering_weight Notice that my order by clause orders first by the highest level of my tree, and last by the lowest or deepest level of the tree.
Of course, disregard the above if all you are looking fo a dynamic level of recursion.
Generally when I see a parent child relationship like this I see people wanting to do more than 1 level of recursion. The problem with your schema is that It doesn't support dynamic levels of recursion as it is. You can only fetch the top level parent's children, every additional level requires another join (there are some clever ways to get past this, but they still require additional SQL per level).
I think what might be more useful to you is to look into the Nested Set Model, which allows querying of infinite levels of recursion.
see: http://en.wikipedia.org/wiki/Nested_set_model
For example the following tree of parent child relationships is extremely difficult using standard joins, but is very easy using a model like nested set.
Category A
- Category B
- - Category D
- Category E
Category F
- Category G
- - Category H
- - - Category I
- - - - Category J

SQL Alternative to performing an INNER JOIN on a single table

I have a large table (TokenFrequency) which has millions of rows in it. The TokenFrequency table that is structured like this:
Table - TokenFrequency
id - int, primary key
source - int, foreign key
token - char
count - int
My goal is to select all of the rows in which two sources have the same token in it. For example if my table looked like this:
id --- source --- token --- count
1 ------ 1 --------- dog ------- 1
2 ------ 2 --------- cat -------- 2
3 ------ 3 --------- cat -------- 2
4 ------ 4 --------- pig -------- 5
5 ------ 5 --------- zoo ------- 1
6 ------ 5 --------- cat -------- 1
7 ------ 5 --------- pig -------- 1
I would want a SQL query to give me source 1, source 2, and the sum of the counts. For example:
source1 --- source2 --- token --- count
---- 2 ----------- 3 --------- cat -------- 4
---- 2 ----------- 5 --------- cat -------- 3
---- 3 ----------- 5 --------- cat -------- 3
---- 4 ----------- 5 --------- pig -------- 6
I have a query that looks like this:
SELECT F.source AS source1, S.source AS source2, F.token,
(F.count + S.count) AS sum
FROM TokenFrequency F
INNER JOIN TokenFrequency S ON F.token = S.token
WHERE F.source <> S.source
This query works fine but the problems that I have with it are that:
I have a TokenFrequency table that has millions of rows and therefore need a faster alternative to obtain this result.
The current query that I have is giving duplicates. For example its selecting:
source1=2, source2=3, token=cat, count=4
source1=3, source2=2, token=cat, count=4
Which isn't too much of a problem but if there is a way to elimate those and in turn obtain a speed increase then it would be very useful
The main issue that I have is speed of the query with my current query it takes hours to complete. The INNER JOIN on a table to itself is what I believe to be the problem. Im sure there has to be a way to eliminate the inner join and get similar results just using one instance of the TokenFrequency table. The second problem that I mentioned might also promote a speed increase in the query.
I need a way to restructure this query to provide the same results in a faster, more efficient manner.
Thanks.
I'd need a little more info to diagnose the speed issue, but to remove the dups, add this to the WHERE:
AND F.source<S.source
Try this:
SELECT token, GROUP_CONCAT(source), SUM(count)
FROM TokenFrequency
GROUP BY token;
This should run a lot faster and also eliminate the duplicates. But the sources will be returned in a comma-separated list, so you'll have to explode that in your application.
You might also try creating a compound index over the columns token, source, count (in that order) and analyze with EXPLAIN to see if MySQL is smart enough to use it as a covering index for this query.
update: I seem to have misunderstood your question. You don't want the sum of counts per token, you want the sum of counts for every pair of sources for a given token.
I believe the inner join is the best solution for this. An important guideline for SQL is that if you need to calculate an expression with respect to two different rows, then you need to do a join.
However, one optimization technique that I mentioned above is to use a covering index so that all the columns you need are included in an index data structure. The benefit is that all your lookups are O(log n), and the query doesn't need to do a second I/O to read the physical row to get other columns.
In this case, you should create the covering index over columns token, source, count as I mentioned above. Also try to allocate enough cache space so that the index can be cached in memory.
If token isn't indexed, it certainly should be.