Data structure for efficient multi-parameters search - sql

I have collection of multidimensional object (e.g class Person = {age : int , height : int, weight : int etc...}).
I need to query the collection with queries where some dimensions are fixed and the rest unspecified (e.g getallPersonWith {age = c , height = a} or getAllPersonWith {weigth = d}...)
Right now i have a multimap with {age, Height,...} (e.g all dimension that can be fixed) -> List : Person.To perform a query i first compute the set of keys that verify the query, then merge the corresponding list from the map.
Is there anything better, in terms of query speed ? in particular is there anything closer to using one sorted list by dimension (which i believe to be the fastest solutions, but too cumbersome to manage:) )
Just to be clear, i am not looking for an sql query.

For your purpose you can have a look at:
http://code.google.com/p/cqengine/
Should get you in the right direction

You mean something like:
SELECT * FROM person p
WHERE gender = 'F'
AND age >=18
AND age < 30
AND weight > 60 -- metric measures here !!
AND weight < 70
AND NOT EXISTS (
SELECT * from couple c
WHERE c.one = p.id OR c.two=p.id
);
Why do you think I use SQL?

Related

Query always gives a result although it shouldn't

I'd like to print the number of a room (stanza) only if that room has a reservation (prenotazione) between two dates in order to display an error message in php when the variable with the result is set. The problem is my query seems to check if any room has a reservation set between these dates and always gives output for any room asked.
SELECT stanze.num_stanza
FROM stanze, prenotazioni
WHERE prenotazioni.num_stanza=stanze.num_stanza
AND prenotazioni.check_in
BETWEEN '20190615'
AND '20190620'
OR prenotazioni.check_out
BETWEEN '20190615'
AND '20190620'
AND stanze.num_stanza='100'
Usually it is not good to have WHERE clauses like x OR y AND z. Try (x OR y) AND z. Otherwise if x is true then it will select no matter what z is.
e.g.
WHERE prenotazioni.num_stanza = stanze.num_stanza
AND (prenotazioni.check_in BETWEEN '20190615' AND '20190620' OR prenotazioni.check_out BETWEEN '20190615' AND '20190620')
AND stanze.num_stanza = '100'
It is not clear what you want exactly -- a full overlap or a partial overlap.
Either way, you can use EXISTS. The following is for a partial overlap:
SELECT s.num_stanza
FROM stanze s
WHERE EXISTS (SELECT 1
FROM prenotazioni p
WHERE p.num_stanza = s.num_stanza AND
p.check_in < '20190620' AND
p.check_out >= '20190615'
) AND
s.num_stanza = 100 -- looks like number so it probably is

SAP Query IMRG Measure documents

I'm learning SAP queries.
I want to get all the Measure documents from an equipement.
To do that, I use 3 tables :
EQUI, IMPTT, IMRG
The query works but I have all documents instead I only want to get the last one by Date. But I can't do that. I'm sure that I have to add a custom field, but I have tried but none of them works.
For example, my last code :
select min( IMRG~INVTS ) IMRG~RECDV
from IMRG inner join IMPTT on
IMRG~POINT = IMPTT~POINT into (INVTS, IMRGVAL)
where IMRG~POINT = IMPTT-POINT AND
IMPTT~MPOBJ = EQUI-OBJNR
and IMRG~CANCL = '' group by IMRG~MDOCM IMRG~RECDV.
ENDSELECT.
Thanks for your help.
You will need to get the date from IMRG, and the inverted timestamp field, so the MIN() of this will be the most recent - that looks correct.
However your GROUP BY looks wrong. You should be grouping on the IMPTT~POINT field so that you get one record per measurement point. Note that one Point IMPTT can have many measurements (IMRG), so something like this:
SELECT EQUI-OBJNR, IMPTT~POINT, MIN(IMRG~IMRC_INVTS)
...
GROUP BY EQUI-OBJNR, IMPTT~POINT
If I got you correctly, you are trying to get the freshest measurement of the equipment disregard of measurement point. So you can try this query, which is not so beautiful, but it just works.
SELECT objnr COUNT(*) MIN( invts )
FROM equi AS eq
JOIN imptt AS tt
ON tt~mpobj = eq~objnr
JOIN imrg AS ig
ON ig~point = tt~point
INTO (wa_objnr, count, wa_invts)
WHERE ig~cancl = ''
GROUP BY objnr.
SELECT SINGLE recdv FROM imrg JOIN imptt ON imptt~point = imrg~point INTO wa_imrgval WHERE invts = wa_invts AND imptt~mpobj = wa_objnr.
WRITE: / wa_objnr, count, wa_invts, wa_imrgval.
ENDSELECT.

Comparing Elements across collections

I have the following models:
class Collection(models.Model):
...
class Record(models.Model):
collection = models.ForeignKey(Collection, related_name='records')
filename = models.CharField(max_length=256)
checksum = models.CharField(max_length=64)
class Meta:
unique_together = (('filename', 'collection'),)
I want to perform the following query:
For each filename of Record I want to know the Collections that:
Do not provide a Record with that filename
or that provide such a Record but has a differing checksum
I have in mind an output like that:
| C1 C2 C3 <- collections
-----------+------------
file-1.txt | x
file-2.txt | x
file-3.txt | ! ! !
file-4.txt | x ! !
file-5.txt | ! ! x
x = missing
! = different checksum
What I've com up so far is that I create a query for each Collection, excluding all filenames that are within this collection but exist in others.
for collection in collections:
other_collections = [c for c in collections if c is not collection]
results[collection] = qs.filter(collection__in=other_collections).exclude(
filename__in=qs.filter(
collection=collection
).values_list('filename', flat=True)
).order_by('filename').values_list('filename', flat=True)
This somewhat solves the first part of my question, but is rather quirky and requires post-processing to get to the format I desire. And, more importantly, it does not address the checksum comparison.
Is it possible to perform the two queries in one combined step to get the results in the format I described above?
The solution would not necessarily have to use the QuerySet APIs, a fallback to raw SQL is fine by me too.
It is not possible to write a SQL query that returns a variable number of columns, although you can achieve that effect if you wrap everything in an array or JSON object.
If you know the collections, you could write SQL like this:
SELECT r.filename,
(SELECT r.checksum = r2.checksum FROM records r2 WHERE r.filename = r2.filename AND r2.collection_id = 1) AS c1,
(SELECT r.checksum = r2.checksum FROM records r2 WHERE r.filename = r2.filename AND r2.collection_id = 2) AS c2,
...
FROM records r
WHERE r.collection_id = 1
GROUP BY r.filename, r.checksum
For each filename/collection pair, you will get NULL if the collection doesn't have the record, true if the collection has it with the right checksum, or false if the collection has it with a different checksum.
I include WHERE r.collection_id = 1 because otherwise for the checksum comparison, you have to answer "different from what?"

Selecting rows from Parent Table only if multiple rows in Child Table match

Im building a code that learns tic tac toe, by saving info in a database.
I have two tables, Games(ID,Winner) and Turns(ID,Turn,GameID,Place,Shape).
I want to find parent by multiple child infos.
For Example:
SELECT GameID FROM Turns WHERE
GameID IN (WHEN Turn = 1 THEN Place = 1) AND GameID IN (WHEN Turn = 2 THEN Place = 4);
Is something like this possible?
Im using ms-access.
Turm - Game turn GameID - Game ID Place - Place on matrix
1=top right, 9=bottom left Shape - X or circle
Thanks in advance
This very simple query will do the trick in a single scan, and doesn't require you to violate First Normal Form by storing multiple values in a string (shudder).
SELECT T.GameID
FROM Turns AS T
WHERE
(T.Turn = 1 AND T.Place = 1)
OR (T.Turn = 2 AND T.Place = 4)
GROUP BY T.GameID
HAVING Count(*) = 2;
There is no need to join to determine this information, as is suggested by other answers.
Please use proper database design principles in your database, and don't violate First Normal Form by storing multiple values together in a single string!
The general solution to your problem can be accomplished by using a sub-query that contains a self-join between two instances of the Turns table:
SELECT * FROM Games
WHERE GameID IN
(
SELECT Turns1.GameID
FROM Turns AS Turns1
INNER JOIN Turns AS Turns2
ON Turns1.GameID = Turns2.GameID
WHERE (
(Turns1.Turn=1 AND Turns1.Place = 1)
AND
(Turns2.Turn=2 AND Turns2.Place = 4))
);
The Self Join between Turns (aliased Turns1 and Turns2) is key, because if you just try to apply both sets of conditions at once like this:
WHERE (
(Turns.Turn=1 AND Turns.Place = 1)
AND
(Turns.Turn=2 AND Turns.Place = 4))
you will never get any rows back. This is because in your table there is no way for an individual row to satisfy both conditions at the same time.
My experience using Access is that to do a complex query like this you have to use the SQL View and type the query in on your own, rather than use the Query Designer. It may be possible to do in the Designer, but it's always been far easier for me to write the code myself.
select GameID from Games g where exists (select * from turns t where
t.gameid = g.gameId and ((turn =1 and place = 1) or (turn =2 and place =5)))
This will select all the games that have atleast one turn with the coresponding criteria.
More info on exist:
http://www.techonthenet.com/sql/exists.php
I bypassed this problem by adding a column which holds the turns as a string example : "154728" and i search for it instead. I think this solution is also less demanding on the database

Help with a complex join query

Keep in mind I am using SQL 2000
I have two tables.
tblAutoPolicyList contains a field called PolicyIDList.
tblLossClaims contains two fields called LossPolicyID & PolicyReview.
I am writing a stored proc that will get the distinct PolicyID from PolicyIDList field, and loop through LossPolicyID field (if match is found, set PolicyReview to 'Y').
Sample table layout:
PolicyIDList LossPolicyID
9651XVB19 5021WWA85, 4421WWA20, 3314WWA31, 1121WAW11, 2221WLL99 Y
5021WWA85 3326WAC35, 1221AXA10, 9863AAA44, 5541RTY33, 9651XVB19 Y
0151ZVB19 4004WMN63, 1001WGA42, 8587ABA56, 8541RWW12, 9329KKB08 N
How would I go about writing the stored proc (looking for logic more than syntax)?
Keep in mind I am using SQL 2000.
Select LossPolicyID, * from tableName where charindex('PolicyID',LossPolicyID,1)>0
Basically, the idea is this:
'Unroll' tblLossClaims and return two columns: a tblLossClaims key (you didn't mention any, so I guess it's going to be LossPolicyID) and Item = a single item from LossPolicyID.
Find matches of unrolled.Item in tblAutoPolicyList.PolicyIDList.
Find matches of distinct matched.LossPolicyID in tblLossClaims.LossPolicyID.
Update tblLossClaims.PolicyReview accordingly.
The main UPDATE can look like this:
UPDATE claims
SET PolicyReview = 'Y'
FROM tblLossClaims claims
JOIN (
SELECT DISTINCT unrolled.LossPolicyID
FROM (
SELECT LossPolicyID, Item = itemof(LossPolicyID)
FROM unrolling_join
) unrolled
JOIN tblAutoPolicyList
ON unrolled.ID = tblAutoPolicyList.PolicyIDList
) matched
ON matched.LossPolicyID = claims.LossPolicyID
You can take advantage of the fixed item width and the fixed list format and thus easily split LossPolicyID without a UDF. I can see this done with the help of a number table and SUBSTRING(). unrolling_join in the above query is actually tblLossClaims joined with the number table.
Here's the definition of unrolled 'zoomed in':
...
(
SELECT LossPolicyID,
Item = SUBSTRING(LossPolicyID,
(v.number - 1) * #ItemLength + 1,
#ItemLength)
FROM tblLossClaims c
JOIN master..spt_values v ON v.type = 'P'
AND v.number BETWEEN 1 AND (LEN(c.LossPolicyID) + 2) / (#ItemLength + 2)
) unrolled
...
master..spt_values is a system table that is used here as the number table. Filter v.type = 'P' gives us a rowset with number values from 0 to 2047, which is narrowed down to the list of numbers from 1 to the number of items in LossPolicyID. Eventually v.number serves as an array index and is used to cut out single items.
#ItemLength is of course simply LEN(tblAutoPolicyList.PolicyIDList). I would probably also declared #ItemLength2 = #ItemLength + 2 so it wasn't calculated every time when applying the filter.
Basically, that's it, if I haven't missed anything.
If the PolicyIDList field is a delimited list, you have to first separate the individual policy IDs and create a temporary table with all of the results. Next up, use an update query on the tblLossClaims with 'where exists (select * from #temptable tt where tt.PolicyID = LossPolicyID).
Depending on the size of the table/data, you might wish to add an index to your temporary table.