Slick query => duplicated result

Slick query => duplicated result - sql

I got those models (simplified) :
User(id: Int, name: String)
Restaurant(id: Int, ownerId: Int, name: String)
Employee(userId: Int, restaurantId: Int)
when I use this query :
for {
r <- Restaurants
e <- Employees
if r.ownerId === userId || (e.userId === userId && e.restaurantId === r.id)
} yield r
which is converted to :
select x2."id", x2."owner_id", x2."name" from "restaurants" x2, "employees" x3 where (x2."owner_id" = 2) or ((x3."user_id" = 2) and (x3."restaurant_id" = x2."id"))
So far no problems. But when I insert those data :
User(1, "Foo")
User(2, "Fuu")
Restaurant(1, 2, "Fuu")
Restaurant(2, 1, "Foo")
Restaurant(3, 1, "Bar")
Employee(2, 2)
Employee(2, 3)
then try to query, I get this result :
List(Restaurant(1, 2, "Fuu"), Restaurant(1, 2, "Fuu"), Restaurant(2, 1, "Foo"), Restaurant(3, 1, "Bar))
I do not understand why Restaurant(1, 2, "Fuu") is present 2 times.
(I am using org.h2.Driver with url jdbc:h2:mem:play)
Am I missing something ?

Why you are getting 4 rows back
Cross joins are hard; what you are asking for with your SQL query is:
-- A Cartesian product of all of the rows in restaurants and employees
Employee.user_id | Employee.restaurant_id | Restaurant.name | Restaurant.owner_id
2 | 2 | Fuu | 2
2 | 3 | Fuu | 2
2 | 2 | Foo | 1
2 | 3 | Foo | 1
2 | 2 | Bar | 1
2 | 3 | Bar | 1
-- Filtering out those where the owner != 2
Employee.user_id | Employee.restaurant_id | Restaurant.name | Restaurant.owner_id
2 | 2 | Fuu | 2
2 | 3 | Fuu | 2
-- And combining that set with the set of those where the employee's user_id = 2
-- and the restaurant's ID is equal to the employee's restaurant ID
Employee.user_id | Employee.restaurant_id | Restaurant.name | Restaurant.owner_id
2 | 2 | Foo | 1
2 | 2 | Bar | 1
How to fix it
Make it an explicit left-join instead:
for {
(r, e) <- Restaurants leftJoin Employees on (_.id = _.restaurantId)
if r.ownerId === userId || e.userId === userId
} yield r
Alternately, use exists to make it even clearer:
for {
r <- Restaurants
if r.ownerId === userId ||
Employees.filter(e => e.userId === userId && e.restaurantId === r.id).exists
} yield r

Related

query by specific value to query

Hi i am trying to do a stored procedure in postgresql,
and I have to fill a table (vol_raleos), from 3 others, these are the tables:
super
zona | sitio | manejo
1 | 1 | 1
2 | 2 | 2
datos_vol_raleos
zona | sitio | manejo |vol_prodn
1 | 1 | 10 | 0
2 | 2 | 15 | 0
datos_manejos
manejoVR | manejoSuper
10 | 1
15 | 2
table to fill
vol_raleos
zona | sitio | manejo |vol_prodn
1 | 1 | 1 | 0
2 | 2 | 2 | 0
So, what I do is take the data that is in datos_vol_raleos, verify that it is in super, but first I must convert the manejoVR value according to the table datos_manejos
INSERT INTO vol_raleos
(zona, sitio, manejo, edad, densidad, vol_prod1, vol_prod2, ..., vol_prod36)
select zona, sitio, manejo, edad, densidad, vol_prod1, vol_prod2, ..., vol_prod36
from (
select volr.*, sup.zona, sup.sitio, sup.manejo, dm.manejo,
from datos_vol_raleos volr
left join super sup on (sup.zona = volr.zona and sup.sitio = volr.sitio and sup.manejo = volr.manejo) selrs
order by zona, sitio, manejo, edad, densidad
) sel_min_max;
so here I don't know how to get the manejoSuper value from datos_manejos, to later compare

You can insert from a select with a couple of joins. For example:
insert into vol_raleos
select s.zona, s.sitio, s.manejo, m.manejoSuper
from super s
join datos_vol_raleos d on (d.zona, d.sitio) = (s.zona, s.sitio)
join datos_manejos m on m.manejoVR = d.manejo

SQLite - select rows by existance of date in given intervals

I have an sqlite3 database with two tables that looks like this:
Table: Position
| pk | name | ...
------------------
| 1 | pos1 | ...
| 2 | pos2 | ...
Table: Status
| pk_position | datetime | ...
----------------------
| 1 | 20170201 | ...
| 1 | 20170204 | ...
| 1 | 20170205 | ...
| 1 | 20170207 | ...
| 2 | 20170204 | ...
| 2 | 20170201 | ...
| 2 | 20170208 | ...
Where datetime is "YYYYMMDD" (i.e. %Y%m%d) and pk_position is a ForeginKey of the table Position.
I need the following: given two intervals of time int1 = [day1:day2] and int2 = [day3:day4], I want a unique selection of pk_position for which there exists at least 1 row with datetime contained in each interval.
Examples (using example tables):
int1 = ["20170201" : "20170202"] and int2 = ["20170202" : "20170203"] => (null)
int1 = ["20170204" : "20170205"] and int2 = ["20170205" : "20170206"] => 1
int1 = ["20170203" : "20170204"] and int2 = ["20170204" : "20170205"] => 1, 2
I tried to use the EXISTS but I can't find any smart way to achieve this.
Thanks!
OBS: I tried to keep the question as broad as possible. In reality in all my use cases the intervals will have the form [day1 : day2], [day2 : day3] (i.e. they share a common boundary), just like all examples. If doing this for a common boundary is easier, I'll be happy with a solution to this simpler problem.

I think this will get you there (I don't have a way to test SQLite, so it wouldn't surprise me if there wasn't a missing comma, or the semicolons are wrong, or something like that, but this is the logic in SQL, adapted to SQLite as best I can without final testing):
CREATE TEMP TABLE
_time_intervals
(
int1 TEXT,
int2 TEXT,
int1Start TEXT,
int1End TEXT,
int2Start TEXT,
int2End TEXT
);
INSERT INTO
_time_intervals(int1, int2)
VALUES ( '["20170201" : "20170202"]', '["20170202" : "20170203"]' );
UPDATE _time_intervals
SET int1Start = SUBSTR(int1, 3, 8),
SET int1End = SUBSTR(int1, 16, 8),
SET int2Start = SUBSTR(int2, 3, 8),
SET int2End = SUBSTR(int2, 16, 8)
SELECT
pk_position
FROM
(
SELECT
s.pk_position,
1 AS int1_counter,
0 AS int2_counter
FROM
Status AS s
WHERE
s.datetime BETWEEN (SELECT int1Start FROM _time_intervals) AND (SELECT int1End FROM _time_intervals)
UNION ALL
SELECT
s.pk_position,
0 AS int1_counter,
1 AS int2_counter
FROM
Status AS s
WHERE
s.datetime BETWEEN (SELECT int2Start FROM _time_intervals) AND (SELECT int2End FROM _time_intervals)
) AS sq
GROUP BY
pk_position
HAVING
SUM(int1_counter) > 0
AND
SUM(int2_counter) > 0;
DROP TABLE _time_intervals;

Remove all records with opposite sign

I'm looking for a SQL query (or even better a LINQ query) to remove people who have cancelled their leave, i.e. remove all records with the same NAME and same START and END and the DAYS_TAKEN values differ only in the sign.
How to get from this
NAME |DAYS_TAKEN |START |END |UNIQUE_LEAVE_ID
--------|-----------|-----------|-----------|-----------
Alice | 2 | 1 June | 3 June | 1 --remove because cancelled
Alice | -2 | 1 June | 3 June | 2 --cancelled
Alice | 3 | 5 June | 8 June | 3 --keep
Bob | 10 | 4 June | 14 June | 4 --keep
Charles | 12 | 2 June | 14 June | 5 --remove because cancelled
Charles | -12 | 2 June | 14 June | 6 --cancelled
David | 5 | 3 June | 8 June | 7 --keep
To this?
NAME |DAYS_TAKEN |START |END |UNIQUE_LEAVE_ID
--------|-----------|-----------|-----------|-----------
Alice | 3 | 5 June | 8 June | 3 --keep
Bob | 10 | 4 June | 14 June | 4 --keep
David | 5 | 3 June | 8 June | 7 --keep
What I've tried
Query1 to find all the cancelled records (not sure if this is correct)
SELECT L1.UNIQUE_LEAVE_ID
FROM LEAVE L1
INNER JOIN LEAVE L2 ON L2.DAYS_TAKEN > 0 AND ABS(L1.DAYS_TAKEN) = L2.DAYS_TAKEN AND L1.NAME= L2.NAME AND L1.START = L2.START AND L1.END = L2.END
WHERE L1.DAYS_TAKEN < 0
Then I use Query1 twice in an inner select like so
SELECT L.* FROM LEAVE L WHERE
L.UNIQUE_LEAVE_ID NOT IN (Query1)
AND L.UNIQUE_LEAVE_ID NOT IN (Query1)
Is there a way to use the inner query only once?
(It's an Oracle database, being called from .NET/C#)

You can use a query like the following:
SELECT NAME, START, END
FROM LEAVE
GROUP BY NAME, START, END
HAVING SUM(DAYS_TAKEN) = 0
in order to get NAME, START, END groups that have been cancelled (assuming DAYS_TAKEN of the cancellation record negates the days of the initial record).
Output:
NAME |START |END
--------|-----------|----------
Alice | 1 June | 3 June
Charles | 2 June | 14 June
Using the above query as a derived table you can get records not being related to 'cancelled' groups:
SELECT L1.NAME, L1.DAYS_TAKEN, L1.START, L1.END, L1.UNIQUE_LEAVE_ID
FROM LEAVE L1
LEFT JOIN (
SELECT NAME, START, END
FROM LEAVE
GROUP BY NAME, START, END
HAVING SUM(DAYS_TAKEN) = 0
) L2 ON L1.NAME = L2.NAME AND L1.START = L2.START AND L1.END = L2.END
WHERE L2.NAME IS NULL
Output:
NAME |DAYS_TAKEN |START |END |UNIQUE_LEAVE_ID
--------|-----------|-----------|-----------|-----------
Alice | 3 | 5 June | 8 June | 3
Bob | 10 | 4 June | 14 June | 4
David | 5 | 3 June | 8 June | 7

You can use not exists:
select l.*
from leave l
where not exists (select 1
from leave l2
where l2.name = l.name and l2.start = l.start and
l2.end = l.name and l2.days_taken = - l.days_taken
);
This query can take advantage of an index on leave(name, start, end, days_taken).

Here is a variation with SUM() OVER:
SELECT x.*
FROM (SELECT l.*, SUM (days_taken) OVER (PARTITION BY name, "START", "END", ABS (days_taken) ORDER BY NULL) s
FROM leave l) x
WHERE s <> 0
And if you have Oracle 12, this give you the canceled:
SELECT l.*
FROM leave l,
LATERAL (SELECT days_taken
FROM leave l2
WHERE l2.name = l.name
AND l2."START" = l."START"
AND l2."END" = l."END"
AND l2.days_taken = -l.days_taken) x
and this what should remain:
SELECT l.*
FROM leave l
OUTER APPLY (SELECT days_taken
FROM leave l2
WHERE l2.name = l.name
AND l2."START" = l."START"
AND l2."END" = l."END"
AND l2.days_taken = -l.days_taken) x
WHERE x.days_taken IS NULL
And something about the column names.Using reserved word in Oracle SQL is not recommended, but if you must do it, use '"' like here.

I used Giorgos answer to come up with this Linq solution. This solution also considers people who cancel / apply their leave multiple times. See Alice and Edgar below.
Sample data
int id = 0;
List<Leave> allLeave = new List<Leave>()
{
new Leave() { UniqueLeaveID=id++, Name="Alice", Start=new DateTime(2016,6,1), End=new DateTime(2016,6,3), Taken=-2 },
new Leave() { UniqueLeaveID=id++,Name="Alice", Start=new DateTime(2016,6,1), End=new DateTime(2016,6,3), Taken=2 },
new Leave() { UniqueLeaveID=id++, Name="Alice", Start=new DateTime(2016,6,1), End=new DateTime(2016,6,3), Taken=2 },
new Leave() { UniqueLeaveID=id++,Name="Alice", Start=new DateTime(2016,6,3), End=new DateTime(2016,6,5), Taken=3 },
new Leave() { UniqueLeaveID=id++,Name="Bob", Start=new DateTime(2016,6,4), End=new DateTime(2016,6,14), Taken=10 },
new Leave() { UniqueLeaveID=id++,Name="Charles", Start=new DateTime(2016,6,2), End=new DateTime(2016,6,14), Taken=12 },
new Leave() { UniqueLeaveID=id++,Name="Charles", Start=new DateTime(2016,6,2), End=new DateTime(2016,6,14), Taken=-12 },
new Leave() { UniqueLeaveID=id++,Name="David", Start=new DateTime(2016,6,3), End=new DateTime(2016,6,8), Taken=5 },
new Leave() { UniqueLeaveID=id++,Name="Edgar", Start=new DateTime(2016,6,3), End=new DateTime(2016,6,8), Taken=5 },
new Leave() { UniqueLeaveID=id++,Name="Edgar", Start=new DateTime(2016,6,3), End=new DateTime(2016,6,8), Taken=5 },
new Leave() { UniqueLeaveID=id++,Name="Edgar", Start=new DateTime(2016,6,3), End=new DateTime(2016,6,8), Taken=5 },
new Leave() { UniqueLeaveID=id++,Name="Edgar", Start=new DateTime(2016,6,3), End=new DateTime(2016,6,8), Taken=5 }
};
Linq Query (watch out for Oracle version 11 vs 12)
var filteredLeave = allLeave
.GroupBy(a => new { a.Name, a.Start, a.End })
.Select(a => new { Group = a.OrderByDescending(b=>b.Taken), Count = a.Count() })
.Where(a => a.Count % 2 != 0)
.Select(a => a.Group.First());
"OrderByDescending" ensures only positive days taken are returned.
Oracle SQL
SELECT
*
FROM
(
SELECT
L1.NAME, L1.START, L1.END, MAX(TAKEN) AS TAKEN, COUNT(*) AS CNT
FROM LEAVE L1
GROUP BY L1.NAME, L1.START, L1.END
) L2
WHERE MOD(L2.CNT,2)<>0 -- replace MOD with % for Microsoft SQL
The condition "WHERE MOD(L2.CNT,2)<>0" (or in Linq "a.Count % 2 != 0") only returns people who applied once or odd number of times (e.g. apply - cancel - apply). But people who apply - cancel - apply - cancel are filtered out.

Postgresql : Alternative to joining the same table multiple times

If i have two tables entry and entry_metadata, with the entry_metadata as a description table for the entry referenced by entry_id and a variable.
If i have this :
entry
id | name |
-------------
1 | entry1 |
2 | entry2 |
3 | entry3 |
entry_metadata
id | entry_id | variable | value
1 | 1 | width | 10
2 | 1 | height | 5
3 | 2 | width | 8
4 | 2 | height | 7
5 | ... | .... | ..
and i'm getting the table :
id | name | width | height| ... | ...
-----------------------------------------
1 | entry1 | 10 | 5 |
2 | entry2 | 8 | 7 |
3 | entry3 | .. | .. |
by the sql :
select e.name, em.width, emr.height
from
public.entry e
left join
public.entry_metadata em
on
em.entry_id = e.id and em.variable = 'width'
left join
public.entry_metadata emr
on
emr.entry_id = e.id and emr.variable = 'height'
The query above works. But as I add more variables to get the values (the entry_metadata table includes a large variety of variables) from the entry metadata. The query gets really really slow. every join I do slows down the execution greatly. Is there a way to get around this?

You can also do this with conditional aggregation:
select id, name,
max(case when variable = 'width' then value end) as width,
max(case when variable = 'height' then value end) as height
from public.entry_metadata em
group by id, name;
Adding additional columns is just adding more aggregation functions.

Just use subselects for this:
SELECT
e.id,
e.name,
(SELECT em.value FROM public.entry_metadata em WHERE em.entry_id = e.id AND em.variable = 'width') AS width,
(SELECT em.value FROM public.entry_metadata em WHERE em.entry_id = e.id AND em.variable = 'height') AS height
FROM
public.entry e
So for each new variable you just need to add one more subselect.

Is there a way to get around this?
Yes, replace entry_metadata table with addtional column in entry (possible solutions are hstore or jsonb) with key - value storage of entry metadata.
Btw. your tables represents well known controversial database desing pattern known as "Entity Attribute Value".

how to make one sql query asking for many rows

I have such table (example and only part of it):
node_id | k | v
123 | addr:housenumber | 50
123 | addr:street | Kingsway
123 | addr:city | London
123 | (some other stuff) | .....
100 | addr:housenumber | 121
100 | addr:street | Edmund St
100 | addr:city | London
I want to find in this table using one query if there exist e.g. London, Kingsway 50 and what's its node_id. How to make such query and is it even possible? How to deal with such problem?
pseudocode:
SELECT node_id WHERE (k == 'addr:city') == 'London' AND (k == 'addr:street') == 'Kingsway' AND (k == 'addr:housenumber') == '50' AND for all node_id the same
Schema for database: http://pastebin.com/Yigjt77f, my table node_tags.

You could use self-joins like this:
SELECT n1.node_id FROM node_tags n1
INNER JOIN node_tags n2 ON n1.node_id = n2.node_id
INNER JOIN node_tags n3 ON n1.node_id = n3.node_id
WHERE n1.k = 'addr:housenumber' AND n1.v = '50'
AND n2.k = 'addr:street' AND n2.v = 'Kingsway'
AND n3.k = 'addr:city' AND n3.v = 'London'
There might be better ways of doing this with PostgreSQL but I'm not that familiar with that DBMS.
Sample SQL Fiddle.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Slick query => duplicated result - sql

Related

query by specific value to query

SQLite - select rows by existance of date in given intervals

Remove all records with opposite sign

Postgresql : Alternative to joining the same table multiple times

how to make one sql query asking for many rows

Categories

Resources