Hi i am trying to do a stored procedure in postgresql,
and I have to fill a table (vol_raleos), from 3 others, these are the tables:
super
zona | sitio | manejo
1 | 1 | 1
2 | 2 | 2
datos_vol_raleos
zona | sitio | manejo |vol_prodn
1 | 1 | 10 | 0
2 | 2 | 15 | 0
datos_manejos
manejoVR | manejoSuper
10 | 1
15 | 2
table to fill
vol_raleos
zona | sitio | manejo |vol_prodn
1 | 1 | 1 | 0
2 | 2 | 2 | 0
So, what I do is take the data that is in datos_vol_raleos, verify that it is in super, but first I must convert the manejoVR value according to the table datos_manejos
INSERT INTO vol_raleos
(zona, sitio, manejo, edad, densidad, vol_prod1, vol_prod2, ..., vol_prod36)
select zona, sitio, manejo, edad, densidad, vol_prod1, vol_prod2, ..., vol_prod36
from (
select volr.*, sup.zona, sup.sitio, sup.manejo, dm.manejo,
from datos_vol_raleos volr
left join super sup on (sup.zona = volr.zona and sup.sitio = volr.sitio and sup.manejo = volr.manejo) selrs
order by zona, sitio, manejo, edad, densidad
) sel_min_max;
so here I don't know how to get the manejoSuper value from datos_manejos, to later compare
You can insert from a select with a couple of joins. For example:
insert into vol_raleos
select s.zona, s.sitio, s.manejo, m.manejoSuper
from super s
join datos_vol_raleos d on (d.zona, d.sitio) = (s.zona, s.sitio)
join datos_manejos m on m.manejoVR = d.manejo
I have an sqlite3 database with two tables that looks like this:
Table: Position
| pk | name | ...
------------------
| 1 | pos1 | ...
| 2 | pos2 | ...
Table: Status
| pk_position | datetime | ...
----------------------
| 1 | 20170201 | ...
| 1 | 20170204 | ...
| 1 | 20170205 | ...
| 1 | 20170207 | ...
| 2 | 20170204 | ...
| 2 | 20170201 | ...
| 2 | 20170208 | ...
Where datetime is "YYYYMMDD" (i.e. %Y%m%d) and pk_position is a ForeginKey of the table Position.
I need the following: given two intervals of time int1 = [day1:day2] and int2 = [day3:day4], I want a unique selection of pk_position for which there exists at least 1 row with datetime contained in each interval.
Examples (using example tables):
int1 = ["20170201" : "20170202"] and int2 = ["20170202" : "20170203"] => (null)
int1 = ["20170204" : "20170205"] and int2 = ["20170205" : "20170206"] => 1
int1 = ["20170203" : "20170204"] and int2 = ["20170204" : "20170205"] => 1, 2
I tried to use the EXISTS but I can't find any smart way to achieve this.
Thanks!
OBS: I tried to keep the question as broad as possible. In reality in all my use cases the intervals will have the form [day1 : day2], [day2 : day3] (i.e. they share a common boundary), just like all examples. If doing this for a common boundary is easier, I'll be happy with a solution to this simpler problem.
I think this will get you there (I don't have a way to test SQLite, so it wouldn't surprise me if there wasn't a missing comma, or the semicolons are wrong, or something like that, but this is the logic in SQL, adapted to SQLite as best I can without final testing):
CREATE TEMP TABLE
_time_intervals
(
int1 TEXT,
int2 TEXT,
int1Start TEXT,
int1End TEXT,
int2Start TEXT,
int2End TEXT
);
INSERT INTO
_time_intervals(int1, int2)
VALUES ( '["20170201" : "20170202"]', '["20170202" : "20170203"]' );
UPDATE _time_intervals
SET int1Start = SUBSTR(int1, 3, 8),
SET int1End = SUBSTR(int1, 16, 8),
SET int2Start = SUBSTR(int2, 3, 8),
SET int2End = SUBSTR(int2, 16, 8)
SELECT
pk_position
FROM
(
SELECT
s.pk_position,
1 AS int1_counter,
0 AS int2_counter
FROM
Status AS s
WHERE
s.datetime BETWEEN (SELECT int1Start FROM _time_intervals) AND (SELECT int1End FROM _time_intervals)
UNION ALL
SELECT
s.pk_position,
0 AS int1_counter,
1 AS int2_counter
FROM
Status AS s
WHERE
s.datetime BETWEEN (SELECT int2Start FROM _time_intervals) AND (SELECT int2End FROM _time_intervals)
) AS sq
GROUP BY
pk_position
HAVING
SUM(int1_counter) > 0
AND
SUM(int2_counter) > 0;
DROP TABLE _time_intervals;
I'm looking for a SQL query (or even better a LINQ query) to remove people who have cancelled their leave, i.e. remove all records with the same NAME and same START and END and the DAYS_TAKEN values differ only in the sign.
How to get from this
NAME |DAYS_TAKEN |START |END |UNIQUE_LEAVE_ID
--------|-----------|-----------|-----------|-----------
Alice | 2 | 1 June | 3 June | 1 --remove because cancelled
Alice | -2 | 1 June | 3 June | 2 --cancelled
Alice | 3 | 5 June | 8 June | 3 --keep
Bob | 10 | 4 June | 14 June | 4 --keep
Charles | 12 | 2 June | 14 June | 5 --remove because cancelled
Charles | -12 | 2 June | 14 June | 6 --cancelled
David | 5 | 3 June | 8 June | 7 --keep
To this?
NAME |DAYS_TAKEN |START |END |UNIQUE_LEAVE_ID
--------|-----------|-----------|-----------|-----------
Alice | 3 | 5 June | 8 June | 3 --keep
Bob | 10 | 4 June | 14 June | 4 --keep
David | 5 | 3 June | 8 June | 7 --keep
What I've tried
Query1 to find all the cancelled records (not sure if this is correct)
SELECT L1.UNIQUE_LEAVE_ID
FROM LEAVE L1
INNER JOIN LEAVE L2 ON L2.DAYS_TAKEN > 0 AND ABS(L1.DAYS_TAKEN) = L2.DAYS_TAKEN AND L1.NAME= L2.NAME AND L1.START = L2.START AND L1.END = L2.END
WHERE L1.DAYS_TAKEN < 0
Then I use Query1 twice in an inner select like so
SELECT L.* FROM LEAVE L WHERE
L.UNIQUE_LEAVE_ID NOT IN (Query1)
AND L.UNIQUE_LEAVE_ID NOT IN (Query1)
Is there a way to use the inner query only once?
(It's an Oracle database, being called from .NET/C#)
You can use a query like the following:
SELECT NAME, START, END
FROM LEAVE
GROUP BY NAME, START, END
HAVING SUM(DAYS_TAKEN) = 0
in order to get NAME, START, END groups that have been cancelled (assuming DAYS_TAKEN of the cancellation record negates the days of the initial record).
Output:
NAME |START |END
--------|-----------|----------
Alice | 1 June | 3 June
Charles | 2 June | 14 June
Using the above query as a derived table you can get records not being related to 'cancelled' groups:
SELECT L1.NAME, L1.DAYS_TAKEN, L1.START, L1.END, L1.UNIQUE_LEAVE_ID
FROM LEAVE L1
LEFT JOIN (
SELECT NAME, START, END
FROM LEAVE
GROUP BY NAME, START, END
HAVING SUM(DAYS_TAKEN) = 0
) L2 ON L1.NAME = L2.NAME AND L1.START = L2.START AND L1.END = L2.END
WHERE L2.NAME IS NULL
Output:
NAME |DAYS_TAKEN |START |END |UNIQUE_LEAVE_ID
--------|-----------|-----------|-----------|-----------
Alice | 3 | 5 June | 8 June | 3
Bob | 10 | 4 June | 14 June | 4
David | 5 | 3 June | 8 June | 7
You can use not exists:
select l.*
from leave l
where not exists (select 1
from leave l2
where l2.name = l.name and l2.start = l.start and
l2.end = l.name and l2.days_taken = - l.days_taken
);
This query can take advantage of an index on leave(name, start, end, days_taken).
Here is a variation with SUM() OVER:
SELECT x.*
FROM (SELECT l.*, SUM (days_taken) OVER (PARTITION BY name, "START", "END", ABS (days_taken) ORDER BY NULL) s
FROM leave l) x
WHERE s <> 0
And if you have Oracle 12, this give you the canceled:
SELECT l.*
FROM leave l,
LATERAL (SELECT days_taken
FROM leave l2
WHERE l2.name = l.name
AND l2."START" = l."START"
AND l2."END" = l."END"
AND l2.days_taken = -l.days_taken) x
and this what should remain:
SELECT l.*
FROM leave l
OUTER APPLY (SELECT days_taken
FROM leave l2
WHERE l2.name = l.name
AND l2."START" = l."START"
AND l2."END" = l."END"
AND l2.days_taken = -l.days_taken) x
WHERE x.days_taken IS NULL
And something about the column names.Using reserved word in Oracle SQL is not recommended, but if you must do it, use '"' like here.
I used Giorgos answer to come up with this Linq solution. This solution also considers people who cancel / apply their leave multiple times. See Alice and Edgar below.
Sample data
int id = 0;
List<Leave> allLeave = new List<Leave>()
{
new Leave() { UniqueLeaveID=id++, Name="Alice", Start=new DateTime(2016,6,1), End=new DateTime(2016,6,3), Taken=-2 },
new Leave() { UniqueLeaveID=id++,Name="Alice", Start=new DateTime(2016,6,1), End=new DateTime(2016,6,3), Taken=2 },
new Leave() { UniqueLeaveID=id++, Name="Alice", Start=new DateTime(2016,6,1), End=new DateTime(2016,6,3), Taken=2 },
new Leave() { UniqueLeaveID=id++,Name="Alice", Start=new DateTime(2016,6,3), End=new DateTime(2016,6,5), Taken=3 },
new Leave() { UniqueLeaveID=id++,Name="Bob", Start=new DateTime(2016,6,4), End=new DateTime(2016,6,14), Taken=10 },
new Leave() { UniqueLeaveID=id++,Name="Charles", Start=new DateTime(2016,6,2), End=new DateTime(2016,6,14), Taken=12 },
new Leave() { UniqueLeaveID=id++,Name="Charles", Start=new DateTime(2016,6,2), End=new DateTime(2016,6,14), Taken=-12 },
new Leave() { UniqueLeaveID=id++,Name="David", Start=new DateTime(2016,6,3), End=new DateTime(2016,6,8), Taken=5 },
new Leave() { UniqueLeaveID=id++,Name="Edgar", Start=new DateTime(2016,6,3), End=new DateTime(2016,6,8), Taken=5 },
new Leave() { UniqueLeaveID=id++,Name="Edgar", Start=new DateTime(2016,6,3), End=new DateTime(2016,6,8), Taken=5 },
new Leave() { UniqueLeaveID=id++,Name="Edgar", Start=new DateTime(2016,6,3), End=new DateTime(2016,6,8), Taken=5 },
new Leave() { UniqueLeaveID=id++,Name="Edgar", Start=new DateTime(2016,6,3), End=new DateTime(2016,6,8), Taken=5 }
};
Linq Query (watch out for Oracle version 11 vs 12)
var filteredLeave = allLeave
.GroupBy(a => new { a.Name, a.Start, a.End })
.Select(a => new { Group = a.OrderByDescending(b=>b.Taken), Count = a.Count() })
.Where(a => a.Count % 2 != 0)
.Select(a => a.Group.First());
"OrderByDescending" ensures only positive days taken are returned.
Oracle SQL
SELECT
*
FROM
(
SELECT
L1.NAME, L1.START, L1.END, MAX(TAKEN) AS TAKEN, COUNT(*) AS CNT
FROM LEAVE L1
GROUP BY L1.NAME, L1.START, L1.END
) L2
WHERE MOD(L2.CNT,2)<>0 -- replace MOD with % for Microsoft SQL
The condition "WHERE MOD(L2.CNT,2)<>0" (or in Linq "a.Count % 2 != 0") only returns people who applied once or odd number of times (e.g. apply - cancel - apply). But people who apply - cancel - apply - cancel are filtered out.
If i have two tables entry and entry_metadata, with the entry_metadata as a description table for the entry referenced by entry_id and a variable.
If i have this :
entry
id | name |
-------------
1 | entry1 |
2 | entry2 |
3 | entry3 |
entry_metadata
id | entry_id | variable | value
1 | 1 | width | 10
2 | 1 | height | 5
3 | 2 | width | 8
4 | 2 | height | 7
5 | ... | .... | ..
and i'm getting the table :
id | name | width | height| ... | ...
-----------------------------------------
1 | entry1 | 10 | 5 |
2 | entry2 | 8 | 7 |
3 | entry3 | .. | .. |
by the sql :
select e.name, em.width, emr.height
from
public.entry e
left join
public.entry_metadata em
on
em.entry_id = e.id and em.variable = 'width'
left join
public.entry_metadata emr
on
emr.entry_id = e.id and emr.variable = 'height'
The query above works. But as I add more variables to get the values (the entry_metadata table includes a large variety of variables) from the entry metadata. The query gets really really slow. every join I do slows down the execution greatly. Is there a way to get around this?
You can also do this with conditional aggregation:
select id, name,
max(case when variable = 'width' then value end) as width,
max(case when variable = 'height' then value end) as height
from public.entry_metadata em
group by id, name;
Adding additional columns is just adding more aggregation functions.
Just use subselects for this:
SELECT
e.id,
e.name,
(SELECT em.value FROM public.entry_metadata em WHERE em.entry_id = e.id AND em.variable = 'width') AS width,
(SELECT em.value FROM public.entry_metadata em WHERE em.entry_id = e.id AND em.variable = 'height') AS height
FROM
public.entry e
So for each new variable you just need to add one more subselect.
Is there a way to get around this?
Yes, replace entry_metadata table with addtional column in entry (possible solutions are hstore or jsonb) with key - value storage of entry metadata.
Btw. your tables represents well known controversial database desing pattern known as "Entity Attribute Value".
I have such table (example and only part of it):
node_id | k | v
123 | addr:housenumber | 50
123 | addr:street | Kingsway
123 | addr:city | London
123 | (some other stuff) | .....
100 | addr:housenumber | 121
100 | addr:street | Edmund St
100 | addr:city | London
I want to find in this table using one query if there exist e.g. London, Kingsway 50 and what's its node_id. How to make such query and is it even possible? How to deal with such problem?
pseudocode:
SELECT node_id WHERE (k == 'addr:city') == 'London' AND (k == 'addr:street') == 'Kingsway' AND (k == 'addr:housenumber') == '50' AND for all node_id the same
Schema for database: http://pastebin.com/Yigjt77f, my table node_tags.
You could use self-joins like this:
SELECT n1.node_id FROM node_tags n1
INNER JOIN node_tags n2 ON n1.node_id = n2.node_id
INNER JOIN node_tags n3 ON n1.node_id = n3.node_id
WHERE n1.k = 'addr:housenumber' AND n1.v = '50'
AND n2.k = 'addr:street' AND n2.v = 'Kingsway'
AND n3.k = 'addr:city' AND n3.v = 'London'
There might be better ways of doing this with PostgreSQL but I'm not that familiar with that DBMS.
Sample SQL Fiddle.