Is there a programming language or package that supports table based reactive declarative programming in memory very similar to the SQL language and trigger facility?
For example, I could define PERSON and JOB tables as functions
name: PERSON -> STRING
female: PERSON -> BOOLEAN
mother: PERSON -> PEOPLE
father: PERSON -> PEOPLE
title: JOB -> STRING
company: JOB -> STRING
salary: JOB -> INTEGER
empoyee: JOB -> PERSON
Then I would like to calculate functions like:
childcount: PERSON -> INTEGER
childcount(P) = |{ Q in PERSON : father(Q) = P or mather(Q) = P }|
income: PERSON -> INTEGER
income(P) = SUM { salary(J) : J in JOB and empoyee(J) = P }
incomeperchild: PERSON -> INTEGER
incomeperchild(P) = income(P) / childcount(P)
parent: PERSON x PERSON -> BOOLEAN
person(P,Q) = (P = father(Q)) or (P = mother(Q))
error: PERSON -> BOOLEAN
error(P) = (female(P) and (exists Q in PERSON)(father(Q) = P))
or (not female(P) and (exists Q in PERSON)(mother(Q) = P))
or (exists Q in PERSON)(parent(P,Q) and error(Q))
So essentially I would like to have calculated columns in tables that are automatically updated whenever values in the tables change. Similar things could be expressed with SQL triggers, but I would like to have such functionality built into a language and executed in memory. The propagation of changes need to be optimized. Are there frameworks to do this?
The observer patter and reactive programming focuses on individual objects. But I do not want to maintain pointers and extra structure for each row in my tables as there could be million of rows. All the rules are generic (although they can refer to different rows via parent/children relations, etc), so some form of recursion is required.
One way to approach this is via attribute grammars: http://www.haskell.org/haskellwiki/Attribute_grammar
Related
I have scenario when I have to iterate through multiple tables in quite big sqlite database. In tables I store informations about planet position on sky through years. So e.g. for Mars I have tables Mars_2000, Mars_2001 and so on. Table structure is always the same:
|id:INTEGER|date:TEXT|longitude:REAL|
Thing is that for certain task I need to iterate through this tables, which cost much time (for more than 10 queries it's painful).
I suppose that if I merge all tables with years to one big table performance might be better as one query through one big table is better than 50 through smaller tables. I wanted to make sure that this might work, as database is humongous (around 20Gb), and reshaping it would cost a while.
Is this plan I just described viable? Is there any other solution for such case?
It might be helpfull so I attach function that produces my SQL query that is unique for each table:
pub fn transition_query(
select_param: &str, // usually asterix
table_name: &str, // table I'd like to query
birth_degree: &f64, // constant number
wanted_degree: &f64, // another constant number
orb: &f64, // another constant number
upper_date_limit: DateTime<Utc>, // casts to SQL-like string
lower_date_limit: DateTime<Utc>, // casts to SQL-like string
) -> String {
let parsed_upper_date_limit = CelestialBodyPosition::parse_date(upper_date_limit);
let parsed_lower_date_limit = CelestialBodyPosition::parse_date(lower_date_limit);
return format!("
SELECT *,(SECOND_LAG>60 OR SECOND_LAG IS NULL) AS TRANSIT_START, (SECOND_LEAD > 60 OR SECOND_LEAD IS NULL) AS TRANSIT_END, time FROM (
SELECT
*,
UNIX_TIME - LAG(UNIX_TIME,1) OVER (ORDER BY time) as SECOND_LAG,
LEAD(UNIX_TIME,1) OVER (ORDER BY time) - UNIX_TIME as SECOND_LEAD FROM (
SELECT {select_param},
DATE(time) as day_scoped_date,
CAST(strftime('%s', time) AS INT) AS UNIX_TIME,
longitude
FROM {table_name}
WHERE ((-{orb} <= abs(realModulo(longitude -{birth_degree} -{wanted_degree},360))
AND abs(realModulo(longitude -{birth_degree} -{wanted_degree},360)) <= {orb})
OR
(-{orb} <= abs(realModulo(longitude -{birth_degree} +{wanted_degree},360))
AND abs(realModulo(longitude -{birth_degree} +{wanted_degree},360)) <= {orb}))
AND time < '{parsed_upper_date_limit}' AND time > '{parsed_lower_date_limit}'
)
) WHERE (TRANSIT_START AND NOT TRANSIT_END) OR (TRANSIT_END AND NOT TRANSIT_START) ;
");
}
I solved the issue programmatically. Whole thing was done with Rust and r2d2_sqlite library. I'm still doing a lot of queries, but now it's done in threads. It allowed me to reduce execution time from 25s to around 3s. Here's the code:
use std::sync::mpsc;
use std::thread;
use r2d2_sqlite::SqliteConnectionManager;
use r2d2;
let manager = SqliteConnectionManager::file("db_path");
let pool = r2d2::Pool::builder().build(manager).unwrap();
let mut result: Vec<CelestialBodyPosition> = vec![]; // Vector of structs
let (tx, rx) = mpsc::channel(); // Allows ansynchronous communication
let mut children = vec![]; //vector of join handlers (not sure if needed at all
for query in queries {
let pool = pool.clone(); // For each loop I clone connection to databse
let inner_tx = tx.clone(); // and messager, as each thread should have spearated one.
children.push(thread::spawn(move || {
let conn = pool.get().unwrap();
add_real_modulo_function(&conn); // this adds custom sqlite function I needed
let mut sql = conn.prepare(&query).unwrap();
// this does query, and maps result to my internal type
let positions: Vec<CelestialBodyPosition> = sql
.query_map(params![], |row| {
Ok(CelestialBodyPosition::new(row.get(1)?, row.get(2)?))
})
.unwrap()
.map(|position| position.unwrap())
.collect();
// this sends partial result to receiver
return inner_tx.send(positions).unwrap();
}));
}
// first messenger has to be dropped, otherwise program will wait for its input
drop(tx);
for received in rx {
result.extend(received); // combine all results
}
return result;
As you can see no optimization happened from sqlite site, which kinda makes me feel I'm doing something wrong, but for now it's alright. It might be good to press some more control over amount of spawned threads.
Suppose I have 3 hypothetical models;
class State(models.Model):
name = models.CharField(max_length=20)
class Company(models.Model):
name = models.CharField(max_length=60)
state = models.ForeignField(State)
class Person(models.Model):
name = models.CharField(max_length=60)
state = models.ForeignField(State)
I want to be able to return results in a Django app, where the results, if using SQL directly, would be based on a query such as this:
SELECT a.name as 'personName',b.name as 'companyName', b.state as 'State'
FROM Person a, Company b
WHERE a.state=b.state
I have tried using the select_related() method as suggested here, but I don't think this is quite what I am after, since I am trying to join two tables that have a common foreign-key, but have no key-relationships amongst themselves.
Any suggestions?
Since a Person can have multiple Companys in the same state. It is not a good idea to do the JOIN at the database level. That would mean that the database will (likely) return the same Company multiple times, making the output quite large.
We can prefetch the related companies, with:
qs = Person.objects.select_related('state').prefetch_related('state__company')
Then we can query the Companys in the same state with:
for person in qs:
print(person.state.company_set.all())
You can use a Prefetch-object [Django-doc] to prefetch the list of related companies in an attribute of the Person, for example:
from django.db.models import Prefetch
qs = Person.objects.prefetch_related(
Prefetch('state__company', Company.objects.all(), to_attr='same_state_companies')
)
Then you can print the companies with:
for person in qs:
print(person.same_state_companies)
This is my Z schema for Appointment DB.
|--AppointmentDB----------------
|attendees : P Person /** those involved in the appointment **/
|
|/** a new TYPE object to store attendees, schedule and purpose **/
|appointments : P APPOINTMENT
|hasAppointment : Person <-> APPOINTMENT
|schedule : APPOINTMENT -> DateTime
|purpose : APPOINTMENT -> Report
|
|/** a forward relation compositions to relate attendees with purpose and schedule **/
|attendeePurpose : hasAppointment;purpose
|attendeeSchedule : hasAppointment;schedule
|-----------------------------
|attendees ⊆ dom(hasAppointment)
|attendees ⊆ dom(attendeePurpose)
|appointments ⊆ ran(hasAppointment)
|-----------------------------
I would like to create a search function that finds an appointment based on the name of the attendees.
I want the search function to return all the details of the appointment object.
How do I design it?
Here is my take :
|--FindAppointment---------------------------------------------------
|ΞAppointmentDB
|attendees? : Person
|appointmentAttendees! : P Person
|appointmentPurpose! : Report
|appointmentSchedule! : DateTime
|-----------------------------
|/** if name of any attendees is given, then it must exist in appointments' domain
|respectively before this function can run**/
|attendees? ∈ dom(attendees)
|
|/** return the set of attendees of the same APPOINTMENT using attendees? as input **/
|appointmentAttendees! = hasAppointment~(|{attendees?}|)
|
|/** Get the image of both forward relational compositions according to set of
|attendees?**/
|appointmentPurpose! = attendeePurpose(|{attendees?}|)
|appointmentSchedule! = attendeeSchedule(|{attendees?}|)
|----------------------------------------------------------------------
Have you type checked your specification?
Your declaration subject? : P Person states that subject? is a set of persons, but subject? : dom(attendees) implies that subject? is a single person.
If you want to have either none or one person given you could introduce a datatype analogous to the Maybe monad in functional programming languages (or null values in other programming languages):
MaybePerson ::= NoPerson | JustPerson <<Person>>
Then you can declare an input like
subject? : MaybePerson
Then I would to suggest to restrain the possible solutions for one input
subject? : ran(JustPerson) => schedule! : schedule(|{ JustPerson~ subject? }|)
If subject? is a set of persons you can achieve the same with:
subject? /= {} => schedule! : schedule(|subject?|)
And then just do the same for the other possible input. You can add also a condition that not both entries should be NoPerson resp. not both input sets should be empty.
I'm using Slick versian 2.0.0-M3. If I have two Querys representing relations of the same type, I see there is a union operator to inclusively disjoin them, but I don't see a comparable operator for obtaining their intersection nor their difference. Do such operators not exist in Slick?
I think the foregoing explains what I'm looking for, but if not, here's an example. I have the suppliers table:
case class Supplier(snum: String, sname: String, status: Int, city: String)
class Suppliers(tag: Tag) extends Table[Supplier](tag, "suppliers") {
def snum = column[String]("snum")
def sname = column[String]("sname")
def status = column[Int]("status")
def city = column[String]("city")
def * = (snum, sname, status, city) <> (Supplier.tupled, Supplier.unapply _)
}
val suppliers = TableQuery[Suppliers]
If I want to know about suppliers that either are in a particular city or have a particular status, I see how to use Query.union for that:
scala> val thirtySuppliers = suppliers.filter(_.status === 30)
thirtySuppliers: scala.slick.lifted.Query[Suppliers,Suppliers#TableElementType] = scala.slick.lifted.WrappingQuery#166f63a
scala> val londonSuppliers = suppliers.filter(_.city === "London")
londonSuppliers: scala.slick.lifted.Query[Suppliers,Suppliers#TableElementType] = scala.slick.lifted.WrappingQuery#1bea855
scala> (thirtySuppliers union londonSuppliers).foreach(println)
Supplier(S1,Smith,20,London)
Supplier(S4,Clark,20,London)
Supplier(S3,Blake,30,Paris)
Supplier(S5,Adams,30,Athens)
No problem. But what if I want only the suppliers that are both in a particular city and have a particular status? Seems as if I ought to be able to do something like:
(thirtySuppliers intersect londonSuppliers).foreach(println)
Or if I want the suppliers in a particular city except the ones that have a particular status. Can I not do something like:
(thirtySuppliers except londonSuppliers).foreach(println)
SQL has UNION, INTERSECT, and EXCEPT operations, and Slick's Query class has a union method that builds an SQL query using SQL's UNION, but I'm not seeing Query methods in Slick for deriving intersections nor differences. Am I missing them?
There is a pull request that implements this. It will likely make it into 2.0 or 2.1. https://github.com/slick/slick/pull/242 We still need to figure out some details and clean up a bit.
The operations are pretty much composable in that an intersect can just be two filters. For instance
val intersect = suppliers.filter(_.status === 30).filter(_.city === "London")
or except:
val except= suppliers.filter(_.city === "London").filterNot(_.status === 30)
I have a class Org, which has ParentId (which points to a Consumer) and Orgs properties, to enable a hierarchy of Org instances. I also have a class Customer, which has a OrgId property. Given any Org instance, named Owner, how can I retrieve all Customer instances for that org? That is, before LINQ I would do a 'manual' traversal of the Org tree with Owner as its root. I'm sure something simpler exists though.
Example: If I have a root level Org called 'Film', with Id '1', and sub-Org called 'Horror' with ParentId of '1', and Id of 23, I want to query for all Customers under Film, so I must get all customers with OrgId's of both 1 and 23.
Linq won't help you with this but SQL Server will.
Create a CTE to generate a flattened list of Org Ids, something like:
CREATE PROCEDURE [dbo].[OrganizationIds]
#rootId int
AS
WITH OrgCte AS
(
SELECT OrganizationId FROM Organizations where OrganizationId = #rootId
UNION ALL
SELECT parent.OrganizationId FROM Organizations parent
INNER JOIN OrgCte child ON parent.Parent_OrganizationId = Child.OrganizationId
)
SELECT * FROM OrgCte
RETURN 0
Now add a function import to your context mapped to this stored procedure. This results in a method on your context (the returned values are nullable int since the original Parent_OrganizationId is declared as INT NULL):
public partial class TestEntities : ObjectContext
{
public ObjectResult<int?> OrganizationIds(int? rootId)
{
...
Now you can use a query like this:
// get all org ids for specific root. This needs to be a separate
// query or LtoE throws an exception regarding nullable int.
var ids = OrganizationIds(2);
// now find all customers
Customers.Where (c => ids.Contains(c.Organization.OrganizationId)).Dump();
Unfortunately, not natively in Entity Framework. You need to build your own solution. Probably you need to iterate up to the root. You can optimize this algorithm by asking EF to get a certain number of parents in one go like this:
...
select new { x.Customer, x.Parent.Customer, x.Parent.Parent.Customer }
You are limited to a statically fixed number of parent with this approach (here: 3), but it will save you 2/3 of the database roundtrips.
Edit: I think I did not get your data model right but I hope the idea is clear.
Edit 2: In response to your comment and edit I have adapted the approach like this:
var rootOrg = ...;
var orgLevels = new [] {
select o from db.Orgs where o == rootOrg, //level 0
select o from db.Orgs where o.ParentOrg == rootOrg, //level 1
select o from db.Orgs where o.ParentOrg.ParentOrg == rootOrg, //level 2
select o from db.Orgs where o.ParentOrg.ParentOrg.ParentOrg == rootOrg, //level 3
};
var setOfAllOrgsInSubtree = orgLevels.Aggregate((a, b) => a.Union(b)); //query for all org levels
var customers = from c in db.Customers where setOfAllOrgsInSubtree.Contains(c.Org) select c;
Notice that this only works for a bounded maximum tree depth. In practice, this is usually the case (like 10 or 20).
Performance will not be great but it is a LINQ-to-Entities-only solution.