Zend JOIN clause return as array in the object? - sql

I currently am using the following code
$select = $this->select()
->setIntegrityCheck(false)
->from(array('st' => $this->_name))
->join(array('sp' => 'staff_permissions'), 'sp.staff_id = st.id and sp.pool_id = ' . $pool_id )
->join(array('p' => 'permissions'), 'p.id = sp.permission_id')
->where('staff_id = ?', $staff_id);
return $this->fetchAll($select)->toArray();
It combines three tables and returns the result. The 'st' table corresponds to one staff (so one row), and the other two tables correspond to multiple rows. So what I was hoping was to get a single object back such that the other two tables are arrays inside the object.
So as an example, I get back $row, so that $row->first_name is the name, but $row->permission_id is an array with all the ids in it.
Can that be done using the JOIN clause?

This query should be done in the lowest layer in your application, the next layer up the stack will be your mappers. In your mapper layer you can map to your Entities, which will be a 'staff' entity (object) which contains a collection for 'staff_permissions' & a collection for 'permissions'
model diagram:
-----------
| service | // business logic
-----------
|
-----------
| mapper | // maps external data to internal entities (or vice versa)
-----------
|
----------- ----------------------
|dao (sql)| -> | zend table gateway |
----------- ----------------------
mapper example code:
$staffEntity = new StaffEntity();
$staffEntity->setName($response['name']);
foreach($response['staff_permissions] as $permission) {
$permission = new Permission();
$permission->setName($permission['name']);
$permission->setRule($permission['rule']);
// ... etc ...
$staffEntity->addPermission($permission);
}
// ... same for permissions ...
return $staffEntity;

Related

Keep a relation map in Objection.js while removing the table

I'm developing a reddit-like site where votes are stored per-user (instead of per-post). Here's my relevant schema:
content
id | author_id | title | text
---|-----------|-------------|---
1 | 1 (adam) | First Post | This is a test post by adam
vote: All the votes ever voted by anyone on any post
id | voter_id | content_id | category_id
---|-------------|------------------|------------
1 | 1 (adam) | 1 ("First Post") | 1 (upvote)
2 | 2 (bob) | 1 ("First Post") | 1 (upvote)
vote_count: Current tally ("count") of total votes received by a post by all users
id | content_id | category_id | count
---|------------------|--------------|-------
1 | 1 ("First Post") | 1 (upvote) | 2
I've defined a voteCount relation in Objection.js model for the content table:
class Content extends Model {
static tableName = 'content';
static relationMappings = {
voteCount: {
relation: Model.HasManyRelation,
modelClass: VoteCount,
join: {
from: 'content.id',
to: 'vote_count.content_id'
}
}
}
}
But I recently (learned and) decided that I don't need to keep (and update) a separate vote_count table, when in fact I can just query the vote table and essentially get the same table as a result:
SELECT content_id
, category_id
, COUNT(*) AS count
FROM vote
GROUP
BY content_id
, category_id
So now I wanna get rid of the vote_count table entirely.
But it seems that would break my voteCount relation since there won't be a VoteCount model (not shown here but it's the corresponding the model for the vote_count table) no more either. (Right?)
How do I keep voteCount relation while getting rid of vote_count table (and thus VoteCount model with it)?
Is there a way to somehow specify in the relation that instead of looking at a concrete table, it should look at the result of a query? Or is it possible to define a model class for the same?
My underlying database in PostgreSQL if that helps.
Thanks to #Belayer. Views were exactly the solution to this problem.
Objection.js supports using views (instead of table) in a Model class, so all I had to do was create a view based on the above query.
I'm also using Knex's migration strategy to create/version my database, and although it doesn't (yet) support creating views out of the box, I found you can just use raw queries:
module.exports.up = async function(knex) {
await knex.raw(`
CREATE OR REPLACE VIEW "vote_count" AS (
SELECT content_id
, category_id
, COUNT(*) AS count
FROM vote
GROUP
BY content_id
, category_id
)
`);
};
module.exports.down = async function(knex) {
await knex.raw('DROP VIEW "vote_count";');
};
The above migration step replaces my table vote_count for the equivalent view, and the Objection.js Model class for it (VoteCount) worked as usual without needing any change, and so did the relation voteCount on the Content class.

Spark SQL computing rows it shouldn't

I have a DataFrame loaded from a parquet file which stores many columns. Two of these are an array of user identifiers and another is the states his visited. The user identifier column is stored as an array (WrappedArray as it's Spark) of arrays, where every sub-array has the identifier type as the first element and its value as the second. For example, a user named Jon Smith with ID 1045 will be stored as: WrappedArray(WrappedArray("name","Jon Smith"), WrappedArray("id","1045")). (the sub arrays are arrays of Strings).
So the table looks like this:
uid | state
---------------------------------------------------------------------------------------
WrappedArray(WrappedArray("name","Jon Smith"), WrappedArray("id","1045")) | TX
WrappedArray(WrappedArray("name","Jon Smith"), WrappedArray("id","1045")) | ND
WrappedArray(WrappedArray("name","Jane Katz"), WrappedArray("id","1056")) | IO
and so on. I want a table with the ID of each user and the number of states his been to:
id | states
--------------------
1045 | 2
1056 | 1
So for this I've created a new UDF that parses the ID from the uid array, which looks like this:
import scala.collection.mutable
def IDfromUID(uid: mutable.WrappedArray[mutable.WrappedArray[String]]): String = {
val ID = uid.filter(_(0) == "id")
ID.length match {
case 0 => null
case _ => ID(0)(1) }
}
and I use it in the following query:
sqlContext.udf.register("IDfromUID",(uid: String) => IDfromUID(uid))
df.registerTempTable("RawData")
sqlContext.sql("with Data as (select IDfromUID(uid) as id, state from RawData where uid is not null) select id, count(state) as states from Data where id not null group by id")
Yet, despite the fact that I mention where uid is not null, I still get NullPointerException coming from IDfromUID. It stops only when I change the UDF to be:
import scala.collection.mutable
def IDfromUID(uid: mutable.WrappedArray[mutable.WrappedArray[String]]): String = {
if (uid==null) null
else {
val ID = uid.filter(_(0) == "id")
ID.length match {
case 0 => null
case _ => ID(0)(1) } }
}
which leaves me with the question - why does Spark tries to compute rows of data it's strictly told not to?
I use Spark 1.6.2, and there are multiple queries running in parallel using the same UDF.
I don't know if this answer/note will have any value for you, but I noticed a few things in your query. First, in the following CTE you already filter out NULL id values:
WITH Data AS
(
SELECT IDfromUID(uid) AS id, state
FROM RawData
WHERE uid IS NOT NULL
)
I also noticed that there seemed to be a typo in your query:
select id, count(state) as states from Data where id id not null group by id")
The typo being where id id not null. I think you can remove this WHERE clause entirely:
SELECT id, count(state) AS states
FROM Data
GROUP BY id

SQL / (Django) : efficient database schema for translations

Situation
I trying to set up a database schema to store translations, between different languages. So far it looks like this (simplyfied):
class Language(models.Model):
tag = models.CharField(max_length=2)
def __unicode__(self):
return self.tag
class Phrase(models.Model):
name = models.TextField()
language = models.ForeignKey(Language)
def __unicode__(self):
return self.name
class Meta:
unique_together = ("name", "language")
index_together = [
["name", "language"]
]
class Translation(models.Model):
phrase1 = models.ForeignKey(Phrase, related_name="translation_as_1")
phrase2 = models.ForeignKey(Phrase, related_name="translation_as_2")
def __unicode__(self):
return self.phrase1.name + " <=> " + self.phrase2.name
class Meta:
unique_together = ("phrase1", "phrase2")
index_together = [
["phrase1", "phrase2"]
]
This database schema seems logical to me. I store phrases in different languages and then have translations that contain exactly two phrases.
Problem
The problem is, that the queries, that result out of this schema, look kind of nasty. For instance:
from django.db.models import Q
name = "my phrase"
translations = Translation.objects.filter(Q(phrase1__name=text)|Q(phrase2__name=text))
translated_names = []
for translation in translations:
name1 = translation.phrase1.name
name2 = translation.phrase2.name
if name1 == name:
translated_names.append(name2)
else:
translated_names.append(name1)
I always have to include the "OR" relationship, to make sure, that I get all the possible translations, since the phrase could be stored as phrase1 or phrase2. On top of that, I have to filter my result afterwards to get the correct translated_name (for loop).
Further Explaination
Before I switched to the described schema, I had the following schema instead (Phrase and Language are the same as before):
class Translation(models.Model):
phrase = models.ForeignKey(Phrase)
name = models.TextField()
def __unicode__(self):
return self.phrase.name + " => " + self.name
class Meta:
unique_together = ("phrase", "name")
index_together = [
["phrase", "name"]
This schema let me make queries like this:
from django.db.models import Q
name = "my phrase"
translations = Translation.objects.filter(phrase__name=text)
translated_names = [t.name for t in translations]
This looks much nicer, and is of course faster. But this schema had the disadvantage, that it presents translations only in one direction, so I moved to the other one, which isn't quite what I want as well, because too slow and too complicated queries.
Question
So is there a good schema for this kind of problem, that I maybe overlook?
Remark
I'm not only interested in Django related answers. A pure SQL schema for this kind of problem would also be interesting for me.
This is the way that I have done it in the past. Adapt it for your naming convention.
Suppose that I had a table with a name and other columns in it like this
Table TR_CLT_clothing_type
clt_id | clt_name | other columns ....
--------------------------------------
1 | T Shirt ...
2 | Pants ...
Now if I decided that it needs translations, first I make a languages table
Table TR_LNG_language
lng_id | lng_name | lng_display
-------------------------------
1 | English | English (NZ)
2 | German | Deutsch
I also need to store the current language in the database (you will see why soon). It will only have one row
Table TA_INF_info
inf_current_lng
---------------
1
Then I drop the clt_name column from my clothing table TR_CLT_clothing_type. Instead I make relation table.
Table TL_CLT_clothing_type
clt_id | lng_id | clt_name
--------------------------
1 | 1 | T Shirt
1 | 2 | (German for T-Shirt)
2 | 1 | Pants
2 | 2 | keuchen (thank you google translate)
Now to get the name, you want to make a stored procedure for it. I have not attempted this in ORM.
CREATE PROCEDURE PS_CLT
#clt_id int
AS
SELECT lng.clt_name, clt.*
FROM TR_CLT_clothing_type clt
JOIN TL_CLT_clothing_type lng
ON lng.clt_id = clt.clt_id
WHERE clt.clt_id = #clt_id AND
lng.lng_id in (SELECT inf_current_lng FROM TA_INF_info)
This stored proc will return the name in the current language and all other columns for a specified language. To set the language, set the clt_current_lng in the TA_INF_info table.
Disclaimer: I don't have anything to check the syntax of what I have typed but it should hopefully be straightforward.
-- EDIT
There was a concern to be able to do "give me all translations for word X in language Y to language Z"
There is a "not so elegant" way to do this with the schema. You can do something like
for each table in database like "TL_%"
SELECT name
FROM table
WHERE id IN ( SELECT id
FROM table
WHERE name = #name
AND lng_id = german
)
AND lng_id = english
Now I would imagine that this would require some auto-generated SQL code but I could pull it off.
I have no idea how you would do this in ORM

NHibernate stateless session - what is the data aliasing effect?

The NHibernate documentation for the stateless session interface states, among others:
Stateless sessions are vulnerable to data aliasing effects, due to the
lack of a first-level cache.
I couldn't find an explanation for this. What does 'data aliasing effects' mean?
If you could give examples... that'd be great.
consider the following example
table Orders
id | customer_id | quantity
---------------------------
1 | 1 | 5
2 | 1 | 20
var orders = statelessSession.Query<Oders>().ToList();
orders[0].Customer.HasDiscount = true;
Assert.False(orders[0].Customer == orders[1].Customer);
Assert.False(orders[1].Customer.HasDiscount);
// while
var orders = session.Query<Oders>().ToList();
orders[0].Customer.HasDiscount = true;
Assert.True(orders[1].Customer.HasDiscount);
so using stateless session the customers are not the same instance hence updates are not seen where they should and ReferenceEquals will return false. You have two alias of the same Customer

Find matching tag items in table from string array using linq

I have a SQL table of tags like this:
Id | Tag
-----------------
1 | car
1 | red
1 | sport
2 | car
2 | red
2 | SUV
And I want to retrieve ONLY the ID for exact matching search strings. So, using LINQ, I'd like to query for:
"car,red"
and have it return: 1 and 2.
Then searching for "car,red,sport" would return 1 only.
I'm not sure at all how to do this using LINQ. If I do the following (using EF context and table as example):
string[] tags = {"car","red","sport"}
var query = context.CarTags.Where(a => tags.Contains(a.Tag)).Select(s=>s);
...will of course return both 1 & 2.
So, how do I do this using LINQ?
string[] tags = {"car","red","sport"}
var query = context.CarTags
.GroupBy(a => a.Id)
.Where(g => tags.All(t => g.Any(a => a.Tag == t)))
.Select(g => g.Key);
I'm not sure if it's gonna work with LINQ to Entites, but would definitely give it a try.
All methods used are listed as supported within Supported and Unsupported LINQ Methods (LINQ to Entities) list, so should work. But even if it is, I would probably check the generated SQL to make sure it's not over-complicated and not very effective.