I have a table visitors(id, email, first_seen, sessions, etc.)
and another table trackings(id, visitor_id, field, value) that stores custom, user supplied data.
I want to query these and merge the visitor data columns and the trackings into a single column called data
For example, say I have two trackings
(id: 3, visitor_id: 1, field: "orders_made", value: 2)
(id: 4, visitor_id: 1, field: "city", value: 'new york')
and a visitor
(id: 1, email: 'hello#gmail.com, sessions: 5)
I want the result to be on the form of
(id: 1, data: {email: 'hello#gmail.com', sessions: 5, orders_made: 2, city: 'new york'})
What's the best way to accomplish this using Postgres 9.4?
I'll start by saying trackings is a bad idea. If you don't have many things to track, just store json instead; that's what it's made for. If you have a lot of things to track, you'll become very unhappy with the performance of trackings over time.
First you need a json object from trackings:
-- WARNING: Behavior of this with duplicate field names is undefined!
SELECT json_object(array_agg(field), array_agg(value)) FROM trackings WHERE ...
Getting json for visitors is relatively easy:
SELECT row_to_json(email, sessions) FROM visitors WHERE ...;
I recommend you do not just squash all those together. What happens if you have a field called email? Instead:
SELECT row_to_json((SELECT
(
SELECT row_to_json(email, sessions) FROM visitors WHERE ...
) AS visitor
, (
SELECT json_object(array_agg(field), array_agg(value)) FROM trackings WHERE ...
) AS trackings
));
Related
I have 2 tables contractPoint and contractPointHistory
ContractPointHistory
ContractPoint
I would like to get contractPoint where point will be subtracted by pointChange. For example: ContractPoint -> id: 3, point: 5
ContractPointHistory has contractPointId: 3 and pointChange: -5. So after manipulating point in contractPoint should be 0
I wrote this code, but it works just for getRawMany(), not for getMany()
const contractPoints = await getRepository(ContractPoint).createQueryBuilder('contractPoint')
.addSelect('"contractPoint".point + COALESCE((SELECT SUM(cpHistory.point_change) FROM contract_point_history AS cpHistory WHERE cpHistory.contract_point_id = contractPoint.id), 0) AS points')
.andWhere('EXTRACT(YEAR FROM contractPoint.validFrom) = :year', { year })
.andWhere('contractPoint.contractId = :contractId', { contractId })
.orderBy('contractPoint.grantedAt', OrderByDirection.Desc)
.getMany();
The method getMany can be used to select all attributes of an entity. However, if one wants to select some specific attributes of an entity then one needs to use getRawMany.
As per the documentation -
There are two types of results you can get using select query builder:
entities or raw results. Most of the time, you need to select real
entities from your database, for example, users. For this purpose, you
use getOne and getMany. But sometimes you need to select some specific
data, let's say the sum of all user photos. This data is not an
entity, it's called raw data. To get raw data, you use getRawOne and
getRawMany
From this, we can conclude that the query which you want to generate can not be made using getMany method.
I have a SQL table message(application, type, action, date, ...) and I would like to get all the actions for a type and all the types for an application in a single query if possible.
So far I have managed to get the result in two separate queries like so:
select application, array_agg(distinct type) as types from message group by application;
application | types
--------------+----------------------------------------------------------------------------------------------------------------------------
app1 | {company,user}
app2 | {document,template}
app3 | {organization,user}
and the second query:
select type, array_agg(distinct action) as actions from message group by type;
type | actions
--------------------------------------+-----------------------------------------
company | {created,updated}
document | {created,tested,approved}
organization | {updated}
template | {deleted}
user | {created,logged,updated}
The most obvious single query I could come up with so far is just:
select application, type, array_agg(distinct action) from message group by application, type;
Which would require some programmatic processing to build the type array.
What I wanted to do was something theoretically like:
select application, array_agg(type, array_agg(action)) from message group by application, type which isn't possible as is but I feel there is a way to do it. I have also thought about nesting the second query into the first one but haven't found how to make it work yet.
demo:db<>fiddle
You can create tuples (records): (col1, col2). So if col2 is of type array, you created (text, text[]). These tuples can be aggregated as well into array of tuples:
SELECT
app,
array_agg((type, actions)) -- here is the magic
FROM (
SELECT
app,
type,
array_agg(actions) actions
FROM
message
GROUP BY app, type
) s
GROUP BY app
To get access, you have to explicitely define the record type at unnesting:
SELECT
*
FROM (
-- your query with tuples
)s,
unnest(types) AS t(type text, actions text[]) -- unnesting the tuple array
Nevertheless, as stated in the comments, maybe JSON may be a better approach for you:
demo:db<>fiddle
SELECT
app,
json_agg(json_build_object('type', type, 'actions', actions))
FROM (
SELECT
app,
type,
json_agg(actions) actions
FROM
message
GROUP BY app, type
) s
GROUP BY app
Result:
[{
"type": "company",
"actions": ["created","updated"]
},
{
"type": "user",
"actions": ["logged","updated"]
}]
Another possible JSON output:
demo:db<>fiddle
SELECT
json_agg(data)
FROM (
SELECT
json_build_object(app, json_agg(types)) as data
FROM (
SELECT
app,
json_build_object(type, json_agg(actions)) AS types
FROM
message
GROUP BY app, type
) s
GROUP BY app
) s
Result:
[{
"app1": [{
"company": ["created","updated"]
},
{
"user": ["logged","updated"]
}]
},
{
"app2": [{
"company": ["created"]
}]
}]
I would like to select all rows from my database where one row contains at least two terms from a set of words/array.
As an example:
I have the following array:
'{"test", "god", "safe", "name", "hello", "pray", "stay", "word", "peopl", "rain", "lord", "make", "life", "hope", "whatever", "makes", "strong", "stop", "give", "television"}'
and I got a tweet dataset stored in the database. So i would like to know which tweets (column name: tweet.content) contain at least two of the words.
My current code looks like this (but of course it only selects one word...):
CREATE OR REPLACE VIEW tweet_selection AS
SELECT tweet.id, tweet.content, tweet.username, tweet.geometry,
FROM tweet
WHERE tweet.topic_indicator > 0.15::double precision
AND string_to_array(lower(tweet.content)) = ANY(SELECT '{"test", "god", "safe", "name", "hello", "pray", "stay", "word", "peopl", "rain", "lord", "make", "life", "hope", "whatever", "makes", "strong", "stop", "give", "television"}'::text[])
so the last line needs to be adjustested somehow, but i have no idea how - maybe with a inner join?!
I have the words also stored with a unique id in a different table.
A friend of mine recommended getting a count for each row, but i have no writing access for adding an additional column in the original tables.
Background:
I am storing my tweets in a postgres database and I applied a LDA (Latent dirichlet allocation) on the dataset. Now i got the generated topics and the words associated with each topic (20 topics and 25 words).
select DISTINCT ON (tweet.id) tweet.id, tweet.content, tweet.username, tweet.geometry
from tweet
where
tweet.topic_indicator > 0.15::double precision
and (
select count(distinct word)
from
unnest(
array['test', 'god', 'safe', 'name', 'hello', 'pray', 'stay', 'word', 'peopl', 'rain', 'lord', 'make', 'life', 'hope', 'whatever', 'makes', 'strong', 'stop', 'give', 'television']::text[]
) s(word)
inner join
regexp_split_to_table(lower(tweet.content), ' ') v (word) using (word)
) >= 2
The setup I have has publications, drafts, and live versions. Publication has a polymorphic belongs_to since many different types of objects can be drafted.
# Publication.all
Publication id: 1, publishable_id: 2, publishable_type: "Foo",
original_id: 1, original_type: "Foo"
# published scope on Foo
select('*, MAX(publications.created_at)').
joins(:publications).
group('publications.original_id')
# Foo.published.all
[<Foo id: 1, ...>]
Here is the published scope's to_sql:
SELECT *, MAX(publications.created_at)
FROM "foos"
INNER JOIN "publications"
ON "publications"."publishable_id" = "foos"."id"
AND "publications"."publishable_type" = 'Foo'
GROUP BY publications.original_id
Because there is only one publication with a publishable_id of 2, I expect this query to return the second Foo. But when I call the published scope on Foo, I instead get the first one. How is this possible? I thought that an INNER JOIN would limit the results to where the join condition is satisfied? How am I getting the complete opposite of what I'm looking for?
Something interesting: just performing the joins returns the correct result:
self.class.unscoped.joins(:publications)
However, the published scope (shown above) returns the incorrect result. Is something happening with the SELECT or GROUP BY parts of the query that is causing this?
I have:
Two database tables:
Users: id, title
Infos: id, type, user_id_createdby, user_id_addressedto, text
in Infos I have records of setting a dates of meetings between users:
Infos record: id: 1, type: "nextmeeting", user_id_createdby: 47, user_id_addressedto: 51, text: "2011/01/13"
beside "nextmeeting" I have other types of data between users as well
while a User logged in I'm showing him a list of users with whom he has a meetings by collecting: unique user_id_addressedto and current_user => user_id_createdby
from Info to array #repilents and then User.find(:all, :conditions => {:id => #repilents})
Question:
How I can sort list of Users by dates from Infos.text where type: "nextmeeting"?
like:
User 4 - 2011/01/05
User 8 - 2011/01/13
User 2 - 2011/01/21
User 5 - Next meeting not defined
User 3 - Next meeting not defined
If you're sure that there will only be one row in Infos that has the type "nextmeeting", you can use a left outer join to link users to meetings:
#users = User.joins("LEFT OUTER JOIN infos ON infos.user_id_createdby = users.id")
To order by the descending date:
#users = User.joins("LEFT OUTER JOIN infos ON infos.user_id_createdby = users.id").order("infos.text DESC")
Now some random comments:
type is a reserved word and you shouldn't use it as a column name (http://stackoverflow.com/questions/2293618/rails-form-not-saving-type-field-newbie)
storing dates as text is going to make your life difficult
You might want to rethink this generic "infos" column and make a distinct Meeting model.