Rails, SQL: private chat, how to find last message in each conversation - sql

I'v got the folowing schema
+----+------+------+-----------+---------------------+--------+
| id | from | to | message | timestamp | readed |
+----+------+------+-----------+---------------------+--------+
| 46 | 2 | 6 | 123 | 2013-11-19 19:12:19 | 0 |
| 44 | 2 | 3 | 123 | 2013-11-19 19:12:12 | 0 |
| 43 | 2 | 1 | ????????? | 2013-11-19 18:37:11 | 0 |
| 42 | 1 | 2 | adf | 2013-11-19 18:37:05 | 0 |
+----+------+------+-----------+---------------------+--------+
from/to is the ID of the user's, message – obviously, the message, timestamp and read flag.
When user open's his profile I want him to see the list of dialogs he participated with last message in this dialog.
To find a conversation between 2 people I wrote this code, it's simple (Message model):
def self.conversation(from, to)
where(from: [from, to], to: [from, to])
end
So, I can now sort the messages and get the last one. But it's not cool to fire a lot of queries for each dialog.
How could I achieve the result I'm looking for with less queries?
UPDATE:
Ok, looks like it's not really clear, what I'm trying to achieve.
For example, 4 users – Kitty, Dandy, Beggy and Brucy used that chat.
When Brucy entered in dialogs, she shall see
Beggy: hello brucy haw ar u! | <--- the last message from beggy
-------
Dandy: Hi brucy! | <---- the last message from dandy
--------
Kitty: Hi Kitty, my name is Brucy! | <–– this last message is from current user
So, three separated dialogs. Then, Brucy can enter anyone dialog to continue private conversation.
And I can't figured out how could I fetch this records without firing a query for each dialog between users.

This answer is a bit late, but there doesn't seem to be a great way to do this, in Rails 3.2.x at least.
However, here is the solution I came up with
(as I had the same problem on my website).
#sender_ids =
Message.where(recipient_id: current_user.id)
.order("created_at DESC")
.select("DISTINCT owner_id")
.paginate(per_page: 10, page: params[:page])
sql_queries =
#sender_ids.map do |user|
user_id = user.owner_id
"(SELECT * FROM messages WHERE owner_id = #{user_id} "\
"AND recipient_id = #{current_user.id} ORDER BY id DESC "\
"LIMIT 1)"
end.join(" UNION ALL ")
#messages = Message.find_by_sql(sql_queries)
ActiveRecord::Associations::Preloader.new(#messages, :owner).run
This gets the last 10 unique people you sent messages to.
For each of those people, it creates a UNION ALL query to get the last message sent to each of those 10 unique people. With 50, 000 rows, the query completes in about ~20ms. And of course to get assocations to preload, you have to use .includes will not work when using .find_by_sql

def self.conversation(from, to)
order("timestamp asc").last
end
Edit:
This railscast will be helpful..
http://railscasts.com/episodes/316-private-pub?view=asciicast
EDIT2:
def self.conversation(from, to)
select(:from, :to, :message).where(from: [from, to], to: [from, to]).group(:from, :to, :country).order("timestamp DESC").limit(1)
end

Related

how to have one itempointer serialize from 1 to n across the selected rows

as shown in the example below, the output of the query contains blockid startds from 324 and it ends at 127, hence, the itempointer or the row index within the block starts from one for each new block id. in otherwords, as shown below
for the blockid 324 it has only itempointer with index 10
for the blockid 325 it has itempointers starts with 1 and ends with 9
i want to have a single blockid so that the itempointer or the row index starts from 1 and ends with 25
plese let me know how to achive that and
why i have three different blockids?
ex-1
query:
select ctid
from awanti_grid_cell_data agcd
where selectedsiteid = '202230060950'
and centerPointsOfWindowAsGeoJSONInEPSG4326ForCellsInTreatment IS NOT NULL
and centerPointsOfWindowAsGeoJSONInEPSG4326ForCellsInTreatment <> 'None'
result:
|ctid |
|--------|
|(324,10)|
|(325,1) |
|(325,2) |
|(325,3) |
|(325,4) |
|(325,5) |
|(325,6) |
|(325,7) |
|(325,8) |
|(325,9) |
|(326,1) |
|(326,2) |
|(326,3) |
|(326,4) |
|(326,5) |
|(326,6) |
|(326,7) |
|(326,8) |
|(326,9) |
|(327,1) |
|(327,2) |
|(327,3) |
|(327,4) |
|(327,5) |
|(327,6) |
You are missing the point. The ctid is the physical address of a row in the table, and it is none of your business. The database is free to choose whatever place it thinks fit for a table row. As a comparison, you cannot go to the authorities and request that your social security number should be 12345678 - it is simply assigned to you, and you have no say. That's how it is with the physical location of tuples.
Very likely you are not asking this question out of pure curiosity, but because you want to solve some problem. You should instead ask a question about your real problem, and there may be a good answer to that. But whatever problem you are trying to solve, using the ctid is probably not the correct answer, in particular if you want to control it.

ID Extracted from string not useable for connecting to bound form - "expression ... too complex"

I have a linked table to a Outlook Mailitem folder in my Access Database. This is handy in that it keeps itself constantly updated, but I can't add an extra field to relate these records to a parent table.
My workaround was to put an automatically generated/added ID String into the Subject so I could work from there. In order to make my form work the way I need it to, I'm trying to create a query that takes the fields I need from the linked table and adds a calculated field with the extracted ID so it can be referenced for relating records in the form.
The query works fine (I get all the records and their IDs extracted) but when I try to filter records from this query by the calculated field I get:
This expression is typed incorrectly, or it is too complex to be evaluated. For example, a numeric expression may contain too many complicated elements. Try simplifying the expression by assigning parts of the expression to variables.
I tried separating the calculated field out into three fields so it's easier to read, hoping that would make it easier to evaluate for Access, but I still get the same error. My base query is currently:
SELECT InStr(Subject,"Support Project #CS")+19 AS StartID,
InStr(StartID,Subject," ") AS EndID,
Int(Mid(Subject,StartID,EndID-StartID)) AS ID,
ProjectEmails.Subject,
ProjectEmails.[From],
ProjectEmails.To,
ProjectEmails.Received,
ProjectEmails.Contents
FROM ProjectEmails
WHERE (((ProjectEmails.[Subject]) Like "*Support Project [#]CS*"));
I've tried to bind a subform to this query on qryProjectEmailWithID.ID = SupportProject.ID where the main form is bound to SupportProject, and I get the above error. I tried building a query that selects all records from that query where the ID = a given parameter and I still get the same error.
The working query that adds Support Project IDs would look like:
+----+--------------------------------------+----------------------+----------------------+------------+----------------------------------+
| ID | Subject | To | From | Received | Contents |
+----+--------------------------------------+----------------------+----------------------+------------+----------------------------------+
| 1 | RE: Support Project #CS1 ID Extra... | questions#so.com | Isaac.Reefman#so.com | 2019-03-11 | Trying to work out how to add... |
| 1 | RE: Support Project #CS1 ID Extra... | isaac.reefman#so.com | questions#so.com | 2019-03-11 | Thanks for your question. The... |
| 1 | RE: Support Project #CS1 ID Extra... | isaac.reefman#so.com | questions#so.com | 2019-03-11 | You should use a different me... |
| 2 | RE: Support Project #CS2 IT issue... | support#domain.com | someone#company.com | 2019-02-21 | I really need some help with ... |
| 2 | RE: Support Project #CS2 IT issue... | someone#company.com | support#domain.com | 2019-02-21 | Thanks for your question. The... |
| 2 | RE: Support Project #CS2 IT issue... | someone#company.com | support#domain.com | 2019-02-21 | Have you tried turning it off... |
| 3 | RE: Support Project #CS3 email br... | support#domain.com | someone#company.com | 2019-02-12 | my email server is malfunccti... |
| 3 | RE: Support Project #CS3 email br... | someone#company.com | support#domain.com | 2019-02-12 | Thanks for your question. The... |
| 3 | RE: Support Project #CS3 email br... | someone#company.com | support#domain.com | 2019-02-13 | I've just re-started the nece... |
+----+--------------------------------------+----------------------+----------------------+------------+----------------------------------+
The view in question would populate a datasheet that looks the same with just the items whos ID matches the ID of the current SupportProject record, updating when a new record is selected. A separate text box should show the full content of whichever record is selected in that grid, like this:
Have you tried turning it off and on again?
From: support#domain.com
On: 21/02/2019
Thanks for your question. The matter has been assigned to Support Project #CS2, and a support staff member will be in touch shortly to help you out. As it is considered of medium priority, you should expect daily updates.
Thanks,
Support
From: someone#company
On: 21/02/2019
I really need some help with my computer. It seems really slow and I can't do my work efficiently.
Neither of these things happens as when I try to use the calculated number to relate to the PK of the SupportProject table...
I don't know if this is a part of the problem, but whether I use Int(Mid(Subject... or Val(Mid(Subject... I still apparently get a Double, where the ID field (as an autoincrement ID) is a Long. I can't work out how to force it to return a Long, so I can't test whether that's the problem.
So that is output resulting from posted SQL? I really wanted raw data but close enough. If requirement is to extract number after ...CS, calculate in query and save query:
Val(Mid([Subject],InStr([Subject],"CS")+2))
Then build another query to join first query to table.
SELECT qryProjectEmailWithID.*, SupportProject.tst
FROM qryProjectEmailWithID
INNER JOIN SupportProject ON qryProjectEmailWithID.ID = SupportProject.ID;
Filter criteria can be applied to either ID field.
A subform can display the related child records synchronized with SupportProject records on main form.
I tested the ID calc with your data and then with a link to my Inbox. No issue with query join.

How to write a select query or server-side function that will generate a neat time-flow graph from many data points?

NOTE: I am using a graph database (OrientDB to be specific). This gives me the freedom to write a server-side function in javascript or groovy rather than limit myself to SQL for this issue.*
NOTE 2: Since this is a graph database, the arrows below are simply describing the flow of data. I do not literally need the arrows to be returned in the query. The arrows represent relationships.*
I have data that is represented in a time-flow manner; i.e. EventC occurs after EventB which occurs after EventA, etc. This data is coming from multiple sources, so it is not completely linear. It needs to be congregated together, which is where I'm having the issue.
Currently the data looks something like this:
# | event | next
--------------------------
12:0 | EventA | 12:1
12:1 | EventB | 12:2
12:2 | EventC |
12:3 | EventA | 12:4
12:4 | EventD |
Where "next" is the out() edge to the event that comes next in the time-flow. On a graph this comes out to look like:
EventA-->EventB-->EventC
EventA-->EventD
Since this data needs to be congregated together, I need to merge duplicate events but preserve their edges. In other words, I need a select query that will result in:
-->EventB-->EventC
EventA--|
-->EventD
In this example, since EventB and EventD both occurred after EventA (just at different times), the select query will show two branches off EventA as opposed to two separate time-flows.
EDIT #2
If an additional set of data were to be added to the data above, with EventB->EventE, the resulting data/graph would look like:
# | event | next
--------------------------
12:0 | EventA | 12:1
12:1 | EventB | 12:2
12:2 | EventC |
12:3 | EventA | 12:4
12:4 | EventD |
12:5 | EventB | 12:6
12:6 | EventE |
EventA-->EventB-->EventC
EventA-->EventD
EventB-->EventE
I need a query to produce a tree like:
-->EventC
-->EventB--|
| -->EventE
EventA--|
-->EventD
EDIT #3 and #4
Here is the data with edges shown as opposed to the "next" column above. I also added a couple additional columns here to hopefully clear up any confusion about the data:
# | event | ip_address | timestamp | in | out |
----------------------------------------------------------------------------
12:0 | EventA | 123.156.189.18 | 2015-04-17 12:48:01 | | 13:0 |
12:1 | EventB | 123.156.189.18 | 2015-04-17 12:48:32 | 13:0 | 13:1 |
12:2 | EventC | 123.156.189.18 | 2015-04-17 12:48:49 | 13:1 | |
12:3 | EventA | 103.145.187.22 | 2015-04-17 14:03:08 | | 13:2 |
12:4 | EventD | 103.145.187.22 | 2015-04-17 14:05:23 | 13:2 | |
12:5 | EventB | 96.109.199.184 | 2015-04-17 21:53:00 | | 13:3 |
12:6 | EventE | 96.109.199.184 | 2015-04-17 21:53:07 | 13:3 | |
The data is saved like this to preserve each individual event and the flow of a session (labeled by the ip address).
TL;DR
Got lots of events, some duplicates, and need them all organized into one neat time-flow graph.
Holy cow.
After wrestling with this for over a week I think I FINALLY have a working function. This isn't optimized for performance (oh the loops!), but gets the job done for the time being while I can work on performance. The resulting OrientDB server-side function (written in javascript):
The function:
// Clear previous runs
db.command("truncate class tmp_Then");
db.command("truncate class tmp_Events");
// Get all distinct events
var distinctEvents = db.query("select from Events group by event");
// Send 404 if null, otherwise proceed
if (distinctEvents == null) {
response.send(404, "Events not found", "text/plain", "Error: events not found" );
} else {
var edges = [];
// Loop through all distinct events
distinctEvents.forEach(function(distinctEvent) {
var newEvent = [];
var rid = distinctEvent.field("#rid");
var eventType = distinctEvent.field("event");
// The main query that finds all *direct* descendents of the distinct event
var result = db.query("select from (traverse * from (select from Events where event = ?) where $depth <= 2) where #class = 'Events' and $depth > 1 and #rid in (select from Events group by event)", [eventType]);
// Save the distinct event in a temp table to create temp edges
db.command("create vertex tmp_Events set rid = ?, event = ?", [rid, event]);
edges.push(result);
});
// The edges array defines which edges should exist for a given event
edges.forEach(function(edge, index) {
edge.forEach(function(e) {
// Create the temp edge that corresponds to its distinct event
db.command("create edge tmp_Then from (select from tmp_Events where rid = " + distinctEvents[index].field("#rid") + ") to (select from tmp_Events where rid = " + e.field("#rid") + ")");
});
});
var result = db.query("select from tmp_Events");
return result;
}
Takeaways:
Temp tables appeared to be necessary. I tried to do this without temp tables (classes), but I'm not sure it could be done. I needed to mock edges that didn't exist in the raw data.
Traverse was very helpful in writing the main query. Traversing through an event to find its direct, unique descendents was fairly simple.
Having the ability to write stored procs in Javascript is freaking awesome. This would have been a nightmare in SQL.
omfg loops. I plan to optimize this and continue to make it better so hopefully other people can find some use for it.

Creating an SSIS job to split a column and insert into database

I have a column called Description:
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Description/Title |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Liszt, Hungarian Rhapsody #6 {'Pesther Carneval'}; 2 Episodes from Lenau's 'Faust'; 'Hunnenschlacht' Symphonic Poem. (NW German Phil./ Kulka) |
| Beethoven, Piano Sonatas 8, 23 & 26. (Justus Frantz) |
| Puccini, Verdi, Gounod, Bizet: Arias & Duets from Butterfly, Tosca, Boheme, Turandot, I Vespri, Faust, Carmen. (Fiamma Izzo d'Amico & Peter Dvorsky w.Berlin Radio Symph./Paternostro) |
| Puccini, Ponchielli, Bizet, Tchaikovsky, Donizetti, Verdi: Arias from Boheme, Manon Lescaut, Tosca, Gioconda, Carmen, Eugen Onegin, Favorita, Rigoletto, Luisa Miller, Ballo, Aida. (Peter Dvorsky, ten. w.Hungarian State Opera Orch./ Mihaly) |
| Thomas, Leslie: 'The Virgin Soldiers' (Hywel Bennett reads abridged version. Listening time app. 2 hrs. 45 mins. DOLBY) |
| Katalsky, A. {1856-1926}: Liturgy for A Cappella Chorus. Rachmaninov, 6 Choral Songs w.Piano. (Bolshoi Theater Children's Choir/ Zabornok. DOLBY) |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Please note that above I'm only showing 1 field.
Also, the output that I would like is:
+-------+-------+
| Word | Count |
+-------+-------+
| Arias | 3 |
| Duets | 2 |
| Liszt | 10 |
| Tosca | 1 |
+-------+-------+
I want this output to encompass EVERY record. I do not want a separate one of these for each record, just one global one.
I am choosing to use SSIS to do this job. I'd like your input on which controls to use to help with this task:
I'm not looking for a solution, but simply some direction on how to get started with this. I understand this can be done many different ways, but I cannot seem to think of a way to do this most efficiently. Thank you for any guidance.
FYI:
This script does an excellent job of concatenating everything:
select description + ', ' as 'data()'
from [BroincInventory]
for xml path('')
But I need guidance on how to work with this result to create the required output. How can this be done with c# or with one of the SSIS components?
edit: As siyual points out below I need a script task. The script above obviously will not work since there's a limit to the size of a data point.
I think term extraction might be the component you are looking for. Check this out: http://www.mssqltips.com/sqlservertip/3194/simple-text-mining-with-the-ssis-term-extraction-component/

RavenDB query a scoreboard

Given a scoreboard like this: ie (same score and user allowed)
user | score | date
user1 | 1000 | 10-05-2012
user1 | 999 | 10-05-2012
user2 | 998 | 10-05-2012
user1 | 998 | 10-05-2012
user4 | 987 | 10-05-2012
I d like to query the document for a particular score and get x results below and above
for example say I want to target user2 with score 998 and get 2 above and 2 below
The date is there but doesn't influence the query, when a new score comes in and the same value existed, it would be included below.
What would be the best way to go about it?
at the moment I m doing the worse possible thing, that is to have three queries: one that gets the score I m looking for and then another two to get the some rows above and below.
I've seen some discussion on the GG but I cant make anything that makes sense work, hopefully posting the question here is ok
Update: there is also this thread at the discussion group but it doesn't seem to resolve it either.
Thanks
Miau,
You don't really have a choice, you can drop it to two queries:
var q1 = session.Query<Score>().Where(x=>x.Value >= 998).OrderBy(x=> x.Value).Take(3).ToList();
var q2 = session.Query<Score>().Where(x=>x.Value < 998).OrderByDescending(x=> x.Value).Take(3).ToList();
You can improve on that by using lazy queries:
var q1 = session.Query<Score>().Where(x=>x.Value >= 998).OrderBy(x=> x.Value).Take(3).Lazily();
var q2 = session.Query<Score>().Where(x=>x.Value < 998).OrderByDescending(x=> x.Value).Take(3).Lazily();
Now both queries will be issued against the db in a single request.