get facebook user first ever made status in rails - ruby-on-rails-3

I have tried to get a facebook user First ever status in my rails app.
I tried this
SELECT status_id,time,message FROM status WHERE uid=me() AND source = 0 ORDER BY status_id ASC LIMIT 1
but it is not getting the first ever made status

SELECT status_id,time,message FROM status WHERE uid=me() AND source = 0 ORDER BY time ASC LIMIT 1
The above query works good. I just replaced status_id with time to sort the result by time (Unix Timestamp where the least is earliest)

Related

find rejected items from the customer's order

I need to generate a report of the rejected items of an order, I have to do it when the order has finished being processed by de system and the conditions that I have to consider that the order has stopped being processed are:
The status of the order in the process is equal to or greater than 600
All items in the order were rejected and are in 999 status
I want to make an SQL query that considers the two previous conditions to bring me the rejecteds items from the order when it is no longer processed by the system.
scenario example:
so, I am trying them in the following way
select * from order_detail_status
where order_number = 'OR_001'
and process_status= '999'
and process_id = (select max(process_id) from configuracion.order_detail_status where order_number = 'OR_001' and process_status >= 600)
this would work if only scenario 1 existed, but for scenario 2 the request never reaches that status, so I am trying to add a second condition:
or (select distinct (process_status) from configuracion.order_detail_status where order_number = 'OR_002' ) = '999'
in the second condition I want to indicate that all the records of the order were rejected with the state 999, but it does not work for me, any suggestions?
If you want to find orders where ALL items have process_status of 999, then try something like this:
SELECT order_number, MIN(process_status) AS minps, MAX(process_status) AS maxps
FROM order_detail_status
GROUP BY order_number
HAVING minps=maxps AND minps=999
Grouping the lines by order and then doing min() and max() gives you the highest and lowest status for the order. If they match, then there is only one status for all items in the order. If the single status is 999 (or > 600), then you have the answer.
HAVING is like a WHERE condition but operates after the grouping is done.
Results:
OR_002 999 999

This OFFSET FETCH command passes validation but produces no results?

I have to connect accounts to users, and in doing so I have ensure I'm selecting the right account to ensure eligibility and do this with multiple accounts. The following code passes validation but yields no results.
I already have a list of the CustomerNumbers on the table that are the PK.
SELECT
x.CustomerNumber,
a.ACC_AccountId
FROM Eligibles x
LEFT JOIN
ACCOUNTS a
ON x.CustomerNumber = a.ACC_CustomerNumber
ORDER BY a.ACC_LIVEcode ASC,
a.ACC_Limit DESC,
a.ACC_Amount DESC
OFFSET 0 ROWS
FETCH FIRST 1 ROWS ONLY
I am not getting any Account ID's to populate, whereas nearly everyone should have one.
In an ascending sort, NULLs are first. So, you get the mismatches first. Instead, add an expression so the matches are first:
SELECT x.CustomerNumber, a.ACC_AccountId
FROM Eligibles x LEFT JOIN
ACCOUNTS a
ON x.CustomerNumber = a.ACC_CustomerNumber
ORDER BY (CASE WHEN ACC_CustomerNumber IS NOT NULL THEN 1 ELSE 0 END),
aa.ACC_LIVEcode ASC, a.ACC_Limit DESC, a.ACC_Amount DESC
OFFSET 0 ROWS
FETCH FIRST 1 ROWS ONLY

SQL: get records based on user likes of other records

I'm trying to write an SQL (Windows server) query that will provide some results based on what other users like.
It is a bit like on Amazon when it says 'Users who bought this also bought...'
It is based on the vote field, where a vote of '1' means a user liked a record; or a vote of '0' means they disliked it.
So when a user is on a particular record, I want to list 3 other records that users who liked the current record also liked.
snippet of relevant table provided below:
ID UserID Record ID Vote DateAdded
16 9999 12013011290 1 2008-11-11 13:23:44.000
17 8888 12013011290 0 2008-11-11 13:23:44.000
18 7777 12013011290 0 2008-11-11 13:23:44.000
20 4930 12013011290 1 2013-11-19 15:04:06.263
I think this requires ordering by a sub-select, but I'm not sure. Can anyone advise me on if this is possible and if so how! thanks.
p.s.
To maintain the quality of the results I think it would be extra useful to filter by DateAdded. That is,
- 'user x' is seeing recommended records about 'record z'
- 'user y' is someone who has liked 'record z' and 'record a'
- only count 'user y's' like of 'record a' IF they liked 'record a' an HOUR before or after they liked 'record z'
- in other words, only count the 'record a's' like if it was during the same website-browsing session as 'record z'
Hope this makes sense!
something like this?
select r.description
from record r
join (
select top 3 v.recordid from votes v
where v.vote = 1 and recordid != 123456789
and userid in
(
select userid from votes where recordid = 123456789 and vote =1
)
order by dateadded desc
) as x on x.recordid = r.id
A method I used for the basic version of this problem is indeed using multiple selects: figure out what users liked a specific item, then query further on what they tagged.
with Likers as
(select user_id from likes where content_id = 10)
select count(user_id) as like_count, content_id
from likes
natural join likers
where content_id <> 10
group by content_id
order by like_count desc;
(Tested using Sqlite3)
What you will receive is a list of items that were liked by everyone who liked item 10, ordered by the number of likes (within the search domain.) I would probably want to limit this as well, since on a larger dataset its likely to result in a large amount of stray items with only one or two similar likes that are in turn buried under items with hundreds of likes.
I suspect the reason you are checking timestamps in the first place is so that if somebody likes laundry detergent, then comes back two days later to like a movie, the system would not associate "people who like Epic Shootout 17 also like Clean More."
I would not recommend using date arithmetic for this. I might suggest creating another table to represent individual "sessions" and using the session_id for this task. Since there are (hopefully!) many, many like records on your database, you want to reduce the amount of work you are making it do. You can also use this session_id for logging any other actions a person did (for analytics purposes.) It is also computationally cheaper to ask for all things that happened within a session with a simple index and identity comparison than to perform date computations on potentially millions of records.
For reference, Piwik defines a new session as thirty minutes since the last action taken.

How to check data integrity within an SQL table?

I have a table for logging access data of a lab. The table struct like this:
create table accesslog
(
userid int not null,
direction int not null,
accesstime datetime not null
);
This lab have only one gate that is under access control. So the users must first "enter" the lab before they can "leave". In my original design, I set the "direction" field as a flag that is either 1 (for entering the lab) or -1 (for leaving the lab). So that I can use queries like:
SELECT SUM(direction) FROM accesslog;
to get the total user count within the lab. Theoretically, it worked; since the "direction" will always be in the patterns of 1 => -1 => 1 => -1 for any given userid.
But soon I found that the log message would lost in the transmission path from lab gate to server, being dropped either by busy network or by hardware glitches. Of course I can enforce the transmission path with sequence number, ACK, retransmission, hardware redundancy, etc., but in the end I might still get something like this:
userid direction accesstime
-------------------------------------
1 1 2013/01/03 08:30
1 -1 2013/01/03 09:20
1 1 2013/01/03 10:10
1 -1 2013/01/03 10:50
1 -1 2013/01/03 13:40
1 1 2013/01/03 18:00
It's a recent log for user "1". It's clear that I've lost one log message for that user entering the lab between 10:50 to 13:40. While I query this data, he is still in the lab, so there is no exiting logs after 2013/01/03 18:00 yet; that's affirmative.
My question is: is there any way to "find" this data inconsistence with SQL command ? There are total 5000 users within my system and the lab is operating 24 hour, there is no such "magic time" that the lab would be cleared. I'd be horrible if I've to write codes checking the continuity of "direction" field line-by-line, user-by-user.
I know it's not possible to "fix" the log with correct data. I just want to know "Oh, I have a data inconsistency issue for userid=1" so that I can add an marked amending data to the correct the final statistic.
Any advice would be appreciated, even changing the table structure would be OK.
Thanks.
Edit: Sorry I didn't mentioned the details.
Currently I'm using mixed SQL solution. The table showed above is MySQL, and it contains only logs within 24 hrs as the "real time" status for fast browsing.
Everyday at 03:00 AM a pre-scheduled process written in C++ on POSIX will be launched. This process will calculated the statistic data, and add the daily statistic to an Oracle DB, via a proprietary-protocol TCP socket, then it will remove the old data from MySQL.
The Oracle part is not handled by me and I can do nothing about it. I just want to make sure that the final statistics of each day is correct.
The data size is about 200,000 records per day -- I know it's sound crazy but it's true.
You didn't state your DBMS, so this is ANSI SQL (which works on most modern DBMS).
select userid,
direction,
accesstime,
case
when lag(direction) over (partition by userid order by accesstime) = direction then 'wrong'
else 'correct'
end as status
from accesslog
where userid = 1
for each row in accesslog you'll get a column "status" which indicates if the row "breaks" the rule or not.
You can filter out those that are invalid using:
select *
from (
select userid,
direction,
accesstime,
case
when lag(direction) over (partition by userid order by accesstime) = direction then 'wrong'
else 'correct'
end as status
from accesslog
where userid = 1
) t
where status = 'wrong'
I don't think there is a way to enforce this kind of rule using constraints in the database (although I have the feeling that PostgreSQL's exclusion constraints could help here)
Why not use SUM() with a WHERE field to filter by USER.
If you get anything other than 0 or 1 then you surely have a problem.
Ok I figured it out. Thanks for the idea provided by a_horse_with_no_name.
My final solution is this query:
SELECT userid, COUNT(*), SUM(direction * rule) FROM (
SELECT userid, direction, #inout := #inout * -1 AS rule
FROM accesslog l, (SELECT #inout := -1) r
ORDER by userid, accesstime
) g GROUP by userid;
First I created a pattern with #inout that will yield 1 => -1 => 1 => -1 for each row in the "rule" column. Than I compare the direction field with rule column by calculating multiplication product.
It's OK even if there are odd records for certain users; since each user is supposed to follow identical or reversed pattern as "rule". So the total sum of multiplication product should be equal to either COUNT() or -1 * COUNT().
By checking SUM() and COUNT(), I can know exactly which userid had go wrong.

How to write an SQL query that retrieves high scores over a recent subset of scores -- see explaination

Given a table of responses with columns:
Username, LessonNumber, QuestionNumber, Response, Score, Timestamp
How would I run a query that returns which users got a score of 90 or better on their first attempt at every question in their last 5 lessons? "last 5 lessons" is a limiting condition, rather than a requirement, so if they completely only 1 lesson, but got all of their first attempts for each question right, then they should be included in the results. We just don't want to look back farther than 5 lessons.
About the data: Users may be on different lessons. Some users may have not yet completed five lessons (may only be on lesson 3 for example). Each lesson has a different number of questions. Users have different lesson paths, so they may skip some lesson numbers or even complete lessons out of sequence.
Since this seems to be a problem of transforming temporally non-uniform/discontinuous values into uniform/contiguous values per-user, I think I can solve the bulk of the problem with a couple ranking function calls. The conditional specification of scoring above 90 for "first attempt at every question in their last 5 lessons" is also tricky, because the number of questions completed is variable per-user.
So far...
As a starting point or hint at what may need to happen, I've transformed Timestamp into an "AttemptNumber" for each question, by using "row_number() over (partition by Username,LessonNumber,QuestionNumber order by Timestamp) as AttemptNumber".
I'm also trying to transform LessonNumber from an absolute value into a contiguous ranked value for individual users. I could use "dense_rank() over (partition by Username order by LessonNumber desc) as LessonRank", but that assumes the order lessons are completed corresponds with the order of LessonNumber, which is unfortunately not always the case. However, let's assume that this is the case, since I do have a way of producing such a number through a couple of joins, so I can use the dense_rank transform described to select the "last 5 completed lessons" (i.e. LessonRank <= 5).
For the >90 condition, I think I can transform the score into an integer so that it's "1" if >= 90, and "0" if < 90. I can then introduce a clause like "group by Username having SUM(Score)=COUNT(Score).", which will select only those users with all scores equal to 1.
Any solutions or suggestions would be appreciated.
You kind of gave away the solution:
SELECT DISTINCT Username
FROM Results
WHERE Username NOT in (
SELECT DISTINCT Username
FROM (
SELECT
r.Username,r.LessonNumber, r.QuestionNumber, r.Score, r.Timestamp
, row_number() over (partition by r.Username,r.LessonNumber,r.QuestionNumber order by r.Timestamp) as AttemptNumber
, dense_rank() over (partition by r.Username order by r.LessonNumber desc) AS LessonRank
FROM Results r
) as f
WHERE LessonRank <= 5 and AttemptNumber = 1 and Score < 90
)
Concerning the LessonRank, I used exactly what you desribed since it is not clear how to order the lessons otherwise: The timestamp of the first attempt of the first question of a lesson? Or the timestamp of the first attempt of any question of a lesson? Or simply the first(or the most recent?) timestamp of any result of any question of a lesson?
The innermost Select adds all the AttemptNumber and LessonRank as provided by you.
The next Select retains only the results which would disqualify a user to be in the final list - all first attempts with an insufficient score in the last 5 lessons. We end up with a list of users we do not want to display in the final result.
Therefore, in the outermost Select, we can select all the users which are not in the exclusion list. Basically all the other users which have answered any question.
EDIT: As so often, second try should be better...
One more EDIT:
Here's a version including your remarks in the comments.
SELECT Username
FROM
(
SELECT Username, CASE WHEN Score >= 90 THEN 1 ELSE 0 END AS QuestionScoredWell
FROM (
SELECT
r.Username,r.LessonNumber, r.QuestionNumber, r.Score, r.Timestamp
, row_number() over (partition by r.Username,r.LessonNumber,r.QuestionNumber order by r.Timestamp) as AttemptNumber
, dense_rank() over (partition by r.Username order by r.LessonNumber desc) AS LessonRank
FROM Results r
) as f
WHERE LessonRank <= 5 and AttemptNumber = 1
) as ff
Group BY Username
HAVING MIN(QuestionScoredWell) = 1
I used a Having clause with a MIN expression on the calculated QuestionScoredWell value.
When comparing the execution plans for both queries, this query is actually faster. Not sure though whether this is partially due to the low number of data rows in my table.
Random suggestions:
1
The conditional specification of scoring above 90 for "first attempt at every question in their last 5 lessons" is also tricky, because the number of questions is variable per-user.
is equivalent to
There exists no first attempt with a score <= 90 most-recent 5 lessons
which strikes me as a little easier to grab with a NOT EXISTS subquery.
2
First attempt is the same as where timestamp = (select min(timestamp) ... )
You need to identify the top 5 lessons per user first, using the timestamp to prioritize lessons, then you can limit by score. Try:
Select username
from table t inner join
(select top 5 username, lessonNumber
from table
order by timestamp desc) l
on t.username = l.username and t.lessonNumber = l.lessonNumber
from table
where score >= 90