This code is supposed to select the TOP 1, but it's not working properly. Instead of showing only the TOP 1 record, it is showing tons of records. It may be because I have 2 tables referenced. In another code I only had 1 and it worked. I need to reference table attendance though so I'm not sure how to work around that. Thanks!
SELECT TOP 1 userID
FROM attendance, CFRRR
WHERE [attendance.Programs] LIKE CFRRR.program
AND [attendance.Language] LIKE CFRRR.language
AND [attendance.Status] = 'Available'
ORDER BY TS ASC
Here are the table fields for attendance: userID, username, Supervisor, Category, AttendanceDay, AttendanceTime, Programs, Language, Status, TS.
Here are the table fields for CFRRR: CFRRRID, WorkerID, Workeremail, Workername, Dateassigned, assignedby, RRRmonth, Scheduledate, scheduledtime, type, ScheduledType, caseid, language, lastname, firstname, Checkedin, Qid, status, CompletedType, comments, actiondate, verifduedate, program.
Clearly the last table has a lot of records.
SELECT TOP in MS Access differs from SELECT TOP in SQL Server and similar functionality in other databases. It returns the top rows based on the order by. Then it continues to return rows that match the last value. This is convenient sometimes, which is why SQL Server has this functionality as SELECT TOP WITH TIES.
To fix this, you need to include one or more columns that is unique for each generated row:
SELECT TOP 1 userID
FROM attendance as a,
CFRRR
WHERE a.Programs LIKE CFRRR.program AND
a.Language LIKE CFRRR.language AND
a.Status = 'Available'
ORDER BY TS ASC, userId, CFFRID
Related
How can I exclude the yellow highlighted rows? The requirement of the query is to return only rows where the job title changed, but the raw data includes extraneous rows with different effective dates which we want to be able to automatically exclude with SQL.
This is because the source system has a record effective date column that is common to several columns. We cannot change this architecture and so need the ability to exclude records from the output.
Edit to include error image from suggested answer:
select
a.*
FROM
jobtitles a
LEFT JOIN jobtitles b
ON a.id = b.id AND
a.effdate < b.effdate
WHERE b.id IS NULL
something like that, that would get the latest job title anyway
you might be able to use that "pseudo table" to further query
on considering your question further, how about
select
MIN(effdate) as effdate, jobtitle
FROM jobtitles
group by employeeid, jobtitle
(I'm making the assumption they don't change job titles back and forth, if so you're basically screwed, so be aware of that)
If JobTitle of an employee does not reverted to previous job titles, use the following query:
SELECT EmployeeID,
Name,
JobTitle,
MAX(Name) AS Name,
MIN(EffectiveDate) AS EffectiveDate
FROM jobtitles
GROUP BY EmployeeID, JobTitle
ORDER BY EmployeeID ASC, EffectiveDate DESC
If JobTitle of employees can be reverted/change to title that they have already obtained in the past, use the following query:
Edit: Update query according to table schema provided in question
SELECT ASSOCIATE_ID,
JOB_TITLE_DESCRIPTION,
POSITION_EFFECTIVE_DATE
FROM (
SELECT
ASSOCIATE_ID,
JOB_TITLE_DESCRIPTION,
JOB_TITLE_CODE,
POSITION_EFFECTIVE_DATE,
LEAD(JOB_TITLE_CODE,1, '0') OVER (ORDER BY ASSOCIATE_ID ASC, POSITION_EFFECTIVE_DATE DESC) AS PREV_TITLE_ID
FROM EMP_JOB_HISTORY
) AS tmp
WHERE tmp.PREV_TITLE_ID <> tmp.JOB_TITLE_CODE
I am using Microsoft SQL Server 2014.
I am able to list emails which are duplicated.
But I am unable to list the entire row, which contain other fields such as EmployeeId, Username, FirstName, LastName, etc.
SELECT Email,
COUNT(Email) AS NumOccurrences
FROM EmployeeProfile
GROUP BY Email
HAVING ( COUNT(Email) > 1 )
May I know how can I list all field in the rows that contains Email appearing more than once in the table?
Thank you.
Try this:
WITH DataSource AS
(
SELECT *
,COUNT(*) OVER (PARTITION BY email) count_calc
FROM EmployeeProfile
)
SELECT *
FROM DataSource
WHERE count_calc > 1
select distinct * from EmployeeProfile where email in (SELECT
Email
FROM EmployeeProfile
GROUP BY Email
HAVING COUNT(*) > 1 )
SQL Fiddle
with cte as (
select *
, count(1) over (partition by email) noDuplicates
from Demo
)
select *
from cte
where noDuplicates > 1
order by Email, EmployeeId
Explanation:
I've used a common table expression (cte) here; but you could equally use a subquery; it makes no difference.
This cte/subquery fetches every row, and includes a new field called noDuplicates which says how many records have that same email address (including the record itself; so noDuplicates=1 actually means there are no duplicates; whilst noDuplicates=2 means the record itself and 1 duplicate, or 2 records with this email address). This field is calculated using an aggregate function over a window. You can read up on window functions here: https://learn.microsoft.com/en-us/sql/t-sql/queries/select-over-clause-transact-sql?view=sql-server-2017
In out outer query we're then selecting only those records with noDuplicates greater than 1; i.e. where there are multiple records with the same mail address.
Finally I've sorted by Email and EmployeeId; so that duplicates are listed alongside one another, and are presented in the sequence in which they were (presumably) created; just to make whoever's then dealing with these results life easy.
If EmployeeId is unique, then you can EXISTS :
SELECT ep.*
FROM EmployeeProfile ep
WHERE EXISTS (SELECT 1
FROM EmployeeProfile ep1
WHERE ep1.Email = ep.Email AND ep1.EmployeeId <> ep.EmployeeId
);
I would like to be able to extract one field from multiple records from within a single table. For example, assuming I have a schema as follows
userId, eventTimestamp, theField
And what I want to do is be able to concatenate all instances of the field 'theField' together into a single string for a given userId ordered by eventTimestamp. And for an extra wrinkle, lets say I only want to include the first fiftiest oldest records.
My first attempt was to try something like:
SELECT
userId,
eventTimestamp,
LEAD(theField,0) OVER (PARTITION BY userId ORDER BY eventTimestamp) AS step0,
LEAD(theField,1) OVER (PARTITION BY userId ORDER BY eventTimestamp) AS step1,
....,
LEAD(theField,50) OVER (PARTITION BY userId ORDER BY eventTimestamp) AS step50,
And then the next step was to wrap that first step up in another SELECT statement as follows:
SELECT userId, eventTimestamp, CONCAT(STRING(step0), STRING(step1),...,STRING(step50)) as concatenatedString
FROM [whateverDataset.whateverTable],
GROUP BY
userId, eventTimestamp
This approach doesn't work though because if I have more than 50 steps (which I do), then I end up getting multiple rows for each of those outer SELECT statements, basically N-50 rows, where N = the total number of records for a particular userId. A 'solution' to this would be to have a HAVING statement in the inner SELECT statement to limit itself to only reporting the first 50 records, but overall this seems like a rather cumbersome solution. In non-BigQuery variants of SQL the GROUP_CONCAT seems to be a good way to go forward, but it either doesn't work here or I lack the creativity to get it to work. Anyone have any suggestions?
Thanks,
Brad
For BigQuery Legacy SQL:
SELECT
userid, GROUP_CONCAT(theField) AS Fields
FROM (
SELECT
userid, eventTimestamp, theField,
ROW_NUMBER() OVER(PARTITION BY userid ORDER BY eventTimestamp DESC) AS pos
FROM YourTable
ORDER BY eventTimestamp
)
WHERE pos < 51
GROUP BY userid
Please note: inner ORDER BY does not guarantee the order of theField in GROUP_CONCAT. But, so far, in all practical cases I see the order is carrying. So, test carefuly
For BigQuery Standard SQL:
Don't forget to uncheck Use Legacy SQL checkbox under Show Options
SELECT
userid,
(SELECT STRING_AGG(fields) FROM t.fields) AS fields
FROM (
SELECT
userid,
ARRAY(SELECT theField FROM t.fields ORDER BY eventTimestamp) fields
FROM (
SELECT
userid,
ARRAY_AGG(STRUCT(theField, eventTimestamp)) fields
FROM (
SELECT
userid,
eventTimestamp,
theField,
ROW_NUMBER() OVER(PARTITION BY userid ORDER BY eventTimestamp DESC) AS pos
FROM YourTable
)
WHERE pos < 51
GROUP BY userid
) t
) t
I have so many records having duplicate taskid assigned to multiple person, but i want to show distinct records means only one taskid in output in SQL
Below is my query not working give me solution
SELECT DISTINCT
taskid, taskname, person, userid, dept, date, status, monitor,
comments, monitor_comments, respondtime, assignedby,
reassigncomment, priority,date_complete, followup_date, re_status
FROM
task
WHERE
(status IS NULL)
in your case, result is distinct but not of your desire because you need only distinct task id then you should use this:
SELECT DISTINCT taskid
FROM task
WHERE (status IS NULL)
then result would be distinct task ids.
First, if you have a column called taskid in a table called task, I think it should be unique -- unless it is somehow a slowly changing dimension.
If it is not unique, then you are begging the question: which row do you want?
In any case, SQL Server 2005 have a function called row_number() that can solve your problem:
select t.*
from (select t.*, row_number() over (partition by taskid order by taskid) as seqnum
from task
) t
where seqnum = 1;
This will return one arbitrary row for each taskid. If you have a way of preferring one row over another, then adjust the order by clause.
i have added a column priority in which the value of that column is 1 of same TASKID and other will be 0 so i can find
SELECT DISTINCT
taskid, taskname, person, userid, dept, date, status, monitor,
comments, monitor_comments, respondtime, assignedby,
reassigncomment, priority,date_complete, followup_date, re_status
FROM
task
WHERE
(status IS NULL) and (priority='1')
I want to do query as below. Query is wrong but describes my intentions.
SELECT name, dateTime, data
FROM Record
WHERE dateTime = MAX(dateTime)
Update: Ok. The query describes intentions not quite good. My bad.
I want to select latest record for each person.
Try This:
SELECT name, dateTime, data
FROM Record
WHERE dateTime = SELECT MAX(dateTime) FROM Record
You could also write it using an inner join:
SELECT R.name, R.dateTime, R.data
FROM Record R
INNER JOIN (SELECT MAX(dateTime) FROM Record) RMax ON R.dateTime = RMax.dateTime
Which is the same but written from a different perspective
SELECT R.name, R.dateTime, R.data
FROM Record R,
(SELECT MAX(dateTime) FROM Record) RMax
WHERE R.dateTime = RMax.dateTime
I like Miky's answer and the from Quassnoi (and upvoted Miky's) but, if your needs are similar to mine, you should keep in mind some limitations. First and most importantly, it only works if you are looking for the latest record overall or the latest record for a single name. If you want the latest record for each person in a set (one record per person but the latest record for each) then the above solutions fall short. Second, and less importantly, if you'll be working with large datasets, might prove a bit slow over the long run. So, what is the work-around?
What I do is to add a bit field to the table marked "newest." Then, when I store a record (which is done in a stored procedure in SQL Server) I follow this pattern:
Update Table Set Newest=0 Where Name=#Name
Insert into Table (Name, dateTimeVal, Data, Newest) Values (#Name, GetDate(), #Data, 1);
Also, there is an index on Name and Newest to make Selects very fast.
Then the Select is just:
Select dateTimeVal, Data From Table Where (Name=#Name) and (Newest=1);
A select for a group will be something like:
Select Name, dateTimeVal, Data from Table Where (Newest=1); -- Gets multiple records
If the records may not be entered in date order, then your logic is a little bit different:
Update Table Set Newest=0 Where Name=#Name
Insert into Table (Name, dateTimeVal, Data, Newest) Values (#Name, GetDate(), #Data, 0); -- NOTE ZERO
Update Table Set Newest=1 Where dateTimeVal=(Select Max(dateTimeVal) From Table Where Name=#Name);
The rest stays the same.
In MySQL and PostgreSQL:
SELECT name, dateTime, data
FROM Record
ORDER BY
dateTime DESC
LIMIT 1
In SQL Server:
SELECT TOP 1 name, dateTime, data
FROM Record
ORDER BY
dateTime DESC
In Oracle
SELECT *
FROM (
SELECT name, dateTime, data
FROM Record
ORDER BY
dateTime DESC
)
WHERE rownum = 1
Update:
To select one person for each record, in SQL Server, use this:
WITH q AS
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY person ORDER BY dateTime DESC)
FROM Record
)
SELECT *
FROM q
WHERE rn = 1
or this:
SELECT ro.*
FROM (
SELECT DISTINCT person
FROM Record
) d
CROSS APPLY
(
SELECT TOP 1 *
FROM Record r
WHERE r.person = d.person
ORDER BY
dateTime DESC
) ro
See this article in my blog:
SQL Server: Selecting records holding group-wise maximum
for benefits and drawbacks of both solutions.
I tried Milky's advice but all three ways of constructing subquery resulted in HQL parser errors.
What does work though, is a slight change to the first method (added extra parentheses).
SELECT name, dateTime, data
FROM Record
WHERE dateTime = (SELECT MAX(dateTime) FROM Record)
PS: This is just for pointing out the obvious to HQL newbies and the like. Thought it would help.