postgresql Select date from min value - sql

I want to select list of people, who have FIRST time 'E' diagnosis from 2016-01-01.
my code looks like
SELECT DISTINCT ON(form25.visit.patient_id )
form25.visit.patient_id,
CONCAT(demographics.party.firstname,' ', demographics.party.lastname) AS pacientas,
form25.visit.disease_code,
demographics.party.code,
form25.visit.date,
form25.diagnose.disease_type
FROM form25.visit
JOIN demographics.party
ON form25.visit.patient_id = demographics.party.id
JOIN form25.diagnose
ON form25.visit.patient_id = form25.diagnose.card_id
WHERE form25.visit.disease_code LIKE 'E%'
AND form25.diagnose.disease_type = '+'
AND form25.visit.date >= '2016-01-01'
ORDER BY form25.visit.patient_id, form25.visit.date ASC
result i get:
patients E.. 2013,2014,2016, 2017
(contains E.. with 2016 so I see them)
All I want is to eliminate those who have date(s) with E.. before 2016, and see only those who had it first time from 2016.

First, your query would be much easier to read (and write) if you used table aliases.
Second, you can get what you want by doing an aggregation and then looking at the minimum date:
SELECT v.patient_id, CONCAT(p.firstname, ' ', p.lastname) AS pacientas,
MIN(v.date) as first_date
FROM form25.visit v JOIN
demographics.party p
ON v.patient_id = p.id JOIN
form25.diagnose d
ON v.patient_id = d.card_id
WHERE v.disease_code LIKE 'E%' AND d.disease_type = '+'
GROUP BY patient_id, pacientas
HAVING MIN(v.date) >= '2016-01-01';
Note that this gives you the patients that meet the conditions. If you need visit details, you can join back or use window functions.

Related

null result for average number of days by month

select a.clientid, a.CaseType, b.EnrollmentStartDate, a.EligibilityStartDate, datediff(day, a.EligibilityStartDate, b.EnrollmentStartDate) as date_diff
INTO ##temptable1
FROM dbo.Client a, dbo.ClientEnrollment b
WHERE a.ClientId = b.ClientId
AND a.CaseType = 99
ORDER BY a.ClientId
select avg (date_diff) from ##temptable1
so the above query gives me the overall average number of days it takes for a client to enroll into a program from their eligibility start date. I now want to sort the results by each month
select avg (date_diff) from ##temptable1
where EligibilityStartDate = '2019-03-01
for some reason I'm getting NULL no matter what date I specify ( even though the original query produces over 40k results ) I've tried inserting EligibilityStartDate = '2019-03-01' into the table itself but that did not work either.
Presumably, you want something like this:
SELECT YEAR(c.EligibilityStartDate) as yyyy,
MONTH(c.EligibilityStartDate) as mm,
AVG(DATEDIFF(DAY, c.EligibilityStartDate, ce.EnrollmentStartDate) as date_diff
FROM dbo.Client c JOIN
dbo.ClientEnrollment ce
ON c.ClientId = ce.ClientId AND c.CaseType = 99
GROUP BY YEAR(c.EligibilityStartDate), MONTH(c.EligibilityStartDate)
ORDER BY YEAR(c.EligibilityStartDate), MONTH(c.EligibilityStartDate);
Notes:
Never use commas in the FROM clause.
Always use proper, explicit, standard JOIN syntax.
Use meaningful table aliases (i.e. abbreviations of table names) rather than meaningless ones.
You seem to want an aggregation query.

Unpivot date columns to a single column of a complex query in Oracle

Hi guys, I am stuck with a stubborn problem which I am unable to solve. Am trying to compile a report wherein all the dates coming from different tables would need to come into a single date field in the report. Ofcourse, the max or the most recent date from all these date columns needs to be added to the single date column for the report. I have multiple users of multiple branches/courses for whom the report would be generated.
There are multiple blogs and the latest date w.r.t to the blogtitle needs to be grouped, i.e. max(date_value) from the six date columns should give the greatest or latest date for that blogtitle.
Expected Result:
select u.batch_uid as ext_person_key, u.user_id, cm.batch_uid as ext_crs_key, cm.crs_id, ir.role_id as
insti_role, (CASE when b.JOURNAL_IND = 'N' then
'BLOG' else 'JOURNAL' end) as item_type, gm.title as item_name, gm.disp_title as ITEM_DISP_NAME, be.blog_pk1 as be_blogPk1, bc.blog_entry_pk1 as bc_blog_entry_pk1,bc.pk1,
b.ENTRY_mod_DATE as b_ENTRY_mod_DATE ,b.CMT_mod_DATE as BlogCmtModDate, be.CMT_mod_DATE as be_cmnt_mod_Date,
b.UPDATE_DATE as BlogUpDate, be.UPDATE_DATE as be_UPDATE_DATE,
bc.creation_date as bc_creation_date,
be.CREATOR_USER_ID as be_CREATOR_USER_ID , bc.creator_user_id as bc_creator_user_id,
b.TITLE as BlogTitle, be.TITLE as be_TITLE,
be.DESCRIPTION as be_DESCRIPTION, bc.DESCRIPTION as bc_DESCRIPTION
FROM users u
INNER JOIN insti_roles ir on u.insti_roles_pk1 = ir.pk1
INNER JOIN crs_users cu ON u.pk1 = cu.users_pk1
INNER JOIN crs_mast cm on cu.crsmast_pk1 = cm.pk1
INNER JOIN blogs b on b.crsmast_pk1 = cm.pk1
INNER JOIN blog_entry be on b.pk1=be.blog_pk1 AND be.creator_user_id = cu.pk1
LEFT JOIN blog_CMT bc on be.pk1=bc.blog_entry_pk1 and bc.CREATOR_USER_ID=cu.pk1
JOIN gradeledger_mast gm ON gm.crsmast_pk1 = cm.pk1 and b.grade_handler = gm.linkId
WHERE cu.ROLE='S' AND BE.STATUS='2' AND B.ALLOW_GRADING='Y' AND u.row_status='0'
AND u.available_ind ='Y' and cm.row_status='0' and and u.batch_uid='userA_157'
I am getting a resultset for the above query with multiple date columns which I want > > to input into a single columnn. The dates have to be the most recent, i.e. max of the dates in the date columns.
I have successfully done the Unpivot by using a view to store the above
resultset and put all the dates in one column. However, I do not
want to use a view or a table to store the resultset and then do
Unipivot simply because I cannot keep creating views for every user
one would query for.
The max(date_value) from the date columns need to be put in one single column. They are as follows:
* 1) b.entry_mod_date, 2) b.cmt_mod_date ,3) be.cmt_mod_date , 4) b.update_Date ,5) be.update_date, 6) bc.creation_date *
Apologies that I could not provide the desc of all the tables and the
fields being used.
Any help to get the above mentioned max of the dates from these
multiple date columns into a single column without using a view or a
table would be greatly appreciated.*
It is not clear what results you want, but the easiest solution is to use greatest().
with t as (
YOURQUERYHERE
)
select t.*,
greatest(entry_mod_date, cmt_mod_date, cmt_mod_date, update_Date,
update_date, bc.creation_date
) as greatestdate
from t;
select <columns>,
case
when greatest (b_ENTRY_mod_DATE) >= greatest (BlogCmtModDate) and greatest(b_ENTRY_mod_DATE) >= greatest(BlogUpDate)
then greatest( b_ENTRY_mod_DATE )
--<same implementation to compare each time BlogCmtModDate and BlogUpDate separately to get the greatest then 'date'>
,<columns>
FROM table
<rest of the query>
UNION ALL
Select <columns>,
case
when greatest (be_cmnt_mod_Date) >= greatest (be_UPDATE_DATE)
then greatest( be_cmnt_mod_Date )
when greatest (be_UPDATE_DATE) >= greatest (be_cmnt_mod_Date)
then greatest( be_UPDATE_DATE )
,<columns>
FROM table
<rest of the query>
UNION ALL
Select <columns>,
GREATEST(bc_creation_date)
,<columns>
FROM table
<rest of the query>

TSQL select clause includes more data than needed

I have an query that is used to pull some data but when I join another table, its duplicating my results for every record it has joined on in the other table.
I'm sure this is a simple issue I am overlooking but cant seem to get it.
My query is here:
SELECT A.[id],
A.[subject],
A.[description],
CONVERT(VARCHAR(17), A.[startTime], 100) as startTime,
CONVERT(VARCHAR(17), A.[endTime], 100) as endTime,
A.[whoCreated],
A.[center],
B.[FirstName],
B.[LastName],
B.[ntid] as empNTID,
C.[centerName],
D.[employee],
E.[segmentID]
FROM Focus_Meetings AS A
JOIN empTable as B
ON A.[whoCreated] = B.[empID]
JOIN Focus_Centers as C
ON A.[center] = C.[id]
JOIN Focus_Attendees as D
ON D.[meetingID] = A.[id]
JOIN Focus_Meetings_Segments as E
ON E.[meetingID] = A.[id]
WHERE
(CAST(A.startTime AS DATE) = CAST(COALESCE(#meetingDate, A.startTime) AS DATE) OR
CAST(A.endTime AS DATE) = CAST(COALESCE(#meetingDate, A.endTime) AS DATE) OR
(E.[segmentID] IN( SELECT ParamValues.x2.value('segment[1]', 'INT')
FROM #meetingSegment.nodes('/segments/theSegment') AS ParamValues(x2))
)
)
FOR XML PATH ('details'), TYPE, ELEMENTS, ROOT ('root');
There is 1 record in the Focus_Meetings table and 5 records in the Focus_Meetings_Segments.
My result should only be the one meeting but its giving a record for every D.[employee] and E.[segmentID].
I assume that's how its supposed to work with my query but that's not my intent.
There are 5 segments attached to the meeting in the Focus_Meetings_Segments and when I search one of them, it should only by showing me the meeting 1 time, not once for each segment.
You are correct that this is how your query is supposed to work. This is a common problem that many people new to JOINS run into.
Essentially, you are currently asking SQL Server to return every set of data based on your JOINS and that is what it is doing. It sounds like what you want is for it to arbitrarily drop records from the result set.
Consider the following simplified version of your result set:
Subject | Description | SegmentId
-----------------------------------------
Whatever | Some desc... | 1
Whatever | Some desc... | 2
Based on your description, you only want the Whatever | Some desc... portion of the results to display one time.
If that is what you want to do, you have a couple of options.
Stop selecting the data (SegmentId) that is causing the records
to show twice and only select distinct records.
SELECT DISTINCT Subject, Description...
Specify an aggregate function on the data that is causing records to show twice and group by the rest.
SELECT Subject, Description, MAX(SegmentId)... GROUP BY Subject, Description
You should also evaluate exactly what you need to select vs. what you are selecting. If you are arbitrarily selecting the SegmentId then you probably don't need it in the first place.
when I search one of them it should only by showing me the meeting 1 time, not once for each segment
Then take Segment and Employee out of th emain query and do a subquery:
SELECT A.[id],
A.[subject],
A.[description],
CONVERT(VARCHAR(17), A.[startTime], 100) as startTime,
CONVERT(VARCHAR(17), A.[endTime], 100) as endTime,
A.[whoCreated],
A.[center],
B.[FirstName],
B.[LastName],
B.[ntid] as empNTID,
C.[centerName]
FROM Focus_Meetings AS A
JOIN empTable as B
ON A.[whoCreated] = B.[empID]
JOIN Focus_Centers as C
ON A.[center] = C.[id]
WHERE A.[id] IN
(
SELECT E.[meetingID]
FROM Focus_Meetings_Segments as E
WHERE
(CAST(A.startTime AS DATE) = CAST(COALESCE(#meetingDate, A.startTime) AS DATE) OR
CAST(A.endTime AS DATE) = CAST(COALESCE(#meetingDate, A.endTime) AS DATE) OR
(E.[segmentID] IN( SELECT ParamValues.x2.value('segment[1]', 'INT')
FROM #meetingSegment.nodes('/segments/theSegment') AS ParamValues(x2))
)
)
)
FOR XML PATH ('details'), TYPE, ELEMENTS, ROOT ('root');

Trying to select multiple columns on an inner join query with group and where clauses

I'm trying to run a query where it will give me one Sum Function, then select two columns from a joined table and then to group that data by the unique id i gave them. This is my original query and it works.
SELECT Sum (Commission_Paid)
FROM [INTERN_DB2].[dbo].[PaymentList]
INNER JOIN [INTERN_DB2]..[RealEstateAgentList]
ON RealEstateAgentList.AgentID = PaymentList.AgentID
WHERE Close_Date >= '1/1/2013' AND Close_Date <= '12/31/2013'
GROUP BY RealEstateAgentList.AgentID
I've tried the query below, but I keep getting an error and I don't know why. It says its a syntax error.
SELECT Sum (Commission_Paid)
FROM [INTERN_DB2].[dbo].[PaymentList]
INNERJOIN [INTERN_DB2]..[RealEstateAgentList](
Select First_Name, Last_Name
From [Intern_DB2]..[RealEstateAgentList]
Group By Last_name
)
ON RealEstateAgentList.AgentID = PaymentList.AgentID
WHERE Close_Date >= '1/1/2013' AND Close_Date <= '12/31/2013'
GROUP BY RealEstateAgentList.AgentID
Your query has multiple problems:
SELECT rl.AgentId, rl.first_name, rl.last_name, Sum(Commission_Paid)
FROM [INTERN_DB2].[dbo].[PaymentList] pl inner join
(Select agent_id, min(first_name) as first_name, min(last_name) as last_name
From [Intern_DB2]..[RealEstateAgentList]
GROUP BY agent_id
) rl
ON rl.AgentID = pl.AgentID
WHERE Close_Date >= '2013-01-01' AND Close_Date <= '2013-12-31'
GROUP BY rl.AgentID, rl.first_name, rl.last_name;
Here are some changes:
INNERJOIN --> inner join.
Fixed the syntax of the subquery next to the table name.
Removed columns for first and last name. They are not used.
Changed the subquery to include agent_id.
Added agent_id, first_name, and last_name to the outer aggregation, so you can tell where the values are coming from.
Changed the date formats to a less ambiguous standard form.
Added table alias for subquery.
I suspect the subquery on the agent list is not important. You can probably do:
SELECT rl.AgentId, rl.first_name, rl.last_name, Sum(pl.Commission_Paid)
FROM [INTERN_DB2].[dbo].[PaymentList] pl inner join
[Intern_DB2]..[RealEstateAgentList] rl
ON rl.AgentID = pl.AgentID
WHERE pl.Close_Date >= '2013-01-01' AND pl.Close_Date <= '2013-12-31'
GROUP BY rl.AgentID, rl.first_name, rl.last_name;
EDIT:
I'm glad this solution helped. As you continue to write queries, try to always do the following:
Use table aliases that are abbreviations of the table names.
Always use table aliases when referring to columns.
When using date constants, either use "YYYY-MM-DD" format or use convert() to convert a string using the specified format. (The latter is actually the safer method, but the former is more convenient and works in almost all databases.)
Pay attention to the error messages; they can be informative in SQL Server (unfortunately, other databases are not so clear).
Format your query so other people can understand it. This will help you understand and debug your queries as well. I have a very particular formatting style (which no one is going to change at this point); the important thing is not the particular style but being able to "see" what the query is doing. My style is documented in my book "Data Analysis Using SQL and Excel.
There are other rules, but these are a good way to get started.
SELECT Sum (Commission_Paid)
FROM [INTERN_DB2].[dbo].[PaymentList] pl
INNER JOIN (
Select First_Name, Last_Name
From [Intern_DB2]..[RealEstateAgentList]
Group By Last_name
) x ON x.AgentID = pl.AgentID
WHERE Close_Date >= '1/1/2013'
AND Close_Date <= '12/31/2013'
GROUP BY RealEstateAgentList.AgentID
This is how the query should look... however, if you subquery first and last name, you'll also have to include them in the group by. Assuming Close_Date is in the PaymentList table, this is how I would write the query:
SELECT
al.AgentID,
al.FirstName,
al.LastName,
Sum(pl.Commission_Paid) AS Commission_Paid
FROM [INTERN_DB2].[dbo].[PaymentList] pl
INNER JOIN [Intern_DB2].dbo.[RealEstateAgentList] al ON al.AgentID = pl.AgentID
WHERE YEAR(pl.Close_Date) = '2013'
GROUP BY al.AgentID, al.FirstName, al.LastName
Subqueries are evil, for the most part. There's no need for one here, because you can just get the columns from the join.

Group by Week beginning Sunday

I have a table with eventdatetime , userid etc. The data is inserted in the table daily.
For the report , I need to give count of userid , projectid grouped by week : Tue-Mon for a month range at a time.
I need help on grouping the data by week for month. I'm using Oracle.
select count(distinct( table1.projectid))as Projects, count(distinct( table2.userid)) as Users,??
from table1
join table 2
on table1.a= table2.a
where table1.e='1'
and table1.eventdatetime between sysdate-30 and sysdate-1
group by ??
I want the output to be grouped by week like :
WeekBegin
2013-04-14
2013-04-21
http://www.techonthenet.com/oracle/functions/to_char.php Use the To_Char function with IW to get the week. Then you can GROUP BY that IW value.
Note that the date the Oracle week starts on is dependent on the language settings of the database. Some countries start on Sunday and some Monday. You'll have to look at your settings to see. If it already starts on Sunday, then you're in luck!
if the example you have posted is your work in progress version - before worrying about getting the days of the week in you should look into getting the basics of the query right
you are selecting e.projectid and u.userid but you haven't got any tables named e or u in your query - it looks like you want to alias them as e and u?
the where clause of your query is also looking for the table e which isn't present
in that case you should change
from table1
join table2
on table1.a= table2.a
to
from table1 e -- select from table1 using alias e
join table2 u -- join table2 using alias u
on ( e.a = u.a ) -- joining on column a from table1 (e) = a from table2 (u)
once you have replaced the a's in the on section with the column names you want to join using it might well run after you remove the last column ", ??" from the select - perhaps something along these lines
select
count (e.projectid) PROJECTS,
count (u.userid) USERS
from table1 e
join table2 u
on ( e.a = u.a )
where e.FILTERING_COLUMN = '1'
and e.eventdatetime >= sysdate-30
note that as sysdate is the current time on the server (depending on localisation and session settings) you can use greater than sysdate-30 instead of between which may well be give the query optimiser an easier time if the table is suitable indexed
the basic rule for grouping is that to select a column you need to either be grouping by it or using an aggregate function such as COUNT()
so you'll probably want something like
select
count (e.projectid) PROJECTS,
count (u.userid) USERS,
to_char(e.eventdatetime,'MM') MONTH
from table1 e
join table2 u
on ( e.a = u.a )
where e.FILTERING_COLUMN = '1'
and e.eventdatetime >= sysdate-30
group by e.eventdatetime
though this won't be the most optimal way to do this it would be easier if you posted the schemas involved in the issue