2 column with same ID to 1 row - sql

I have a table with only 2 column which is as follow
|ID | Date |
===================
|1 | 03/04/2017 |
|1 | 09/07/1997 |
|2 | 04/04/2014 |
I want to achieve an end result as follow
|ID | Date 1 |Date 2 |
================================
|1 | 03/04/2017 | 09/07/1997 |
|2 | 04/04/2014 | NULL |
I'm currently reading up on PIVOT function and I'm not sure am I on the right track. Am still new to SQL

A simple pivot query should work here, with a twist. For your ID 2 data, there is only one row, but in this case you want to report a first date and a NULL second date. We can use a CASE expression to handle this case.
SELECT
ID,
MAX(Date) AS date_1,
CASE WHEN COUNT(*) = 2 THEN MIN(Date) ELSE NULL END AS date_2
FROM yourTable
GROUP BY ID
Output:
Demo here:
Rextester

This can be done easily using min/max aggregate function
select Id,min(Date),
case when min(Date)<>max(Date) then max(Date) end
From yourtable
Group by Id
If this will not help you with your original data, then alter sample data and expected result

Related

SQL DB2 Split result of group by based on count

I would like to split the result of a group by in several rows based on a count, but I don't know if it's possible. For instance, if I have a query like this :
SELECT doc.client, doc.template, COUNT(doc) FROM document doc GROUP BY doc.client, doc.template
and a table document with the following data :
ID | name | client | template
1 | doc_a | a | temp_a
2 | doc_b | a | temp_a
3 | doc_c | a | temp_a
4 | doc_d | a | temp_b
The result for the query would be :
client | template | count
a | temp_a | 3
a | temp_b | 1
But I would like to split a row of the result in two or more if the count is higher than 2 :
client | template | count
a | temp_a | 2
a | temp_a | 1
a | temp_b | 1
Is there a way to do this in SQL ?
You can use RCTE like below. Run this statement AS IS first playing with different values in the last column. Max batch size here is 1000.
WITH
GRP_RESULT (client, template, count) AS
(
-- Place your SELECT ... GROUP BY here
-- instead of VALUES
VALUES
('a', 'temp_a', 4500)
, ('a', 'temp_b', 3001)
)
, T (client, template, count, max_batch_size) AS
(
SELECT client, template, count, 1000
FROM GRP_RESULT
UNION ALL
SELECT client, template, count - max_batch_size, max_batch_size
FROM T
WHERE count > max_batch_size
)
SELECT client, template, CASE WHEN count > max_batch_size THEN max_batch_size ELSE count END count
FROM T
ORDER BY client, template, count DESC
The result is:
|CLIENT|TEMPLATE|COUNT |
|------|--------|-----------|
|a |temp_a |1000 |
|a |temp_a |1000 |
|a |temp_a |1000 |
|a |temp_a |1000 |
|a |temp_a |500 |
|a |temp_b |1000 |
|a |temp_b |1000 |
|a |temp_b |1000 |
|a |temp_b |1 |
You may place your SELECT ... GROUP BY statement as specified above afterwards to achieve your goal.
You can use window functions and then aggregate:
SELECT client, template, COUNT(*)
FROM (SELECT doc.client, doc.template,
ROW_NUMBER() OVER (PARTITION BY doc.client, doc.template ORDER BY doc.client) - 1 as seqnum,
COUNT(*) OVER (PARTITION BY doc.client, doc.template) as cnt
FROM document doc
) d
GROUP BY doc.client, doc.template, floor(seqnum * n / cnt)
The subquery enumerates the rows. The outer query then splits the rows into groups of two using MOD().

Analytic function - Comparing values using LAG()

Assume following data:
| Col1 | Col2 |
| 3 | 20-dec-15 |
| 4 | 20-dec-15 |
| 8 | 25-dec-15 |
|10 | 25-dec-15 |
I have to compare the values of column Col1 for a particular date.
For Example: For 20-dec-15 changes occured as 3 changed to 4.
I have to solve this using an analytical function.
Following is the query which I am using
decode(LAG(Col1,1,Col1) OVER (partition by Col2 order by Col2),Col1,0,1) Changes
As Col2 is date column, Partition by date is not working for me. Can we apply date column as Partition?
Expected Result should be:
| Changes |
| 0 |
| 1 |
| 0 |
| 1 |
Here 1 means Change occured while comparing for same date.
You need to use trunc() in order to reset the time part to 00:00:00 but you should still keep order by col2 so that all rows on the same day are ordered by the time part:
I also prefer an explicit case for this kind of comparison, personally I find the decode() really hard to read:
select case
when col1 = lag(col1,1,col1) over (partition by trunc(col2) order by col2) then 0
else 1
end as changes
from the_table;

Select most recent date and calculate avg of multiple values in different rows

i am building a query for a table that has several columns:
name
date
about 10 columns with integer values
Now i want to return a table that returns one row for each unique element of the name column.
This row has three columns: name, date, and an average of the integer values columns.
The date is the most current date for this particular name element.
And the average value is the average of all integer values of this particular row.
SELECT name, Max(date), SUM(value1+value2+value3+value4...+value10)/10
FROM myTable
WHERE *join statements*
Group by name
The problem with this is, that due to the face that sum aggregates all the values of the columns, the computed value is not the average of this single row.
Adding the values to the group by statement is not possible because the result should be displayed only by one row per name.
Hope the problem is clear. Any Ideas? Thanks!
Edit:
Thank you for your reply Gordon Linoff! The formulation of my problem was not that clear, sorry for that.
Actually there can exist entries with same name and date. Therefore I need to aggregate these entries, in case it is the most current date of this particular name element.
To clarify, here a possible table:
|Name |Date |value1|value2|
|Name A | 14/09/24 | 1 | 2 |
|Name A | 14/09/24 | 2 | 1 |
|Name A | 14/09/24 | 9 | 9 |
|Name A | 14/09/22 | 4 | 3 |
|Name B | 14/09/23 | 3 | 5 |
|Name B | 14/09/22 | 2 | 4 |
|Name B | 14/09/21 | 4 | 2 |
|Name C | 14/09/23 | 5 | 1 |
The result shoud be:
|Name |Date |avg|
|Name A | 14/09/24 | 4 |
|Name B | 14/09/23 | 4 |
|Name C | 14/09/23 | 3 |
With your hint, I think I found the right query for this problem, where 2 is the number of values per row:
SELECT name, max(date), avg(value1+value2)/2
FROM myTable t
WHERE not exists (select 1
from myTable t2
where t2.name = t.name and t2.date > t.date
)
group by name
You don't actually want an aggregate. You want to choose the most recent row for each name. Here is a method that uses not exists:
SELECT name, date, (value1+value2+value3+value4...+value10)/10
FROM myTable t
WHERE not exists (select 1
from myTable t2
where t2.name = t.name and t2.date > t.date
);

Column 'Course.Course_Name' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause

I need to link two tables columns, please help me. This my code:
SELECT Student.Stu_Course_ID, Course.Course_Name, COUNT(Student.Stu_ID) AS NoOfStudent FROM Student
INNER JOIN Course
ON Student.Stu_Course_ID=Course.Course_ID
GROUP BY Stu_Course_ID;
This is my course table:
__________________________________________
|Course_ID | Course_Name |
|1 | B.Eng in Software Engineering |
|2 | M.Eng in Software Engineering |
|3 | BSC in Business IT |
I got number of students from student table
_____________________________
|Stu_Course_ID | NoOfStudents |
|1 | 30 |
|2 | 12 |
|3 | 20 |
This is what i want
____________________________________________________________
|Stu_Course_ID | Course_Name | NoOfStudents|
|1 | B.Eng in Software Engineering | 30 |
|2 | M.Eng in Software Engineering | 12 |
|3 | BSC in Business IT | 20 |
You need to add Course.Course_Name to your group by clause:
SELECT Student.Stu_Course_ID,
Course.Course_Name,
COUNT(Student.Stu_ID) AS NoOfStudent
FROM Student
INNER JOIN Course
ON Student.Stu_Course_ID=Course.Course_ID
GROUP BY Student.Stu_Course_ID, Course.Course_Name;
Imagine the following simple table (T):
ID | Column1 | Column2 |
----|---------+----------|
1 | A | X |
2 | A | Y |
Your query is similary to this:
SELECT ID, Column1, COUNT(*) AS Count
FROM T
GROUP BY Column1;
So, you know you have 2 records for A in column1, so you expect a count of 2, however, you are also selecting ID, there are two different values for ID where Column1 = A, so the following result:
ID | Column1 | Count |
----|---------+----------|
1 | A | 2 |
Is no more or less correct than
ID | Column1 | Count |
----|---------+----------|
2 | A | 2 |
This is why ID cannot be contained in the select list, unless it included in the group by clause, or as part of an aggregate function.
For what it's worth, if Course_ID is the primary key in the table Course then following query is legal according to the SQL Standard, and will work in Postgresql, and I suspect at some point Microsoft will build this functionality into SQL Server too:
SELECT Course.Course_ID,
Course.Course_Name,
COUNT(Student.Stu_ID) AS NoOfStudent
FROM Student
INNER JOIN Course
ON Student.Stu_Course_ID=Course.Course_ID
GROUP BY Course.Course_ID;
The reason for this is that since Course.Course_ID is the primary key of Course there can be no duplicates of this in the table, therefore there can only be one value for Course_name for each Course_ID
give columns names after group by statements which you want to retreive so you have to also give Course.Course_Name as well...

SQL: get maximum value and it's corresponding field(s)

I need to get the max lesson_score from the following table, along with the respective date for a particular user:
--------------------------------
|uid |lesson_score |date |
--------------------------------
|1 |2 |1391023460 |
|1 |8 |1391023518 |
|1 |4 |1391023596 |
--------------------------------
I need a result of:
---------------------------
|lesson_score |date |
---------------------------
|8 |1391023596 |
---------------------------
My SQL looks like this:
SELECT date, MAX(lesson_score) AS lesson_score
FROM cdu_user_session_progress
WHERE uid = 1
GROUP BY date";
But it just gives me three rows:
---------------------------
|lesson_score |date |
---------------------------
|2 |1391023460 |
|4 |1391023596 |
|8 |1391023518 |
---------------------------
What am I doing wrong? Thanks!
SELECT date, MAX(lesson_score) AS lesson_score
FROM cdu_user_session_progress
WHERE uid = 1
GROUP BY date";
MAX is an aggregation function, it will return maximum of lesson_score if there are multiple value for lesson_score
In your query the value is always same so it returns that. Remember SELECT MAX(...) does not work on whole table records, it works each record one by one.
You can get your result using order by like this
SELECT top 1 date, lesson_score AS lesson_score
FROM cdu_user_session_progress
WHERE uid = 1
ORDER BY lesson_score DESC;
Try using
SELECT lesson_score, date FROM cdu_user_session_progress ORDER BY lesson_score DESC LIMIT 1;
The ORDER BY - part is responsible, that the max. lession_score will be fetched at the beginning.
After the order-by, you get the folling result:
---------------------------
|lesson_score |date |
---------------------------
|8 |1391023518 |
|4 |1391023596 |
|2 |1391023460 |
---------------------------
Now the LIMIT-part says, that the database should only return the first row - all other result-rows will be ignored, and the result is this:
---------------------------
|lesson_score |date |
---------------------------
|8 |1391023518 |
---------------------------
In order to get the minium-score, you just write ASC intead of DESC (or remove it, because ASC is the default-value)
why do you want the group by? just use Order BY ColumnName DESC LIMIT 1 in your query it is returning all the records because UID=1 is true for all the records and its just grouping it by date shows MAX for each DATE
SELECT [date], lesson_score
FROM cdu_user_session_progress
WHERE lesson_score = (SELECT MAX(lesson_score) FROM cdu_user_session_progress GROUP BY [ui] HAVING [uid] = 1)
AND [uid] =1;