In SQL, how to select rows with matching values in one column, based on earliest date in another column - sql

I think this should be a simple SQL exercise, but I am not sure how it is done as I am new to querying dbs with SQL. I have a table that looks like this:
select * from myschema.mytable
customer_name date
nick 2017-06-19 19:26:40
tom 2017-06-21 19:24:40
peter 2017-06-23 21:25:10
nick 2017-06-24 13:43:39
I'd like for this query to return only one row for each unique name. Specifically, I'd like the query to return the rows for each customer_name with the earliest date. In this case, the first row for nick should be returned (with date 2017-06-19), but not the other row with date 2017-06-24.
Is this a simple exercise in SQL?
Thanks!

A simple MIN will do:
SELECT
customer_name,
MIN(date) AS earliest_date
FROM myschema.mytable
GROUP BY customer_name;

For this kind of problems you can use aggregate functions. what has to be done basically is grouping the rows by name and choosing the minimum date:
select cutomer_name, MIN(date)
FROM myschema.mytable
GROUP BY customer_name

Related

How to calculate ages in Oracle SQL

I have two databases in the same server(dbA and dbB). The both have a table called CUSTOMERS. I want to calculate the age of my customers and instert it on a column AGE, on the dbB.CUSTOMERS, based on the current date and the DateOfBirth column on the dbA.CUSTOMERS table.
To calculate the age I tried to use
SELECT floor(months_between(SYSDATE, (SELECT BIRTH_DATE FROM dbA.CUSTOMERS)) /12) from dual;
but this returns an ORA-01427:single-row subquery returns more than one row
I am guessing this is because the subquery returns every row of the BIRTH_DATE column. Is there someway to do this for all the rows and then insert the result to my dbB.CUSTOMERS table?
I am using OracleSQL
There is no need for the subquery here
select floor(months_between(sysdate,BIRTH_DATE)/12) as Age
from CUSTOMERS
Using the subquery returns all the rows for the table and cannot apply the functions against all of them. This method applies the function against each row, so it works

I want to merge the duplicate names in my table or at least see the names that are unique and look alike

I have a employee table with schema as follows:
Id Name Birthday DeathDay Startdate EndDate
The problem is that I have data as follows:
Bergh Celestin 06/09/1791 14/12/1861
Bergh Célestin 06/09/1791 14/12/1861
Bergh Francois 04/04/1958 11/12/2001
Bergh Jozef Francois 04/04/1958 11/12/2001
Now i want to merge these records as 1 as they are the same person how can i do that?
Also, if I just want to display the list of only those person from the table whose names are possibly same, like above, how can I do that?
I used:
select Distinct name,birthday,deathday from table
but that is not good enough.
I would use a function (.NET or SQL) of sorts to remove the accents as per https://stackoverflow.com/a/12715102/1662973 and then group on that together with the dates. You will need to group on something, as essentially "Bergh Célestin" could actually be a different person to "Bergh Celestin".
Sample:
select
RemoveExtraChars(name)
,birthday
,deathday
from
TABLE
group by
RemoveExtraChars(name)
,birthday
,deathday
For your second Question you can use SQL LIKE Operator:
SELECT column_name(s)
FROM table_name
WHERE column_name LIKE pattern;

SQL server select max two dates from a table

I would like to know how if it is possible to select the two most recent dates from a column in a table. Please see the simple example below. I know to get the max date I can use the max function. I'm also aware that I could then do another max statement with a where condition that states it must be less than the first date returned from my first max query. I was wondering though if there was a way of doing this in one query?
Name DateAdded
ABC 2014-04-20
ABC 2014-04-20
ABC 2014-03-01
ABC 2014-03-01
ABC 2014-02-25
ABC 2014-05-22
ABC 2014-04-01
The two dates that should be returned are the two most recent, i.e. 2014-05-22 & 2014-04-20.
EDIT
Sorry I should have mentioned yes I want two distnict dates. The table is large and the dates are not sorted. I think sorting the table could be quite slow.
SELECT distinct top 2 Dateadded
FROM table
ORDER BY Dateadded desc
select top(2) DateAdded
from table
order by DateAdded DESC
Try This :
select distinct top(2) format(Dateadded ,'yyyy-MM-dd') as Dateadded
from TableName
order by Dateadded DESC

SQL - Insert using Column based on SELECT result

I currently have a table called tempHouses that looks like:
avgprice | dates | city
dates are stored as yyyy-mm-dd
However I need to move the records from that table into a table called houses that looks like:
city | year2002 | year2003 | year2004 | year2005 | year2006
The information in tempHouses contains average house prices from 1995 - 2014.
I know I can use SUBSTRING to get the year from the dates:
SUBSTRING(dates, 0, 4)
So basically for each city in tempHouses.city I need to get the the average house price from the above years into one record.
Any ideas on how I would go about doing this?
This is an SQL Server approach, and a PIVOT may be a better, but here's one way:
SELECT City,
AVG(year2002) AS year2002,
AVG(year2003) AS year2003,
AVG(year2004) AS year2004
FROM (
SELECT City,
CASE WHEN Dates BETWEEN '2002-01-01T00:00:00' AND '2002-12-31T23:59:59' THEN avgprice
ELSE 0
END AS year2002,
CASE WHEN Dates BETWEEN '2003-01-01T00:00:00' AND '2003-12-31T23:59:59' THEN avgprice
ELSE 0
END AS year2003
CASE WHEN Dates BETWEEN '2004-01-01T00:00:00' AND '2004-12-31T23:59:59' THEN avgprice
ELSE 0
END AS year2004
-- Repeat for each year
)
GROUP BY City
The inner query gets the data into the correct format for each record (City, year2002, year2003, year2004), whilst the outer query gets the average for each City.
There many be many ways to do this, and performance may be the deciding factor on which one to choose.
The best way would be to use a script to perform the query execution for you because you will need to run it multiple times and you extract the data based on year. Make sure that the only required columns are city & row id:
http://dev.mysql.com/doc/refman/5.0/en/insert-select.html
INSERT INTO <table> (city) VALUES SELECT DISTINCT `city` from <old_table>;
Then for each city extract the average values, insert them into a temporary table and then insert into the main table.
SELECT avg(price), substring(dates, 0, 4) dates from <old_table> GROUP BY dates;
Otherwise you're looking at a combination query using joins and potentially unions to extrapolate the data. Because you're flattening the table into a single row per city it's going to be a little tough to do. You should create indexes first on the date column if you don't want the database query to fail with memory limits or just take a very long time to execute.

How can I select if Date column is as same as current year

This is my Student table
Id(int) | Name(varchar) | registerDate(Date)
1 John 2012-01-01
How can I write the appropriate query to check if the person's registerDate value is as same as current year (2012)?
SELECT *
FROM Student
WHERE YEAR(registerDate) = YEAR(getdate())
The most direct solution would be to use the YEAR or DATEPART function in whatever flavor of SQL you're using. This will probably meet your needs but keep in mind that this approach does not allow you to use an index if you're searching the table for matches. In this case, it would be more efficient to use the BETWEEN operator.
e.g.
SELECT id, name, registerDate
FROM Student
WHERE registerDate BETWEEN 2012-01-01 and 2012-12-31
How you would generate the first and last day of the current year will vary by SQL flavor.
Because you're using a range, and index can be utilized. If you were using a function to calculate the year for each row, it would need to be computed for each row in the table instead of seeking directly to the relevant rows.
If by chance your flavor of sql is Microsoft TSql then this works:
SELECT * FROM Student Where datepart(yy,registerDate) = datepart(yy,GetDate())
This should work for SQL Query:
SELECT * FROM myTable
WHERE registerDate=YEAR(CURDATE())