Select Specific value from a table - sql

ID FName Lname Status Major Code GPA Admitted Date
101 Tom Smith Freshman 103 3.51 3-May-2015
I have a table with these columns, I need to "list the sophomores who were admitted before 2015" Now I've got the admitted before 2015 part down, but I am struggling with selecting only sophomores throughout this large table.
I'm thinking I'll need to use DISTINCT or WHERE IN but I'm not sure how to select sophomore as the table doesn't recognize that
SELECT status, fname, lname, gpa, admitteddate
FROM mytablename
WHERE status = 'JUNIOR' AND
TO_CHAR(admitteddate, 'YYYY') <=2014;
This is the correct query, thank you!

I assume the Status column is the one with the key information. If you have rows with Sophomore in this column, it is as simple as adding Status = 'Sophomore' to the WHERE clause.

Related

Combining two mostly identical rows in SQL

I have a table that contains data like below:
Name
ID
Dept
Joe
1001
Accounting
Joe
1001
Marketing
Mary
1003
Administration
Mary
1009
Accounting
Each row is uniquely identified with a combo of Name and ID. I want the resulting table to combine rows that have same Name and ID and put their dept's together separated by a comma in alpha order. So the result would be:
Name
ID
Dept
Joe
1001
Accounting, Marketing
Mary
1003
Administration
Mary
1009
Accounting
I am not sure how to approach this. So far I have this, which doesn't really do what I need:
SELECT Name, ID, COUNT(*)
FROM employees
GROUP BY Name, ID
I know COUNT(*) is irrelevant here, but I am not sure what to do. Any help is appreciated! By the way, I am using PostgreSQL and I am new to the language.
Apparently there is an aggregate function for string concatenation with PostgreSQL. Find documentation here. Try the following:
SELECT Name, ID, string_agg(Dept, ', ' ORDER BY Dept ASC) AS Departments
FROM employees
GROUP BY Name, ID

How to use Max while taking other values from another column?

I am new in SQL and have problem picking the biggest value of a column for every manager_id and also other information in the same row.
Let me show the example - consider this table:
name
manager_id
sales
John
1
100
David
1
80
Selena
2
26
Leo
1
120
Frank
2
97
Sara
2
105
and the result I am expecting would be like this:
name
manager_id
top_sales
Leo
1
120
Sara
2
105
I tried using Max but the problem is that I must group it with manager_id and not being able to take name of the salesPerson.
select manager_id, max(sales) as top_sales
from table
group by manager_id ;
This is just an example and the actual query is very long and I am taking the information from different tables. I know that I can use the same table and join it again but the problem is as I mentioned this query is very long as I am extracting info from different tables with multiple conditions. And I don't want to make a temporary table to save it. It should be done in one single query and I actually did solve this but the query is super long due to the inner join that I used and made original table twice.
My question is that can I use Max and have the value in the name column or is there other method to solve this?
Appreciate all help
You can use row_number() with CTE to get the highest sales for each manager as below:
with MaxSales as (
select name, manager_id, sales,row_number() over (partition by manager_id order by sales desc) rownumber from table
)
select name , manager_id ,sales from MaxSales where rownumber=1

How to check if record is updated?

I'm trying to get updated records count from previous week. But I'm having trouble approaching the problem.
For Eg: I have a table 'Org'
week1 :
id name age address date
record1 : 123 Joe 35 xyz 12/01/2017
week2 :
id name age address date
record1 : 123 Joe 35 abc 12/03/2017
I'm trying to get the record which has been updated. In the above example, the address for record1 with id 123 has been updated. Currently I'm checking in an in-efficient way.
Query:
select * from Org where date='12/01/2017'
and
select * from Org where date='12/03/2017'
Query:
select distinct on (id) count(*) from Org group by Org.id
A file is being pushed to the db every day. So, the updated record will have a new timestamp and the records are getting aggregated overtime which made my job a little harder. I tried joining the table to itself but it didn't make any sense to me. I'm not sure how to approach this problem. I was trying and almost reaching the solution but I don't understand why I'm getting count two times. Example Fiddle
Try this Query:
select col1
from(
select distinct col1,col2,col3,col4
from table_not
)D
group by col1
having count(1)>1
SQL Fiddle link: SQL Fiddle

Historical sql Table With Bits Of User Information - Make New Table With 1 Entry & All Information

I have a table (customers) that has 43 columns of user information (first name, last name, address, city, state, zip, phone, email, visitDate, lastActive, etc...)
Every night, I'm getting a feed from our clients with the customers that visited them that day. These visits are stored into the customers table without removing the old record. The old record is marked lastActive = 0 and the new one is marked lastActive = 1. Any null fields are stored as "Unknown".
Obviously this results in a very large table that takes a while to query. So, I plan on making a new table that is only the distinct users and their most complete information.
For example: If Bob Smith was imported on January 1st with no phone or email, and then he was imported again on August 1st with a phone, but no email, and then imported again on September 1st with no phone, but an email, my customers table would look something like this:
CustImportID CustomerKey FirstName LastName Phone Email visitDate lastActive
1 1 Bob Smith Unknown Unknown 2016-01-01 0
2 1 Bob Smith 5551231234 Unknown 2016-08-01 0
3 1 Bob Smith Unknown 1#2.io 2016-09-01 1
So my question is this, what's the best way to get the distinct people from the customers table, and insert them into the new table where Bob would only be one entry, but I would have values for every field (if every entry has phone, for example, we would pull the phone from the most recent entry), resulting is something like this:
CustomerKey FirstName LastName Phone Email visitDate
1 Bob Smith 5551231234 1#2.io 2016-09-01
You can use FIRST_VALUE with a trick to ignore 'Uknown' values:
SELECT FirstName, LastName,
FIRST_VALUE(Phone) OVER (ORDER BY CASE
WHEN Phone='Unknown' THEN 1
ELSE 0
END,
visitDate DESC) AS Phone,
FIRST_VALUE(Email) OVER (ORDER BY CASE
WHEN Email='Unknown' THEN 1
ELSE 0
END,
visitDate DESC) AS Email
FROM mytable
FIRST_VALUE is available from SQL Server 2012. It picks the latest field value as specified by the ORDER BY of the OVER clause. Due the CASE in the ORDER BY clause, 'Unknown' values will have to lowest priority.
you can use max of values from all records which will result this:
select customerkey, max(firstname), max(lastname), max(phone), max(email), max(visitdate) from yourtablename
If you have two are more valid entries then use row_number and select max of that based on recent values

UPDATE query that fixes orphaned records

I have an Access database that has two tables that are related by PK/FK. Unfortunately, the database tables have allowed for duplicate/redundant records and has made the database a bit screwy. I am trying to figure out a SQL statement that will fix the problem.
To better explain the problem and goal, I have created example tables to use as reference:
alt text http://img38.imageshack.us/img38/9243/514201074110am.png
You'll notice there are two tables, a Student table and a TestScore table where StudentID is the PK/FK.
The Student table contains duplicate records for students John, Sally, Tommy, and Suzy. In other words the John's with StudentID's 1 and 5 are the same person, Sally 2 and 6 are the same person, and so on.
The TestScore table relates test scores with a student.
Ignoring how/why the Student table allowed duplicates, etc - The goal I'm trying to accomplish is to update the TestScore table so that it replaces the StudentID's that have been disabled with the corresponding enabled StudentID. So, all StudentID's = 1 (John) will be updated to 5; all StudentID's = 2 (Sally) will be updated to 6, and so on. Here's the resultant TestScore table that I'm shooting for (Notice there is no longer any reference to the disabled StudentID's 1-4):
alt text http://img163.imageshack.us/img163/1954/514201091121am.png
Can you think of a query (compatible with MS Access's JET Engine) that can accomplish this goal? Or, maybe, you can offer some tips/perspectives that will point me in the right direction.
Thanks.
The only way to do this is through a series of queries and temporary tables.
First, I would create the following Make Table query that you would use to create a mapping of the bad StudentID to correct StudentID.
Select S1.StudentId As NewStudentId, S2.StudentId As OldStudentId
Into zzStudentMap
From Student As S1
Inner Join Student As S2
On S2.Name = S1.Name
Where S1.Disabled = False
And S2.StudentId <> S1.StudentId
And S2.Disabled = True
Next, you would use that temporary table to update the TestScore table with the correct StudentID.
Update TestScore
Inner Join zzStudentMap
On zzStudentMap.OldStudentId = TestScore.StudentId
Set StudentId = zzStudentMap.NewStudentId
The most common technique to identify duplicates in a table is to group by the fields that represent duplicate records:
ID FIRST_NAME LAST_NAME
1 Brian Smith
3 George Smith
25 Brian Smith
In this case we want to remove one of the Brian Smith Records, or in your case, update the ID field so they both have the value of 25 or 1 (completely arbitrary which one to use).
SELECT min(id)
FROM example
GROUP BY first_name, last_name
Using min on ID will return:
ID FIRST_NAME LAST_NAME
1 Brian Smith
3 George Smith
If you use max you would get
ID FIRST_NAME LAST_NAME
25 Brian Smith
3 George Smith
I usually use this technique to delete the duplicates, not update them:
DELETE FROM example
WHERE ID NOT IN (SELECT MAX (ID)
FROM example
GROUP BY first_name, last_name)