Select same last names but not same names - sql

I have a table with fname|lname|startyear|endyear
Take it that a person with same fname and lname is a unique person.
There can be multiple entries with the same fname|lname.
1)How do i find all the same last names belonging to different people?
Eg
'tom' |'jerry'|1990|1991|
'vlad' |'jerry'|1991|1992|
'tim' |'cook' |1991|1992|
'tim' |'cook' |1992|1993|
Output:
jerry
2)Which people (first and last names) served between 'Mary' 'Jane's two terms?
Eg
'mary' |'jane'|1989|1990|
'tom' |'jerry'|1990|1991|
'vlad' |'jerry'|1991|1992|
'tim' |'cook' |1991|1992|
'tim' |'cook' |1992|1993|
'mary' |'jane'|1993|1994
Output
tom jerry
vlad jerry
tim cook

1) In this below query, the inline view gets you all the unique combination of fname,lname's and its joined with the original table on lname that will give you all the unique lnames but have multilple first names.
SELECT lname
FROM table t1
INNER JOIN
( SELECT fname,lname
FROM table
GROUP BY fname,lname
HAVING COUNT(1) = 1
) t2
ON t1.lname = t2.lname;
2) In this query, the inline view will return the min year and max year of the terms served by Mary Jane and then its cross joined to the original table and the comparison is done on the startyear and endyear which will give you all the fname,lname's who served in between Mary Jane.
SELECT fname,lname
FROM table t1
CROSS JOIN
( SELECT MIN(startyear) AS minstart,MAX(endyear) AS maxend
FROM table
WHERE fname = 'Mary' AND lname = 'Jane'
) t2
WHERE t1.startyear >= t2.minstart AND t1.endyear <= t2.maxstart;

Related

LIMIT by distinct values in PostgreSQL

I have a table of contacts with phone numbers similar to this:
Name Phone
Alice 11
Alice 33
Bob 22
Bob 44
Charlie 12
Charlie 55
I can't figure out how to query such a table with LIMITing the rows not just by plain count but by distinct names. For example, if I had a magic LIMIT_BY clause, it would work like this:
SELECT * FROM "Contacts" ORDER BY "Phone" LIMIT_BY("Name") 1
Alice 11
Alice 33
-- ^ only the first contact
SELECT * FROM "Contacts" ORDER BY "Phone" LIMIT_BY("Name") 2
Alice 11
Charlie 12
Alice 33
Charlie 55
-- ^ now with Charlie because his phone 12 goes right after 11. Bob isn't here because he's third, beyond the limit
How could I achieve this result?
In other words, select all rows containing top N distinct Names ordered by Phone
I don't think that PostgreSQL provides any particularly efficient way to do this, but for 6 rows it doesn't need to be very efficient. You could do a subquery to compute which people you want to see, then join that subquery back against the full table.
select * from
"Contacts" join
(select name from "Contacts" group by name order by min(phone) limit 2) as limited
using (name)
You could put the subquery in an IN-list rather than a JOIN, but that often performs worse.
If you want all names that are in the first n rows, you can use in:
select t.*
from t
where t.name in (select t2.name
from t t2
order by t2.phone
limit 2
);
If you want the first n names by phone:
select t.*
from t
where t.name in (select t2.name
from t t2
group by t2.name
order by min(t2.phone)
limit 2
);
try this:
SELECT distinct X.name
,X.phone
FROM (
SELECT *
FROM (
SELECT name
,rn
FROM (
SELECT name
,phone
,row_number() OVER (
ORDER BY phone
) rn
FROM "Contacts"
) AA
) DD
WHERE rn <= 2 --rn is the "limit" variable
) EE
,"Contacts" X
WHERE EE.name = X.name
above seems to be working correctly on following dataset:
create table "Contacts" (name text, phone text);
insert into "Contacts" (name, phone) VALUES
('Alice', '11'),
('Alice', '33'),
('Bob', '22'),
('Bob', '44'),
('Charlie', '13'),
('Charlie', '55'),
('Dennis', '12'),
('Dennis', '66');

Opposite of SUBSTR for big query

I have two tables in bigquery that can be matched on a ID. Unfortunately one of the ids has a prefix (3 digits that is not consistent)
For example, one ID is "12345" (Table two / id) and the second ID is "T1_12345" (Table one / Link_id)
When selecting from the first table I can just use SUBSTR to remove the prefix before working in the second table. However, if I want to first select in the second table with the shorter prefix and than in the first table I can't find a way to do that.
The code below is an example of what i'm working with.
I'm looking for something similar to the RIGHT or SUBSTR functions, but in reverse basically.
SELECT body from [table] where link_id in
(SELECT
id
FROM
[table_two]
WHERE
author == "Username")
This code isn't correct, but might give a clearer picture of what i'm trying to do.
SELECT body from [table] where "12345" in
(SELECT
"T1_12345"
FROM
[table_two]
WHERE
author == "Username")
Edit:
For example, if I had these two tables...
Table 1
| First_name| Link ID |
|-----------|-----------|
| James |T1_12345 |
| John |T2_12346 |
Table 2
| Surname| ID |
|-----------|--------|
| Tobbin |12345 |
| Peterson |12346 |
And I ran this query...
SELECT first_name from [table 1] where Link_ID in
(SELECT
ID
FROM
[table_two]
WHERE
Surname == "Peterson")
The output I want is: John Peterson
Below is for BigQuery Standard SQL
#standardSQL
SELECT first_name
FROM `project.dataset.table_one`
WHERE SUBSTR(Link_ID, 4) IN (
SELECT ID
FROM `project.dataset.table_two`
WHERE Surname = 'Peterson'
)
with result:
Row first_name
1 John
--
#standardSQL
SELECT CONCAT(first_name, ' ', Surname) full_name
FROM `project.dataset.table_one`
LEFT JOIN `project.dataset.table_two`
ON SUBSTR(Link_ID, 4) = ID
WHERE Surname = 'Peterson'
with result:
Row full_name
1 John Peterson
Below is for BigQuery Legacy SQL
#legacySQL
SELECT first_name
FROM (
SELECT First_name, SUBSTR(Link_ID, 4) short_ID
FROM [project:dataset.table_one]
)
WHERE short_ID IN (
SELECT ID
FROM [project:dataset.table_two]
WHERE Surname = 'Peterson'
)
--
#legacySQL
SELECT CONCAT(first_name, ' ', Surname) full_name
FROM (
SELECT First_name, SUBSTR(Link_ID, 4) short_ID
FROM [project:dataset.table_one]) t1
LEFT JOIN [project:dataset.table_two] t2
ON short_ID = ID
WHERE Surname = 'Peterson'
If you want to use in, can't you just use this?
SELECT body
FROM [table]
WHERE link_id IN (SELECT SUBSTR(id, 4)
FROM [table_two]
WHERE author = 'Username'
);

ms-access 2010: count duplicate names per household address

I am currently working with a spreadsheet in MS Access 2010 which contains about 130k rows of information about people who voted in a local election recently. Each row has their residential information (street name, number, postcode etc.) and personal information (title, surname, forename, middle name, DOB etc.). Each row represents an individual person rather than a household (therefore in many cases the same residential address appears more than once as more than one person resides in a particular household).
What I want to achieve is basically to create a new field in this dataset called 'count'. I want this field to give me a count of how many different surnames reside at a single address.
Is there an SQL script that will allow me to do this in Access 2010?
+------------------+----------+-------+---------+----------+-------------+
| PROPERTYADDRESS1 | POSTCODE | TITLE | SURNAME | FORENAME | MIDDLE_NAME |
+------------------+----------+-------+---------+----------+-------------+
FAKEADDRESS1 EEE 5GG MR BLOGGS JOE N
FAKEADDRESS2 EEE 5BB MRS BLOGGS SUZANNE P
FAKEADDRESS3 EEE 5RG MS SMITH PAULINE S
FAKEADDRESS4 EEE 4BV DR JONES ANNE D
FAKEADDRESS5 EEE 3AS MR TAYLOR STUART A
The following syntax has got me close so far:
SELECT COUNT(electoral.SURNAME)
FROM electoral
GROUP BY electoral.UPRN
However, instead of returning me all 130k odd rows, it only returns me around 67k rows. Is there anything I can do to the syntax to achieve the same result, but just returning every single row?
Any help is greatly appreciated!
Thanks
You could use something like this:
select *,
count(surname) over (partition by householdName)
from myTable
If you have only one column which contains the name,
ex: Rob Adams
then you can do this to have all the surnames in a different column so it will be easier in the select:
SELECT LEFT('HELLO WORLD',CHARINDEX(' ','HELLO WORLD')-1)
in our example:
select right (surmane, charindex (' ',surname)-1) as surname
example on how to use charindex, left and right here:
http://social.technet.microsoft.com/wiki/contents/articles/17948.t-sql-right-left-substring-and-charindex-functions.aspx
if there are any questions, leave a comment.
EDIT: I edited the query, had a syntax error, please try it again. This works on sql server.
here is an example:
create table #temp (id int, PropertyAddress varchar(50), surname varchar(50), forname varchar(50))
insert into #temp values
(1, 'hiddenBase', 'Adamns' , 'Kara' ),
(2, 'hiddenBase', 'Adamns' , 'Anne' ),
(3, 'hiddenBase', 'Adamns' , 'John' ),
(4, 'QueensResidence', 'Queen' , 'Oliver' ),
(5, 'QueensResidence', 'Queen' , 'Moira' ),
(6, 'superSecretBase', 'Diggle' , 'John' ),
(7, 'NandaParbat', 'Merlin' , 'Malcom' )
select * from #temp
select *,
count (surname) over (partition by PropertyAddress) as CountMembers
from #temp
gives:
1 hiddenBase Adamns Kara 3
2 hiddenBase Adamns Anne 3
3 hiddenBase Adamns John 3
7 NandaParbat Merlin Malcom 1
4 QueensResidence Queen Oliver 2
5 QueensResidence Queen Moira 2
6 superSecretBase Diggle John 1
Your query should look like this:
select *,
count (SURNAME) over (partition by PropertyAddress) as CountFamilyMembers
from electoral
EDIT
If over partition by isn't supported, then I guess you can get to your desired result by using group by
select *,
count (SURNAME) over (partition by PropertyAddress) as CountFamilyMembers
from electoral
group by -- put here the fields in the select (one by one), however you can't write group by *
GROUP BY creates an aggregate query, so it's by design that you get fewer records (one per UPRN).
To get the count for each row in the original table, you can join the table with the aggregate query:
SELECT electoral.*, elCount.NumberOfPeople
FROM electoral
INNER JOIN
(
SELECT UPRN, COUNT(*) AS NumberOfPeople
FROM electoral
GROUP BY UPRN
) AS elCount
ON electoral.UPRN = elCount.UPRN
Given the update I want to post another answer. Try it like this:
create table #temp2 ( PropertyAddress1 varchar(50), POSTCODE varchar(20), TITLE varchar (20),
surname varchar(50), FORENAME varchar(50), MIDDLE_NAME varchar (50) )
insert into #temp2 values
('FAKEADDRESS1', 'EEE 5GG', 'MR', 'BLOGGS', 'JOE', 'N'),
('FAKEADDRESS1', 'EEE 5BB', 'MRS', 'BLOGGS', 'SUZANNE', 'P'),
('FAKEADDRESS2', 'EEE 5RG', 'MS', 'SMITH', 'PAULINE', 'S'),
('FAKEADDRESS3', 'EEE 4BV', 'DR', 'JONES', 'ANNE', 'D'),
('FAKEADDRESS4', 'EEE 3AS', 'MR', 'TAYLOR', 'STUART', 'A')
select PropertyAddress1, surname,count (#temp2.surname) as CountADD
into #countTemp
from #temp2
group by PropertyAddress1, surname
select * from #temp2 t2
left join #countTemp ct
on t2.PropertyAddress1 = ct.PropertyAddress1 and t2.surname = ct.surname
This yields:
PropertyAddress1 POSTCODE TITLE surname FORENAME MIDDLE_NAME PropertyAddress1 surname CountADD
FAKEADDRESS1 EEE 5GG MR BLOGGS JOE N FAKEADDRESS1 BLOGGS 2
FAKEADDRESS1 EEE 5BB MRS BLOGGS SUZANNE P FAKEADDRESS1 BLOGGS 2
FAKEADDRESS2 EEE 5RG MS SMITH PAULINE S FAKEADDRESS2 SMITH 1
FAKEADDRESS3 EEE 4BV DR JONES ANNE D FAKEADDRESS3 JONES 1
FAKEADDRESS4 EEE 3AS MR TAYLOR STUART A FAKEADDRESS4 TAYLOR 1

One SQL statement for counting the records in the master table based on matching records in the detail table?

I have the following master table called Master and sample data
ID---------------Date
1 2014-09-07
2 2014-09-07
3 2014-09-08
The following details table called Details
masterId-------------Name
1 John Walsh
1 John Jones
2 John Carney
1 Peter Lewis
3 John Wilson
Now I want to find out the count of Master records (grouped on the Date column) whose corresponding details record with Name having the value "John".
I cannot figure how to write a single SQL statement for this job.
**Please note that join is needed in order to find master records for count. However, such join creates duplicate master records for count. I need to remove such duplicate records from being counted when grouping on the Date column in the Master table.
The correct results should be:
count: grouped on Date column
2 2014-09-07
1 2014-09-08
**
Thanks and regards!
This answer assumes the following
The Name field is always FirstName LastName
You are looking once and only once for the John firstname. The search criteria would be different, pending what you need
SELECT Date, Count(*)
FROM tblmaster
INNER JOIN tbldetails ON tblmaster.ID=tbldetails.masterId
WHERE NAME LIKE 'John%'
GROUP BY Date, tbldetails.masterId
What we're doing here is using a wilcard character in our string search to say "Look for John where any characters of any length follows".
Also, here is a way to create table variables based on what we're working with
DECLARE #tblmaster as table(
ID int,
[date] datetime
)
DECLARE #tbldetails as table(
masterID int,
name varchar(50)
)
INSERT INTO #tblmaster (ID,[date])
VALUES
(1,'2014-09-07'),(2,'2014-09-07'),(3,'2014-09-08')
INSERT INTO #tbldetails(masterID, name) VALUES
(1,'John Walsh'),
(1,'John Jones'),
(2,'John Carney'),
(1,'Peter Lewis'),
(3,'John Wilson')
Based on all comments below, this SQL statement in it's clunky glory should do the trick.
SELECT date,count(t1.ID) FROM #tblmaster mainTable INNER JOIN
(
SELECT ID, COUNT(*) as countOfAll
FROM #tblmaster t1
INNER JOIN #tbldetails t2 ON t1.ID=t2.masterId
WHERE NAME LIKE 'John%'
GROUP BY id)
as t1 on t1.ID = mainTable.id
GROUP BY mainTable.date
Is this what you want?
select date, count(distinct m.id)
from master m join
details d
on d.masterid = m.id
where name like '%John%'
group by date;

Join logic from two separate tables in sql

We returned a list of cardID's after a query and those cardID's belong to two tables Student and Personnel. So how can I join those cardID's with Student and Personnel so I can return a table that shows name of Student and Personnel according to cardID's?
Personnel table:
PERSONNELID NUMBER(9,0)
PERSONNELNAME VARCHAR2(20)
PERSONNELSURNAME VARCHAR2(20)
PERSONNELJOB VARCHAR2(40)
PERSONNELCARDID NUMBER(4,0)
Student table:
STUDENTID NUMBER(9,0)
STUDENTNAME VARCHAR2(20)
STUDENTSURNAME VARCHAR2(20)
STUDENTDEPT VARCHAR2(40)
STUDENTFACULTY VARCHAR2(20)
STUDENTCARDID NUMBER(4,0)
CardID table
CARDID NUMBER(4,0)
USERTYPE VARCHAR2(20)
CHARGE NUMBER(3,2)
CREDIT NUMBER(4,2)
PaymentDevice table:
ORDERNO NUMBER
PAYDEVIP NUMBER(8,0)
PAYDEVDATE DATE No
PAYDEVTIME VARCHAR2(8)
CHARGEDCARDID NUMBER(9,0)
MEALTYPE VARCHAR2(10)
I tried to return first 10 person's name and surname that eat at cafeteria on 27/12/2012
SELECT C.CARDID
FROM CARD C, PAYMENTDEVICE P
WHERE P.ORDERNO
BETWEEN (SELECT MIN(ORDERNO)
FROM PAYMENTDEVICE
WHERE PAYDEVDATE='27/12/2012') AND (SELECT MIN(ORDERNO)
FROM PAYMENTDEVICE
WHERE PAYDEVDATE='27/12/2012')+10 AND C.CARDID=P.CHARGEDCARDID;
Our orderNo isn't reset everyday but keeps increasing so we found the min orderNo that day and add 10 to this value to find first 10 person who eat on that day between those order numbers.
So what return from this query:
CARDID
1005
1000
1002
1003
1009
2000
2001
1007
2002
1004
1006
and those some of those cardId (start with 1) are studentCardId and some of them (starts with 2) are PersonnelCardId. So how can I match and write names accordingly?
SELECT *
FROM Personel p INNER JOIN Student s
ON p.PersonnelCardId = s.StudentCardId
INNER JOIN ReturnedQuery rq
ON rq.CardId = p.PersonnelCardId
updated:
SELECT p.PersonnelName, rq.CardId
FROM Personel p INNER JOIN ReturnedQuery rq
ON rq.CardId = p.PersonnelCardId
UNION
SELECT s.StudentName, rq.Cardid
FROM Student s INNER JOIN ReturnedQuery rq
ON s.StudentCardId = rq.Cardid
Your original query is actually pretty fragile. I'd rewrite it like so (and added the needed joins):
WITH First_Daily_Purchase as (SELECT chargedCardId,
MIN(payDevTime) as payDevTime,
MIN(orderNo) as orderNo
FROM PaymentDevice
WHERE payDevDate >=
TO_DATE('2012-12-27', 'YYYY-MM-DD')
AND payDevDate <
TO_DATE('2012-12-28', 'YYYY-MM-DD')
GROUP BY chargedCardId),
First_10_Daily_Purchasers as (SELECT chargedCardId
FROM (SELECT chargedCardId,
RANK() OVER(ORDER BY payDevTime,
orderNo) as rank
FROM First_Daily_Purchase) a
WHERE a.rank < 11)
SELECT a.chargedCardId, b.personnelName, b.personnelSurname
FROM First_10_Daily_Purchasers a
JOIN Personnel b
ON b.personnelCardId = a.chargedCardId
UNION ALL
SELECT a.chargedCardId, b.studentName, b.studentSurname
FROM First_10_Daily_Purchasers a
JOIN Student b
ON b.studentCardId = a.chargedCardId
(Have a working SQL Fiddle - generally bullet-proofing this took me a while.)
This should get you the first 10 people who made a purchase (not the first 11 purchases, which is what you were actually getting). This of course assumes that payDevTime is actually stored in a sortable format (if it isn't you have bigger problems than this query not working quite right).
That said, there's a number of troubling things about your schema design.