Am I on the right track..finding matches between two tables(SQL) - sql

I'm a SQL noob and I'm wondering if someone could give me
some help with the folowing problem:
So I have two tables "Schools" and "Teachers".
the "search_key" column of the "Schools" table is one big string that
combines teachers name and other elements (example: "ENGLISH | JANE | [90, 56])
So what I'm trying to do is match the string from the column "name"
of the "teachers" table with that previous one, getting the cells that have a match.
SELECT * FROM(
SELECT substr(a.search_key, 6, instr(a.search_key, '|'))
FROM schools a
) JOIN teachers s ON a.search_key = s.search_key
This is what Ive been trying to do, substring and try and match, but no luck so far.
Any ideas?

I'm not sure why it would be more complicated than:
SELECT *
FROM schools s
INNER JOIN teachers t ON t.teacher_name LIKE '%' + s.search_key + '%'

This should get you going -
WITH SCHOOLS AS (
SELECT
'ENGLISH | JANE | [90,56]' AS SEARCH_KEY
FROM
DUAL
),TEACHER AS (
SELECT
'JANE' AS NAME
FROM
DUAL
) SELECT
S.SEARCH_KEY
FROM
SCHOOLS S,
TEACHER T
WHERE
S.SEARCH_KEY LIKE '%' || T.NAME || '%';
Output -
SEARCH_KEY
ENGLISH | JANE | [90, 56]
Another approach would be -
WITH SCHOOLS AS (
SELECT
'ENGLISH | JANE | [90,56]' AS SEARCH_KEY
FROM
DUAL
),TEACHER AS (
SELECT
'JANE' AS NAME
FROM
DUAL
) SELECT
S.SEARCH_KEY
FROM
SCHOOLS S,
TEACHER T
WHERE
TRIM(REGEXP_SUBSTR(S.SEARCH_KEY,'(\S*)(\W)',1,3)) = T.NAME;
This works if the name is always the 2nd field value in the SEARCH_KEY field de-limited by |

Related

LIMIT by distinct values in PostgreSQL

I have a table of contacts with phone numbers similar to this:
Name Phone
Alice 11
Alice 33
Bob 22
Bob 44
Charlie 12
Charlie 55
I can't figure out how to query such a table with LIMITing the rows not just by plain count but by distinct names. For example, if I had a magic LIMIT_BY clause, it would work like this:
SELECT * FROM "Contacts" ORDER BY "Phone" LIMIT_BY("Name") 1
Alice 11
Alice 33
-- ^ only the first contact
SELECT * FROM "Contacts" ORDER BY "Phone" LIMIT_BY("Name") 2
Alice 11
Charlie 12
Alice 33
Charlie 55
-- ^ now with Charlie because his phone 12 goes right after 11. Bob isn't here because he's third, beyond the limit
How could I achieve this result?
In other words, select all rows containing top N distinct Names ordered by Phone
I don't think that PostgreSQL provides any particularly efficient way to do this, but for 6 rows it doesn't need to be very efficient. You could do a subquery to compute which people you want to see, then join that subquery back against the full table.
select * from
"Contacts" join
(select name from "Contacts" group by name order by min(phone) limit 2) as limited
using (name)
You could put the subquery in an IN-list rather than a JOIN, but that often performs worse.
If you want all names that are in the first n rows, you can use in:
select t.*
from t
where t.name in (select t2.name
from t t2
order by t2.phone
limit 2
);
If you want the first n names by phone:
select t.*
from t
where t.name in (select t2.name
from t t2
group by t2.name
order by min(t2.phone)
limit 2
);
try this:
SELECT distinct X.name
,X.phone
FROM (
SELECT *
FROM (
SELECT name
,rn
FROM (
SELECT name
,phone
,row_number() OVER (
ORDER BY phone
) rn
FROM "Contacts"
) AA
) DD
WHERE rn <= 2 --rn is the "limit" variable
) EE
,"Contacts" X
WHERE EE.name = X.name
above seems to be working correctly on following dataset:
create table "Contacts" (name text, phone text);
insert into "Contacts" (name, phone) VALUES
('Alice', '11'),
('Alice', '33'),
('Bob', '22'),
('Bob', '44'),
('Charlie', '13'),
('Charlie', '55'),
('Dennis', '12'),
('Dennis', '66');

Opposite of SUBSTR for big query

I have two tables in bigquery that can be matched on a ID. Unfortunately one of the ids has a prefix (3 digits that is not consistent)
For example, one ID is "12345" (Table two / id) and the second ID is "T1_12345" (Table one / Link_id)
When selecting from the first table I can just use SUBSTR to remove the prefix before working in the second table. However, if I want to first select in the second table with the shorter prefix and than in the first table I can't find a way to do that.
The code below is an example of what i'm working with.
I'm looking for something similar to the RIGHT or SUBSTR functions, but in reverse basically.
SELECT body from [table] where link_id in
(SELECT
id
FROM
[table_two]
WHERE
author == "Username")
This code isn't correct, but might give a clearer picture of what i'm trying to do.
SELECT body from [table] where "12345" in
(SELECT
"T1_12345"
FROM
[table_two]
WHERE
author == "Username")
Edit:
For example, if I had these two tables...
Table 1
| First_name| Link ID |
|-----------|-----------|
| James |T1_12345 |
| John |T2_12346 |
Table 2
| Surname| ID |
|-----------|--------|
| Tobbin |12345 |
| Peterson |12346 |
And I ran this query...
SELECT first_name from [table 1] where Link_ID in
(SELECT
ID
FROM
[table_two]
WHERE
Surname == "Peterson")
The output I want is: John Peterson
Below is for BigQuery Standard SQL
#standardSQL
SELECT first_name
FROM `project.dataset.table_one`
WHERE SUBSTR(Link_ID, 4) IN (
SELECT ID
FROM `project.dataset.table_two`
WHERE Surname = 'Peterson'
)
with result:
Row first_name
1 John
--
#standardSQL
SELECT CONCAT(first_name, ' ', Surname) full_name
FROM `project.dataset.table_one`
LEFT JOIN `project.dataset.table_two`
ON SUBSTR(Link_ID, 4) = ID
WHERE Surname = 'Peterson'
with result:
Row full_name
1 John Peterson
Below is for BigQuery Legacy SQL
#legacySQL
SELECT first_name
FROM (
SELECT First_name, SUBSTR(Link_ID, 4) short_ID
FROM [project:dataset.table_one]
)
WHERE short_ID IN (
SELECT ID
FROM [project:dataset.table_two]
WHERE Surname = 'Peterson'
)
--
#legacySQL
SELECT CONCAT(first_name, ' ', Surname) full_name
FROM (
SELECT First_name, SUBSTR(Link_ID, 4) short_ID
FROM [project:dataset.table_one]) t1
LEFT JOIN [project:dataset.table_two] t2
ON short_ID = ID
WHERE Surname = 'Peterson'
If you want to use in, can't you just use this?
SELECT body
FROM [table]
WHERE link_id IN (SELECT SUBSTR(id, 4)
FROM [table_two]
WHERE author = 'Username'
);

Select same last names but not same names

I have a table with fname|lname|startyear|endyear
Take it that a person with same fname and lname is a unique person.
There can be multiple entries with the same fname|lname.
1)How do i find all the same last names belonging to different people?
Eg
'tom' |'jerry'|1990|1991|
'vlad' |'jerry'|1991|1992|
'tim' |'cook' |1991|1992|
'tim' |'cook' |1992|1993|
Output:
jerry
2)Which people (first and last names) served between 'Mary' 'Jane's two terms?
Eg
'mary' |'jane'|1989|1990|
'tom' |'jerry'|1990|1991|
'vlad' |'jerry'|1991|1992|
'tim' |'cook' |1991|1992|
'tim' |'cook' |1992|1993|
'mary' |'jane'|1993|1994
Output
tom jerry
vlad jerry
tim cook
1) In this below query, the inline view gets you all the unique combination of fname,lname's and its joined with the original table on lname that will give you all the unique lnames but have multilple first names.
SELECT lname
FROM table t1
INNER JOIN
( SELECT fname,lname
FROM table
GROUP BY fname,lname
HAVING COUNT(1) = 1
) t2
ON t1.lname = t2.lname;
2) In this query, the inline view will return the min year and max year of the terms served by Mary Jane and then its cross joined to the original table and the comparison is done on the startyear and endyear which will give you all the fname,lname's who served in between Mary Jane.
SELECT fname,lname
FROM table t1
CROSS JOIN
( SELECT MIN(startyear) AS minstart,MAX(endyear) AS maxend
FROM table
WHERE fname = 'Mary' AND lname = 'Jane'
) t2
WHERE t1.startyear >= t2.minstart AND t1.endyear <= t2.maxstart;

UPDATE and return some rows twice

I try to update and return rows. The problem is I use a nested select with UNION to get some rows twice and I want to get them returned twice. Example:
Table:
First_name | last_name | ready
-----------+-----------+------
john | doe | false
| smith | false
jane | | false
Query:
With list(name) as (
Select First_name
from table1
where First_name Not null and ready=false
union
Select last_name
from table1
where last_name Not null and ready=false
)
Select * from list
This returns:
John
jane
doe
smith
Now I want to update the rows found by the select and use update ... returning instead. But the update only returns the three affected rows, while I want it to return the rows as the select in the example does. Is there any way?
Rewrite to:
WITH cte AS (
UPDATE table1
SET ready = true
WHERE (first_name IS NOT NULL OR last_name IS NOT NULL)
AND NOT ready
RETURNING first_name, last_name
)
SELECT first_name FROM cte WHERE first_name IS NOT NULL
UNION ALL
SELECT last_name FROM cte WHERE last_name IS NOT NULL;
Same result, just shorter and faster: This query accesses table1 a single time instead of three times like in your original.
(Verify the superior performance with EXPLAIN ANALYZE on a test table.)
UNION ALL like #Clodoaldo already mentioned. UNION would eliminate duplicates, which is substantially slower (and probably wrong here).
with list(name) as (
select first_name
from table1
where first_name is not null and ready=false
union all
select last_name
from table1
where last_name is not null and ready=false
), u as (
update table1
set ready = true
where
(first_name is not null or last_name is not null)
and
not ready
)
select * from list
You need union all to have the four rows. It is is [not] null

SQL Select multiple rows using where in one table

I have a table named "Student".
Student
id | name | age
1 | john | 10
2 | jack | 10
3 | jerry| 10
I wanna select 1 and 2 rows. I wrote that Select * from Student where name=john and name=jack
But return "Empty Set".
How do i do it. Help me.
select *
from student
where name in ('john', 'jack')
Or
select *
from student
where name = 'john'
or name = 'jack'
You need an OR rather than an AND.
Whatever conditions your write, it checks them all against each record. As no single record has both name = 'john' AND name = 'jack' they all fail.
If, instead, you use OR...
- The 1st record yields TRUE OR FALSE which is TRUE.
- The 2nd record yields FALSE OR TRUE which is TRUE.
- The 3rd record yields FALSE OR FALSE which is FALSE.
Select * from Student where name='john' OR name='jack'
Or, using a differnt way of saying it all...
SELECT * FROM Student WHERE name IN ('john', 'jack')
Use single quotes to surround your values, plus use and instead of or:
where name='john' or name = 'jack'
try this one.
declare #names varchar(100)
set #names='john,jack'
Select * from Student
where charIndex(',' + rtrim(cast(name as nvarchar(max))) + ',',',' +isnull(#names,name) +',') >0