Joining tables on an array with string - sql

I am looking to do a left join on a table that has an array column called tags with a table that has the definitions of the tag, tag_definitions. There will only be one (at the most) match per row in the Cities Table. I can't join an array with a string and i'm not sure how to proceed.
Cities_Table
City_Code | State |Tags
NYC | NY | 1, 4, 5
SF | CA | 2,4, 6
CHI | IL | 3, 8, 10
.
Tag_Definitions
Tag_ID | Name
5 | East_Coast
6 | West_Coast
10 | MidWest
So I'm looking to get something like this...
City_Code | State |Tags | Tag_Descr
NYC | NY | 1, 4, 5 | East_Coast
SF | CA | 2,4, 6 | West_Coast
CHI | IL | 3, 8, 10 | MidWest

Depending on your database (the syntax might be different), you can do something like the following:
select *
from cities_table c
join tag_definitions t on concat(',',c.tags,',') like concat('%,',t.tag_id,',%')
SQL Fiddle Demo
However as noted, a better idea would be to create a City_Tags table and store the individual ids in that table. Generally it's not a good idea to store comma delimited data.

Related

How to fetch records from DB which fulfill a certain criteria

I have the following problem and wanted to ask if this is the correct way to do it or if there is a better way of doing it:
Assume I have the following table/data in my DB:
|---|----|------|-------------|---------|---------|
|id |city|street|street_number|lastname |firstname|
|---|----|------|-------------|---------|---------|
| 1 | ar | K1 | 13 |Davenport| Hector |
| 2 | ar | L1 | 27 |Cannon | Teresa |
| 3 | ar | A1 | 135 |Brewer | Izaac |
| 4 | dc | A2 | 8 |Fowler | Milan |
| 5 | fr | C1 | 18 |Kaiser | Ibrar |
| 6 | fr | C1 | 28 |Weaver | Kiri |
| 7 | ny | O1 | 37 |Petersen | Derrick |
I now get some some requests of the following structures: (city/street/street_number)
E.g.: {(ar,K1,13),(dc,A2,8),(ny,01,37)}
I want to retrieve the last name of the person living there. Since the request amount is quite large I don't want to run over all the request one-by-one. My current implementation is to insert the data into a temporary table and join the values.
Is this the right approach or is there some better way of doing this?
You can construct a query using in with tuples:
select t.*
from t
where (city, street, street_number) in ( (('ar', 'K1', '13'), ('dc', 'A2', '8'), ('ny', '01', '37') );
However, if the data starts in the database, then a temporary table or subquery is better than bringing the results back to the application and constructing such a query.
I think you can use the hierarchy query and string function as follows:
WITH YOUR_INPUT_DATA AS
(SELECT '(ar,K1,13),(dc,A2,8),(ny,01,37)' AS INPUT_STR FROM DUAL),
--
CTE AS
( SELECT REGEXP_SUBSTR(STR,'[^,]',1,2) AS STR1,
REGEXP_SUBSTR(STR,'[^,]',1,3) AS STR2,
REGEXP_SUBSTR(STR,'[^,]',1,4) AS STR3
FROM (SELECT SUBSTR(INPUT_STR,
INSTR(INPUT_STR,'(',1,LEVEL),
INSTR(INPUT_STR,')',1,LEVEL) - INSTR(INPUT_STR,'(',1,LEVEL) + 1) STR
FROM YOUR_INPUT_DATA
CONNECT BY LEVEL <= REGEXP_COUNT(INPUT_STR,'\),\(') + 1))
--
SELECT * FROM YOUR_TABLE WHERE (city,street,street_number)
IN (SELECT STR1,STR2,STR3 FROM CTE);

How to create a table from different query results SQL

I want to create a new table using the results from some queries. I might be looking at this the wrong way so please feel free to let me know. Because of this I will try to make this question simple without putting my code to match each employee number with each manager level column from table2
I have two tables, one has employee names and employee numbers example
table 1
+-------------+-----------+-------------+-------------+
| emplpyeenum | firstname | last name | location |
+-------------+-----------+-------------+-------------+
| 11 | joe | free | JE |
| 22 | jill | yoyo | XX |
| 33 | yoda | null | 9U |
+-------------+-----------+-------------+-------------+
and another table with employee numbers under each manager level so basically a hierarchy example
Table 2
+---------+----------+----------+
| manager | manager2 | manager3 |
+---------+----------+----------+
| 11 | 22 | 33 |
+---------+----------+----------+
I want to make a new table that will have the names besides the numbers, so for example but with employee number beside the names
+---------+--------+----------+
| level 1 | level2 | level3 |
+---------+--------+----------+
| jill | joe | yoda |
+---------+--------+----------+
How can I do this?
edit sorry guys I don't have permission to create a new table or view
Why not change your table2 to this?
+------------+----------+
| EmployeeId | ManagerId|
+------------+----------+
| 11 | NULL |
+------------+----------+
| 22 | 11 |
+------------+----------+
| 33 | 22 |
+------------+----------+
Then you can do what you want with the data. At least your data will be properly normalized. In your table2. What happen if employee 33 hire another employee below him? You will add another column?
Based on your available table, this should give you the result you want.
SELECT m1.firstname, m2.firstname, m3.firstname
FROM table2 t
LEFT JOIN table1 m1 ON m1.employeenum = t.manager
LEFT JOIN table1 m2 ON m2.employeenum = t.manager2
LEFT JOIN table1 m3 ON m3.employeenum = t.manager3
You can just do a basic create table, then do a insert select to that will fill the table the way you need it. All you have to do is replace the select statement that I provided with the one you used to create the levels table output.
create table Levels
(
level1 varchar(25),
level2 varchar(25),
level3 varchar(25)
)
insert into Levels(level1, level2, level3)
select * from tables --here you would put the select statement that you used to create the information. If you dont have this script then let me know

Using a table to lookup multiple IDs on one row

I have two tables I am using at work to help me gain experience in writing SQL queries. One table contains a list of Applications and has three columns -
Application_Name, Application_Contact_ID and Business_Contact_ID. I then have a separate table called Contacts with two columns - Contact_ID and Contact_Name. I am trying to write a query that will list the Application_Name and Contact_Name for both the Applications_Contact_ID and Business_Contact_ID columns instead of the ID number itself.
I understand I need to JOIN the two tables but I haven't quite figured out how to formulate the correct statement. Help Please!
APPLICATIONS TABLE:
+------------------+------------------------+---------------------+
| Application_Name | Application_Contact_ID | Business_Contact_ID |
+------------------+------------------------+---------------------+
| Adobe | 23 | 23 |
| Word | 52 | 14 |
| NotePad++ | 44 | 989 |
+------------------+------------------------+---------------------+
CONTACTS TABLE:
+------------+--------------+
| Contact_ID | Contact_Name |
+------------+--------------+
| 23 | Tim |
| 52 | John |
| 14 | Jen |
| 44 | Carl |
| 989 | Sam |
+------------+--------------+
What I am trying to get is:
+------------------+--------------------------+-----------------------+
| Application_Name | Application_Contact_Name | Business_Contact_Name |
+------------------+--------------------------+-----------------------+
| Adobe | Tim | Tim |
| Word | John | Jen |
| NotePad++ | Carl | Sam |
+------------------+--------------------------+-----------------------+
I've tried the below but it is only returning the name for one of the columns:
SELECT Application_Name, Application_Contact_ID, Business_Contact_ID, Contact_Name
FROM Applications
JOIN Contact ON Contact_ID = Application_Contact_ID
This is a pretty critical and 101 part of SQL. Consider reading this other answer on a different question, which explains the joins in more depth. The trick to your query, is that you have to join the CONTACTS table twice, which is a bit hard to visualize, because you have to go there for both the application_contact_id and business_contact_id.
There are many flavors of joins (INNER, LEFT, RIGHT, etc.), which you'll want to familiarize yourself with for the future reference. Consider reading this article at the very least: https://www.techonthenet.com/sql_server/joins.php.
SELECT t1.application_name Application_Name,
t2.contact_name Application_Contact_name,
t3.contact_name Business_Contact_name
FROM applications t1
INNER JOIN contacts ON t2 t1.Application_Contact_ID = t2.contact_id -- join contacts for appName
INNER JOIN contacts ON t3 t1.business_Contact_ID = t3.contact_id; -- join contacts for busName

Remove newest redundant row and update timestamp

I'm working with a SQLite database that receives large data dumps on a regular basis from several sources. Unfortunately, those sources aren't intelligent about what they dump, and I end up with a lot of repeated records from one time to the next. I'm looking for a way to remove these repeated records without affecting the records that have legitimately changed from the past dump to this one.
Here's the general structure of the data (_id is the primary key):
| _id | _dateUpdated | _dateEffective | _dateExpired | name | status | location |
|-----|--------------|----------------|--------------|------|--------|----------|
| 1 | 2016-05-01 | 2016-05-01 | NULL | Fred | Online | USA |
| 2 | 2016-05-01 | 2016-05-01 | NULL | Jim | Online | USA |
| 3 | 2016-05-08 | 2016-05-08 | NULL | Fred | Offline| USA |
| 4 | 2016-05-08 | 2016-05-08 | NULL | Jim | Online | USA |
| 5 | 2016-05-15 | 2016-05-15 | NULL | Fred | Offline| USA |
| 6 | 2016-05-15 | 2016-05-15 | NULL | Jim | Online | USA |
I'd like to be able to reduce this data to something like this:
| _id | _dateUpdated | _dateEffective | _dateExpired | name | status | location |
|-----|--------------|----------------|--------------|------|--------|----------|
| 1 | 2016-05-01 | 2016-05-01 | 2016-05-07 | Fred | Online | USA |
| 2 | 2016-05-15 | 2016-05-01 | NULL | Jim | Online | USA |
| 3 | 2016-05-15 | 2016-05-08 | NULL | Fred | Offline| USA |
The idea here is that rows 4, 5, and 6 exactly duplicate rows 2 and 3 except for the timestamps (I'd need to compare by all three fields - name, status, location). However, row 3 does not duplicate row 1 (status changed from Online to Offline), so the _dateExpired field is set in row 1, and row 3 becomes the most recent record.
I'm querying this table with something like this:
SELECT * FROM Data WHERE
date(_dateEffective) <= date("now")
AND (_dateExpired IS NULL OR date(_dateExpired) > date("now"))
Is this sort of reduction possible in SQLite?
I am still a beginner to SQL and database design in general, so it's possible that I haven't structured the database in the best way. I'm open to suggestions there as well...I'm going for the ability to query data at a given point in time - for example, "what was Jim's status around 2016-05-06?"
Thanks in advance!
Consider using a staging table where the dump file goes into a DumpTable (regularly cleaned out before each dump) and then an INSERT...SELECT query migrates to your final table.
Now the SELECT portion maintains a correlated subquery (to calculate new [_dateExpired] for needed rows) and derived table subquery (to filter out non-dups according to your criteria). Finally, the LEFT JOIN...NULL with FinalTable is to ensure no duplicate records are appended, assuming [_id] is a unique identifier. Below is the routine:
Clean Out DumpTable
DELETE FROM DumpTable;
Run Dump Routine to be appended into DumpTable
Append Records to FinalTable
INSERT INTO FinalTable ([_id], [_dateUpdated], [_dateEffective], [_dateExpired],
[name], status, location)
SELECT d.[_id], d.[_dateUpdated], d.[_dateEffective],
(SELECT Min(date(sub.[_dateEffective], '-1 day'))
FROM DumpTable sub
WHERE sub.[name] = DumpTable.[name]
AND sub.[_dateEffective] > DumpTable.[_dateEffective]
AND sub.status <> DumpTable.status) As calcExpired
d.name, d.status, d.location
FROM DumpTable d
INNER JOIN
(SELECT Min(DumpTable.[_id]) AS min_id,
DumpTable.name, DumpTable.status
FROM DumpTable
GROUP BY DumpTable.name, DumpTable.status) AS c
ON (c.name = d.name)
AND (c.min_id = d.[_id])
AND (c.status = d.status)
LEFT JOIN FinalTable f
ON d.[_id] = f.[_id]
WHERE f.[_id] IS NULL;
-- INSERTED RECORDS:
-- _id _dateUpdated _dateEffective _dateExpired name status location
-- 1 2016-05-01 2016-05-01 2016-05-07 Fred Online USA
-- 2 2016-05-01 2016-05-01 Jim Online USA
-- 3 2016-05-08 2016-05-08 Fred Offline USA
Is this sort of reduction possible in SQLite?
The answer to any "reduction" question in SQL is always Yes. The trick is to find what axes you're reducing along.
Here's a partial solution to illustrate; it gives the first Online date for each name & location.
select min(_dateEffective) as start_date
, name
, location
from Data
where status = 'Online'
group by
name
, location
With an outer join back to the table (on name & location) where the status is 'Offline' and the _dateEffective is greater than start_date, you get your _dateExpired.
_id is the primary key
There is a commonly held misunderstanding that every table needs some kind of sequential "ID" number as a primary key. The key you really care about is known as a natural key, 1 or more columns in the data that uniquely identify the data. In your case, it looks to me like that's _dateEffective, name, status, and location. At the very least, declare them unique to prevent accidental duplication.

SQL select / join performace (PHP, PDO)

What is the most efficient way to get data from two tables set up in the following way:
Table 1:
ID(PK) | Name | Age
--------------------------
1 | Jim | 44
2 | Jane | 35
3 | John | 22
Table 2
Name(PK) | Pet(PK)
-----------------
Jim | Cat
Jim | Dog
Jane | Fish
There is a constraint on "Name" with the FK in Table 2
Results
I want the age and all the pets for a specific person.
Name | Age | Pet
---------------------
Jim | 44 | Cat
Jim | 44 | Dog
As I see it these are my options:
1) Left join table 2 on Name and end up with redundant data in my resulting array for Name and Age (as above).
2) Use a function that turns the pets into a comma separated list.
3) Use 2 separate selects.
My question is relating to performance of the 3 options above. I don't need SQL (specifically, unless you want to suggest another method).
Thanks!
select
tb01.name, tb01.age, tb02.pet
from
table01 tb01
left join table02 tb02 on tb02.name = tb01.name