Query to compare values across different tables? - sql

I have a pair of models in my Rails app that I'm having trouble bridging.
These are the tables I'm working with:
states
+----+--------+------------+
| id | fips | name |
+----+--------+------------+
| 1 | 06 | California |
| 2 | 36 | New York |
| 3 | 48 | Texas |
| 4 | 12 | Florida |
| 5 | 17 | Illinois |
| … | … | … |
+----+--------+------------+
places
+----+--------+
| id | place |
+----+--------+
| 1 | Fl |
| 2 | Calif. |
| 3 | Texas |
| … | … |
+----+--------+
Not all places are represented in the states model, but I'm trying to perform a query where I can compare a place's place value against all state names, find the closest match, and return the corresponding fips.
So if my input is Calif., I want my output to be 06
I'm still very new to writing SQL queries, so if there's a way to do this using Ruby within my Rails (4.1.5) app, that would be ideal.
My other plan of attack was to add a fips column to the "places" table, and write something that would run the above comparison and then populate fips so my app doesn't have to run this query every the page loads. But I'm very much a beginner, so that sounds... ambitious.

This is not an easy query in SQL. Your best bet is one of the fuzzing string matching routines, which are documented here.
For instance, soundex() or levenshtein() may be sufficient for what you want. Here is an example:
select distinct on (p.place) p.place, s.name, s.fips, levenshtein(p.place, s.name) as dist
from places p cross join
states s
order by p.place, dist asc;

Related

Postgres: How do I count occurrences of each enum value when they exist in columns as an array?

I have an enum State which can contain values like CA, NY, etc.
If I have a table Users , with a column states that contains an array of State values, so for example {CA, NY} how can I write a query to count the users grouped by each State value? so for {CA, NY} that should count 1 for CA and 1 for NY
So If I had records like:
| id | states |
| -- | ------- |
| 1 | {CA,NY} |
| 2 | {CA} |
| 3 | {NV,CA} |
I would expect a query to output:
| State | count |
| ----- | ----- |
| CA | 3 |
| NV | 1 |
| NY | 1 |
The first piece of advice is to normalise your data. You are breaking 2nd Normal form by holding multiple pieces of information in a single column.
Assuming you can't change that, then you will need to SPLIT the data like this
enter link description here
and you can then COUNT() and group it.

SQL - specific requirement to compare tables

I'm trying to merge 2 queries into 1 (cuts the number of daily queries in half): I have 2 tables, I want to do a query against 1 table, then the same query against the other table that has the same list just less entries.
Basically its a list of (let's call it for obfuscation) people and hobby. One table is ALL people & hobby, the other shorter list is people & hobby that I've met. Table 2 would all be found in table 1. Table 1 includes entries (people I have yet to meet) not found in table 2
The tables are synced up from elsewhere, what I'm looking to do is print a list of ALL people in the first column then print the hobby ONLY of people that are on both lists. That way I can see the lists merged, and track the rate at which the gap between both lists is closing. I have tried a number of SQL combinations but they either filter out the first table and match only items that are true for both (i.e. just giving me table 2) or just adding table 2 to table 1.
Example of what I'm trying to do below:
+---------+----------+--+----------+---------+--+---------+----------+
| table1 | | | table2 | | | query | |
+---------+----------+--+----------+---------+--+---------+----------+
| name | hobby | | activity | person | | name | hobby |
| bob | fishing | | fishing | bob | | bob | fishing |
| bill | vidgames | | hiking | sarah | | bill | |
| sarah | hiking | | planking | sabrina | | sarah | hiking |
| mike | cooking | | | | | mike | |
| sabrina | planking | | | | | sabrina | planking |
+---------+----------+--+----------+---------+--+---------+----------+
Normally I'd just take the few days to learn SQL a bit better however I'm stretched pretty thin at work as it is!
I should mention the table 2 is flipped and the headings are all unique (don't think this matters)!
I think you just want a left join:
select t1.name, t2.activity as hobby
from table1 t1 left join
table2 t2
on t1.name = t2.person;

Calculate Equation From Seperate Tables Data

I'm working on my senior High School Project and am reaching out to the community for help! (As my teacher doesn't know the answer to my question).
I have a simple "Products" table as shown below:
I also have a "Orders" table shown below:
Is there a way I can create a field in the "Orders" table named "Total Cost", and make that automaticly calculate the total cost from all the products selected?
Firstly, I would advise against storing calculated values, and would also strongly advise against using calculated fields in tables. In general, calculations should be performed by queries.
I would also strongly advise against the use of multivalued fields, as your images appear to show.
In general, when following the rules of database normalisation, most sales databases are structured in a very similar manner, containing with the following main tables (amongst others):
Products (aka Stock Items)
Customers
Order Header
Order Line (aka Order Detail)
A good example for you to learn from would be the classic Northwind sample database provided free of charge as a template for MS Access.
With the above structure, observe that each table serves a purpose with each record storing information pertaining to a single entity (whether it be a single product, single customer, single order, or single order line).
For example, you might have something like:
Products
Primary Key: Prd_ID
+--------+-----------+-----------+
| Prd_ID | Prd_Desc | Prd_Price |
+--------+-----------+-----------+
| 1 | Americano | $8.00 |
| 2 | Mocha | $6.00 |
| 3 | Latte | $5.00 |
+--------+-----------+-----------+
Customers
Primary Key: Cus_ID
+--------+--------------+
| Cus_ID | Cus_Name |
+--------+--------------+
| 1 | Joe Bloggs |
| 2 | Robert Smith |
| 3 | Lee Mac |
+--------+--------------+
Order Header
Primary Key: Ord_ID
Foreign Keys: Ord_Cust
+--------+----------+------------+
| Ord_ID | Ord_Cust | Ord_Date |
+--------+----------+------------+
| 1 | 1 | 2020-02-16 |
| 2 | 1 | 2020-01-15 |
| 3 | 2 | 2020-02-15 |
+--------+----------+------------+
Order Line
Primary Key: Orl_Order + Orl_Line
Foreign Keys: Orl_Order, Orl_Prod
+-----------+----------+----------+---------+
| Orl_Order | Orl_Line | Orl_Prod | Orl_Qty |
+-----------+----------+----------+---------+
| 1 | 1 | 1 | 2 |
| 1 | 2 | 3 | 1 |
| 2 | 1 | 2 | 1 |
| 3 | 1 | 1 | 4 |
| 3 | 2 | 3 | 2 |
+-----------+----------+----------+---------+
You might also opt to store the product description & price on the order line records, so that these are retained at the point of sale, as the information in the Products table is likely to change over time.

Query M:N contains

I am trying to filter a set of tables that includes an M:N junction table in Android Room (SQLite).
An image can have many subjects. I'd like to allow filtering by a subject, so that I get a row with complete image information (including all subjects). So if an image had (National Park, Yosemite) filtering for either would result in one row with both keywords. Unless I messed something up, a typical join will result in multiple rows such that matching Yosemite would get the right image, but you'd be lacking National Park. I came up with this:
SELECT *,
(SELECT GROUP_CONCAT(name)
FROM meta_subject_junction
JOIN subject
ON subject.id = meta_subject_junction.subjectId
WHERE meta_subject_junction.metaId = meta.id) AS keywords,
(SELECT documentUri
FROM image_parent
WHERE meta.parentId = image_parent.id ) AS parentUri
FROM meta
Now this gets me the complete rows, but I think at this point I'd need to:
WHERE keywords LIKE(%YOSEMITE%)
and I think the LIKE is less than ideal, not to mention an imprecise match. Is there a better way to accomplish this? Thanks, this is bending my novice SQL brain.
Further details
meta
+----+----------+--+
| id | name | |
+----+----------+--+
| 1 | yosemite | |
| 2 | bryce | |
| 3 | flowers | |
+----+----------+--+
subject
+----+---------------+--+
| id | name | |
+----+---------------+--+
| 1 | National Park | |
| 2 | Yosemite | |
| 3 | Tulip | |
+----+---------------+--+
junction
+--------+-----------+
| metaId | subjectId |
+--------+-----------+
| 1 | 1 |
| 1 | 2 |
| 2 | 1 |
| 3 | 3 |
+--------+-----------+
Although I may have done something wrong, as far as I can tell Android Room doesn't like:
+----+-----------+---------------+
| id | name | subject |
+----+-----------+---------------+
| 1 | yosemite | National Park |
| 1 | yosemite | Yosemite |
+----+-----------+---------------+
so I'm trying to reduce the rows:
+----+-----------+-------------------------+
| id | name | subject |
+----+-----------+-------------------------+
| 1 | yosemite | National Park, Yosemite |
+----+-----------+-------------------------+
which the above query does. However, I also want to query for a subject. So that National Park filter will yield:
+----+-----------+-------------------------+
| id | name | subject |
+----+-----------+-------------------------+
| 1 | yosemite | National Park, Yosemite |
| 2 | bryce | National Park |
+----+-----------+-------------------------+
I'd like to be more precise/efficient than LIKE with the already 'concat' subject. Most of my attempts end up with no results in Room (multi-row) or reducing the subject to only the filter keyword.
Update
Here's a test I've been using to compare the actual SQL results from a query to what Android Room ends up with:
http://sqlfiddle.com/#!7/0ac11/10/0
That join query is interpreted as four objects in Android Room, so I'm trying to reduce the rows, but retain the full subject results while filtering for any image containing the subject keyword.
If you want multiple keywords, then where and group by and having can be used:
select image_id
from image_subject
where subject_id in ('a', 'b', 'c') -- whatever
group by image-id
having count(distinct subject_id) = 3; -- same count as in `where`
This gets the result I need, though I'd love to hear a better option if this is particularly inefficient.
SELECT meta.*,
(SELECT GROUP_CONCAT(name)
FROM junction
JOIN subject
ON subject.id = junction.subjectId
WHERE junction.metaId = meta.id) AS keywords,
junction.subjectId
FROM meta
LEFT JOIN junction ON junction.metaId = meta.id
WHERE subjectId IN (1,2)
GROUP BY meta.id
+----+----------+------------------------+-----------+
| id | name | keywords | subjectId |
+----+----------+------------------------+-----------+
| 1 | yosemite | National Park,Yosemite | 2 |
| 2 | bryce | National Park | 1 |
+----+----------+------------------------+-----------+
http://sqlfiddle.com/#!7/86a76/13

sqlite3: Create tables with many rows or one table with more columns

I'm asking for a best practive when creating some tables for localization of an Web Interface in sqlite3
In my first intetion I wanted to create a table with the different languages, and another on for the Messeage Code and Entries.
tblLanguage
+------------+-------------+---------+
| idLangCode | txtLangName | txtCode |
+------------+-------------+---------+
| 1 | English | en |
| 2 | German | de |
| 3 | French | fr |
| 4 | Spanish | es |
| 5 | Chinese | zh |
+------------+-------------+---------+
tblMessageText
+----+-------+--------------------------+------------+
| Id | Code | Message | LanguageID |
+----+-------+--------------------------+------------+
| 1 | 20500 | Set Point changed | 1 |
| 2 | 20500 | Sollwert geändert | 2 |
| 3 | 20500 | Punto de ajuste cambiado | 5 |
+----+-------+--------------------------+------------+
So in the second table I would have several rows with the same Message Code but whith an different language text.
The other possibility would be to have just one table with just one row for each Message Code but an Column for each language.
tblMessageTextMulti
+----+-------+-------------------+-------------------+--------------------------+
| id | Code | txtMessageText_EN | txtMessageText_DE | txtMessageText_ES |
+----+-------+-------------------+-------------------+--------------------------+
| 1 | 20500 | Set Point changed | Sollwert geändert | Punto de ajuste cambiado |
+----+-------+-------------------+-------------------+--------------------------+
My team likes the second solution with just one table more, because it just has one entry for each Message Code, and you see all Language text side by side.
What I like on the first solution is, that I could dynmically Query the langugage with just on line in php:
$query = 'SELECT * FROM qryInfoMessage WHERE idLangCode=' .$Language;
For the second solution with one table I can not store the query itself in the database, because I have to change the Column name in my query dynmically. So I have to put this together in php.
$query = 'SELECT Code, txtMessageText_EN FROM tblMessageTextMulti;
What I dont show here is that my query is much more complex, whith string substitution and date time conversion.
Beside that, what are the advantages or disadvantages of this solutions. Which one should be more perfomant and what is the best practice?