Create a summary result with one query - sql

I have a table with the following format.
mysql> describe unit_characteristics;
+----------------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------------------+------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| uut_id | int(10) unsigned | NO | PRI | NULL | |
| uut_sn | varchar(45) | NO | | NULL | |
| characteristic_name | varchar(80) | NO | PRI | NULL | |
| characteristic_value | text | NO | | NULL | |
| creation_time | datetime | NO | | NULL | |
| last_modified_time | datetime | NO | | NULL | |
+----------------------+------------------+------+-----+---------+----------------+
each uut_sn has multiple characteristic_name/value pairs. I want to use MySQL to generate a table
+----------------------+-------------+-------------+-------------+--------------+
| uut_sn | char_name_1 | char_name_2 | char_name_3 | char_name_4 | ... |
+----------------------+-------------+-------------+-------------+--------------+
| 00000 | char_val_1 | char_val_2 | char_val_3 | char_val_4 | ... |
| 00001 | char_val_1 | char_val_2 | char_val_3 | char_val_4 | ... |
| 00002 | char_val_1 | char_val_2 | char_val_3 | char_val_4 | ... |
| ..... | char_val_1 | char_val_2 | char_val_3 | char_val_4 | ... |
+----------------------+------------------+------+-----+---------+--------------+
Is this possible with just one query?
Thanks,
-peter

This is a standard pivot query:
SELECT uc.uut_sn,
MAX(CASE
WHEN uc.characteristic_name = 'char_name_1' THEN uc.characteristic_value
ELSE NULL
END) AS char_name_1,
MAX(CASE
WHEN uc.characteristic_name = 'char_name_2' THEN uc.characteristic_value
ELSE NULL
END) AS char_name_2,
MAX(CASE
WHEN uc.characteristic_name = 'char_name_3' THEN uc.characteristic_value
ELSE NULL
END) AS char_name_3,
FROM unit_characteristics uc
GROUP BY uc.uut_sn
To make it dynamic, you need to use MySQL's dynamic SQL syntax called Prepared Statements. It requires two queries - the first gets a list of the characteristic_name values, so you can concatenate the appropriate string into the CASE expressions like you see in my example as the ultimate query.

You're using the EAV antipattern. There's no way to automatically generate the pivot table you describe, without hardcoding the characteristics you want to include. As #OMG Ponies mentions, you need to use dynamic SQL to general the query in a custom fashion for the set of characteristics you want to include in the result.
Instead, I recommend you fetch the characteristics one per row, as they are stored in the database, and if you want an application object to represent a single UUT with all its characteristics, you write code to loop over the rows as you fetch them in your application, collecting them into objects.
For example in PHP:
$sql = "SELECT uut_sn, characteristic_name, characteristic_value
FROM unit_characteristics";
$stmt = $pdo->query($sql);
$objects = array();
while ($row = $stmt->fetch()) {
if (!isset($objects[ $row["uut_sn"] ])) {
$object[ $row["uut_sn"] ] = new Uut();
}
$objects[ $row["uut_sn"] ]->$row["characteristic_name"]
= $row["characterstic_value"];
}
This has a few advantages over the solution of hardcoding characteristic names in your query:
This solution takes only one SQL query instead of two.
No complex code is needed to build your dynamic SQL query.
If you forget one of the characteristics, this solution automatically finds it anyway.
GROUP BY in MySQL is often slow, and this avoids the GROUP BY.

Related

Replacing for loop by sql

I have SQL for example
show tables from mydb;
It shows the list of table
|table1|
|table2|
|table3|
Then,I use sql sentence for each table.
such as "show full columns from table1 ;"
+----------+--------+-----------+------+-----+---------+----------------+---------------------------------+---------+
| Field | Type | Collation | Null | Key | Default | Extra | Privileges | Comment |
+----------+--------+-----------+------+-----+---------+----------------+---------------------------------+---------+
| id | bigint | NULL | NO | PRI | NULL | auto_increment | select,insert,update,references | |
| user_id | bigint | NULL | NO | MUL | NULL | | select,insert,update,references | |
| group_id | int | NULL | NO | MUL | NULL | | select,insert,update,references | |
+----------+--------+-----------+------+-----+---------+----------------+---------------------------------+---------+
So in this case I can use programming language such as .(this is not correct code just showing the flow)
tables = "show tables from mydb;"
for t in tables:
cmd.execute("show full columns from {t} ;")
However is it possible to do this in sql only?
If you are using MySQL you can use the system view - INFORMATION_SCHEMA.
It contains table name and column name (and other details). No loop is require and you can easily filter by other information, too.
SELECT *
FROM INFORMATION_SCHEMA.COLUMNS
If you are using Microsoft SQL Server, you can use the above command

SQL for calculated column that chooses from value in own row

I have a table in which several indentifiers of a person may be stored. In this table I would like to create a single calculated identifier column that stores the best identifier for that record depending on what identifiers are available.
For example (some fictional sample data) ....
Table = "Citizens"
Id | LastName | DL-No | SS-No | State-Id-No | Calculated
------------------------------------------------------------------------
1 | Smith | NULL | 374-784-8888 | 7383204848 | ?
2 | Jones | JG892435262 | NULL | NULL | ?
3 | Trask | TSK73948379 | NULL | 9276542119 | ?
4 | Clinton | CL231429888 | 543-123-5555 | 1840430324 | ?
I know the order in which I would like choose identifiers ...
Drivers-License-No
Social-Security-No
State-Id-No
So I would like the calculated identifier column to be part of the table schema. The desired results would be ...
Id | LastName | DL-No | SS-No | State-Id-No | Calculated
------------------------------------------------------------------------
1 | Smith | NULL | 374-784-8888 | 7383204848 | 374-784-8888
2 | Jones | JG892435262 | NULL | 4537409273 | JG892435262
3 | Trask | NULL | NULL | 9276542119 | 9276542119
4 | Clinton | CL231429888 | 543-123-5555 | 1840430324 | CL231429888
IS this possible? If so what SQL would I use to calculate what goes in the "Calculated" column?
I was thinking of something like ..
SELECT
CASE
WHEN ([DL-No] is NOT NULL) THEN [DL-No]
WHEN ([SS-No] is NOT NULL) THEN [SS-No]
WHEN ([State-Id-No] is NOT NULL) THEN [State-Id-No]
AS "Calculated"
END
FROM Citizens
The easiest solution is to use coalesce():
select c.*,
coalesce([DL-No], [SS-No], [State-ID-No]) as calculated
from citizens c
However, I think your case statement will also work, if you fix the syntax to use when rather than where.

Joining two tables and show data from one if there is any

I have these two tables that i need to join
fields_data fields
+------------+-----------+------+ +------+-------------+----------+
| relationid | fieldname | data | | name | displayname | position |
+------------+-----------+------+ +------+-------------+----------+
| 2 | ftp | test | | user | Username | top |
| 2 | other | 1234 | | pass | Password | top |
+------------+-----------+------+ | ftp | FTP | top |
| log | Log | top |
| txt | Text | mid |
+------+-------------+----------+
I want to get all the rows from the "fields" table if they have the position "top" AND if a row has a match on name = fieldname from fields_data it should also show the data. This is my join
SELECT
fd.`data`,
fd.`relationid`,
fd.`fieldname`,
f.`name`,
f.`displayname`
FROM `fields` AS f
LEFT OUTER JOIN `fields_data` AS fd
ON fd.`fieldname` = f.`name`
WHERE f.`position`='top' AND (fd.`relationid`='3' OR fd.`relationid` IS NULL)
My problem is that the above query only gives me this result:
+------+------------+-----------+------+-------------+
| data | relationid | fieldname | name | displayname |
+------+------------+-----------+------+-------------+
| NULL | NULL | NULL | user | Username |
| NULL | NULL | NULL | pass | Password |
| NULL | NULL | NULL | log | Log |
+------+------------+-----------+------+-------------+
The field called "ftp" is missing due to it having a relation to "2".. However i still want to display it as result but like the others with NULL in it. And if the SQL query had "fd.relationid='2'" instead of 3 it would give same result, but with the row containing ftp in name, holding data in the three fields.
I hope you get what i mean.. My english is not the best.. Heres the result i want:
with above query containing fd.`relationid`='3'
+------+------------+-----------+------+-------------+
| data | relationid | fieldname | name | displayname |
+------+------------+-----------+------+-------------+
| NULL | NULL | NULL | user | Username |
| NULL | NULL | NULL | pass | Password |
| NULL | NULL | NULL | ftp | FTP |
| NULL | NULL | NULL | log | Log |
+------+------------+-----------+------+-------------+
with above query containing fd.`relationid`='2'
+------+------------+-----------+------+-------------+
| data | relationid | fieldname | name | displayname |
+------+------------+-----------+------+-------------+
| NULL | NULL | NULL | user | Username |
| NULL | NULL | NULL | pass | Password |
| test | 2 | ftp | ftp | FTP |
| NULL | NULL | NULL | log | Log |
+------+------------+-----------+------+-------------+
You want to move the condition to the on clause:
SELECT fd.`data`, fd.`relationid`, fd.`fieldname`, f.`name`, f.`displayname`
FROM `fields` f LEFT OUTER JOIN
`fields_data` fd
ON fd.`fieldname` = f.`name` AND fd.`relationid` = '3'
WHERE f.`position`='top' ;
It is interesting that the semantics of your query and this query are different -- and you found the exact situation: when there is a match on another value, the where clause form filters out the row. This will still keep everything.
As a note, the following also does what you want:
SELECT fd.`data`, fd.`relationid`, fd.`fieldname`, f.`name`, f.`displayname`
FROM `fields` f LEFT OUTER JOIN
(SELECT fd.*
FROM `fields_data` fd
WHERE fd.`relationid` = '3'
) fd
ON fd.`fieldname` = f.`name`
WHERE f.`position` = 'top' ;
I wouldn't recommend writing the query this way, particularly in MySQL (because the subquery is materialized). However, understanding why your version is different from these versions (and why these are the same) is a big step forward in mastering outer joins.

Rewriting this subquery?

I am trying to build a new table such that the values in the existing table are NOT contained (but obviously the following checks for contained) in another table. Following is my table structure:
mysql> explain t1;
+-----------+---------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------+---------------------+------+-----+---------+-------+
| id | int(11) | YES | | NULL | |
| point | bigint(20) unsigned | NO | MUL | 0 | |
+-----------+---------------------+------+-----+---------+-------+
mysql> explain whitelist;
+-------------+---------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+---------------------+------+-----+---------+----------------+
| id | bigint(20) unsigned | NO | PRI | NULL | auto_increment |
| x | bigint(20) unsigned | YES | | NULL | |
| y | bigint(20) unsigned | YES | | NULL | |
| geonetwork | linestring | NO | MUL | NULL | |
+-------------+---------------------+------+-----+---------+----------------+
My query looks like this:
SELECT point
FROM t1
WHERE EXISTS(SELECT source
FROM whitelist
WHERE MBRContains(geonetwork, GeomFromText(CONCAT('POINT(', t1.point, ' 0)'))));
Explain:
+----+--------------------+--------------------+-------+-------------------+-----------+---------+------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+--------------------+-------+-------------------+-----------+---------+------+------+--------------------------+
| 1 | PRIMARY | t1 | index | NULL | point | 8 | NULL | 1001 | Using where; Using index |
| 2 | DEPENDENT SUBQUERY | whitelist | ALL | _geonetwork | NULL | NULL | NULL | 3257 | Using where |
+----+--------------------+--------------------+-------+-------------------+-----------+---------+------+------+--------------------------+
The query is taking 6 seconds to execute for 1000 records in t1 which is unacceptable for me. How can I rewrite this query using Joins (or perhaps a faster way if that exists) if I don't have a column to join on? Even a stored procedure is acceptable I guess in the worst case. My goal is to finally create a new table containing entries from t1. Any suggestions?
Unless the query optimizer is failing, a WHERE EXISTS construct should result in the same plan as a join with a GROUP clause. Look at optimizing MBRContains(geonetwork, GeomFromText(CONCAT('POINT(', t1.point, ' 0)')))), that's probably where your query is spending all its time. I don't have a suggestion for that, but here's your query written with a JOIN:
Select t1.point
from t1
join whitelist on MBRContains(whitelist.geonetwork, GeomFromText(CONCAT('POINT(', t1.point, ' 0)'))))
group by t1.point
;
or to get the points in t1 not in whitelist:
Select t1.point
from t1
left join whitelist on MBRContains(whitelist.geonetwork, GeomFromText(CONCAT('POINT(', t1.point, ' 0)'))))
where whitelist.id is null
;
This seems like a case where de-nomalizing t1 might be beneficial. Adding a GeomFrmTxt column with a value of GeomFromText(CONCAT('POINT(', t1.point, ' 0)')) could speed up the query you already have.

SQL LIKE question

I was wondering if there's a drawback (other than bad practice) to using something like this
SELECT * FROM my_table WHERE id LIKE '1';
where id is an integer. I know you're supposed to use id=1 but I am writing a java program and if everything can use LIKE it'll be a lot easier for me. Also, so far, everything works fine; I get the correct query results, so if there is no drawback I will continue doing it like this.
edit: I am using MySQL.
MySQL will allow it, but will ignore the index:
mysql> describe METADATA_44;
+---------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------+--------------+------+-----+---------+-------+
| AtextId | int(11) | NO | PRI | NULL | |
| num | varchar(128) | YES | | NULL | |
| title | varchar(128) | YES | | NULL | |
| file | varchar(128) | YES | | NULL | |
| context | varchar(128) | YES | | NULL | |
| source | varchar(128) | YES | | NULL | |
+---------+--------------+------+-----+---------+-------+
6 rows in set (0.00 sec)
mysql> explain select * from METADATA_44 where Atextid like '7';
+----+-------------+-------------+------+---------------+------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------+------+---------------+------+---------+------+------+-------------+
| 1 | SIMPLE | METADATA_44 | ALL | PRIMARY | NULL | NULL | NULL | 591 | Using where |
+----+-------------+-------------+------+---------------+------+---------+------+------+-------------+
mysql> explain select * from METADATA_44 where Atextid=7;
+----+-------------+-------------+-------+---------------+---------+---------+-------+------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------+-------+---------------+---------+---------+-------+------+-------+
| 1 | SIMPLE | METADATA_44 | const | PRIMARY | PRIMARY | 4 | const | 1 | |
+----+-------------+-------------+-------+---------------+---------+---------+-------+------+-------+
1 row in set (0.00 sec)
You'd need to look at the Query Execution Plan on your RDBMS to verify that LIKE with no wildcards is treated as efficiently as an = would be. A quick test in SQL Server shows that it would give you an index scan rather than a seek so I guess it doesn't look at that when generating the plan and for SQL Server using = would be much more efficient. I don't have a MySQL install to test against.
Edit: Just to update this SQL Server seems to handle it fine and do a seek when the data type is varchar. When it is run against an int column though you get the scan. This is because it does an implicit conversion to varchar on the int column so can't use the index.
You are better off writing your query as
SELECT * FROM my_table WHERE id = 1;
otherwise mysql will have to typecast '1' to int which is the type of the column id
so obviously there is a small performance penalty, when u know the type of the column supply the value according to that type
Speed. [15-char filler as there's not much more to say]
Without using any wildcards with LIKE, is should be fine for your needs if the speed/efficiency is something you don't bother with.