After much mucking about, I'm close (For my sake, I don't care about the type differences). I do, however, want the exact same output format as MySQL. The reason is I'm trying to adapt a MySQL-only tool for use with PostgreSQL. Here's an example output from MySQL (albeit with fewer columns):
mysql> show columns from users;
+-------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+--------------+------+-----+---------+-------+
| id | int(11) | NO | PRI | NULL | |
| name | varchar(200) | YES | | NULL | |
| institution | varchar(200) | YES | | NULL | |
+-------------+--------------+------+-----+---------+-------+
Here's the table on which I'm testing this:
Table "public.users"
Column | Type | Collation | Nullable | Default
--------------------+------------------------+-----------+----------+-----------------------------------
id | integer | | not null | nextval('users_id_seq'::regclass)
name | character varying(255) | | |
role_id | integer | | |
image_url | character varying(510) | | |
institution | character varying(255) | | |
qualifications | text | | |
cv_url | character varying(510) | | |
specializations | text | | |
text_collaboration | text | | |
Indexes:
"users_pkey" PRIMARY KEY, btree (id) Check constraints:
"users_name_not_null" CHECK (name IS NOT NULL) Foreign-key constraints:
"fk_role_id" FOREIGN KEY (role_id) REFERENCES roles(id) Referenced by:
TABLE "novel_reviews" CONSTRAINT "novels_reviewer_id_fkey" FOREIGN KEY (reviewer_id) REFERENCES users(id)
TABLE "review_translations" CONSTRAINT "review_translations_recorder_id_fkey" FOREIGN KEY (recorder_id) REFERENCES users(id)
Here's the query I have... it's probably poorly done esp. with the GROUP BY part:
SELECT column_name AS "Field"
, data_type AS "Type"
, is_nullable AS "Null"
, CASE WHEN is_primary=true THEN 'PRI' ELSE NULL END AS "Key"
, column_default as "Default"
, CASE WHEN column_default LIKE 'nextval(%' THEN 'auto_increment' ELSE '' END AS "Extra"
FROM
(
SELECT c.column_name
, c.data_type
, c.is_nullable
, tc.constraint_type='PRIMARY KEY' AS is_primary
, c.column_default
FROM information_schema.columns AS c
LEFT JOIN information_schema.constraint_column_usage AS ccu USING (column_name, table_name)
LEFT JOIN information_schema.table_constraints tc USING (constraint_name)
WHERE c.table_name = 'users'
GROUP BY c.column_name
, c.data_type
, c.is_nullable
, is_primary
, c.column_default
) as sq;
Here's the results I'm getting currently. Sorry for the poor formatting.
> Field | Type | Null | Key | Default | Extra
> --------------------+-------------------+------+-----+-----------------------------------+----------------
> | cv_url | character varying | YES | |
> | id | integer | NO | | nextval('users_id_seq'::regclass) | auto_increment
> | id | integer | NO | PRI | nextval('users_id_seq'::regclass) | auto_increment
> | image_url | character varying | YES | |
> | institution | character varying | YES | |
> | name | character varying | YES | |
> | qualifications | text | YES | |
> | role_id | integer | YES | |
> | specializations | text | YES | |
> |
> | (10 rows)
I can't figure how to get the second occurrence of id to go away, the one emanating from the non-primarykey constraint. I can't wrap my head around how to drop that. I tried doing WHERE is_primary_key is NULL or is_primary_key=TRUE but that drops the Name field as well, which is joined to a constraint which is also not a primary key.
What I'd like is to get all columns from the table, (each only once) and the string "PRI" if the field is a primary key.
Help! I'm in a bit over my head. Thanks.
It is the query you need :
SELECT *
FROM information_schema.columns
WHERE table_schema = 'public'
AND table_name = 'users'
Figured it out after a lot of banging my head against the wall. First I made a view:
CREATE VIEW table_column_constraints as (SELECT c.table_schema, c.table_name, c.column_name
, c.data_type
, c.is_nullable
, tc.constraint_type
, c.column_default
FROM information_schema.columns AS c
LEFT JOIN information_schema.constraint_column_usage AS ccu USING (column_name, table_name)
LEFT JOIN information_schema.table_constraints tc ON tc.constraint_name=ccu.constraint_name WHERE c.table_schema='public');
Then, I did a de-duplication technique of comparing the table to itself:
SELECT column_name as "Field"
, data_type AS "Type"
, is_nullable AS "Null"
, CASE WHEN constraint_type='PRIMARY KEY' THEN 'PRI' ELSE NULL END AS "Key"
, column_default AS "Default", CASE WHEN column_default LIKE 'nextval(%' THEN 'auto_increment' ELSE '' END AS "Extra"
FROM table_column_constraints as given WHERE given.table_name = 'users'
AND NOT EXISTS (SELECT * FROM table_column_constraints other WHERE other.column_name=given.column_name AND given.constraint_type!='PRIMARY KEY' AND other.constraint_type='PRIMARY KEY');
To get the following results:
Field | Type | Null | Key | Default | Extra
--------------------+-------------------+------+-----+-----------------------------------+----------------
name | character varying | YES | | |
id | integer | NO | PRI | nextval('users_id_seq'::regclass) | auto_increment
image_url | character varying | YES | | |
institution | character varying | YES | | |
qualifications | text | YES | | |
cv_url | character varying | YES | | |
specializations | text | YES | | |
text_collaboration | text | YES | | |
role_id | integer | YES | | |
(9 rows)
I was inspired by https://stackoverflow.com/a/45065229/1151229 on a question called "Selecting rows ordered by some column and distinct on another"
Related
I have SQL for example
show tables from mydb;
It shows the list of table
|table1|
|table2|
|table3|
Then,I use sql sentence for each table.
such as "show full columns from table1 ;"
+----------+--------+-----------+------+-----+---------+----------------+---------------------------------+---------+
| Field | Type | Collation | Null | Key | Default | Extra | Privileges | Comment |
+----------+--------+-----------+------+-----+---------+----------------+---------------------------------+---------+
| id | bigint | NULL | NO | PRI | NULL | auto_increment | select,insert,update,references | |
| user_id | bigint | NULL | NO | MUL | NULL | | select,insert,update,references | |
| group_id | int | NULL | NO | MUL | NULL | | select,insert,update,references | |
+----------+--------+-----------+------+-----+---------+----------------+---------------------------------+---------+
So in this case I can use programming language such as .(this is not correct code just showing the flow)
tables = "show tables from mydb;"
for t in tables:
cmd.execute("show full columns from {t} ;")
However is it possible to do this in sql only?
If you are using MySQL you can use the system view - INFORMATION_SCHEMA.
It contains table name and column name (and other details). No loop is require and you can easily filter by other information, too.
SELECT *
FROM INFORMATION_SCHEMA.COLUMNS
If you are using Microsoft SQL Server, you can use the above command
I'm trying to execute several times the following query :
SELECT st2.stop_id AS to_stop_id,
TIME_TO_SEC(
ADDTIME(TIMEDIFF(MIN(st1.time), %time),
TIMEDIFF(st2.time, st2.time))) AS duration
FROM stop_times st1,
stop_times st2,
trips tr,
calendar cal
WHERE tr.service_id = cal.service_id
AND tr.trip_id = st1.trip_id
AND st1.trip_id = st2.trip_id
AND st1.stop_id = %sid
AND st1.stop_seq +1 = st2.stop_seq
AND st1.time > %time
AND DATE(NOW()) BETWEEN cal.start_date AND
cal.end_date
GROUP BY st2.stop_id
However, it run extremely slow. I indexed the following attributes:
+------------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+------------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| stop_times | 0 | st_id | 1 | st_id | A | 11431583 | NULL | NULL | | BTREE | | |
| stop_times | 1 | fk_tid_s | 1 | trip_id | A | 1039234 | NULL | NULL | YES | BTREE | | |
| stop_times | 1 | st_per_sid | 1 | stop_id | A | 33135 | NULL | NULL | YES | BTREE | | |
| calendar | 0 | PRIMARY | 1 | service_id | A | 5206 | NULL | NULL | | BTREE | | |
| calendar | 0 | PRIMARY | 1 | service_id | A | 5206 | NULL | NULL | | BTREE | | |
| trips | 0 | PRIMARY | 1 | trip_id | A | 449489 | NULL | NULL | | BTREE | | |
| trips | 1 | fk_rid | 1 | route_id | A | 1937 | NULL | NULL | YES | BTREE | | |
| trips | 1 | fk_sid | 1 | service_id | A | 7749 | NULL | NULL | YES | BTREE | | |
+------------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
(For some reasons, st_id is not show as a PRIMARY KEY, but it is, I don't know if it's important but just in case..)
I ran SQL EXPLAIN on this query and it gave me the following answer :
+------+-------------+-------+--------+-------------------------------------------------+---------------------+---------+------------------------------+------+---------------------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------+--------+-------------------------------------------------+---------------------+---------+------------------------------+------+---------------------------------------------------------------------+
| 1 | SIMPLE | st1 | range | comp_uniq_st_seq,st_per_sid,comp_uniq_stid_time | comp_uniq_stid_time | 9 | NULL | 1396 | Using index condition; Using where; Using temporary; Using filesort |
| 1 | SIMPLE | tr | eq_ref | PRIMARY,fk_sid | PRIMARY | 8 | reseau_ratp.st1.trip_id | 1 | Using where |
| 1 | SIMPLE | cal | eq_ref | PRIMARY,comp_sid_date_en,comp_sid_date_st | PRIMARY | 4 | reseau_ratp.tr.service_id | 1 | Using where |
| 1 | SIMPLE | st2 | ref | comp_uniq_st_seq | comp_uniq_st_seq | 14 | reseau_ratp.st1.trip_id,func | 1 | Using index condition |
+------+-------------+-------+--------+-------------------------------------------------+---------------------+---------+------------------------------+------+---------------------------------------------------------------------+
What should I do to get this query running faster?
EDIT :
Query using the requested syntax :
SELECT st2.stop_id AS to_stop_id,
TIME_TO_SEC(
ADDTIME(TIMEDIFF(MIN(st1.time), %time),
TIMEDIFF(st2.time, st2.time))) AS duration
FROM stop_times st1
INNER JOIN stop_times st2
ON st1.trip_id = st2.trip_id AND st1.stop_seq + 1 = st2.stop_seq
INNER JOIN trips tr
ON tr.trip_id = st1.trip_id
INNER JOIN calendar cal
ON tr.service_id = cal.service_id
WHERE st1.stop_id = %sid
AND st1.time > %time
AND cal.start_date <= NOW()
AND cal.end_date >= NOW()
GROUP BY st2.stop_id
Here SHOW CREATE TABLE stop_times:
CREATE TABLE `stop_times` (
`trip_id` bigint(10) unsigned DEFAULT NULL,
`stop_id` int(10) DEFAULT NULL,
`time` time DEFAULT NULL,
`stop_seq` int(10) unsigned DEFAULT NULL,
UNIQUE KEY `comp_uniq_st_seq` (`trip_id`,`stop_seq`),
KEY `comp_uniq_stid_time` (`stop_id`,`time`),
CONSTRAINT `fk_sid_s` FOREIGN KEY (`stop_id`) REFERENCES `stops` (`stop_id`),
CONSTRAINT `fk_tid_s` FOREIGN KEY (`trip_id`) REFERENCES `trips` (`trip_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
For calendar :
CREATE TABLE `calendar` (
`service_id` int(10) unsigned NOT NULL,
`start_date` date DEFAULT NULL,
`end_date` date DEFAULT NULL,
PRIMARY KEY (`service_id`),
KEY `comp_sid_date_en` (`service_id`,`end_date`),
KEY `comp_sid_date_st` (`service_id`,`start_date`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
And for trips :
CREATE TABLE `trips` (
`trip_id` bigint(10) unsigned NOT NULL DEFAULT '0',
`route_id` int(10) unsigned DEFAULT NULL,
`service_id` int(10) unsigned DEFAULT NULL,
`trip_headsign` varchar(15) DEFAULT NULL,
`trip_short_name` varchar(15) DEFAULT NULL,
`direction_id` tinyint(1) DEFAULT NULL,
PRIMARY KEY (`trip_id`),
KEY `fk_rid` (`route_id`),
KEY `fk_sid` (`service_id`),
CONSTRAINT `fk_rid` FOREIGN KEY (`route_id`) REFERENCES `routes` (`route_id`),
CONSTRAINT `fk_sid` FOREIGN KEY (`service_id`) REFERENCES `calendar` (`service_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
st1 needs this composite index: INDEX(stop_id, time)
Please use the JOIN ... ON syntax.
Please provide SHOW CREATE TABLE.
Here is a Cookbook on creating INDEXes from a SELECT.
(Edit)
Calendar is trickier to handle, and there is no "good" index. These may help:
INDEX(service_id, start_time)
INDEX(service_id, end_time)
plus, reformulate AND DATE(NOW()) BETWEEN cal.start_date AND cal.end_date into
AND cal.start_date <= NOW()
AND cal.end_time >= NOW()
(Edit 2)
Wherever practical, say NOT NULL. This is probably especially important in stop_times which does not have a PRIMARY KEY. Change the two columns in UNIQUE KEY comp_uniq_st_seq (trip_id,stop_seq) to be NOT NULL and turn it into PRIMARY KEY (trip_id, stop_seq). This will allow the performance benefits of "the PK is clustered with the data" to kick in.
Now that I see the CREATE TABLE for Calendar, and that service_id is the PRIMARY KEY, the two indexes I suggested for it are probably useless. (Again, this relates to "clustering".)
My Cookbook for building indexes may come in handy.
I have a database like this:
+-------------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+-------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| PosScore | float | YES | | NULL | |
| NegScore | float | YES | | NULL | |
| SynsetTerms | varchar(45) | YES | | NULL | |
+-------------+-------------+------+-----+---------+----------------+
some of the SynsetTerms have # at the end.
Can I just use an SQL query and remove them?
Thanks in advance.
You can use an update statement:
update t
set SynsetTerms = left(SynsetTerms, length(SynsetTerms) - 1)
where SynsetTerms like '%#';
If you want to remove all occurrences of '#':
update t
set SynsetTerms = replace(SynsetTerms, '#', '')
where SynsetTerms like '%#%';
In your select or update statement just do this:
SELECT replace(synsetTerms, '#','') from table
or
UPDATE table set synsetTerms = replace(synsetTerms, '#','')
if you just want to update the records that contain the '#' symbol you can add the following WHERE clause:
WHERE synsetTerms like '%#'
I have the following table structures:
matches:
+-------------------+---------------------------+------+-----+-------------------+-----------------------------+
| Field | Type | Null | Key | Default | Extra |
+-------------------+---------------------------+------+-----+-------------------+-----------------------------+
| id | bigint(20) | NO | PRI | NULL | auto_increment |
| creator_id | bigint(20) | NO | | NULL | |
| mode | enum('versus','freeplay') | NO | | NULL | |
| name | varchar(100) | NO | | NULL | |
| team_1_id | varchar(100) | NO | | NULL | |
| team_2_id | varchar(100) | NO | | NULL | |
+-------------------+---------------------------+------+-----+-------------------+-----------------------------+
teams:
+--------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+--------------+------+-----+---------+----------------+
| id | bigint(20) | NO | PRI | NULL | auto_increment |
| creator_id | bigint(20) | NO | MUL | NULL | |
| name | varchar(100) | NO | | NULL | |
+--------------+--------------+------+-----+---------+----------------+
I need a query where we get all matches from the matches table along with the name of the team from the teams table given that when mode is "versus" the name of the team is taken from the teams table but when the mode is "freeplay" the name of the team is team_1_id or team_2_id themselves (they can hold strings, that is why they are varchar instead of int) without going to the teams table.
Use:
SELECT m.id,
m.creator_id,
m.mode,
m.name,
m.team_1_id,
m.team_2_id
FROM MATCHES m
WHERE m.mode = 'freeplay'
UNION ALL
SELECT m.id,
m.creator_id,
m.mode,
m.name,
t1.name,
t2.name
FROM MATCHES m
LEFT JOIN TEAMS t1 ON t1.id = m.team_1_id
LEFT JOIN TEAMS t2 ON t2.id = m.team_2_id
WHERE m.mode = 'versus'
SELECT CASE mode WHEN 'versus' THEN t1.name
ELSE team_1_id END AS name FROM matches
LEFT JOIN teams t1 ON t1.id=team_1_id
Do the same for team 2 and add where clause to suit your need
First, I would suggest you change your table structure slightly. Instead of using team_X_id for a name or an id, use it only for an id. Add an additional column for team_X_name that you can put the string in. That way you can define foreign keys and have the correct datatype. Set the team_X_id field to null and team_X_name to the team name when in freeplay mode, and set the team_X_id to the team id in versus mode.
That said, this should do what you want:
SELECT mode,
IF(team_1.id IS NULL, team_1_id, team_1.name),
IF(team_2.id IS NULL, team_2_id, team_2.name),
FROM matches
LEFT JOIN teams AS team_1 ON (matches.team_id_1=team_1.id)
LEFT JOIN teams AS team_2 ON (matches.team_id_2=team_2.id);
edit:
Actually, perhaps I misunderstood the design. If you are saying mode 'freeplay' flag means neither team_X_id will be an actual team id then you need a slightly different query:
SELECT mode,
IF(mode = 'freeplay' OR team_1.id IS NULL, team_1_id, team_1.name),
IF(mode = 'freeplay' OR team_2.id IS NULL, team_2_id, team_2.name),
FROM matches
LEFT JOIN teams AS team_1 ON (matches.team_id_1=team_1.id)
LEFT JOIN teams AS team_2 ON (matches.team_id_2=team_2.id);
But I would strongly suggest improving your DB design.
I am trying to build a new table such that the values in the existing table are NOT contained (but obviously the following checks for contained) in another table. Following is my table structure:
mysql> explain t1;
+-----------+---------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------+---------------------+------+-----+---------+-------+
| id | int(11) | YES | | NULL | |
| point | bigint(20) unsigned | NO | MUL | 0 | |
+-----------+---------------------+------+-----+---------+-------+
mysql> explain whitelist;
+-------------+---------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+---------------------+------+-----+---------+----------------+
| id | bigint(20) unsigned | NO | PRI | NULL | auto_increment |
| x | bigint(20) unsigned | YES | | NULL | |
| y | bigint(20) unsigned | YES | | NULL | |
| geonetwork | linestring | NO | MUL | NULL | |
+-------------+---------------------+------+-----+---------+----------------+
My query looks like this:
SELECT point
FROM t1
WHERE EXISTS(SELECT source
FROM whitelist
WHERE MBRContains(geonetwork, GeomFromText(CONCAT('POINT(', t1.point, ' 0)'))));
Explain:
+----+--------------------+--------------------+-------+-------------------+-----------+---------+------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+--------------------+-------+-------------------+-----------+---------+------+------+--------------------------+
| 1 | PRIMARY | t1 | index | NULL | point | 8 | NULL | 1001 | Using where; Using index |
| 2 | DEPENDENT SUBQUERY | whitelist | ALL | _geonetwork | NULL | NULL | NULL | 3257 | Using where |
+----+--------------------+--------------------+-------+-------------------+-----------+---------+------+------+--------------------------+
The query is taking 6 seconds to execute for 1000 records in t1 which is unacceptable for me. How can I rewrite this query using Joins (or perhaps a faster way if that exists) if I don't have a column to join on? Even a stored procedure is acceptable I guess in the worst case. My goal is to finally create a new table containing entries from t1. Any suggestions?
Unless the query optimizer is failing, a WHERE EXISTS construct should result in the same plan as a join with a GROUP clause. Look at optimizing MBRContains(geonetwork, GeomFromText(CONCAT('POINT(', t1.point, ' 0)')))), that's probably where your query is spending all its time. I don't have a suggestion for that, but here's your query written with a JOIN:
Select t1.point
from t1
join whitelist on MBRContains(whitelist.geonetwork, GeomFromText(CONCAT('POINT(', t1.point, ' 0)'))))
group by t1.point
;
or to get the points in t1 not in whitelist:
Select t1.point
from t1
left join whitelist on MBRContains(whitelist.geonetwork, GeomFromText(CONCAT('POINT(', t1.point, ' 0)'))))
where whitelist.id is null
;
This seems like a case where de-nomalizing t1 might be beneficial. Adding a GeomFrmTxt column with a value of GeomFromText(CONCAT('POINT(', t1.point, ' 0)')) could speed up the query you already have.