PostgreSQL: How to optimize this query - sql

I am developing a small application in C++ and using PostgreSQL as back-end database. Along with other tables in my database in have a "projects" table. Based on each primary key of this table a new table is dynamically added in my Database.
Example:
Suppose the projects table contains following 3 rows:
--------------------------------
| Id |Other Columns Goes here |
--------------------------------
| 1 | |
--------------------------------
| 2 | |
--------------------------------
| 3 | |
--------------------------------
So in this case i also have following three more table
Table1, Table2, Table3
Now you might notice that the table names are generated by appending projects.Id at the end of fixed string i.e "Table".
It might also be possible that for some projects no table is generated.
Example:
Suppose the projects table contains following 3 rows:
--------------------------------
| Id |Other Columns Goes here |
--------------------------------
| 1 | |
--------------------------------
| 2 | |
--------------------------------
| 3 | |
--------------------------------
So in this i might found only following two tables in my database:
Table1, Table3
Now i simply need to get all the valid projects. For this currently i am using following algo:
//part1
SELECT * FROM Projects Table
get the projects info one by one from the results of above query and store them in new instance of my Custom Class Project
Store the above instance in some contianer e.g Vector say vProjects
//part 2
For each Project p in vProject
if (TableExist(p.Id))
Store p in a new container say vValidatedProjects
Note: The TableExist() method execute the following query:
SELECT COUNT(*) FROM pg_tables WHERE tablename = 'Table"+ p.Id + "'"
Now every thing is working fine as expected but !!! the program executes very slow just because of the second part of above algo, if we have one thousand projects the TableExist() method is also called thousand time and each time this method is called a new query is executed which slow downs the program :(
The solution in my mind is some such thing
//part1
SELECT * FROM Projects Table
WHERE a table exist angainst projets.Id
get only those projects info for whom a dynamic table exist. From the results of above query and store them in new instance of my Custom Class Project
Store the above instance in some contianer e.g Vector say vProjects.
Now in this way only one query did the job for us rather than N+1 Queries (Where N is no of rows in Projects Table)
But i don't know how do i write such a query that returns the above results. Please help me in acheiving this.

Changing the design would be the best solution.
If that is not an option, then you could change the second part:
//part 2
For each Project p in vProject
if (TableExist(p.Id))
Store p in a new container say vValidatedProjects
Note: The TableExist() method execute the following query:
SELECT COUNT(*) FROM pg_tables WHERE tablename = 'Table"+ p.Id + "'"
by, first adding a new boolean column in projects table (lets name it projects.TableExists )
Then, run your current TableExist() function one and populate that column. In addition, chnage the code that creates table for a project, to also update that column and the code that deletes a table to also update the column accordingly.
Then your second part would be:
//part 2
For each Project p in vProject
if (p.TableExists)
Store p in a new container say vValidatedProjects
Note: The TableExist() method will not be used any more

I would rather have one table with project_id in it and do all selects with where project_id = .... That would result in better table statistics and the table optimizer will make a better job.

Related

Language dependent column headers

I am working on an PostgreSQL based application and am very curious if there might be a clever solution to have language dependent column headers.
I sure know, that I can set an alias for a header with the "as" keyword, but that obviously has to be done for every select and over and over again.
So I have a table for converting the technical column name to a mnemonic one, to be shown to the user.
I can handle the mapping in the application, but would prefer a database solution. Is there any?
At least could I set the column header to table.column?
You could use a "view". You can think of a view as a psuedo-table, it can be created using a single or multiple tables created from a query. For instance, if I have a table that has the following shape
Table: Pets
Id | Name | OwnerId | AnimalType
1 | Frank| 1 | 1
2 | Jim | 1 | 2
3 | Bobo | 2 | 1
I could create a "view" that changes the Name field to look like PetName instead without changing the table
CREATE VIEW PetView AS
SELECT Id, Name as PetName, OwnerId, AnimalType
FROM Pets
Then I can use the view just like any other table
SELECT PetName
FROM PetView
WHERE AnimalType = 1
Further we could combine another table as well into the view. For instance if we add another table to our DB for Owners then we could create a view that automatically joins the two tables together before subjecting to other queries
Table: Owners
Id | Name
1 | Susan
2 | Ravi
CREATE VIEW PetsAndOwners AS
SELECT p.Id, p.Name as PetName, o.Name as OwnerName, p.AnimalType
FROM Pets p, Owners o
WHERE p.OwnerId = o.Id
Now we can use the new view again as in any other table (for querying, inserts and deletes are not supported in views).
SELECT * FROM PetsAndOwners
WHERE OwnerName = 'Susan'

How to create a central parameter, like Report-Date?

I would like to create one location on sql server where I store the report-date and all queries and procedures should relate to this one value.
In that way I only have to change the report date on one location and it is valid for all related queries and procedures.
I started with a scalar function that retrieves a value from a table, but this slows down the queries enomoursly.
I tried an inline table valued function, but have no idea how to include this into a query.
I tried with a table that contains the report-date and used a cross join.
But it says:
The multi-part identifier could not be bound
Maybe some of you have an idea what to do here?
One possibility is to create a table, let's say TblReportDate with two columns: id and reportDate.
Then add one row with id 1 like following:
+----+------------+
| id | reportDate |
+----+------------+
| 1 | 04.04.2018 |
+----+------------+
Now join the table with a LEFT JOIN and use the >= operator to compare with the id-column of the main-table:
SELECT * FROM mainTable
LEFT JOIN TblReportDate ON mainTable.id >= TblReportDate.id

SELECT the FROM table with a sub select and modify the resulting table name

I have the follwing given two tables which can not be changed.
1: DataTypes
+----------------------+-----------------------+
| datatypename(String) | datatypetable(String) |
+----------------------+-----------------------+
Example data:
+-----------+------------+
| CycleTime | datalong |
+-----------+------------+
| InjTime1 | datadouble |
+-----------+------------+
2: datalong_1 (data model does not matter here)
I want to make a query now that reads the datatypetable attribute from the datatypes table, adds the String "_1" to it and selects all content from it.
I imagined it, from a programmatic perspective, to look something similar to this statement which obviously doesn't work yet:
SELECT * FROM
(SELECT datatypetable FROM datatypes WHERE datatypename = 'CycleTime') + '_1'
How can I make this happen in SQL using HSQLDB?
Thanks to Leonidas199x I know now how to get in the '_1' in but how do I tell the FROM statement that the subselect is not a new table I want to read from but instead the name of an existing table I want to read from.
SELECT * FROM
(SELECT RTRIM(datatypetable)+'_1' FROM datatypes WHERE datatypename = 'CycleTime')
According to this question which is identical to mine this is not possible:
using subquery instead of the tablename
:(
Can you explain your data model in a little more detail? I am not sure I understand exactly what it is you are looking to do.
If you are wanting to add _1 to the 'datatypename', you can use:
SELECT datatypename+'_1'
FROM datatypes

Multiple records in a table matched with a column

The architecture of my DB involves records in a Tags table. Each record in the Tags table has a string which is a Name and a foreign kery to the PrimaryID's of records in another Worker table.
Records in the Worker table have tags. Every time we create a Tag for a worker, we add a new row in the Tags table with the inputted Name and foreign key to the worker's PrimaryID. Therefore, we can have multiple Tags with different names per same worker.
Worker Table
ID | Worker Name | Other Information
__________________________________________________________________
1 | Worker1 | ..........................
2 | Worker2 | ..........................
3 | Worker3 | ..........................
4 | Worker4 | ..........................
Tags Table
ID |Foreign Key(WorkerID) | Name
__________________________________________________________________
1 | 1 | foo
2 | 1 | bar
3 | 2 | foo
5 | 3 | foo
6 | 3 | bar
7 | 3 | baz
8 | 1 | qux
My goal is to filter WorkerID's based on an inputted table of strings. I want to get the set of WorkerID's that have the same tags as the inputted ones. For example, if the inputted strings are foo and bar, I would like to return WorkerID's 1 and 3. Any idea how to do this? I was thinking something to do with GROUP BY or JOINING tables. I am new to SQL and can't seem to figure it out.
This is a variant of relational division. Here's one attempt:
select workerid
from tags
where name in ('foo', 'bar')
group by workerid
having count(distinct name) = 2
You can use the following:
select WorkerID
from tags where name in ('foo', 'bar')
group by WorkerID
having count(*) = 2
and this will retrieve your desired result/
Regards.
This article is an excellent resource on the subject.
While the answer from #Lennart works fine in Query Analyzer, you're not going to be able to duplicate that in a stored procedure or from a consuming application without opening yourself up to SQL injection attacks. To extend the solution, you'll want to look into passing your list of tags as a table-valued parameter since SQL doesn't support arrays.
Essentially, you create a custom type in the database that mimics a table with only one column:
CREATE TYPE list_of_tags AS TABLE (t varchar(50) NOT NULL PRIMARY KEY)
Then you populate an instance of that type in memory:
DECLARE #mylist list_of_tags
INSERT #mylist (t) VALUES('foo'),('bar')
Then you can select against that as a join using the GROUP BY/HAVING described in the previous answers:
select workerid
from tags inner join #mylist on tag = t
group by workerid
having count(distinct name) = 2
*Note: I'm not at a computer where I can test the query. If someone sees a flaw in my query, please let me know and I'll happily correct it and thank them.

How can I order the rows inside a table by a column (but not the `SELECT`'s response)?

For example, I want to order a table like this
Foo | Bar
---------
1 | a
5 | d
2 | c
1 | b
2 | a
to this:
Foo | Bar
---------
1 | a
1 | b
2 | a
2 | c
5 | d
(ordered by Foo column)
That's because I only want to select the Bars that have a given Foo, and if it's already ordered I guess they will be faster to select because I won't have to use ORDER BY.
And if it's possible, once sorting by columns Foo, I want to sort the rows which have the same Foo by Bar column.
Of course, if I INSERT or UPDATE to table, it should remain ordered.
In SQL, tables are inherently unordered. This is a very important characteristic of databases. For instance, you can delete a row in the middle of a table, and when a new row is inserted, it uses up the space occupied by the deleted row. This is more efficient that just appending rows to the end of the data.
In other words, the order by clause is used basically for output purposes only. Okay, I can think of two other situations . . . with limit (or a related clause) and with window functions (which SQLite does not support).
In any case, ordering the data also would not matter for a query such as this:
select bar
from t
where foo = $FOO
The SQL engine does not "know" that the table is ordered. So, it will start at the beginning of the table and do the comparison for each row.
The way to make this more efficient is by building an index on foo. Then you will be able to get the efficiencies that you want.