How do I select rows in a table based on a key (PK) from another table. I have selected multiple polygons which is within a geografical region from one layer.
The attributes table from the selected layer look like this:
| Bloknr | Column 1 | Column 2 | Column 3 |
| 111-08 | xqyz | xyzq | qxyz |
| 208-09 | abc | cba | bca |
Where the row in question (row 1) is selected.
I now want to select this row from a nongeographic layer (from a postgresql database) with a table that looks like this:
| BLOKNR | Column 1 | Column 2 | Column 3 |
| 111-08 | cab | bac | cab |
| 208-09 | abc | cba | bca |
| 111-08 | cba | bca | cab |
Where the first and third row is to be selected.
There is about 20.000.000 rows in the postgres table and multiple matches on each bloknr
I work in qgis ver. 3.2 and postgresql with PGadmin4
Any help most appreciated.
UPDATE to answer the comments
It would be simple, if it was a matter of doing it within postgres - it's kind of made for that - but i cannot figure out how to query within qgis i would like not to have to export each table (I have a few, and for each i need multiple selection queries, based on geography) to postgresql - partly because i would like to keep the workflow in qgis, and partly because the export feature in the DB manager of qgis gives me this error - which i think means that i have to make all the tables manually.
" ERROR: function addgeometrycolumn(unknown, unknown, unknown,
integer, unknown, integer) does not exist LINE 1: SELECT
AddGeometryColumn('public','Test',NULL,0,'MULTIPOLYGO...
HINT: No function matches the given name and argument types. You might need to add explicit type casts."
So again any help appreciated.
So i have come up with an answer, that will work in theory.
First make the desired geographical selection and make a new layer with the selection
Then export the layer to the postgis database, with which you are connected
Now it is possible to make queries in postgresql - and PGadmin.
Note that this does not keep the workflow in qgis - and for further processing of statistics etc. one will have to work on the integration between the new postgis layer and selection within this - and it doesn't quite solve the geographical/mapbased selection approach - although it will work
Related
I have a linked table to a Outlook Mailitem folder in my Access Database. This is handy in that it keeps itself constantly updated, but I can't add an extra field to relate these records to a parent table.
My workaround was to put an automatically generated/added ID String into the Subject so I could work from there. In order to make my form work the way I need it to, I'm trying to create a query that takes the fields I need from the linked table and adds a calculated field with the extracted ID so it can be referenced for relating records in the form.
The query works fine (I get all the records and their IDs extracted) but when I try to filter records from this query by the calculated field I get:
This expression is typed incorrectly, or it is too complex to be evaluated. For example, a numeric expression may contain too many complicated elements. Try simplifying the expression by assigning parts of the expression to variables.
I tried separating the calculated field out into three fields so it's easier to read, hoping that would make it easier to evaluate for Access, but I still get the same error. My base query is currently:
SELECT InStr(Subject,"Support Project #CS")+19 AS StartID,
InStr(StartID,Subject," ") AS EndID,
Int(Mid(Subject,StartID,EndID-StartID)) AS ID,
ProjectEmails.Subject,
ProjectEmails.[From],
ProjectEmails.To,
ProjectEmails.Received,
ProjectEmails.Contents
FROM ProjectEmails
WHERE (((ProjectEmails.[Subject]) Like "*Support Project [#]CS*"));
I've tried to bind a subform to this query on qryProjectEmailWithID.ID = SupportProject.ID where the main form is bound to SupportProject, and I get the above error. I tried building a query that selects all records from that query where the ID = a given parameter and I still get the same error.
The working query that adds Support Project IDs would look like:
+----+--------------------------------------+----------------------+----------------------+------------+----------------------------------+
| ID | Subject | To | From | Received | Contents |
+----+--------------------------------------+----------------------+----------------------+------------+----------------------------------+
| 1 | RE: Support Project #CS1 ID Extra... | questions#so.com | Isaac.Reefman#so.com | 2019-03-11 | Trying to work out how to add... |
| 1 | RE: Support Project #CS1 ID Extra... | isaac.reefman#so.com | questions#so.com | 2019-03-11 | Thanks for your question. The... |
| 1 | RE: Support Project #CS1 ID Extra... | isaac.reefman#so.com | questions#so.com | 2019-03-11 | You should use a different me... |
| 2 | RE: Support Project #CS2 IT issue... | support#domain.com | someone#company.com | 2019-02-21 | I really need some help with ... |
| 2 | RE: Support Project #CS2 IT issue... | someone#company.com | support#domain.com | 2019-02-21 | Thanks for your question. The... |
| 2 | RE: Support Project #CS2 IT issue... | someone#company.com | support#domain.com | 2019-02-21 | Have you tried turning it off... |
| 3 | RE: Support Project #CS3 email br... | support#domain.com | someone#company.com | 2019-02-12 | my email server is malfunccti... |
| 3 | RE: Support Project #CS3 email br... | someone#company.com | support#domain.com | 2019-02-12 | Thanks for your question. The... |
| 3 | RE: Support Project #CS3 email br... | someone#company.com | support#domain.com | 2019-02-13 | I've just re-started the nece... |
+----+--------------------------------------+----------------------+----------------------+------------+----------------------------------+
The view in question would populate a datasheet that looks the same with just the items whos ID matches the ID of the current SupportProject record, updating when a new record is selected. A separate text box should show the full content of whichever record is selected in that grid, like this:
Have you tried turning it off and on again?
From: support#domain.com
On: 21/02/2019
Thanks for your question. The matter has been assigned to Support Project #CS2, and a support staff member will be in touch shortly to help you out. As it is considered of medium priority, you should expect daily updates.
Thanks,
Support
From: someone#company
On: 21/02/2019
I really need some help with my computer. It seems really slow and I can't do my work efficiently.
Neither of these things happens as when I try to use the calculated number to relate to the PK of the SupportProject table...
I don't know if this is a part of the problem, but whether I use Int(Mid(Subject... or Val(Mid(Subject... I still apparently get a Double, where the ID field (as an autoincrement ID) is a Long. I can't work out how to force it to return a Long, so I can't test whether that's the problem.
So that is output resulting from posted SQL? I really wanted raw data but close enough. If requirement is to extract number after ...CS, calculate in query and save query:
Val(Mid([Subject],InStr([Subject],"CS")+2))
Then build another query to join first query to table.
SELECT qryProjectEmailWithID.*, SupportProject.tst
FROM qryProjectEmailWithID
INNER JOIN SupportProject ON qryProjectEmailWithID.ID = SupportProject.ID;
Filter criteria can be applied to either ID field.
A subform can display the related child records synchronized with SupportProject records on main form.
I tested the ID calc with your data and then with a link to my Inbox. No issue with query join.
I have this sample rows of plate nos with bay nos:
Plate no | Bay no
------------------
AAA111 | 1
AAA222 | 1
AAA333 | 2
BBB111 | 3
BBB222 | 3
CCC111 | 1
Is there a way to make it look like this in a datawindow in powerbuilder?
1 | 2 | 3
------------------------
AAA111 | AAA333 | BBB111
AAA222 BBB222
CCC111
There isn't an simple answer, especially if you need cells to be update-able.
Variable Column Count Strategy
If the number of columns across the top is unknown at development time than you might get by with a "Crosstab" style datawindow but it would be a display only. If you need updates you'll need to do manual data manipulations & updates as each cell would probably represent one row.
Fixed Column Count Strategy
If the number of columns is known (fixed) you could flatten the data at the database and use a standard tabular (or grid) datawindow control but you'll still need to get creative if updates are needed.
If you use Oracle to obtain the data you can use the Pivot and Unpivot function to perform what you are looking for. Here is an example of how to do it:
http://www.oracle.com/technetwork/es/articles/sql/caracteristicas-database11g-2108415-esa.html
I want to use the same labels from a SQLAlchemy table, to re-aggregate some data (e.g. I want to iterate through mytable.c to get the column names exactly).
I have some spending data that looks like the following:
| name | region | date | spending |
| John | A | .... | 123 |
| Jack | A | .... | 20 |
| Jill | B | .... | 240 |
I'm then passing it to an existing function we have, that aggregates spending over 2 periods (using a case statement) and groups by region:
grouped table:
| Region | Total (this period) | Total (last period) |
| A | 3048 | 1034 |
| B | 2058 | 900 |
The function returns a SQLAlchemy query object that I can then use subquery() on to re-query e.g.:
subquery = get_aggregated_data(original_table)
region_A_results = session.query(subquery).filter(subquery.c.region = 'A')
I want to then re-aggregate this subquery (summing every column that can be summed, replacing the region column with a string 'other'.
The problem is, if I iterate through subquery.c, I get labels that look like:
anon_1.region
anon_1.sum_this_period
anon_1.sum_last_period
Is there a way to get the textual label from a set of column objects, without the anon_1. prefix? Especially since I feel that the prefix may change depending on how SQLAlchemy decides to generate the query.
Split the name string and take the second part, and if you want to prepare for the chance that the name is not prefixed by the table name, put the code in a try - except block:
for col in subquery.c:
try:
print(col.name.split('.')[1])
except IndexError:
print(col.name)
Also, the result proxy (region_A_results) has a method keys which returns an a list of column names. Again, if you don't need the table names, you can easily get rid of them.
I have the following table, which is actually the minimal example of the result of multiple joined tables. I now would like to group by 'person_ID' and get all the 'value' entries in one row, sorted after the feature_ID.
person_ID | feature_ID | value
123 | 1 | 1.1
123 | 2 | 1.2
123 | 3 | 1.3
123 | 4 | 1.2
124 | 1 | 1.0
124 | 2 | 1.1
...
The result should be:
123 | 1.1 | 1.2 | 1.3 | 1.2
124 | 1.0 | 1.1 | ...
There should exist an elegant SQL query solution, which I can neither come up with, nor find it.
For fast reconstruction that would be the example data:
create table example(person_ID integer, feature_ID integer, value float);
insert into example(person_ID, feature_ID, value) values
(123,1,1.1),
(123,2,1.2),
(123,3,1.3),
(123,4,1.2),
(124,1,1.0),
(124,2,1.1),
(124,3,1.2),
(124,4,1.4);
Edit: Every person has 6374 entries in the real life application.
I am using a PostgreSQL 8.3.23 database, but I think that should probably be solvable with standard SQL.
Data bases aren't much at transposing. There is a nebulous column growth issue at hand, I mean how does the data base deal with a variable number of columns? It's not a spread sheet.
This transposing of sorts is normally done in the report writer, not in SQL.
... or in a program, like in php.
Dynamic cross tab in sql only by procedure, see:
https://www.simple-talk.com/sql/t-sql-programming/creating-cross-tab-queries-and-pivot-tables-in-sql/
The Question
One thing that I am confused about is the technical definition of possibly the most basic component of a database: a single value.
Some Examples
I understand and follow (at a minimum) the first three normal forms of database normalization - or so I think. That said, with the introduction of RANGE in PostgreSQL 9.2 I started thinking about what makes a single value.
From the docs:
Range types are useful because they represent many element values in a single range value
So, what are you? Several values, or a single value... nothingness... 42?
Why does this matter?
Because is speaks directly to the Second Normal Form:
Create separate tables for sets of values that apply to multiple records.
Relate these tables with a foreign key.
#1 Ranges
For example, in Postgres 9.1 I had some tables structured like this:
"SomeSchema"."StatusType"
"StatusTypeID" | "StatusType"
--------------------|----------------
1 | Start
2 | Stop
"SomeSchema"."Statuses"
"StatusID" | "Identifier" | "StatusType" | "Value" | "Timestamp"
---------------|----------------|----------------|---------|---------------------
1 | 1 | 1 | 0 | 2000-01-01 00:00:00
2 | 1 | 2 | 5 | 2000-01-02 12:00:00
3 | 2 | 1 | 1 | 2000-01-01 00:00:00
4 | 3 | 1 | 2 | 2000-01-01 00:00:00
5 | 2 | 2 | 7 | 2000-01-01 18:30:00
6 | 1 | 2 | 3 | 2000-01-02 12:00:00
This enabled me to keep an historical record of how things were configured at any given point in time.
This structure takes the position that the data in the "Value" column were all separate values.
Now, in Postgres 9.2 if I do the same thing with a RANGE value it would look like this:
"SomeSchema"."Statuses"
"StatusID" | "Identifier" | "Value" | "Timestamp"
---------------|----------------|-------------|---------------------
1 | 1 | (0, NULL) | 2000-01-01 00:00:00
2 | 1 | (0, 5) | 2000-01-02 12:00:00
3 | 2 | (1, NULL) | 2000-01-01 00:00:00
4 | 3 | (2, NULL) | 2000-01-01 00:00:00
5 | 2 | (1, 7) | 2000-01-01 18:30:00
6 | 1 | (0, 3) | 2000-01-02 12:00:00
Again, this structure would enable me to keep an historical record of how things were configured, but I would be storing the same value several times in separate places. It makes updating (technically inserting a new record) more tricky because I have to make sure the data rolls over from the original record.
#2 Arrays
Arrays have been around for a long time, and while they can be abused, I tend to use them for things like color codes. For example, my project stores information and at times needs to know how to display it. I could create three columns to store red, green, and blue values; but that just seems silly. When would I ever create a foreign key (or even just filter) based on one of the given color codes.
When I created the field it was from the perspective that I needed to store a color in a neutral format so that I could feed anything that accepts a color value. I made the column an array and filled it with the appropriate codes to make the color I want.
#3 PostGIS: Geometry & Geography
When storing a polygon in PostGIS, it stores all the points that make the boundary in a single field. If one point were to change and I wanted to keep an historical record, I would have to store all of the points that have not changed twice in order to store the new polygon along with the old.
So, what is a value? and... if RANGE, ARRAY, and GEOGRAPHY are values do they really break the second normal form?
The fact that some operation can derive new values from X that appear to be components of X's value doesn't mean X itself isn't "single valued". Thus "range" values and "geography" values should be single values as far as the DBMSs type system is concerned. I don't know enough about Postgresql's implementation to know whether "arrays" can be considered as single values in themselves. SQL DBMSs like Postgresql are not truly relational DBMSs and SQL supports various structures that certainly aren't proper relation variables, values or types (pointers, nulls and other exotica).
This is a difficult and sometimes controversial topic however. If you haven't read it then I recommend the book Databases, Types, and the Relational Model - The Third Manifesto by Date and Darwen. It addresses exactly the kind of questions you are asking about.
I don't like your description of 2NF but it's not very relevant here.