I have a list of about 900 postcodes for each solar farm in England and Wales. I would like to find the house prices for each postcode, to see how house prices may have changed after the solar farms were implemented.
I have been given a query which gives me the HPI corresponding to each postcode, but i would like to get the individual house transactions (houses sold) in each postcode.
I am new to SPARQL and have no idea how to do a single query for all the postcodes. If anyone can help it would be great.
This is the link to searching via postcode: http://landregistry.data.gov.uk/app/qonsole.
Many thanks
Reece
There is an example query there, with transactions in a postcode. It uses a VALUES clause, so simply add all your postcodes to that VALUES clause.
Note: they store postcodes as typed literals, so you'll have to append ^^xsd:string to each quoted value.
Add ?postcode to the ORDER BY clause to make a bit more sense of the result.
Related
It's very basic. I have one table named "AddressPrimary" which consists of 500,000 addresses for a city and has all of them. Has three columns, "id" and "address" and "rentalPrice". Rental price is currently null as I am trying to get rental prices for these addresses.
I have then a new table named "AddressSecondary" that consists of 1,000 addresses that has rental prices. It has three columns, "id", "unmatchedAddress", "rentalPriceGood".
The 1,000 addresses are not in the same format or spelt the same as in "AddressPrimary". Therefore I cannot just upload these rental prices to the correct address. How can I do a quick SQL to compare the 1,000 addresses to the 500,000 to get the best match %, so that the output would be
"Id from AddressPrimary", "address", "umatchedAddress", "rentalPriceGood",
This way I can then export to CSV, and see if the "address" actually does equal "unmatchedAddress" then I can upload the rental prices for these 1,000 properties into Postgres.
Any suggestions. I've read numerous threads online and tried doing it, but it wouldn't produce what I wanted.
Thanks.
You can use the <-> or % operators from pg_trgm. There are many ways to formulate this, and it isn't clear what you want. You could look for all matches within a certain cutoff (join with % condition), or find the single closest match in AddressPrimary for each AddressSecondary (lateral join using <-> in an ORDER BY with LIMIT 1), or find the single closest match but only if it is also within a certain cutoff (combine the two with % in WHERE and <-> in ORDER BY, this might be important for performance).
The simple case with <-> would be:
select *, address <-> unmatchedAddress
from AddressSecondary cross join lateral (
select * from AddressPrimary order by address <-> unmatchedAddress limit 1
) as foo
You also might want to try domain-specific address standardization software. I think the USPS publishes some of that.
I am trying to run a query to do the following; all of the information will be contained in the same table.
For a specified product of interest find the price of the product and the country it is sold in. The result of this query should produce a set of countries and a set of prices that correspond to the sales price of that product in each country (the prices will differ across countries).
For example, the result of this query may reveal that the product is sold in India for 100$ and Russia for 200$.
(In advance of running this query, we did not know the product was sold in these countries or the prices)
The results of that query, should then be used as a joined set of conditions to pull other parameters of interest from the table as per another query. The results of this query is the ultimate outcome we hope to achieve.
ie, The 2nd query should work as if it was stated where country = India AND Price = $100 (ie only IF both are true), return the name of all products that match this criteria (this will reveal alternative products at this price point in this country).
Repeat this search approach to show all products where country= Russia and price= $200 etc.
The first query should produce set of conditions and the 2nd query should loop through each of those conditions and produce the result.
The country and price combinations will differ and should not be defined statically at any point.
I have seen a few different approaches including WITH, CTE and subqueries but have struggled to do this correctly.
Part of the problem I am having is that my condition for the 2nd query is a combination of the results of the 1st query.
Any help with this is really appreciated! Thanks in advance!
You should use a cursor that will run on the first query, will keep the 2 results in 2 local variables in every iteration and then will run the second query with those variables.
For the record I am a newbie when it comes to SQL. I am practicing and getting familiar with the syntax, and how it is all structured. I am using this website as a crash course in SQL. If you follow the link and take a look at question #5 the directions state to pull up data related only to three countries that of France, Germany, and Italy.
My dilemma is that I do not understand how multiple strings should be structured.
Here is my code:
SELECT name, population
FROM world
WHERE name = 'france';
The code above will produce a table with both selected columns, and the row that contains France's data, but I still need two other rows with data. However, when I try to edit it to achieve the results I need it won't work:
SELECT name, population
FROM world
WHERE name = 'germany' 'france' 'italy';
The code above only produces columns and the rows disappear. I need my table to include the name and population of all three countries. I have searched for a simple answer on how to properly add multiple strings and haven't found anything conclusive.
Despite my issue being relatively simple, I still seek help from all y'all, so please help!
Thank you!
Try to use IN operator, that allows you to make multiple values in WHERE clause.
Your code should look like this:
SELECT name, population
FROM world
WHERE name IN ('france','Germany','Italy');
You can put many criteria in a WHERE clause. Use AND and OR to combine them (and parentheses when needed such as in where a = 1 and (b = 2 or c = 3)). So your query can be written as
SELECT name, population
FROM world
WHERE name = 'germany' OR name = 'france' OR name = 'italy';
As has been shown, however, when looking for several values of one attribute you can replace all the ORs with an IN clause:
WHERE name IN ('germany', 'france', 'italy')
which is more readable and hence preferable.
I am trying to determine medicare costs per capita in each State using Google BigQuery.
I already have population numbers for each state (represented as Total) as well as total medicare cost (Cost) in each state. I am trying to divide total cost by the population of each state.
At the moment the query runs, however every entry is null. I am admittedly a beginner with both BigQuery and SQL.
Here is my code:
SELECT State, Cost / Total AS PerCapita
FROM medicare.population, medicare.CostByState
GROUP BY State, PerCapita;
One thing that may be causing issues is that the 'State' column exists in both 'population' and 'CostByState' tables. Not sure how to address this.
Here are my tables:
population
CostByState
You seem to have data with one row per state, so you only need a JOIN.
SELECT p.State, cbs.Cost / p.Total AS PerCapita
FROM medicare.population p JOIN
medicare.CostByState cbs
ON p.state = cbs.state;
You would only need aggregation if the tables had multiple rows per state.
Indeed you need to join that.
If the relationship is one to one you're good. But if not you may need some type of aggregation
sum(cost)/sum(total) as per_capita
I'm trying to filter my query to show only table entries with the word WORLD and also contains other country codes.
I have used the wildcard function but I am unsure as to how I can have it only return entries with both WORLD and other country codes.
SELECT *
FROM [****].[dbo].[Titles]
WHERE Territories LIKE '%world%'
Any help is appreciated
EDIT: Expected result would return all rows with both WORLD and one or more country code in the field. The territories column in this table contains both World or country codes and should bot contain both. the reason im running this query is to search for any rows with bad data.
You could try something like this
SELECT *
FROM [****].[dbo].[Titles]
WHERE Territories LIKE '%world%'
AND Territories LIKE '%countryCode%'
but I am not sure how fast that will be. If you know that the other country codes will always come after the world code, you could do something like this
SELECT *
FROM [****].[dbo].[Titles]
WHERE Territories LIKE '%world%countryCode%'
which I think should run faster.
Well if it can contain other country codes and world then you need
WHERE Territories LIKE '%world%' AND Territories LIKE '%{other country code}%'
If you are looking at world or other country code you need
WHERE Territories LIKE '%world%' OR Territories LIKE '%{other country code}%'