Neighbours list extracted out of polygon regions - sql

I've got a SQL database which contains some coded polygon structures. Those can be extracted as follows
poly <- data.frame(sqldf("SELECT ST_astext(geometry) FROM table"))
The data.frame 'poly' contains strings that now can be converted to real 'SpatialPolygons' objects as follows (for the first string)
realWKT(poly[1,1])
I can do the previous for each string, and save it in a vector
list <- c()
for (i in 1:100){
list <- c(list, readWKT(poly[i,1])
}
The last thing I want to do, is to create a neighbourhood list, based on all the SpatialPolygons by making use of the following function
poly2nb(list)
But sadly, this command results in the following error
Error: extends(class(pl), "SpatialPolygons") is not TRUE
I know that the problem has something to do with the classtype of the list, but I really don't see a way out.. Any help will be appreciated!
Edit
As suggested, some parts of the output. Keep in mind that the rows of 'poly' are really long strings of coordinates
> poly[1,1]
[1] "POLYGON((4.155976 50.78233,...,4.153225 50.76121,4.152384 50.761191,4.151878 50.761194,4.151319 50.761163,4.150872 50.761126))"
> poly[2,1]
[1] "POLYGON((5.139526 50.914059,...,5.140994 50.913612,5.156976 50.895945))"

This seems to work:
list <- lapply(1:2,function(i)readWKT(poly[i,1],id=i))
sp <- SpatialPolygons(lapply(list,function(sp)sp#polygons[[1]]))
library(spdep)
poly2nb(sp)
The internal structure of SpatialPolygons is rather complex. A SpatialPolygons object is a collection (list) of Polygons objects (which represent geographies), and each of these is a list of Polygon objects, which represent geometric shapes. So for example, a SpatialPolygons object that represents US states, has 50 or so Polygons objects (one for each state), and each of those can have multiple Polygon objects (if the state is not contiguous, e.g. has islands, etc.).
It looks like poly2nb(...) takes a single SpatialPolygons object and calculates neighborhood structure based on the contained list of Polygons objects. You were passing a list of SpatialPolygons objects.
So the challenge is to convert the result of your SQL query to a single SpatialPolygons object. readWKT(...) converts each row to a SpatialPolygons object, each of which contains exactly one Polygons object. So you have to extract those and re-assemble them into a single SpatialPolygons object. The line:
sp <- SpatialPolygons(lapply(list,function(sp)sp#polygons[[1]]))
does that. The line before:
list <- lapply(1:2,function(i)readWKT(poly[i,1],id=i))
replaces your for (...) loop and also adds a polygon id to each polygon, which is necessary for the call to SpatialPolygons(...).

Related

Extract rows from numpy and add rows to specific indexes

I have a really big 3D array (1103546X2504X3). These are genotype data, imported from VCF file. First I want to filter it, with my down data. After it, I would like to extract the needed rows, and add the missing one-s, but sorted. Right now my code is:
chr_pos is the position from my "reference" file, pos is the index position from the big array, and needed_index the rows I need.
needed_index = []
for i in range(len(chr_pos)):
for k in range(len(pos)):
if chr_pos[i] == k:
needed_index.append(k)
After the extraction I check is there any missing row from the reference:
list_difference = [item for item in chr_pos if item not in needed_pos]
needed_pos was made with the same code, but with .append(pos[k]).
My questions would be:
How to extract specific array rows, according to the list needed_index or needed pos?
How to add the missing items to the array? It would be in the format indexesX2504X[0,0], where indexes is from list_difference, 2504 is the columns (sample numbers], and [0,0] is the value for every position i want to add.
Edit1: So basically I want to find the rows in the array, what i need (from a reference file), and if some of the positions are not in the main array, add them to the specific position with the 2504 column and [0,0] value as the third dimension

List comprehension- Multiple inputs

I am a beginner , trying to understand how list comprehension for multiple input works.
Can someone explain how the below code works?
x,y = [int(x) for x in input("Enter the value ").split()]
print(x,y)
Thanks in advance!
This is actually is not directly related to list comprehensions but instead a concept called "sequence unpacking", which applies to any sequence type (list, tuple, range). What is happening here is that the user input is expected to be two whitespace-separated values. The split call will split the user input on the whitespace, returning a list of size 2. Then, the list comprehension is looping over each element of this split-produced list and converting each one to an int. Thus, the list comprehension will return a list of length 2, and each of its elements will be "unpacked" separately into the x and y variables on the left-hand side of the assignment operator. Here is an excerpt from the Data Structures section of the Python tutorial that explains sequence unpacking:
The statement t = 12345, 54321, 'hello!' is an example of tuple packing: the values 12345, 54321 and 'hello!' are packed together in a tuple. The reverse operation is also possible:
>>> x, y, z = t
This is called, appropriately enough, sequence unpacking and works for
any sequence on the right-hand side. Sequence unpacking requires that
there are as many variables on the left side of the equals sign as
there are elements in the sequence. Note that multiple assignment is
really just a combination of tuple packing and sequence unpacking.
Note that this only works if the user input is of length 2, else the
sequence unpacking will not work and will result in an error.

Blender: split object with a shape

I've got a flat object that I want to split in multiple pieces (background: I want to print it later, but the surface of my printer is not large enough). I've modeled a simple puzzle-shape:
I would like to use this shape to cut through my object, but if I use the boolean modifier, blender generates vertexes where the shape and the object intersects, but it won't cut the object since my shape got a thickness of 0:
I don't want to make my shape thicker, because otherwise it would delete something of my object...
You are able to separate the two sides of the object from each other, and then rejoin them afterwards if you need to. (This does include the use of the boolean modifier)
First, you should add the boolean modifier to the main mesh where you want it, with the 'difference' operation. Then in edit mode, as you explained before, the vertexes are created but there isn't the actual 'cut' that you were looking for.
I recreated the scenario with a plane intersecting a cube:
This is what it looks like in edit mode after having applied the boolean modifier:
Second what you can do is (after applying the boolean modifier) select the faces you want to be separated in edit mode. Then, pressing P (shortcut for separate, you can get to it by right clicking) click on 'selection' and you should have two separate objects. One of the objects will have what looks like a missing face: If you wanted two separate objects, then you just need to add a face on the object with the missing face and you can look no further. If you wanted separate parts of objects that are separate within edit mode (all together one object in object mode) then you can select the two objects and press crtl+j. Hope this helps somehwhat!
I have selected half of the cube that I want cut out (the selection does not include the face in the middle):
There are now two objects, completely seperated from each other:

How to efficiently append a dataframe column with a vector?

Working with Julia 1.1:
The following minimal code works and does what I want:
function test()
df = DataFrame(NbAlternative = Int[], NbMonteCarlo = Int[], Similarity = Float64[])
append!(df.NbAlternative, ones(Int, 5))
df
end
Appending a vector to one column of df. Note: in my whole code, I add a more complicated Vector{Int} than ones' return.
However, #code_warntype test() does return:
%8 = invoke DataFrames.getindex(%7::DataFrame, :NbAlternative::Symbol)::AbstractArray{T,1} where T
Which means I suppose, thisn't efficient. I can't manage to get what this #code_warntype error means. More generally, how can I understand errors returned by #code_warntype and fix them, this is a recurrent unclear issue for me.
EDIT: #BogumiłKamiński's answer
Then how one would do the following code ?
for na in arr_nb_alternative
#show na
for mt in arr_nb_montecarlo
println("...$mt")
append!(df.NbAlternative, ones(Int, nb_simulations)*na)
append!(df.NbMonteCarlo, ones(Int, nb_simulations)*mt)
append!(df.Similarity, compare_smaa(na, nb_criteria, nb_simulations, mt))
end
end
compare_smaa returns a nb_simulations length vector.
You should never do such things as it will cause many functions from DataFrames.jl to stop working properly. Actually such code will soon throw an error, see https://github.com/JuliaData/DataFrames.jl/issues/1844 that is exactly trying to patch this hole in DataFrames.jl design.
What you should do is appending a data frame-like object to a DataFrame using append! function (this guarantees that the result has consistent column lengths) or using push! to add a single row to a DataFrame.
Now the reason you have type instability is that DataFrame can hold vector of any type (technically columns are held in a Vector{AbstractVector}) so it is not possible to determine in compile time what will be the type of vector under a given name.
EDIT
What you ask for is a typical scenario that DataFrames.jl supports well and I do it almost every day (as I do a lot of simulations). As I have indicated - you can use either push! or append!. Use push! to add a single run of a simulation (this is not your case, but I add it as it is also very common):
for na in arr_nb_alternative
#show na
for mt in arr_nb_montecarlo
println("...$mt")
for i in 1:nb_simulations
# here you have to make sure that compare_smaa returns a scalar
# if it is passed 1 in nb_simulations
push!(df, (na, mt, compare_smaa(na, nb_criteria, 1, mt)))
end
end
end
And this is how you can use append!:
for na in arr_nb_alternative
#show na
for mt in arr_nb_montecarlo
println("...$mt")
# here you have to make sure that compare_smaa returns a vector
append!(df, (NbAlternative=ones(Int, nb_simulations)*na,
NbMonteCarlo=ones(Int, nb_simulations)*mt,
Similarity=compare_smaa(na, nb_criteria, nb_simulations, mt)))
end
end
Note that I append here a NamedTuple. As I have written earlier you can append a DataFrame or any data frame-like object this way. What "data frame-like object" means is a broad class of things - in general anything that you can pass to DataFrame constructor (so e.g. it can also be a Vector of NamedTuples).
Note that append! adds columns to a DataFrame using name matching so column names must be consistent between the target and appended object.
This is different in push! which also allows to push a row that does not specify column names (in my example above I show that a Tuple can be pushed).

How to read data from data bag within a PIG script

I have a databag which is the following format
{([ChannelName#{ (bigXML,[])} ])}
DataBag consists of only one item which is a Tuple.
Tuple consists of only item that is Map.
Map is of type which is a map between channel names and values.
Here is value is of type DataBag, which consists of only one tuple.
The tuple consists of two items one is a charrarray (very big string) and other is a map
I have a UDF that emits the above bag.
Now i need to invoke another UDF by passing the only tuple within the DataBag against a given Channel from the Map.
Assuming there was not data bag and a tuple as
([ChannelName#{ (bigXML,[])} ])
I can access the data using $0.$0#'StdOutChannel'
Now with the tuple inside a bag
{([ChannelName#{ (bigXML,[])} ])}
If i do $0.$0.$0#'StdOutChannel' (Prepend $0), i get the following error
ERROR 1052: Cannot cast bag with schema bag({bytearray}) to map
How can I access data within a data bag?
Try to break this problem down a little.
Let's say you get your inner bag:
MYBAG = $0.$0#'StdOutChannel';
First, can you ILLUSTRATE or DUMP this?
What can you do with this bag? Usually FOREACH over the tuples inside.
A = FOREACH MYBAG {
GENERATE $0 AS MyCharArray, $1 AS MyMap
};
ILLUSTRATE A; -- or if this doesn't work
DUMP A;
Can you try this interactively and maybe edit your question a little more with some details as a result of you trying these things.
Some editing hints for StackOverflow:
put backticks around your code (`ILLUSTRATE`)
indent code blocks by 4 spaces on each line