Incorrect graph legend - RStudio mix up factorial and numeric variables - variables

I have a database with several columns with different species names, and one with a range of years. I have to produce a graph between species and years of sightings: everything works fine and the graph comes up, but as a legend it doesn't write the name in words but in numbers. For example, instead of putting "Citizen Science"=red, it writes 1=red. To do this I use a function that has already been tested on other PCs and everything works correctly, while on mine it doesn't.
Plot with incorrect legend
I noticed that on other PCs it reads the variables as characters while on mine it gives them as factors. I tried modifying them with stringsAsFactors=F and indeed they all come out as character, but doing so completely throws the graph off. I can't understand what the problem is and why only my computer reads the dataframe this way.
If it is helpful, this is the plot part of the function:
tipidati<-unique(data[,3])
tipidati<-tipidati[order(tipidati)]
if(plot){
plot(rbind(c(start-2,0),c(end,length(species))),type="n",ylab="Species richness",xlab="Years of record")
mtext(paste(paste(c(tipidati,"Plural"),"=",col=c(colours[1:length(tipidati)],colours[length(colours)]),sep=""),collapse=","),cex=0.9)

Related

pandas writes NUL character (\0) when calling to_csv

In one of my scripts I call the following code on my dataframe to save the data on disc.
to_csv(input_folder / 'tmp' / section_fname, index=False, encoding='latin1', quoting=csv.QUOTE_NONNUMERIC)
When I opened the created file with notepad++ in the "show all characters mode" it showed a lot of NUL characters (\0) inside one of the rows. In addition to this, some rows of the dataframe are not being written
However, if I scroll this line, there are some data of my dataframe after:
This appears somewhere in the middle of my dataframe, so I decided to call head and then tail to look inside this specific portion of the data where it appears. As I can see, the data is pretty all right: there are some integers and strings as it should be.
I am using pandas 1.1.5
I have looked throught the data to ensure that nothing weird is being written that can result in reading this way. In addition to this, I have googled if someone faced the same issue, but mostly it occurs that people read the data with pandas and get NUL characters
I have spent a lot of time digging into the data and the code and have no explanation of such behavior. Maybe someone can help me?
By the way, everytime I write my dataframe it occurs in a different place removing different amount of rows.
Kind regards,
Mike

What to expect when using mariadb/phpmyadmin when working with datatype 'geometry'

Trying to get familiar with the datatype 'geometry', I want to import a GPX-file in a table and show it on an OSM-map. I'm using MariaDB/phpmyadmin because that's what my hoster is providing me. I'm using the 'geometry'-type because I like the ST_functions (instead of putting the lat-lon in two columns and develop/copy the needed algorithm's).
After googling and youtubing for some time now, I'm at the point that I'm wondering if I'm doing things wrong or if I'm encountering bugs. Because I don't know what to expect, I hope someone can get me on the right track.
I started on a local PC with XAMPP 7.3.4 (phpmyadmin 4.8.5/mariadb 10.1.38) installed. I started with a column datatype POINT, was surprised that phpmyadmin has the option to show the contents of a record on a map, and was dissapointed I only saw blue water. When editting a record, phpmyadmin showed a map and data which have to presented, which makes clear that SRID is '0'. Couldn't get the SRID on '4326', until some text somewhere hinted met to use a column with datatype GEOMETRY. But only a worldmap showed up.
After a lot of trying, I decided to use the hosted environment (phpmyadmin 4.9.5/mariadb 10.3.22). To my surprise the point was visible at the map. Only, on a different part of the world. Looking at the lat-lon I saw that they were interchanged. Putting them in the lon-lat sequence, the point was visible at the place where I expected it.
Because the hoster provides higher versions installed, I installed a newer XAMPP 7.4.6 (phpmyadmin 5.0.2/mariadb 10.4.1). It was a big surprise that my point wasn't showing up, just the worldmap again. So it's some configuration with the OSM-map on the local machine that needs attention? The lat-lon still have to be interchanged.
Mapping is ok, lat-lon interchanged
Mapping wrong, lat-lon ok
Mapping of a walk in Paris. First is mapping of a GPX in Prune, second is an import of the tracked points in MariaDB. Exactly the same mapping, just had to interchange the points lat - lon. So, nothing wrong with used SRID and/or coordinates I think, just phpmysql taking the lat as y and lon as x, instead of the expected lat as x and lon as Y which puts the walk somewhere in the sea in front of Somalie:
Mapping of walk in Paris presented in Prune
Mapping of same gps-points in phpmyadmin, lat-lon interchanged
Apart from the presentation of the data, I have difficulties when using the insert-option of phpmyadmin. I only get data in one pass in the table when using sql. The insert-option generates sql which gives errors. I have to edit that sql, there are " ' " and " \ " which I have to remove. Comparing the used versions I detected differences in number and places of the to remove " ' "and " \ ".
I looked at the phpmyadmin-issues, nothing seems open. I can find closed ones who indicate to some sort of issues I'm experiencing. A lot of docu on geo is offcourse about postgresql, some about mysql, but less about mariadb and phpmysql, it's hard to find the good directions.
So, my biggest three questions are if it's intended to store lon-lat instead of lat-lon (Or do I have to use another srid?) Second question is what I have to configure to get the map working locally like it does with my hoster (if that's what's causing the problem)? Third is if people can use the insert-option of phpmysql without editing the generated sql?
Thanks in advance.
If you are going to use great-circle distances, be sure to have a version of MySQL/MaraiDB that includes st_distance_sphere. (I think that limits you to MySQL 8.0.)
If you are going to have code to "find the nearest", you will find that challenging. Here's my discussion of techniques for such. http://mysql.rjweb.org/doc.php/find_nearest_in_mysql (Also included is a Stored Function to do great-circle calculations.) That discusses using SPATIAL and other techniques.
Part of your issue is the mapping from y and x to whatever projection of the world your map has. The spherical long-lat are thinking "sphere" not a "projection".
phpmyadmin is just a UI toop. It may not be smart enough to deal with some of the SPATIAL issue. Suggest you switch to MySQL commandline tool and/or your application code. BTW, what language will you be writing in?
Backtics are used around table and column names. Quotes (' or ") are used around strings. In some contexts, backslash (\) may need to be doubled or even quadrupled up.
POINT is a 25-byte binary format. It is best to dynamically construct the value, not spell out the value, as can be done with decimal numbers for INT and FLOAT.
And, yes, longitude comes first in POINT() and other SPATIAL thingies.

plotting direction field for systems of three equations in SageMath using maxima

I would like to plot the direction field for a system of 3 or more equations in SageMath using Maxima. I know how to do this for a system of 2 equations. I don't know what to modify so that I extend it to 3 or more equations. I tried the following example for a system of two equations
maxima('plotdf([x,-y],[x,y],[x,-2,2],[y,-2,2])')
I was thinking for the three or more equations I simply have to add more varibles like
maxima('plotdf([x,-y,z],[x,y,z],[x,-2,2],[y,-2,2],[z,-2,2])')
but its not working. I dont know what am missing.
The Maxima documentation for plotdf makes the syntax clear. I'm not sure what a slope/direction field would look like for more equations unless you had more variables, but then it would have to be in three dimensions, which is not supported.
In any case I'm surprised this worked from within Sage; you must have had wxmaxima or something analogous already there.
Finally, note that SageMath has slope fields natively, though this may not correspond with your workflow.

OpenNLP splits our sentences in half along special characters

We're facing an issue while processing text extracted from PDF documents (the content of which we do not have control over).
Most of our text data happen to have sections which pose a challenge for OpenNLP which we use for detecting sentences for further processing. We are using the en-sent.bin model file from the OpenNLP website.
For example, one can often encounter GPS coordinates like 40° 43.554’ N, 73° 59.814’ W in these texts, where OpenNLP believes anything after a ’ character must belong to a new sentence.
This results in unwanted splitting of some of our sentences, for which we'd like to a find a solution or workaround.
The above character turns out not to be a regular single quote (U+0027), but one called 'RIGHT SINGLE QUOTATION MARK' (U+2019 or 0xE2 0x80 0x99 in hex). It looks like the sentence that contains the coordinates is split exactly along these.
We don't know how the en-sent.bin Sentence Detector model is trained or what character encoding it is working with (our input is UTF-8), as we found no such information in the documentation of OpenNLP (despite it mentioning that the character encoding to be used is specified during training of the model).
Filtering out such characters (i.e. all of those along which the splits happen) as a solution was dismissed, since we can't know for sure which ones are affected and it might also introduce the very similar problem of accidentally joining two sentences.
Since our team is highly inexperienced with OpenNLP, we're struggling to fix this. We have so far identified what we believe to be two candidate causes for the unwanted split, which I'd rather not post unless absolutely necessary, in order not to affect your thinking.
Please note that I'm obliged not to include our source code or the exact data we're feeding as those are highly confidential and the latter may contain personal or otherwise protected information.

How many Axis can we use in MDX practically?

I heard about there are around 128 Axis in MDX.
AXIS(0) or simply 0 – Columns
AXIS(1) or simply 1 – Rows
AXIS(2) or simply 2 – Pages
AXIS(3) or simply 3 – Sections
……….
……….
So far I have used only two of them, Column (0) & Row (1).
I am just curious about
how,
where
when or why
can I use other MDX Axis ?
As SQL SSMS only supports two Axis, If I am not wrong.
Thanks.
How :
select ... on 0, ... on 1, ... on 2 and so on .... from [cube]
Where :
Any client that will not crash with unexpected result format ;-)
When / Why :
A client could take advantage of several axis for rendering the result in 3D using 3 axis. Even if the the client does not render the result in 3D, it might be interesting to ask the server to return the result split over 3 axis for ad-hoc (or easier) processing.
I do not know of any standard client that supports this.
But a typical application that comes to mind: Some years ago (before I was working with Analysis Services), we had a client requiring one and the same report for ten countries and five markets on fifty PowerPoint slides. If we had used Analysis Services at that time, we might have written a custom client application that uses a four dimensional report and thus can get the data to be put into all fifty PowerPoint slides with a single MDX query.
You need not think of OLAP dimensions as dimensions in space. You also can think of them (as the name aliases suggest) as e. g. pages and chapters.